RFC: Pluggable Execution Engine for Stream Processing #17501
Labels
discuss
Issues intended to help drive brainstorming and decision making
enhancement
Enhancement or improvement to existing feature or request
RFC
Issues requesting major changes
Search:Performance
Is your feature request related to a problem? Please describe
Overview and Motivation
With the introduction of arrow streams into OpenSearch we have been experimenting with the idea of an embeddable execution engine. The idea is to use Lucene for initial doc retrieval and filtering, and stream out doc values to a pluggable execution engine for analytics. The primary motivation of this is to introduce modern analytics capabilities to OpenSearch along with performance improvements particularly through vectorized processing of columnar data.
Describe the solution you'd like
I have been using rust based Apache DataFusion as the engine of choice for experiments, but the idea is to make the engine pluggable. I chose DataFusion particularly because it was easy to get off the ground, it uses arrow as its in-memory format, and is performant with efficient vectorized processing. It was also relatively simple to create vectors in the jvm and send them zero copy to/from DataFusion. Further its extensible architecture allows us to customize pretty much anything (language front-end/planning/execution etc), and use it for more than simply an execution engine.
Some additional benefits:
Use Cases:
The three use cases I have been targeting for POC are (branches at bottom of this issue):
For implementation perspective this is what the execution flow looks like for terms aggregation in a POC:

At the coordinator:
The Coordinator executes query phase as normal (not pictured), where each data node returns a stream ticket.
On the data node - This resembles today's approach where aggregators return per-shard results, but with streaming to reduce memory requirements.
I’ve benchmarked a couple of experiments for aggregations and have seen favorable results in both latency & memory used - Cluster using DF in green - 3 data nodes + 1 dedicated coordinator.
Benchmark - Big5 keyword-terms operation.
I will follow up here with a few more benchmarks shortly using the new red line feature in OSB, but wanted to get this out there to see what people think?
Related:
Join Support: #15185
Streaming aggs - #16774
Pluggable Storage Engine Support - #17341 (comment)
POC branches - aggregations (term) - https://github.com/mch2/OpenSearch/commits/df-streaming-aggs/
joins - https://github.com/mch2/OpenSearch/commits/mch2-rishma-join
Related component
Search:Performance
Describe alternatives you've considered
not do this and rely on pure java implementations?
Additional context
No response
The text was updated successfully, but these errors were encountered: