Data Loading for Kafka Pipelines
On this page
For Kafka pipelines to have optimal performance, there should be an equal number of partitions between SingleStore and Kafka (i.
In scenarios where the leaf nodes are processing unequal amounts of data, pipeline ingestion will generally outperform parallel loading through aggregator nodes.
The SingleStore Master Aggregator (MA) connects to Kafka’s lead broker and requests metadata about the Kafka cluster, including information about the brokers, topics, and partitions.
SingleStoreDB processes data from Kafka in order, per partition.
Each leaf node will process different Kafka partitions per batch.
Kafka to SingleStore One-to-One Relationship
Kafka Cluster |
SingleStore Cluster |
---|---|
BKR (P1) (P3) |
MA = Master Aggregator |
|
CA = Child aggregator |
BKR (P2) (P4) |
L1 & L2 = Leaf 1 and Leaf 2 |
|
P1 - P4 = partitions 1 - 4 |

Last modified: September 11, 2023