Skip to main content

Data Loading for Kafka Pipelines

For Kafka pipelines, there should be a 1:1 relationship between the number of leaf node partitions and the number of Kafka partitions. For example, if your database has two leaves with eight partitions each, your Kafka workspace should have 16 partitions. If the database or the data source’s partitions aren’t equal in number, leaf nodes will either sit idle or will process uneven amounts of data. However, even in scenarios when leaf nodes are processing an uneven amount of data, ingestion using Pipelines will generally be more performant than parallel loading through aggregator nodes.