Skip to main content

Data Loading for HDFS Pipelines

When the master aggregator reads an HDFS output directory’s contents, it schedules each file on a single SingleStoreDB partition. After each leaf partition across the cluster has finished extracting, transforming, and loading its file, a batch operation has been completed.