Data Shaping with Pipelines
Warning
SingleStore 9.0 gives you the opportunity to preview, evaluate, and provide feedback on new and upcoming features prior to their general availability. In the interim, SingleStore 8.9 is recommended for production workloads, which can later be upgraded to SingleStore 9.0.
On this page
After data is extracted from a SingleStore Pipeline’s data source, it can be optionally shaped (modified).
Some common data shaping operations that can be performed are:
-
Lookups from other SingleStore tables (in addition to the destination table(s))
-
Normalizing data
-
Denormalizing data
-
Adding computed columns
-
Filtering data (excluding specific columns or records)
-
Mapping data values from the data source to new values
-
Splitting records from the data source into multiple destination tables
-
Adding surrogate keys
Data modifications made during shaping are not written back to the data source, unless done explicitly in a transform (SingleStore Self-Managed only).
Ways to specify data shaping logic:
-
In a
CREATE PIPELINE
statement. -
In a stored procedure that is called from the pipeline.
-
In a transform that is called from the pipeline.
Methods for Data Shaping with Pipelines
The details of each data shaping method are explained in the following table.
Data Shaping Method |
Amount of Customization Logic Allowed |
Ease of Use |
Comments |
Examples |
---|---|---|---|---|
In a |
Low |
Easiest |
Pros: Generally, runs the fastest of the three data shaping methods; transactional guarantees. |
|
Pipeline Stored Procedure |
Medium |
More Difficult |
Pros: Transactional guarantees; cons of specifying data shaping logic directly in your |
See examples in CREATE PIPELINE . |
Transform |
High |
Most Difficult |
Pros: Can use any nearly any programming language and leverage third-party libraries. |
See the guide Writing a Transform to Use with a Pipeline |
Last modified: October 23, 2023