Data Shaping with Pipelines
On this page
After data is extracted from a SingleStore Pipeline’s data source, it can be optionally shaped (modified).
Some common data shaping operations that can be performed are:
- 
      
Lookups from other SingleStore tables (in addition to the destination table(s))
 - 
      
Normalizing data
 - 
      
Denormalizing data
 - 
      
Adding computed columns
 - 
      
Filtering data (excluding specific columns or records)
 - 
      
Mapping data values from the data source to new values
 - 
      
Splitting records from the data source into multiple destination tables
 - 
      
Adding surrogate keys
 
Data modifications made during shaping are not written back to the data source, unless done explicitly in a transform (SingleStore Self-Managed only).
Ways to specify data shaping logic:
- 
      
In a
CREATE PIPELINEstatement. - 
      
In a stored procedure that is called from the pipeline.
 - 
      
In a transform that is called from the pipeline.
 
Methods for Data Shaping with Pipelines
The details of each data shaping method are explained in the following table.
| 
           Data Shaping Method  | 
           Amount of Customization Logic Allowed  | 
           Ease of Use  | 
           Comments  | 
           Examples  | 
|---|---|---|---|---|
| 
           In a   | 
           Low  | 
           Easiest  | 
           Pros: Generally, runs the fastest of the three data shaping methods; transactional guarantees.  | 
           
  | 
| 
           Pipeline Stored Procedure  | 
           Medium  | 
           More Difficult  | 
           Pros: Transactional guarantees; cons of specifying data shaping logic directly in your   | 
           See examples in CREATE PIPELINE .  | 
| 
           Transform  | 
           High  | 
           Most Difficult  | 
           Pros: Can use any nearly any programming language and leverage third-party libraries.  | 
           See the guide Writing a Transform to Use with a Pipeline  | 
Last modified: October 23, 2023