Data Shaping with Pipelines
On this page
After data is extracted from a SingleStore Pipeline’s data source, it can be optionally shaped (modified).
Some common data shaping operations that can be performed are:
- 
      Lookups from other SingleStore tables (in addition to the destination table(s)) 
- 
      Normalizing data 
- 
      Denormalizing data 
- 
      Adding computed columns 
- 
      Filtering data (excluding specific columns or records) 
- 
      Mapping data values from the data source to new values 
- 
      Splitting records from the data source into multiple destination tables 
- 
      Adding surrogate keys 
Data modifications made during shaping are not written back to the data source, unless done explicitly in a transform (SingleStore Self-Managed only).
Ways to specify data shaping logic:
- 
      In a CREATE PIPELINEstatement.
- 
      In a stored procedure that is called from the pipeline. 
- 
      In a transform that is called from the pipeline. 
Methods for Data Shaping with Pipelines
The details of each data shaping method are explained in the following table.
| Data Shaping Method | Amount of Customization Logic Allowed | Ease of Use | Comments | Examples | 
|---|---|---|---|---|
| In a  | Low | Easiest | Pros: Generally, runs the fastest of the three data shaping methods; transactional guarantees. | 
 | 
| Pipeline Stored Procedure | Medium | More Difficult | Pros: Transactional guarantees; cons of specifying data shaping logic directly in your  | See examples in CREATE PIPELINE . | 
| Transform | High | Most Difficult | Pros: Can use any nearly any programming language and leverage third-party libraries. | See the guide Writing a Transform to Use with a Pipeline | 
Last modified: October 23, 2023