Skip to main content

Avro Schema Evolution with Pipelines

Avro schema evolution is the ability of existing consumers of a schema to easily handle updates made to the schema.

SingleStore Pipelines support some Avro schema evolution capabilities, which are explained below.

  • When you create a Pipeline, instead of specifying the Avro schema definition directly in the CREATE PIPELINE statement, you can specify the host name/IP address and port of Confluent Schema Registry. The schema registry contains the definition of the schema. If fields are added to the definition, or fields are removed, the Pipeline sees these changes.

  • Without Avro schema evolution, you need to reset a Pipeline’s offsets after updating the Avro schema. The offsets track the current position in the data source from which the Pipeline is reading. Because the offsets are reset to the beginning position, you also need to unload the data from the Pipeline’s target table. As an alternative to resetting the offsets and unloading the data, you could collect its offsets (which may be a difficult process).

  • If you add fields to the Avro schema (but do not remove fields), you do not need to stop the Pipeline before modifying it to add the fields.