# Appendix

## Appendix: Understanding the Extraction Process

## Extraction Process

The extraction process consists of two parts:

* Initial Extract
* Delta Extract

## Initial Extract

An initial extract is performed the first time Ingest connects to a database. During this extract, the entire table is replicated from the source database to the destination.

## Delta Extract

After the initial extract, Ingest performs delta extracts. Delta extracts capture only the changes made since the last extraction and merge them with the destination.

A typical delta extract log file looks like this:

```
Extracting 2
Delta Extract database_name:table_name
Info (ME188): Stage pre-BCP
Info (ME190): Stage post-BCP
Info (ME260): Stage post-process
Delta Extract database_name complete (10 records)
Extracted 2
Load file 2
Creating table dbname_schemaname.table_name...
Created table dbname_schemaname.table_name
Loading table dbname_schemaname.table_name with x records(n bytes)
Created new connection org.mariadb.jdbc.Connection@4ca52dc7
Replace data...
Loading ./spool/dbname_schemaname.table_name_2.dat into dbname_schemaname.table_name
Loaded ./spool/dbname_schemaname.table_name_2.dat
Deleted ./spool/dbname_schemaname.table_name_2.dat
Replace data completed
Loaded table dbname_schemaname.table_name(0 of 1 left)
Loaded file 2(Source=2025-01-07 10:01:56 IST)
```

## First Extract

The first extract always needs to be a **Full Extract**. This extracts the entire table, and future extractions are delta extracts that run periodically based on your desired frequency.

## Appendix: Additional Configurations

## Source Database

While configuring the source database, there are additional configurations for each **Extract Type**.

* Handle zero length strings: Load zero-length strings directly from the source to the destination.
* Extract Threads: The number of extracting threads to use.
* Log file catchup count: The number of Oracle archive logs processed in one instance.
* Log catchup time (mins):
* Log catchup offset (mins):
* Log file look-ahead:
* Convert RAW to Hex: Convert raw columns to hex strings instead of treating them as CHAR(1).

## Destination Database

While configuring the destination database, there are additional configurations for each **Extract Type**.

* Max Updates: Combine updates that exceed this value.
* Load Threads: The number of loading threads to use.
* Add Database Prefix:
* Truncate table instead of drop:
* Schema for all tables: Ignore the source schema and place all tables in this schema on the destination.
* Ignore database name in schema: Check this option to ignore the database name as part of the schema prefix for destination tables.
* Database for staging tables: Specify the schema name to be used for staging tables in the destination.
* Retain staging tables: Check this option to retain staging tables in the destination.
* Sync Struct Suppressed: When enabled, Ingest will not automatically recreate destination tables or apply any schema changes, even if structure mismatches are detected or **Sync Struct** is run. Enable this to manually control schema changes and preserve custom destination table definitions. It is disabled by default.

## Flow Events for AWS CloudWatch Logs and SNS

Ingest supports connections to AWS CloudWatch Logs, CloudWatch Metrics, and SNS. These integrations enable monitoring of Ingest operations and facilitate interaction with other assets utilizing the AWS infrastructure. AWS CloudWatch Logs can capture event logs, such as load completion or failure, from Ingest. These logs can also help monitor error conditions and trigger alarms.

The following is a list of events that Ingest pushes to the AWS CloudWatch Logs console and AWS SNS:

| **FlowEvents**     | **Description**                                                                    |
| ------------------ | ---------------------------------------------------------------------------------- |
| `LogfileProcessed` | Archive log file processed (Oracle only)                                           |
| `TableExtracted`   | Source table extraction complete for SQL Server and Oracle (initial extracts only) |
| `ExtractCompleted` | Source extraction batch is complete                                                |
| `TableLoaded`      | Destination table load complete                                                    |
| `LoadCompleted`    | All destination table loads in a batch complete                                    |
| `HaltError`        | Unrecoverable error occurred, disabled the Scheduler                               |
| `RetryError`       | Error occurred, but process will retry                                             |

The following are the details for each of the SingleStore Flow events:

**Event: LogfileProcessed**

| **Attribute**     | **Is Metric(Y/N)?** | **Description**                               |
| ----------------- | ------------------- | --------------------------------------------- |
| type              | N                   | “LogfileProcessed”                            |
| generated         | N                   | Timestamp of message                          |
| source            | N                   | Instance name                                 |
| sourceType        | N                   | “CDC”                                         |
| fileSeq           | N                   | File sequence                                 |
| file              | N                   | File name                                     |
| dictLoadMS        | Y                   | Time taken to load dictionary in milliseconds |
| CurrentDBDate     | N                   | Current database date                         |
| CurrentServerDate | N                   | CurrentFlowserver date                        |
| parseMS           | Y                   | Time taken to parse file in milliseconds      |
| parseComplete     | N                   | Timestamp when parsing is complete            |
| sourceDate        | N                   | Source date                                   |

**Event: TableExtracted**

| **Attribute** | **Is Metric(Y/N)?** | **Description**             |
| ------------- | ------------------- | --------------------------- |
| type          | N                   | “TableLoaded”               |
| subType       | N                   | Table name                  |
| generated     | N                   | Timestamp of message        |
| source        | N                   | Instance name               |
| sourceType    | N                   | “CDC”                       |
| tabName       | N                   | Table name                  |
| success       | N                   | true/false                  |
| message       | N                   | Status message              |
| sourceTS      | N                   | Source date time            |
| sourceInserts | Y                   | Number of Inserts in source |
| sourceUpdates | Y                   | Number of Updates in source |
| sourceDeletes | Y                   | Number of Deletes in source |

**Event: ExtractCompleted**

| **Attribute** | **Is Metric(Y/N)?** | **Description**       |
| ------------- | ------------------- | --------------------- |
| type          | N                   | “ExtractCompleted”    |
| generated     | N                   | Timestamp of message  |
| source        | N                   | Instance name         |
| sourceType    | N                   | “CDC”                 |
| jobType       | N                   | “EXTRACT”             |
| jobSubType    | N                   | Extract type          |
| success       | N                   | Y/N                   |
| message       | N                   | Status message        |
| runId         | N                   | Run ID                |
| sourceDate    | N                   | Source date           |
| dbDate        | N                   | Current database date |
| fromSeq       | N                   | Start file sequence   |
| toSeq         | N                   | End file sequence     |
| extractId     | N                   | Run ID for extract    |
| tableErrors   | Y                   | Count of table errors |
| tableTotals   | Y                   | Count of total tables |

**Event: TableLoaded**

| **Attribute** | **Is Metric(Y/N)?** | **Description**                  |
| ------------- | ------------------- | -------------------------------- |
| type          | N                   | “TableLoaded”                    |
| subType       | N                   | Table name                       |
| generated     | N                   | Timestamp of message             |
| source        | N                   | Instance name                    |
| sourceType    | N                   | “CDC”                            |
| tabName       | N                   | Table name                       |
| success       | N                   | true/false                       |
| message       | N                   | Status message                   |
| sourceTS      | N                   | Source date time                 |
| sourceInserts | Y                   | Number of Inserts in source      |
| sourceUpdates | Y                   | Number of Updates in source      |
| sourceDeletes | Y                   | Number of Deletes in source      |
| destInserts   | Y                   | Number of Inserts in destination |
| destUpdates   | Y                   | Number of Updates in destination |
| destDeletes   | Y                   | Number of Deletes in destination |

**Event: LoadCompleted**

| **Attribute** | **Is Metric(Y/N)?** | **Description**       |
| ------------- | ------------------- | --------------------- |
| type          | N                   | “LoadCompleted”       |
| generated     | N                   | Timestamp of message  |
| source        | N                   | Instance name         |
| sourceType    | N                   | “CDC”                 |
| jobType       | N                   | “LOAD”                |
| jobSubType    | N                   | Subtype of the “LOAD” |
| success       | N                   | Y/N                   |
| message       | N                   | Status message        |
| runId         | N                   | Run ID                |
| sourceDate    | N                   | Source date           |
| dbDate        | N                   | Current database date |
| fromSeq       | N                   | Start file sequence   |
| toSeq         | N                   | End file sequence     |
| extractId     | N                   | Run ID for extract    |
| tableErrors   | Y                   | Count of table errors |
| tableTotals   | Y                   | Count of total tables |

**Event: HaltError**

| **Attribute** | **Is Metric (Y/N)?** | **Description**      |
| ------------- | -------------------- | -------------------- |
| type          | N                    | “HaltError”          |
| generated     | N                    | Timestamp of message |
| source        | N                    | Instance name        |
| sourceType    | N                    | “CDC”                |
| message       | N                    | Error message        |
| errorId       | N                    | Short identifier     |

**Event: RetryError**

| **Attribute** | **Is Metric (Y/N) ?** | **Description**      |
| ------------- | --------------------- | -------------------- |
| type          | N                     | “RetryError”         |
| generated     | N                     | Timestamp of message |
| source        | N                     | Instance name        |
| sourceType    | N                     | “CDC”                |
| message       | N                     | Error message        |
| errorId       | N                     | Short identifier     |



***

Modified at: December 16, 2025

Source: [/db/v9.1/load-data/load-data-with-singlestore-flow-on-helios/singlestore-ingest/appendix/](https://docs.singlestore.com/db/v9.1/load-data/load-data-with-singlestore-flow-on-helios/singlestore-ingest/appendix/)

(An index of the documentation is available at /llms.txt)
