SingleStore Ingest
On this page
Overview
SingleStore Ingest (“Ingest”) is real-time data replication software that replicates data from various sources to SingleStore.
Supported Source Databases
Ingest supports the following database sources:
-
Oracle
-
Microsoft SQL Server
-
MySQL
-
PostgreSQL
Contact your SingleStore account team or SingleStore Sales if you want to move data from a source not listed above.
Ingest Architecture
Ingest replicates data from any supported source to a SingleStore destination database.
SingleStore Flow, of which Ingest is a part, offers several deployment strategies for its customers, including:
-
Standard deployment in an AWS environment
-
High Availability deployment in an AWS environment
-
Hybrid deployment using both on-premises and cloud infrastructure
-
Fully on-premises deployment
Flow components can be deployed in Google Cloud and Microsoft Azure as well.
Ingest uses log-based Change Data Capture for data replication.
The following diagram serves as the reference for all setup instructions.
Estimated deployment time: Approximately 1 hour
![](https://images.contentstack.io/v3/assets/bltac01ee6daa3a1e14/blt060ddd51a6772bdf/67ac83e03cc73486be3dd213/singlestore_flow_ingest_architecture-2qyFhL.png?width=850&disable=upscale&auto=webp)
Ingest / AWS Service Integration
The following is the Ingest architecture which showcases integration with various optional AWS services in a standard deployment.
![](https://images.contentstack.io/v3/assets/bltac01ee6daa3a1e14/blta9567b5d380c0af5/67ac83f84ee4090fd57d6149/singlestore_flow_ingest_aws_service_integration-LYaiej.png?width=850&disable=upscale&auto=webp)
This architecture diagram illustrates a standard deployment that highlights the following features:
-
AWS services running alongside Ingest.
-
RecommendedFlow architecture for a VPC in AWS.
-
Data flow between the source database, AWS, and SingleStore destination database, including security and monitoring features.
-
Security, including IAM, organized in a separate group and integrated with Ingest.
Ingest High Availability Architecture
The following High Availability architecture explains how Ingest is deployed in a multi-AZ setup.
Estimated deployment time: Approximately 1 day
![](https://images.contentstack.io/v3/assets/bltac01ee6daa3a1e14/blt8ee38c225ec6f9bf/67ac83f83ff84648e52fcd28/singlestore_flow_ingest_ha_architecture-GrIA1J.png?width=850&disable=upscale&auto=webp)
Ingest Hybrid Architecture
Ingest also offers a hybrid deployment model that combines on-premises services with those in the AWS Cloud.
Estimated deployment time: Approximately 2 hours to 1 day
![](https://images.contentstack.io/v3/assets/bltac01ee6daa3a1e14/blt5d7775a0de252a6f/67ac83ecbe5d7df40d7d8e26/singlestore_flow_ingest_hybrid_architecture-750qBD.png?width=850&disable=upscale&auto=webp)
Prerequisites
The following are the prerequisites for launching Ingest on Amazon EC2
-
Selection of the Ingest volume.
-
Selection of the EC2 instance type.
-
Ensure connectivity between the server/EC2 hosting the Ingest software and the source.
Additionally, ensure connectivity to DynamoDB if the high availability option is required.
To create the necessary AWS services, refer to Environment Preparation.
The following are the steps to take before launching Ingest in AWS via custom installation on an EC2:
-
Create a policy with a relevant name for EC2, such as
FlowEC2Policy
.Refer to the Define custom IAM permissions with customer managed policies for creating policies. -
Refer to AWS Identity and Access Management (IAM) for SingleStore Flow for JSON policy.
-
Create an IAM role called
FlowEC2Role
.Refer to Create a role to delegate permissions to an IAM user for creating roles. -
Attach the
FlowEC2Policy
to the role. -
Create a Lambda policy for disk checks and attach the Lambda policy JSON.
Refer to AWS Recovery for Lambda Policy JSON.
The following are the recommended EC2 options for replicating source data volumes.
Total Data Volume |
EC2 Recommended |
---|---|
< 100 GB |
t2. |
100GB – 300GB |
t2. |
300GB – 1TB |
t2. |
> 1TB |
Contact SingleStore Support |
These recommendations serve as a starting point.
The following are the system requirements when not using the Amazon EC2:
-
Port
8081
must be open on the server hosting the Ingest software. -
Google Chrome is required as the internet browser on the server hosting Ingest software.
-
Java version 21 or higher is required.
-
If using Microsoft SQL Server as a source, download and install the BCP utility.
-
Ensure connectivity between the server hosting the Ingest software and the source, and DynamoDB (if the high availability option is required).
Recommended Hardware Configuration
The following describes the hardware configuration for a Windows server, assuming that there are a few sources and target combinations (3 medium, ideally).
The following describes the hardware configuration for a Windows server; similar configuration is recommended for a Linux or Ubuntu based server.
Component |
Specification |
---|---|
Processor |
4 cores |
Memory |
16 GB |
Disk requirements |
Varies based on the data being extracted, with a minimum of 300 GB |
Network performance |
High |
Prerequisites for Software on Server
The following software must be installed on the server:
-
64-bit Open JDK 21: Amazon Corretto 21 JRE
-
For SQL Server sources only, install the following tools and drivers:
-
For a MySQL server only, install
mysqlbinlog.
on the server and include it in the system path.exe
Required Skills
Flow is a suite of robust applications that makes seamless data replication to the cloud.
-
AWS Cloud Fundamentals
-
Basic database skills, including writing and executing database queries (for RDBMS endpoints)
-
Familiarity with using Microsoft Windows or Linux-based systems
Installation
For details on how to install Ingest and other Flow components, refer to Install SingleStore Flow.
In this section
Last modified: February 7, 2025