Important
The SingleStore 9.1 release candidate (RC) gives you the opportunity to preview, evaluate, and provide feedback on new and upcoming features prior to their general availability. In the interim, SingleStore 9.0 is recommended for production workloads, which can later be upgraded to SingleStore 9.1.
Configure Kinesis Event Notifications for S3 Pipelines
On this page
Overview
S3 pipelines discover new files by periodically scanning the bucket using AWS ListObjectsV2 API calls, which return only 1,000 files at a time.
With Kinesis-based file discovery:
-
S3 sends event notifications to Amazon EventBridge when files are created.
-
EventBridge routes events to a Kinesis Data Stream.
-
The SingleStore pipeline reads events from the stream and loads files immediately.
-
File discovery latency is reduced to approximately <1.
5 seconds.
Prerequisites
Ensure the following:
-
An AWS account with permissions to create and configure:
-
S3 buckets and event notifications
-
Amazon EventBridge rules
-
Amazon Kinesis Data Streams
-
IAM roles and policies
-
-
An existing S3 bucket with data files
Step 1: Enable EventBridge Notifications on the S3 Bucket
-
Navigate to Amazon S3 in the AWS Management Console.
-
Select your S3 bucket.
-
Navigate to the Properties tab.
-
Navigate to the Amazon EventBridge section.
-
Select Edit.
-
On the Edit Amazon EventBridge page, select On.
-
Select Save changes.
Step 2: Create a Kinesis Data Stream
Create a Kinesis Data Stream to receive S3 event notifications.
-
Navigate to Amazon Kinesis in the AWS Management Console.
-
Select Create data stream.
-
Configure the stream:
-
Data stream name: Enter the name of your Kinesis data stream.
-
Capacity mode: Select Provisioned.
This is for predictable workloads (start with 1 shard).
-
-
Select Create data stream.
-
Copy and securely store the stream ARN.
Step 3: Configure IAM Permissions
Configure permissions for the following:
-
EventBridge to write to Kinesis
-
SingleStore to read from Kinesis and S3
Create an IAM Policy for EventBridge to Kinesis Permissions
Provide EventBridge permission to write events to your Kinesis stream.
-
Navigate to IAM > Policies.
-
Select Create Policy.
-
In the Specify Permissions page, select the JSON tab.
-
Paste the following policy and replace the
Resourcewith your actual Kinesis Stream ARN:{"Version": "2012-10-17","Statement": [{"Effect": "Allow","Action": ["kinesis:PutRecord","kinesis:PutRecords"],"Resource": "arn:aws:kinesis:<REGION>:<ACCOUNT_ID>:stream/<STREAM_NAME>"}]} -
Select Next.
-
-
On the Review and Create page, configure the following:
-
In Policy details, enter the following:
-
Policy Name: Enter the policy name.
-
Description (optional): Enter the description of the policy.
-
-
-
Select Create policy.
Create an IAM Role for EventBridge
-
Navigate to IAM > Roles and select Create role.
-
In the Selected trusted entity page, select or enter the following:
-
Trusted entity type: Custom trust policy.
-
Custom trust policy: Add the following trust policy:
{"Version": "2012-10-17","Statement": [{"Effect": "Allow","Principal": {"Service": "events.amazonaws.com"},"Action": "sts:AssumeRole"}]}
Select Next.
-
-
In the Add permissions page, search for the EventBridge to Kinesis policy you created in Create an IAM Policy for EventBridge to Kinesis Permissions.
-
Enable the EventBridge to Kinesis policy and select Next.
-
In the Name, review and create page, enter the following:
-
Role name: Enter the name of the IAM role for EventBridge.
-
Description (optional): Enter the description of the IAM role for EventBridge.
-
-
Review the role and select Create role.
SingleStore Access to Kinesis and S3
The IAM credentials used by the SingleStore pipeline must include permissions to read from the Kinesis stream in addition to the existing S3 permissions such as s3:GetObject and s3:ListBucket.
Attach the Kinesis read policy to the appropriate IAM principal based on the credential mode used by the pipeline:
|
Credential Mode |
Required Configuration |
IAM Principal |
|---|---|---|
|
Static Credentials |
|
The IAM user associated with the access keys specified in |
|
EKS IRSA |
|
The IAM role associated with the Kubernetes service account through IAM Roles for Service Accounts (IRSA). |
|
EKS IRSA + |
|
The IAM role specified in role_ |
|
Static Credentials + |
|
The IAM role specified in |
Kinesis Read Policy
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "KinesisReadAccess",
"Effect": "Allow",
"Action": [
"kinesis:DescribeStream",
"kinesis:GetRecords",
"kinesis:GetShardIterator",
"kinesis:ListShards"
],
"Resource": "arn:aws:kinesis:us-east-1:123456789012:stream/s3-file-notifications"
},
{
"Sid": "S3ReadAccess",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket",
"s3:HeadObject"
],
"Resource": [
"arn:aws:s3:::your-bucket-name",
"arn:aws:s3:::your-bucket-name/*"
]
}
]
}Step 4: Create an EventBridge Rule
-
Navigate to Amazon EventBridge in the AWS Management Console.
-
Select Create rule.
-
On the Create rule page, in Builder mode, select Advanced builder.
-
In Define rule detail, configure the following:
-
Name: Enter the name of the EventBridge rule.
-
Description (optional): Enter the description of the EventBridge rule.
-
Event bus: Select default.
-
-
Enable the rule on the selected event bus and select Next.
-
On Build event pattern, configure the following:
-
In Events, select the following:
-
Event source: AWS events or EventBridge partner events.
-
-
In Event pattern, select the following:
-
Creation method: Select Custom pattern (JSON editor).
-
Event pattern: Select Edit pattern.
Enter the following event JSON and replace <YOUR_with your S3 bucket:BUCKET_ NAME> {"source": ["aws.s3"],"detail-type": ["Object Created","Object Deleted"],"detail": {"bucket": {"name": ["<YOUR_BUCKET_NAME>"]}}}
-
-
-
In Select target(s), configure the following:
-
Target type: Select AWS service.
-
Select a Target: Select Kinesis stream.
-
Target location: Select from the following:
-
Target in this location
-
Target in another AWS account
-
-
Stream: Based on the selected target location, enter or select the following:
-
If Target in this location is selected, select your Kinesis stream.
-
If Target in another AWS account is selected, enter the stream ARN.
-
-
Partition key (optional): Enter JSON path as partition key.
-
Execution role: Select from the following:
-
Create a new role for this specific resource
-
Use existing role
-
-
Role Name: Based on the selected execution role, enter or select the following:
-
If Create a new role for this specific resource is selected, enter the name of the role.
-
If Use existing role is selected, select the role name.
Select Next.
-
-
-
(Optional) In Configure tags, select Add new tag.
Enter Key and Value. Select Next. -
In Review and Create, review the configurations and select Create rule.
Step 5: Test the Configuration
Verify that events flow correctly before creating the pipeline.
Upload a Test File
echo "test data" > test-file.txtaws s3 cp test-file.txt s3://your-bucket-name/test-file.txt
Verify Events in Kinesis
Get a shard iterator:
SHARD_ITERATOR=$(aws kinesis get-shard-iterator \--stream-name <your-stream-name> \--shard-id shardId-000000000000 \--shard-iterator-type TRIM_HORIZON \--query 'ShardIterator' \--output text \--region us-east-1)
Read records:
aws kinesis get-records \--shard-iterator $SHARD_ITERATOR \--region us-east-1
Verify that an event appears for the uploaded file.
Verify EventBridge Activity
-
Navigate to Amazon EventBridge > Rules.
-
Select your EventBridge rule.
-
Select the Monitoring tab.
-
Verify that the Invocations metric shows activity.
Step 6: Create the SingleStore Pipeline
Create the S3 pipeline with Kinesis-based file discovery enabled.
Example 1: Using Static Credentials
CREATE PIPELINE fast_s3_pipeline AS
LOAD DATA S3 'your-bucket-name/data/'
CONFIG '{
"region": "us-east-1",
"file_notifications_kinesis_stream_arn": "arn:aws:kinesis:us-east-1:123456789012:stream/s3-file-notifications"
}'
CREDENTIALS '{
"aws_access_key_id": "YOUR_ACCESS_KEY",
"aws_secret_access_key": "YOUR_SECRET_KEY"
}'
INTO TABLE my_table
FORMAT JSON;After creating the pipeline, start the pipeline.
Example 2: Using IAM Roles (EKS IRSA)
CREATE PIPELINE fast_s3_pipeline AS
LOAD DATA S3 'your-bucket-name/data/'
CONFIG '{
"region": "us-east-1",
"file_notifications_kinesis_stream_arn": "arn:aws:kinesis:us-east-1:123456789012:stream/s3-file-notifications",
"role_arn": "arn:aws:iam::123456789012:role/singlestore-pipeline-role"
}'
INTO TABLE my_table
FORMAT JSON;After creating the pipeline, start the pipeline.
Example 3: Using Cross-Account Role Assumption
CREATE PIPELINE fast_s3_pipeline AS
LOAD DATA S3 'your-bucket-name/data/'
CONFIG '{
"region": "us-east-1",
"file_notifications_kinesis_stream_arn": "arn:aws:kinesis:us-east-1:ACCOUNT_A:stream/s3-file-notifications",
"role_arn": "arn:aws:iam::ACCOUNT_A:role/cross-account-pipeline-role"
}'
CREDENTIALS '{
"aws_access_key_id": "YOUR_ACCESS_KEY",
"aws_secret_access_key": "YOUR_SECRET_KEY"
}'
INTO TABLE my_table
FORMAT JSON;After creating the pipeline, start the pipeline.
Last modified: