Important

The SingleStore 9.1 release candidate (RC) gives you the opportunity to preview, evaluate, and provide feedback on new and upcoming features prior to their general availability. In the interim, SingleStore 9.0 is recommended for production workloads, which can later be upgraded to SingleStore 9.1.

Configure Kinesis Event Notifications for S3 Pipelines

Overview

S3 pipelines discover new files by periodically scanning the bucket using AWS ListObjectsV2 API calls, which return only 1,000 files at a time. For buckets with a large volume of files, this process can take several minutes.

With Kinesis-based file discovery:

  1. S3 sends event notifications to Amazon EventBridge when files are created.

  2. EventBridge routes events to a Kinesis Data Stream.

  3. The SingleStore pipeline reads events from the stream and loads files immediately.

  4. File discovery latency is reduced to approximately <1.5 seconds.

Prerequisites

Ensure the following:

  • An AWS account with permissions to create and configure:

    • S3 buckets and event notifications

    • Amazon EventBridge rules

    • Amazon Kinesis Data Streams

    • IAM roles and policies

  • An existing S3 bucket with data files

Step 1: Enable EventBridge Notifications on the S3 Bucket

  1. Navigate to Amazon S3 in the AWS Management Console.

  2. Select your S3 bucket.

  3. Navigate to the Properties tab.

  4. Navigate to the Amazon EventBridge section.

  5. Select Edit.

  6. On the Edit Amazon EventBridge page, select On.

  7. Select Save changes.

Step 2: Create a Kinesis Data Stream

Create a Kinesis Data Stream to receive S3 event notifications.

  1. Navigate to Amazon Kinesis in the AWS Management Console.

  2. Select Create data stream.

  3. Configure the stream:

    • Data stream name: Enter the name of your Kinesis data stream.

    • Capacity mode: Select Provisioned. This is for predictable workloads (start with 1 shard).

  4. Select Create data stream.

  5. Copy and securely store the stream ARN.

Step 3: Configure IAM Permissions

Configure permissions for the following:

  1. EventBridge to write to Kinesis

  2. SingleStore to read from Kinesis and S3

Create an IAM Policy for EventBridge to Kinesis Permissions

Provide EventBridge permission to write events to your Kinesis stream.

  1. Navigate to IAM > Policies.

  2. Select Create Policy.

  3. In the Specify Permissions page, select the JSON tab.

    1. Paste the following policy and replace the Resource with your actual Kinesis Stream ARN:

      {
      "Version": "2012-10-17",
      "Statement": [
      {
      "Effect": "Allow",
      "Action": [
      "kinesis:PutRecord",
      "kinesis:PutRecords"
      ],
      "Resource": "arn:aws:kinesis:<REGION>:<ACCOUNT_ID>:stream/<STREAM_NAME>"
      }
      ]
      }
    2. Select Next.

  4. On the Review and Create page, configure the following:

    1. In Policy details, enter the following:

      1. Policy Name: Enter the policy name.

      2. Description (optional): Enter the description of the policy.

  5. Select Create policy.

Create an IAM Role for EventBridge

  1. Navigate to IAM > Roles and select Create role.

  2. In the Selected trusted entity page, select or enter the following:

    1. Trusted entity type: Custom trust policy.

    2. Custom trust policy: Add the following trust policy:

      {
      "Version": "2012-10-17",
      "Statement": [
      {
      "Effect": "Allow",
      "Principal": {
      "Service": "events.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
      }
      ]
      }

    Select Next.

  3. In the Add permissions page, search for the EventBridge to Kinesis policy you created in Create an IAM Policy for EventBridge to Kinesis Permissions.

  4. Enable the EventBridge to Kinesis policy and select Next.

  5. In the Name, review and create page, enter the following:

    1. Role name: Enter the name of the IAM role for EventBridge.

    2. Description (optional): Enter the description of the IAM role for EventBridge.

  6. Review the role and select Create role.

SingleStore Access to Kinesis and S3

The IAM credentials used by the SingleStore pipeline must include permissions to read from the Kinesis stream in addition to the existing S3 permissions such as s3:GetObject and s3:ListBucket.

Attach the Kinesis read policy to the appropriate IAM principal based on the credential mode used by the pipeline:

Credential Mode

Required Configuration

IAM Principal

Static Credentials

aws_access_key_id, aws_secret_access_key

The IAM user associated with the access keys specified in CREDENTIALS.

EKS IRSA

creds_mode: "eks_irsa"

The IAM role associated with the Kubernetes service account through IAM Roles for Service Accounts (IRSA).

EKS IRSA + AssumeRole

creds_mode: "eks_irsa" , role_arn

The IAM role specified in role_arn. The IRSA role must have sts:AssumeRole permission on the target role, and the target role must include the required Kinesis permissions.

Static Credentials + AssumeRole

aws_access_key_id , aws_secret_access_key, role_arn

The IAM role specified in role_arn.

Kinesis Read Policy

{
 "Version": "2012-10-17",
 "Statement": [
   {
     "Sid": "KinesisReadAccess",
     "Effect": "Allow",
     "Action": [
       "kinesis:DescribeStream",
       "kinesis:GetRecords",
       "kinesis:GetShardIterator",
       "kinesis:ListShards"
     ],
     "Resource": "arn:aws:kinesis:us-east-1:123456789012:stream/s3-file-notifications"
   },
   {
     "Sid": "S3ReadAccess",
     "Effect": "Allow",
     "Action": [
       "s3:GetObject",
       "s3:ListBucket",
       "s3:HeadObject"
     ],
     "Resource": [
       "arn:aws:s3:::your-bucket-name",
       "arn:aws:s3:::your-bucket-name/*"
     ]
   }
 ]
}

Step 4: Create an EventBridge Rule

  1. Navigate to Amazon EventBridge in the AWS Management Console.

  2. Select Create rule.

  3. On the Create rule page, in Builder mode, select Advanced builder.

  4. In Define rule detail, configure the following:

    1. Name: Enter the name of the EventBridge rule.

    2. Description (optional): Enter the description of the EventBridge rule.

    3. Event bus: Select default.

  5. Enable the rule on the selected event bus and select Next.

  6. On Build event pattern, configure the following:

    1. In Events, select the following:

      1. Event source: AWS events or EventBridge partner events.

    2. In Event pattern, select the following:

      1. Creation method: Select Custom pattern (JSON editor).

      2. Event pattern: Select Edit pattern. Enter the following event JSON and replace <YOUR_BUCKET_NAME> with your S3 bucket:

        {
        "source": ["aws.s3"],
        "detail-type": [
        "Object Created",
        "Object Deleted"
        ],
        "detail": {
        "bucket": {
        "name": ["<YOUR_BUCKET_NAME>"]
        }
        }
        }
  7. In Select target(s), configure the following:

    1. Target type: Select AWS service.

    2. Select a Target: Select Kinesis stream.

    3. Target location: Select from the following:

      1. Target in this location

      2. Target in another AWS account

    4. Stream: Based on the selected target location, enter or select the following:

      1. If Target in this location is selected, select your Kinesis stream.

      2. If Target in another AWS account is selected, enter the stream ARN.

    5. Partition key (optional): Enter JSON path as partition key.

    6. Execution role: Select from the following:

      1. Create a new role for this specific resource

      2. Use existing role

    7. Role Name: Based on the selected execution role, enter or select the following:

      1. If Create a new role for this specific resource is selected, enter the name of the role.

      2. If Use existing role is selected, select the role name.

      Select Next.

  8. (Optional) In Configure tags, select Add new tag. Enter Key and Value. Select Next.

  9. In Review and Create, review the configurations and select Create rule.

Step 5: Test the Configuration

Verify that events flow correctly before creating the pipeline.

Upload a Test File

echo "test data" > test-file.txt
aws s3 cp test-file.txt s3://your-bucket-name/test-file.txt

Verify Events in Kinesis

Get a shard iterator:

SHARD_ITERATOR=$(aws kinesis get-shard-iterator \
--stream-name <your-stream-name> \
--shard-id shardId-000000000000 \
--shard-iterator-type TRIM_HORIZON \
--query 'ShardIterator' \
--output text \
--region us-east-1)

Read records:

aws kinesis get-records \
--shard-iterator $SHARD_ITERATOR \
--region us-east-1

Verify that an event appears for the uploaded file.

Verify EventBridge Activity

  1. Navigate to Amazon EventBridge > Rules.

  2. Select your EventBridge rule.

  3. Select the Monitoring tab.

  4. Verify that the Invocations metric shows activity.

Step 6: Create the SingleStore Pipeline

Create the S3 pipeline with Kinesis-based file discovery enabled. The following examples demonstrates how to create SingleStore pipeline with different credential modes.

Example 1: Using Static Credentials

CREATE PIPELINE fast_s3_pipeline AS
LOAD DATA S3 'your-bucket-name/data/'
CONFIG '{
 "region": "us-east-1",
 "file_notifications_kinesis_stream_arn": "arn:aws:kinesis:us-east-1:123456789012:stream/s3-file-notifications"
}'
CREDENTIALS '{
 "aws_access_key_id": "YOUR_ACCESS_KEY",
 "aws_secret_access_key": "YOUR_SECRET_KEY"
}'
INTO TABLE my_table
FORMAT JSON;

After creating the pipeline, start the pipeline.

Example 2: Using IAM Roles (EKS IRSA)

CREATE PIPELINE fast_s3_pipeline AS
LOAD DATA S3 'your-bucket-name/data/'
CONFIG '{
 "region": "us-east-1",
 "file_notifications_kinesis_stream_arn": "arn:aws:kinesis:us-east-1:123456789012:stream/s3-file-notifications",
 "role_arn": "arn:aws:iam::123456789012:role/singlestore-pipeline-role"
}'
INTO TABLE my_table
FORMAT JSON;

After creating the pipeline, start the pipeline.

Example 3: Using Cross-Account Role Assumption

CREATE PIPELINE fast_s3_pipeline AS
LOAD DATA S3 'your-bucket-name/data/'
CONFIG '{
 "region": "us-east-1",
 "file_notifications_kinesis_stream_arn": "arn:aws:kinesis:us-east-1:ACCOUNT_A:stream/s3-file-notifications",
 "role_arn": "arn:aws:iam::ACCOUNT_A:role/cross-account-pipeline-role"
}'
CREDENTIALS '{
 "aws_access_key_id": "YOUR_ACCESS_KEY",
 "aws_secret_access_key": "YOUR_SECRET_KEY"
}'
INTO TABLE my_table
FORMAT JSON;

After creating the pipeline, start the pipeline.

Last modified:

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK

Try Out This Notebook to See What’s Possible in SingleStore

Get access to other groundbreaking datasets and engage with our community for expert advice.