S3 Pipeline Errors

For more information about creating pipelines with S3, refer to Load Data from Amazon Web Services (AWS) S3.

S3 Authentication Errors

You may receive authentication errors if you attempt to create an S3 pipeline without providing credentials or if the provided credentials are invalid.

NoCredentialProviders: no valid providers in chain.

This error is caused by one or more of the following conditions:

  • No CREDENTIALS were specified in the CREATE PIPELINE statement or the JSON was malformed.

  • An IAM role was specified, but your EC2 instance was not configured with an instance profile.

"aws_access_key_id" specified, but not "aws_secret_access_key"

This error is caused by a missing aws_secret_access_key key in the CREDENTIALS JSON of your CREATE PIPELINE statement, or if the JSON key is malformed.

"aws_secret_access_key" specified, but not "aws_access_key_id"

This error is caused by a missing aws_access_key_id key in the CREDENTIALS JSON of your CREATE PIPELINE statement, or if the JSON is malformed.

InvalidAccessKeyID: The access key ID you provided does not exist in our records

This error is caused by specifying an Access Key ID that does not exist.

SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your key and signing method

This error is caused by specifying an invalid combination of an Access Key ID and a Secret Access Key.

High Memory Usage for S3 Pipeline

When using an S3 pipeline, over time, an increase in the memory used by the md_extractors_offsets table may occur. The continued increase in memory use can eventually lead to Out-of-Memory conditions and can impact performance. To clear the data in this table, you can use the optional clause ENABLE OFFSETS METADATA GC.

By default, the pipeline garbage collector (GC) for S3 is not enabled. ENABLE OFFSETS METADATA GC should be added to the CREATE PIPELINE query block to enable pipeline garbage collection on new pipelines. To enable pipeline garbage collection on an existing pipeline, use the ALTER PIPELINE statement with the ENABLE OFFSETS METADATA GC clause.

See the S3 Pipeline Using Metadata Garbage Collection (GC) section in the CREATE PIPELINE or the ALTER PIPELINE topics.

To check the memory usage, use the query below:

SELECT * FROM information_schema.INTERNAL_TABLE_STATISTICS WHERE table_name LIKE "md_extractors_offsets" ORDER BY memory_use DESC;
+---------------+-----------------------+---------+-----------+------+------------+----------------+------+------------+-------------------+----------------+
| DATABASE_NAME | TABLE_NAME            | ORDINAL | HOST      | PORT | NODE_TYPE  | PARTITION_TYPE | ROWS | MEMORY_USE | STORAGE_TYPE      | ROWS_IN_MEMORY |
+---------------+-----------------------+---------+-----------+------+------------+----------------+------+------------+-------------------+----------------+
| ticket_test   | md_extractors_offsets |    NULL | 127.0.0.1 | 3306 | Aggregator | Reference      |    2 |     524544 | INTERNAL_METADATA |              2 |
| ticket_test   | md_extractors_offsets |    NULL | 127.0.0.1 | 3307 | Leaf       | Reference      |    2 |     524544 | INTERNAL_METADATA |              2 |
| ticket_test   | md_extractors_offsets |       1 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
| ticket_test   | md_extractors_offsets |       7 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
| ticket_test   | md_extractors_offsets |       6 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
| ticket_test   | md_extractors_offsets |       5 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
| ticket_test   | md_extractors_offsets |       4 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
| ticket_test   | md_extractors_offsets |       3 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
| ticket_test   | md_extractors_offsets |       2 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
| ticket_test   | md_extractors_offsets |       0 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
+---------------+-----------------------+---------+-----------+------+------------+----------------+------+------------+-------------------+----------------+

Last modified: October 8, 2024

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK