S3 Pipeline Errors

For more information about creating pipelines with S3, refer to Load Data from Amazon Web Services (AWS) S3.

S3 Authentication Errors

You may receive authentication errors if you attempt to create an S3 pipeline without providing credentials or if the provided credentials are invalid.

NoCredentialProviders: no valid providers in chain.

This error is caused by one or more of the following conditions:

  • No CREDENTIALS were specified in the CREATE PIPELINE statement or the JSON was malformed.

  • An IAM role was specified, but your EC2 instance was not configured with an instance profile.

"aws_access_key_id" specified, but not "aws_secret_access_key"

This error is caused by a missing aws_secret_access_key key in the CREDENTIALS JSON of your CREATE PIPELINE statement, or if the JSON key is malformed.

"aws_secret_access_key" specified, but not "aws_access_key_id"

This error is caused by a missing aws_access_key_id key in the CREDENTIALS JSON of your CREATE PIPELINE statement, or if the JSON is malformed.

InvalidAccessKeyID: The access key ID you provided does not exist in our records

This error is caused by specifying an Access Key ID that does not exist.

SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your key and signing method

This error is caused by specifying an invalid combination of an Access Key ID and a Secret Access Key.

High Memory Usage for S3 Pipeline

When using an S3 pipeline, over time, an increase in the memory used by the md_extractors_offsets table may occur. The continued increase in memory use can eventually lead to Out-of-Memory conditions and can impact performance. To clear the data in this table, you can use the optional clause ENABLE OFFSETS METADATA GC.

By default, the pipeline garbage collector (GC) for S3 is not enabled. ENABLE OFFSETS METADATA GC should be added to the CREATE PIPELINE query block to enable pipeline garbage collection on new pipelines. To enable pipeline garbage collection on an existing pipeline, use the ALTER PIPELINE statement with the ENABLE OFFSETS METADATA GC clause.

See the S3 Pipeline Using Metadata Garbage Collection (GC) section in the CREATE PIPELINE or the ALTER PIPELINE topics.

To check the memory usage, use the query below:

SELECT * FROM information_schema.INTERNAL_TABLE_STATISTICS WHERE table_name LIKE "md_extractors_offsets" ORDER BY memory_use DESC;
+---------------+-----------------------+---------+-----------+------+------------+----------------+------+------------+-------------------+----------------+
| DATABASE_NAME | TABLE_NAME            | ORDINAL | HOST      | PORT | NODE_TYPE  | PARTITION_TYPE | ROWS | MEMORY_USE | STORAGE_TYPE      | ROWS_IN_MEMORY |
+---------------+-----------------------+---------+-----------+------+------------+----------------+------+------------+-------------------+----------------+
| ticket_test   | md_extractors_offsets |    NULL | 127.0.0.1 | 3306 | Aggregator | Reference      |    2 |     524544 | INTERNAL_METADATA |              2 |
| ticket_test   | md_extractors_offsets |    NULL | 127.0.0.1 | 3307 | Leaf       | Reference      |    2 |     524544 | INTERNAL_METADATA |              2 |
| ticket_test   | md_extractors_offsets |       1 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
| ticket_test   | md_extractors_offsets |       7 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
| ticket_test   | md_extractors_offsets |       6 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
| ticket_test   | md_extractors_offsets |       5 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
| ticket_test   | md_extractors_offsets |       4 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
| ticket_test   | md_extractors_offsets |       3 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
| ticket_test   | md_extractors_offsets |       2 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
| ticket_test   | md_extractors_offsets |       0 | 127.0.0.1 | 3307 | Leaf       | Master         |    0 |          0 | INTERNAL_METADATA |              0 |
+---------------+-----------------------+---------+-----------+------+------------+----------------+------+------------+-------------------+----------------+

Last modified: October 8, 2024

Was this article helpful?