# Configuration Options for Different Sources

SingleStore supports a number of configuration options for different sources. These options can be used with the `CONFIG` clause in the [CREATE PIPELINE](https://docs.singlestore.com/db/v9.1/reference/sql-reference/pipelines-commands/create-pipeline/#UUID-6166b957-3476-1e7b-46ae-c04322557883.md) command.

## Kafka Configurations

The following table shows the SingleStore-specific configurations for a Kafka environment.

| Parameter                     | Description                                                                                                                                                                                                                                               |
| ----------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `spoof.dns`                   | Used while connecting to Kafka via a proxy, for example, when connecting across multiple cloud services. Use`spoof.dns`to re-route the connections to the proxy without modifying the Kafka broker configuration.                                         |
| `operation.timeout.ms`        | Specifies a timeout for operations such as metadata requests and message consumption/production. This value can be adjusted based on the size of the consumed/produced dataset.**Default**: 10 seconds`CONFIG '{"operation.timeout.ms" : "10000"}'`       |
| `sasl.kerberos.cache`         | Used with Kerberos authentication to specify where to cache Kerberos tickets. When this value is not specified, a`"sasl.tmpdir"`&#xNAN;**+**`/pipeline_digest`location is used.**Default**`sasl.tmpdir`:`/tmp``CONFIG '{"sasl.kerberos.cache" : "/tmp"}'` |
| `sasl.kerberos.disable.kinit` | Use this parameter if the client does not support`kinit`and refresh tokens withSingleStore. Running`kinit`is not required if a background process keeps the Kerberos ticket cache up to date.`CONFIG '{"sasl.kerberos.disable.kinit" : true}'`            |

The `CONFIG` clause of a Kafka pipeline can accept a `spoof.dns` element as an alternative to configuring Kafka brokers. The `spoof.dns` element must be a JSON object consisting of an arbitrary number of key-value pairs with URL string values. When the pipeline attempts to connect to a Kafka broker whose URL matches one of the keys, the pipeline will connect to the corresponding URL value, effectively remapping the broker URLs inside the pipeline Kafka client.

This `CREATE PIPELINE` command will let you set the AWS private link configuration for Kafka Brokers with AWS MSK.

```sql
CREATE PIPELINE <pipeline_name> AS LOAD DATA KAFKA '<Kafka bootstrap server endpoint>:<port>/<topic name>'
CONFIG '{
  "spoof.dns": {
    "<broker 1 endpoint>:<port>":"<SingleStore shared endpoint (outbound)>:<NLB listener port for broker 1>",
    "<broker 2 endpoint>:<port>":"<SingleStore shared endpoint (outbound)>:<NLB listener port for broker 2>",
    "<broker 3 endpoint>:<port>":"<SingleStore shared endpoint (outbound)>:<NLB listener port for broker 3>",
  }
}'
INTO TABLE <table_name>;
    
```

There are a few more configuration options that are supported by Kafka. Consult the `CONFIGURATION.md` file in the [librdkafka](https://github.com/confluentinc/librdkafka/tree/v1.9.2) project in GitHub to see the full list.

> **📝 Note**: Some of the configuration options are not supported in SingleStore. The client will receive a `"Forbidden Key"` error when accessing unsupported configuration options.

The configuration below controls some of the various aspects of the consumer's behavior (e.g., timeouts, fetching behavior, and message handling). These parameters can be adjusted to optimize the performance and reliability of the Kafka consumer based on your environment and requirements.

```sql
CREATE PIPELINE p AS LOAD DATA kafka 'host.example.com:9092/whatever'
  CONFIG '{"fetch.max.bytes": "52428800", "topic.metadata.refresh.interval.ms": "300000", "message.max.bytes": "1000000", 
           "fetch.wait.max.ms": "500", "session.timeout.ms": "45000", "topic.metadata.refresh.fast.interval.ms": "100", 
           "fetch.min.bytes": "1", "max.partition.fetch.bytes": "1048576", "fetch.message.max.bytes": "1048576", 
           "socket.keepalive.enable": "true", "fetch.error.backoff.ms": "500", "socket.timeout.ms": "60000"}'
  INTO TABLE t format CSV;
```

The following configuration sets some of the different communication options that are used with Kafka brokers (e.g., timeouts, batching behavior, and resource usage). These parameters should be based on your application requirements and specific Kafka deployment environment.

```sql
CREATE PIPELINE p AS LOAD DATA kafka 'host.example.com:9092/whatever2' 
  CONFIG '{"connections.max.idle.ms": "230000", "client.id": "<client_id>", "fetch.max.bytes": "1000000", 
  "operation.timeout.ms": "30000", "batch.num.messages": "1000", "socket.keepalive.enable": "false",
  "socket.timeout.ms": "60000"}'
INTO TABLE t format CSV;
    
```

## S3 Configurations

The following table shows the SingleStore-specific configurations for S3.

| Parameter                               | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| --------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `disable_gunzip`                        | When this parameter is set to`true`, files with the`.gz`extension are not decompressed.When this parameter is disabled or missing, files with the`.gz`extension are decompressed.`CONFIG '{"disable_gunzip" : true}'`                                                                                                                                                                                                                                                                            |
| `request_payer`                         | Specifies who is responsible for paying for the data transfer and request costs associated with accessing an S3 bucket.By default, the owner of an S3 bucket is responsible for paying these costs. However, when using the`request_payer`parameter, the requester will be responsible for covering the costs associated with the request. This can include costs such as`GET`,`PUT`, and`LIST`requests, as well as data transfer charges.`CONFIG '{"request_payer" : "name"}'`                  |
| `endpoint_url`                          | Specifies the URL of the S3-compatible storage provider. This parameter can be used to direct requests to a non-standard endpoint, such as an S3-compatible service other than AWS. For example, MiniO, which is an S3-compatible storage provider, or a private cloud object storage which exposes an interface like S3.`CONFIG '{"endpoint_url" : "sample_url"}'`                                                                                                                              |
| `compatibility_mode`                    | Instructs the downloader to use S3 API calls that are better supported by third parties.`CONFIG '{"compatibility_mode" : true}'`                                                                                                                                                                                                                                                                                                                                                                 |
| `file_compression`                      | Decompresses files with the specified extensions. It can have the following values:`"gz"`,`"lz4"`,`"auto"`, and`"disable"`. This parameter overrides`disable_gunzip`.`CONFIG '{"file_compression" : "gz"}'`                                                                                                                                                                                                                                                                                      |
| `file_time_threshold`                   | If set, files last modified before the specified timestamp are not ingested. The timestamp must be specified in the[Unix Timestamp](https://www.unixtimestamp.com/)format represented as an integer value.`CONFIG '{"file_time_threshold" : 10070010}'`                                                                                                                                                                                                                                          |
| `file_notifications_kinesis_stream_arn` | Specifies the ARN of a Kinesis Data Stream that receives S3 event notifications through AWS EventBridge. When configured, the pipeline uses event-driven file discovery combined with periodic bucket scanning, reducing file discovery latency for large buckets from minutes to approximately 1–2 seconds. Requires EventBridge configuration to route S3 object creation events to the specified Kinesis stream. If not set, the pipeline uses traditional ListObjects-based bucket scanning. |

No `CONFIG` clause is required to create an S3 pipeline. This clause is used to specify things like the Amazon S3 region where the source bucket is located or an entrypoint for an S3-compatible object sore. If no `CONFIG` clause is specified, SingleStore will automatically use the `us-east-1` region, also known as `US Standard` in the Amazon S3 console. To specify a different region, such as `us-west-1`, include a `CONFIG` clause as shown in the example below. The `CONFIG` clause can also be used to specify the `suffixes` for files to load. These suffixes are a JSON array of strings. When specified, `CREATE PIPELINE` only loads files that have the specified suffix. Suffixes in the `CONFIG` clause can be specified without a `.` before them, for example, `CONFIG '{"suffixes": ["csv"]}'`.

```sql
CREATE OR REPLACE PIPELINE <pipeline_name>
   AS LOAD DATA S3 'data-test-bucket' 
   CONFIG '{"region": "us-east-1","request_payer": "requester", "endpoint_url": "https://storage.googleapis.com", "compatibility_mode": true}' 
   CREDENTIALS '{"aws_access_key_id": "ANIAVX7U2LM9QVJMK2ZT",   
                 "aws_secret_access_key": "xxxxxxxxxxxxxxxxxxxxxxx"}' 
   INTO TABLE 'market_data' 
     (ts, timestamp, event_type, ticker, price, quantity, exchange, conditions);
```

## Azure Blob Configurations

The following table shows the SingleStore-specific configurations for Azure Blobs.

| Parameter        | Description                                                                                                                                                                                                           |
| ---------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `disable_gunzip` | When this parameter is set to`true`, files with the`.gz`extension are not decompressed.When this parameter is disabled or missing, files with the`.gz`extension are decompressed.`CONFIG '{"disable_gunzip" : true}'` |

Note that no `CONFIG` clause is required to create an Azure pipeline unless you need to specify the `suffixes` for files to load. These suffixes are a JSON array of strings. When specified, `CREATE PIPELINE` only loads files that have the specified suffix. Suffixes in the `CONFIG` clause can be specified without a `.` before them, for example, `CONFIG '{"suffixes": ["csv"]}'`.

## GCS Configurations

The following table shows the SingleStore-specific configurations for GCS.

| Parameter        | Description                                                                                                                                                                                                           |
| ---------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `disable_gunzip` | When this parameter is set to`true`, files with the`.gz`extension are not decompressed.When this parameter is disabled or missing, files with the`.gz`extension are decompressed.`CONFIG '{"disable_gunzip" : true}'` |

## HDFS Configurations

The following table shows the SingleStore-specific configurations for HDFS.

| Parameter               | Description                                                                                                                                                                                                                                          |
| ----------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `disable_partial_check` | When this parameter is set to`true`, a pipeline is created that imports Hive output files. When the pipeline runs, the extractor imports files, but does not check for additional files in the directory.`CONFIG '{"disable_partial_check" : true}'` |
| `disable_gunzip`        | When this parameter is set to`true`, files with the`.gz`extension are not decompressed.When this parameter is disabled or missing, files with the`.gz`extension are decompressed.`CONFIG '{"disable_gunzip" : true}'`                                |

## Filesystem Configurations

The following table shows the SingleStore-specific configurations for the filesystem.

| Parameter                 | Description                                                                                                                                                                                                           |
| ------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `disable_gunzip`          | When this parameter is set to`true`, files with the`.gz`extension are not decompressed.When this parameter is disabled or missing, files with the`.gz`extension are decompressed.`CONFIG '{"disable_gunzip" : true}'` |
| `process_zero_byte_files` | When this parameter is set to`true`, zero-byte files are processed.When this parameter is disabled or missing, zero-byte files are not processed.`CONFIG '{"process_zero_byte_files" : true}'`                        |

***

Modified at: May 18, 2026

Source: [/db/v9.1/load-data/data-sources/configuration-options-for-different-sources/](https://docs.singlestore.com/db/v9.1/load-data/data-sources/configuration-options-for-different-sources/)

(An index of the documentation is available at /llms.txt)
