Configuration Options for Different Sources
On this page
SingleStore supports a number of configuration options for different sources.CONFIG
clause in the CREATE PIPELINE command.
Kafka Configurations
The following table shows the SingleStore-specific configurations for a Kafka environment.
Parameter |
Description |
---|---|
|
Used while connecting to Kafka via a proxy, for example, when connecting across multiple cloud services. |
|
Specifies a timeout for operations such as metadata requests and message consumption/production. Default: 10 seconds
|
|
Used with Kerberos authentication to specify where to cache Kerberos tickets. Default
|
|
Use this parameter if the client does not support
|
The CONFIG
clause of a Kafka pipeline can accept a spoof.
element as an alternative to configuring Kafka brokers.spoof.
element must be a JSON object consisting of an arbitrary number of key-value pairs with URL string values.
This CREATE PIPELINE
command will let you set the AWS private link configuration for Kafka Brokers with AWS MSK.
CREATE PIPELINE <pipeline_name> AS LOAD DATA KAFKA '<Kafka bootstrap server endpoint>:<port>/<topic name>'CONFIG '{"spoof.dns": {"<broker 1 endpoint>:<port>":"<SingleStore shared endpoint (outbound)>:<NLB listener port for broker 1>","<broker 2 endpoint>:<port>":"<SingleStore shared endpoint (outbound)>:<NLB listener port for broker 2>","<broker 3 endpoint>:<port>":"<SingleStore shared endpoint (outbound)>:<NLB listener port for broker 3>",}}'INTO TABLE <table_name>;
There are a few more configuration options that are supported by Kafka.CONFIGURATION.
file in the librdkafka project in GitHub to see the full list.
Note
Some of the configuration options are not supported in SingleStore."Forbidden Key"
error when accessing unsupported configuration options.
The configuration below controls some of the various aspects of the consumer's behavior (e.
CREATE PIPELINE p AS LOAD DATA kafka 'host.example.com:9092/whatever'CONFIG '{"fetch.max.bytes": "52428800", "topic.metadata.refresh.interval.ms": "300000", "message.max.bytes": "1000000","fetch.wait.max.ms": "500", "session.timeout.ms": "45000", "topic.metadata.refresh.fast.interval.ms": "100","fetch.min.bytes": "1", "max.partition.fetch.bytes": "1048576", "fetch.message.max.bytes": "1048576","socket.keepalive.enable": "true", "fetch.error.backoff.ms": "500", "socket.timeout.ms": "60000"}'INTO TABLE t format CSV;
The following configuration sets some of the different communication options that are used with Kafka brokers (e.
CREATE PIPELINE p AS LOAD DATA kafka 'host.example.com:9092/whatever2'CONFIG '{"connections.max.idle.ms": "230000", "client.id": "<client_id>", "fetch.max.bytes": "1000000","operation.timeout.ms": "30000", "batch.num.messages": "1000", "socket.keepalive.enable": "false","socket.timeout.ms": "60000"}'INTO TABLE t format CSV;
S3 Configurations
The following table shows the SingleStore-specific configurations for S3.
Parameter |
Description |
---|---|
|
When this parameter is set to When this parameter is disabled or missing, files with the
|
|
Specifies who is responsible for paying for the data transfer and request costs associated with accessing an S3 bucket. By default, the owner of an S3 bucket is responsible for paying these costs.
|
|
Specifies the URL of the S3-compatible storage provider.
|
|
Instructs the downloader to use S3 API calls that are better supported by third parties.
|
|
Decompresses files with the specified extensions.
|
|
If set, files last modified before the specified timestamp are not ingested.
|
No CONFIG
clause is required to create an S3 pipeline.CONFIG
clause is specified, SingleStore will automatically use the us-east-1
region, also known as US Standard
in the Amazon S3 console.us-west-1
, include a CONFIG
clause as shown in the example below.CONFIG
clause can also be used to specify the suffixes
for files to load.CREATE PIPELINE
only loads files that have the specified suffix.CONFIG
clause can be specified without a .
before them, for example, CONFIG '{"suffixes": ["csv"]}'
.
CREATE OR REPLACE PIPELINE <pipeline_name>AS LOAD DATA S3 'data-test-bucket'CONFIG '{"region": "us-east-1","request_payer": "requester", "endpoint_url": "https://storage.googleapis.com", "compatibility_mode": true}'CREDENTIALS '{"aws_access_key_id": "ANIAVX7U2LM9QVJMK2ZT","aws_secret_access_key": "xxxxxxxxxxxxxxxxxxxxxxx"}'INTO TABLE 'market_data'(ts, timestamp, event_type, ticker, price, quantity, exchange, conditions);
Azure Blob Configurations
The following table shows the SingleStore-specific configurations for Azure Blobs.
Parameter |
Description |
---|---|
|
When this parameter is set to When this parameter is disabled or missing, files with the
|
Note that no CONFIG
clause is required to create an Azure pipeline unless you need to specify the suffixes
for files to load.CREATE PIPELINE
only loads files that have the specified suffix.CONFIG
clause can be specified without a .
before them, for example, CONFIG '{"suffixes": ["csv"]}'
.
GCS Configurations
The following table shows the SingleStore-specific configurations for GCS.
Parameter |
Description |
---|---|
|
When this parameter is set to When this parameter is disabled or missing, files with the
|
HDFS Configurations
The following table shows the SingleStore-specific configurations for HDFS.
Parameter |
Description |
---|---|
|
When this parameter is set to
|
|
When this parameter is set to When this parameter is disabled or missing, files with the
|
Filesystem Configurations
The following table shows the SingleStore-specific configurations for the filesystem.
Parameter |
Description |
---|---|
|
When this parameter is set to When this parameter is disabled or missing, files with the
|
|
When this parameter is set to When this parameter is disabled or missing, zero-byte files are not processed.
|
Last modified: November 12, 2024