ERROR 1158 (08S01): Leaf Error (): Error reading packet ### from the connection socket (): Connection timed out
The presence of extremely large numbers of duplicates in combination with
LOAD DATA IGNORE can cause the leaves to have to wait so long they time out.
If you see this error when running
LOAD DATA IGNORE, verify that the data does not have a lot of duplicates.
ERROR 1735: Unable to connect … Timed out reading from socket
A MemSQL node is unable to connect to another MemSQL node. This may be because there is no network connectivity (such as a network problem or a firewall blocking connectivity), or because a node is overloaded with connection requests.
Here are some possible solutions to solve this problem:
Ensure that all nodes are able to connect to all other nodes on the configured port (the default is 3306). Update any firewall rules that block connectivity between the nodes.
One way to verify connectivity is to run the command
FILL CONNECTION POOLSon all MemSQL nodes. If this fails with the same error, then a node is unable to connect to another node.Info
Some queries require different amounts of connectivity. For example, some queries only require aggregator-leaf connections while others require aggregator-leaf as well as leaf-leaf connections. As a result, it is possible for some queries to succeed while others fail with this error.
If all nodes are able to connect to all other nodes, the error is likely because your query or queries require opening too many connections at once. Run
FILL CONNECTION POOLSon all MemSQL nodes to pre-fill connection pools. If the connection pool size is too small for your workload, adjust the max_pooled_connections configuration variable, which controls the number of pooled connections between each pair of nodes.
ERROR 1970 (HY000): Subprocess /var/lib/memsql/master-3306/extractors/kafka-extract –get-offsets –kafka-version=0.8.2.2 timed out
This error occurs when there are connectivity issues between a MemSQL node and the data source (e.g. Kafka cluster or S3). This error is particularly common when using S3 pipelines because of throttling and other S3 behavior.
To solve this issue, edit the value of
pipelines_extractor_get_offsets_timeout_ms. The default value is 10000. Increase this value to eliminate the timeout error. See Pipeline System Variables for more information on this timeout variable, and to change the value, see How to Set Pipelines System Variables.
ERROR 2002 (HY000): Can’t connect to local MySQL server through socket ‘/var/run/mysqld/mysqld.sock’
When the MySQL client connects to
localhost, it attempts to use a socket file instead of TCP/IP. The socket file used is specified in
/etc/mysql/my.cnf when the MySQL client is installed on the system. This is a MySQL socket file, which MemSQL does not use by default. Therefore, connecting with
localhost attempts to connect to MySQL and not MemSQL.
There are two solutions to solve this problem:
127.0.0.1as the host instead of
localhost. That is,
mysql -h 127.0.0.1 -u rootinstead of
mysql -h localhost -u root.Info
If you omit the host (
mysql -u root), the MySQL client will implicitly use
/etc/mysql/my.cnf, change the
socketvalue to the location of your MemSQL socket file as shown in the example below:
[client] port = 3306 socket = /var/lib/memsql/data/memsql.sock
ERROR 2026 (HY000): SSL connection error: SSL_CTX_set_default_verify_paths failed
This error occurs when the incorrect path is provided for the ca-cert-pem file when using the
--ssl_ca flag in the connection string to the MemSQL node.
The solution is to verify you are using the correct path to the ca-cert.pem file.
ERROR 2026 (HY000): SSL connection error: SSL is required but the server doesn’t support it
This error occurs when you attempt to create a connection into the affected memsql node and either you did not add the required SSL configurations to the
memsql.cnf file, or you did add the required SSL configurations to the
memsql.cnf file but you did NOT restarted the target memsql node.
- Check to make sure the correct ssl configurations have been written to the
memsql.cnffile of the target memsql node.
- Check to make sure the target memsql node has been restarted since updating its
ERROR: Distributed Join Error. Leaf X cannot connect to Leaf Y.
When a distributed join occurs, the leaves within the cluster must reshuffle data amongst themselves, which requires the leaves to connect to one another. If the leaves are not able to communicate with one another, and a distributed join is touching those leaves, the distributed query will not run successfully. The inter-leaf communication that needed for distributed join queries relies on the DNS cache on each leaf. If this cache is out of sync with the current state of the leaves, the distributed join will fail.
Use the following steps to troubleshoot this scenario:
Confirm you are able to access MemSQL from one leaf to another in the cluster. This will eliminate network connection issues.Info
You are able to connect manually from one leaf to another because doing so does not utilize the DNS cache on the leaf.
SHOW LEAVESon an affected leaf (e.g. leaf X) in the cluster. The
Opened_Connectionscolumns should reveal what leaves the affected leaf has open connections with. Verify that leaf Y is not in this list.
When leaves connect to each other, they cache connection information (leaf-1 is at IP 192.0.2.1, leaf-2 is at IP 192.0.2.2, etc.). If the IPs of these leaves ever change the cache will not automatically update. This will ultimately result in an unsuccessful connection attempt because the other leaves in the cluster are using old IP address information. The solution is to flush the DNS cache and connection pools on all affected nodes. You can do so by running the following:
FLUSH HOSTS; FLUSH CONNECTION POOLS;
FLUSH HOSTSclears the DNS cache on the node. This must be performed all affected nodes in the cluster.
FLUSH CONNECTION POOLSshuts down all existing connections and closes idle pooled connections.