Use Case: Disk Bottleneck

An update query against a rowstore table is unexpectedly slow, and we’d like to understand why.

Was the query running or waiting for a resource?

To answer this question, we compare the cluster-wide CPU usage associated with the query against cluster-wide wait times. mv_activities reports these in a single row. To find the correct row, we join mv_activities with mv_queries so that we may search by query text.

USE information_schema;
SELECT a.cpu_time_ms, a.cpu_wait_time_ms, a.lock_time_ms, a.disk_time_ms, a.network_time_ms, left(q.query_text, 50), a.activity_name
FROM mv_activities a LEFT JOIN mv_queries q ON q.activity_name = a.activity_name
WHERE q.query_text like "update %";
+-------------+------------------+--------------+--------------+-----------------+----------------------------------------------------+----------------------------------+
| cpu_time_ms | cpu_wait_time_ms | lock_time_ms | disk_time_ms | network_time_ms | left(q.query_text, 50)                             | activity_name                    |
+-------------+------------------+--------------+--------------+-----------------+----------------------------------------------------+----------------------------------+
|          64 |                3 |            0 |          938 |             966 | update rowstore_db.rowstore set c = c+@ where a =  | Update_rowstore_9cb5d2de6e9ae9cc |
+-------------+------------------+--------------+--------------+-----------------+----------------------------------------------------+----------------------------------+

The query mostly waited. Over the interval of time described by mv_activities, the query spent only 60ms running, compared to 900ms waiting for disk operations to complete and 900ms waiting for network messages.

Why was the query polling on the network?

network_time_msmay indicate a network issue, but it also includes time the aggregator spent waiting for a response from leaves.

To decide whether we have an actual network issue, we use mv_activities_extended to drill down into the per-partition usage associated with the query across the entire cluster. We again join with mv_queries to find the correct rows. We also join with mv_nodes to describe the nodes on which tasks ran.

use information_schema;
select a.elapsed_time_ms, a.cpu_time_ms, a.cpu_wait_time_ms, a.lock_time_ms, a.lock_row_time_ms, a.network_time_ms, a.disk_time_ms, a.log_flush_time_ms, a.log_buffer_time_ms, a.database_name, a.partition_id, n.type node_type
from mv_activities_extended a left join mv_queries q on a.aggregator_activity_name = q.activity_name left join mv_nodes n on a.node_id=n.id
where q.query_text like "update%";
+-----------------+-------------+------------------+--------------+------------------+-----------------+--------------+-------------------+--------------------+---------------+--------------+-----------+
| elapsed_time_ms | cpu_time_ms | cpu_wait_time_ms | lock_time_ms | lock_row_time_ms | network_time_ms | disk_time_ms | log_flush_time_ms | log_buffer_time_ms | database_name | partition_id | node_type |
+-----------------+-------------+------------------+--------------+------------------+-----------------+--------------+-------------------+--------------------+---------------+--------------+-----------+
|             941 |          44 |                1 |            0 |                0 |               0 |            0 |               915 |                  0 | rowstore_db   |            0 | LEAF      |
|             993 |          46 |                2 |            1 |                0 |             956 |            0 |                 0 |                  0 | rowstore_db   |         NULL | MA        |
+-----------------+-------------+------------------+--------------+------------------+-----------------+--------------+-------------------+--------------------+---------------+--------------+-----------+

The 900ms network_time_ms we saw in mv_activities was accumulated on the MA node, and closely matches the latency of processing the query on the sole relevant leaf partition.

We can deduce that network_time_ms was high because the MA genuinely needed to wait for the leaf to finish processing the query.

Why was latency high on the leaf?

In the previous query, we saw 900ms log_flush_time_ms. The leaf spent its time waiting for transaction logs to be flushed to disk.

Note that we saw zero disk_time_ms in mv_activities_extended, and that mv_activities doesn’t have a log_flush_time_ms column. This is because disk_time_ms in mv_activities is split into disk_time_ms and log_flush_time_ms in mv_activities_extended.

We can drill down further, to illustrate the system-wide profiling potential of mv_activities_extended.

SingleStore performs actual flushing of transaction logs in a separate thread per partition, which is instrumented with a separate task. We can see the relationship between queries and log flusher threads by expanding our query to include system tasks. In particular, we select all rows of mv_activities_extended associated with the query’s database:

use information_schema;
select a.disk_time_ms, a.log_flush_time_ms, a.log_buffer_time_ms, a.database_name, a.partition_id, n.type node_type, a.activity_name, a.activity_type
from mv_activities_extended a left join mv_nodes n on a.node_id=n.id
where a.database_name="rowstore_db" order by a.disk_time_ms desc;
+--------------+-------------------+--------------------+---------------+--------------+-----------+----------------------------------+---------------+
| disk_time_ms | log_flush_time_ms | log_buffer_time_ms | database_name | partition_id | node_type | activity_name                    | activity_type |
+--------------+-------------------+--------------------+---------------+--------------+-----------+----------------------------------+---------------+
|          923 |                 0 |                  0 | rowstore_db   |            0 | LEAF      | LogFlusher                       | Database      |
|            0 |                 0 |                  0 | rowstore_db   |            2 | LEAF      | LogFlusher                       | Database      |
|            0 |                 0 |                  0 | rowstore_db   |            2 | LEAF      | RunBackgroundMerger              | Database      |
|            0 |                 0 |                  0 | rowstore_db   |         NULL | LEAF      | ReplayFile                       | Database      |
|            0 |                 0 |                  0 | rowstore_db   |            1 | LEAF      | RunBackgroundMerger              | Database      |
|            0 |               925 |                  0 | rowstore_db   |            0 | LEAF      | update_rowstore_62c84c96b9194b88 | Query         |
|            0 |                 0 |                  0 | rowstore_db   |            0 | LEAF      | RunBackgroundMerger              | Database      |
|            0 |                 0 |                  0 | rowstore_db   |         NULL | LEAF      | StreamFileFromMaster             | Database      |
|            0 |                 0 |                  0 | rowstore_db   |            1 | LEAF      | LogFlusher                       | Database      |
|            0 |                 0 |                  0 | rowstore_db   |         NULL | MA        | Update_rowstore_9cb5d2de6e9ae9cc | Query         |
|            0 |                 0 |                  0 | rowstore_db   |         NULL | MA        | LogFlusher                       | Database      |
|            0 |                 0 |                  0 | rowstore_db   |         NULL | MA        | MEMSQL_REPLICATION_START         | Database      |
|            0 |                 0 |                  0 | rowstore_db   |         NULL | MA        | ReplicaHandler_Loop              | Database      |
|            0 |                 0 |                  0 | rowstore_db   |         NULL | MA        | ReplicaHandler_StreamFiles       | Database      |
+--------------+-------------------+--------------------+---------------+--------------+-----------+----------------------------------+---------------+

LogFlusher tasks on partition 0 spent as much time waiting for disk writes to complete as our query tasks spent waiting for transaction log flushes to be acknowledged.

We can now build a detailed theory of what happened. Query latency was high because the queried MA spent most of its time waiting for a leaf node to finish processing the query on partition 0. This leaf node, in turn, was delayed waiting for a transaction log flusher thread to complete disk writes.

Ultimately, the query was waiting for disk.

What else was using the disk?

We extract disk usage figures from mv_activities_extended, again joining with mv_queries for descriptive query text.

USE information_schema;
SELECT a.disk_logical_write_b, a.disk_logical_read_b, a.activity_name, a.database_name, a.partition_id, LEFT(q.query_text, 50)
FROM mv_activities_extended a LEFT JOIN mv_queries q ON a.aggregator_activity_name = q.activity_name
ORDERY BY a.disk_logical_write_b + a.disk_logical_read_b desc limit 10;
+----------------------+---------------------+-------------------------------------+----------------+--------------+----------------------------------------------------+
| disk_logical_write_b | disk_logical_read_b | activity_name                       | database_name  | partition_id | left(q.query_text, 50)                             |
+----------------------+---------------------+-------------------------------------+----------------+--------------+----------------------------------------------------+
|              1383163 |             1383855 | RunBackgroundMerger                 | columnstore_db |            0 | NULL                                               |
|              1212571 |             1213233 | RunBackgroundMerger                 | columnstore_db |            1 | NULL                                               |
|               797124 |              598624 | RunBackgroundMerger                 | columnstore_db |            2 | NULL                                               |
|                22535 |                   0 | LogFlusher                          | columnstore_db |            1 | NULL                                               |
|                21335 |                   0 | LogFlusher                          | columnstore_db |            2 | NULL                                               |
|                17990 |                   0 | LogFlusher                          | columnstore_db |            0 | NULL                                               |
|                 7076 |                   0 | LogFlusher                          | rowstore_db    |            0 | NULL                                               |
|                 1025 |                   0 | Insert_columnstore_61fc44a9950b4882 | columnstore_db |            1 | insert into columnstore select @*round(@ * @ *rand |
|                  894 |                   0 | Insert_columnstore_61fc44a9950b4882 | columnstore_db |            2 | insert into columnstore select @*round(@ * @ *rand |
|                  769 |                   0 | Insert_columnstore_61fc44a9950b4882 | columnstore_db |            0 | insert into columnstore select @*round(@ * @ *rand |
+----------------------+---------------------+-------------------------------------+----------------+--------------+----------------------------------------------------+

The disk was dominated by work done on behalf of a columnstore insert query in the columnstore_db database.

Last modified: March 3, 2023

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK