Use Case: Memory Usage by a Query

How to find the total typical memory usage of a recent run of a query?

Use the following query to get the total typical memory usage of a recent run of a query for all activities on the aggregator and on all partitions of the leaf nodes:

SELECT 
  aggregator_activity_name, 
  SUM(memory_bs * 1000) / (SUM((CASE WHEN mvae.activity_name = aggregator_activity_name THEN elapsed_time_ms ELSE 0 END))) AS avg_clusterwide_mem_bytes_per_second,
  ANY_VALUE(query_text)
FROM information_schema.mv_activities_extended mvae
INNER JOIN information_schema.mv_queries mvq ON mvae.aggregator_activity_name = mvq.activity_name
GROUP BY 1 
ORDER BY 2 DESC;

This gives us the cluster-wide memory usage of all instances of user queries, divided by the sum of query latency of those instances. Thus it is the average memory usage per time the query was running.The results are arranged in descending order with the greatest memory users at the top. To view only the top five memory-consuming queries, add a LIMIT 5 clause at the end.

Note

The MV_ACTIVITIES_EXTENDED table provides data only as far back as the uptime, as it is cleared out when the aggregator node restarts.

Warning

The measurement AVG_CLUSTERWIDE_MEM_BYTES_PER_SECOND is the total memory usage, averaged over the execution time of the query. This is a good general measurement, but if there is a large spike in memory usage and  the query is long running, then the memory usage averaged over time may not be representative of that large, short term memory usage.