Key Considerations for Understanding Your Workload
Before attempting any schema design, it is critical to understand the nature of the workload. Ask yourself the following questions before getting started:
Is data loaded in trickles, large batches, or concurrent inserts? Is data frequently updated?
Is data ingest speed more important than query performance?
Are the queries mostly working with a small subset of rows related to the entire dataset (probably 0.1% or less)? Are the queries dealing with the entire dataset or a big subset of the data?
Which tables do you tend to join and what columns do you use to join them?
What columns do you tend to use in filters?