Designing for Multi-Tenant Applications

This article explains the various models and trade-offs available for a multi-tenancy SaaS application. Finding the right design early on is critical to avoid costly changes later. SingleStoreDB, with its separation of compute and storage, enables various models.

We will first describe the different SaaS application patterns you might consider, define the pros and cons for each pattern and finally summarize these pros and cons across 5 dimensions.

Note that the compute infrastructure that SingleStoreDB customers can create, pause and resume is called a workspace.

SaaS Architecture: Multi- or Single-Tenancy with Workspaces

When you design your SaaS application, you want your business model to align with your infrastructure cost as much as possible. As your customer upgrades and grows in usage, so does your infrastructure.

Many SaaS applications have a free tier. This free tier will have a high density of tenants that share the same compute workspace. On the opposite side, you might have customers that will be in a single compute workspace to provide a high performance and avoid noisy neighbors. SingleStoreDB allows you to provide such a hybrid model.

designing-your-DB-multi-tenant-apps-diagram1.png

Example

Let's assume that one of SingleStore's customers has 10,000 customers with free, standard, premium and dedicated offerings. Workspaces can be used to create different service expectations between these offerings. Each group can be placed on a separate workspace, where the number of users and the size of the compute workspace could be adjusted to provide higher or lower quality of service per tenant.

Example offerings:

  • Free - S-2 Workspace - 9,000 tenants with low performance requirement and infrequent usage

  • Standard - S-4 Workspace - 900 tenants, medium performance requirement with infrequent usage

  • Premium - S-16 Workspace - 99 tenants with high performance requirements and frequent usage

  • Dedicated - S-4 Workspace - 1 tenant whose data and compute capacity need to be dedicated

designing-your-DB-multi-tenant-apps-diagram2A.png

The sizes of workspaces and number of tenants allocated to each workspace could be adjusted based on customer application SLAs and active workload. If a customer moves from Free to Standard, that customer would be served now by Workspace #2 instead of Workspace #1.

Data Layer Strategy

There are different ways to model your multi-tenancy. When building a multi-tenant application, consider the isolation level you want to achieve:

designing-your-DB-multi-tenant-apps-diagram3.png

No data isolation (NDI): this modeling strategy consolidates all tenants within shared databases and tables.

Table isolation (TI): this modeling strategy consolidates all tenants within a shared database but would isolate each tenant at the table level.

Database isolation (DI): this modeling strategy groups tenants within a shared workspace group but isolates each tenant at the database level.

Workspace group isolation (WGI): this model isolates groups of tenants with dedicated databases and workspaces. This model is useful when you group your tenants in geographical areas or/and per cloud provider.

Choosing the Right Data Isolation Model for You

Let's look at the benefits and tradeoffs that developers and DBAs need to consider. Unfortunately no data model is a clear winner and you might need to embrace a hybrid model based on your audience.

Here is a summary table across the 4 models. Keeping a low cost is not an issue as you can control how many tenants per workspace you allocate (and the size of the workspace). Note that the overall score is not summarized since each dimension might have different levels of importance to you.

designing-your-DB-multi-tenant-apps-diagram4.png

Scale: 1..10 with 10 being the best experience

No Data Isolation (NDI)

This pattern is the simplest to architect and to scale (millions of tenants) and has the lowest cost per tenant (all resources are shared). Adding a new tenant does not create more complexity as no new object needs to be created when adding a tenant. However the app's data model needs to be consistent between tenants. An application that has a premium offering, with more features, might have a different schema than the basic offering.

The operational complexity can be high as the application grows in tenants and data. Removing or archiving a customer requires custom code to pull their data out. Backup and recovery at the tenant level is difficult in this model. That said, ETL and ELT operations through pipelines are easier as there are fewer pipelines to manage.

The development complexity is higher as developers need to deeply understand database RBAC and row-level security. SingleStoreDB has built-in capabilities for managing access at a granular level. A bug in your code can start leaking tenants information. You will need to build and maintain a tenant metadata table for identifying the tenant presence and retrieving relevant data to a specific tenant. Finally, cost management per tenant is also harder to estimate since tenants share the same workspace.

In this model noisy neighbors sharing compute resources can become challenging if you need more than 5 workspaces to serve all your tenants (currently, this is a hard limit). At this point you might want to create a second workspace group to avoid noisy neighbors.

Finally, this model might not comply with local regulations (especially around location).

Operating with a No Data Isolation Model

Each table will have to have at least a column to identify the tenant (with something like tenant_id). With SingleStoreDB, you can shard the data on tenant_id to distribute your data across partitions for better performance (subject to data and distribution/skew considerations). You can add shard key columns to further optimize performance, especially if you need more granularity in your sharding strategy.

Ideally your tenants are equal with regards to utilization. If they are not, you might have data skew issues or hot spots in your shards. In that case, we recommend to fit tenants by groups belonging to different databases.

If you have Free, Standard, and Premium offerings, a good strategy is to fit each tenant in databases based on the offering. When a customer upgrades from free to premium, move that customer from the database hosting free tenants to the database hosting premium tenants.

designing-your-DB-multi-tenant-apps-diagram5.png

Table Isolation (TI)

This model is simple to architect and cost efficient but provides a medium level of data isolation. Your developers will need to be vigilant to not surface data from other tenants in their application. This strategy is good to scale up to 1,000 tenants but the operational complexity will grow very quickly, especially if you get multiple tables per tenant. Your pipelines will be multiplied over the NDI model.

In this model, it is easy to improve the performance for a specific tenant since you can easily optimize a table. However, it can be harder to monitor and manage performance across all tenants.

Backup and recovery at the tenant level is easier than for NDI but still difficult to manage at the table level. SingleStoreDB only allows backup and restore (incl. PITR) at the database level.

Finally, this model might not comply with local regulations (especially around location).

Database Isolation (DI)

This model is simple to architect but can become an issue if you scale to 1,000s of tenants. It is great at providing data isolation and is compliant with regulations like PCI/HIPAA/FedRamp. Similar to TI, you can scale to 1,000s of tenants but quickly the model can grow in complexity as you will need to manage one database per tenant. Your pipelines will be multiplied over the NDI model which adds more operational complexity. Because each database has a small amount of memory overhead, if you have a lot of databases (even small in size), you will experience memory overhead that might increase your cost.

Managing backup and restore at the tenant level is simple. Managing performance at the tenant level is also simple. Similar to WGI, managing and monitoring performance across all tenants is more complex.

Workspace Group Isolation (WGI) Across a Group of Tenants

SingleStore does not see the need to isolate one tenant per Workspace Group. However, we see many use cases when you need to isolate a group of tenants in a Workspace Group.

You can isolate groups of tenants based on region (e.g. European tenants to be in the WGI) or cloud providers (e.g. some tenants require to be on Azure) as examples. This model provides the highest level of isolation and compliance.

SingleStore also recommends using that model for isolating work environments between development, staging, and production (3 separate Workspace Groups).

Summary

When deciding which approach is best for your particular circumstances, consider which factors are most important to you and your different offerings. In many cases, you will need a hybrid approach across isolation models.

Low-end or free offerings will have less requirements around isolation with far more tenants, and keeping a low TCO is key. The No Data Isolation model will be the best fit for such offerings.

High-end offerings will have far more requirements around compliance and security, and TCO might be less important. For such models Database Isolation and Workspace Group Isolation would be the preferred solutions.

A winning strategy is the ability to move your tenant between models which will be in-line with your offerings and features.