Connect with Tableau Desktop

You can use Tableau Desktop to connect to a SingleStore cluster for exploring your data. 

This guide provides guidelines and best practices for developing business Tableau dashboards using SingleStore data and covers the following:

Prerequisites

Install Tableau Desktop

To develop business Tableau dashboards using SingleStore data, download and install Tableau Desktop on your Windows or Mac PC.

Additionally, download the SingleStore connector for Tableau (Tableau Exchange > Connectors > SingleStoreDB JDBC > Download), and place the .taco file in the Connectors directory of your Tableau installation. The following table specifies the default path of the Connectors directory:

Platform

Path

Windows

C:\Users[user]\Documents\My Tableau Repository\Connectors

Mac

/Users/[user]/Documents/My Tableau Repository/Connectors

(Optional) Install Tableau Reader

You can use Tableau Reader to package a dashboard, along with the supporting data, and share it with the recipients, even if they do not have Tableau Desktop installed.

Install the Singlestore JDBC Driver

Download and install The SingleStore JDBC Driver. Place the .jar file in the Drivers directory of your Tableau installation. The following table specifies the default path of the Drivers directory:

Platform

Path

Windows

C:\Program Files\Tableau\Drivers

Mac

~/Library/Tableau/Drivers or ~/Library/JDBC

Connect Tableau Desktop to SingleStore

After installing Tableau Desktop, connect it to SingleStore through the following steps:

  1. In Tableau Desktop, select Connect > To a Server > More… > SingleStoreDB JDBC by SingleStore. If the connector is not installed, select Install and Restart Tableau.

  2. In the SingleStoreDB JDBC by SingleStore dialog, enter the following details:

    • Server: IP address or hostname of the SingleStore cluster.

    • Port: Default 3306.

    • Username: The SingleStore username.

    • Password: Password for the SingleStore user.

Set Up SingleStore Data Source in Tableau Desktop

On the data source page, perform the following steps:

  1. The data source name defaults to the Server value. If needed, you may enter a unique data source name for use in Tableau. For example, use a data source naming convention that helps other users identify which data source to connect to.

  2. On the left pane, from the Database list, select a database (or search by database name).

  3. Under Table, select a table.

  4. Drag the selected table to the canvas, and then select the sheet tab to start your analysis.

You can also perform different joins, table appends, filters in the data source along with setting up live connections and extracts. See Tableau Online Help for more information.

After the data source is set up, you are ready to visualize and analyze data.

Tableau Best Practices

Follow these strategies to build efficient Tableau dashboards.

Data Strategies

  1. Keep simple data sources: The performance of visualization depends on the underlying data sources. To improve performance, extract only the data that is needed for the worksheet to perform its analysis.

  2. Extract only the required data: Minimize joined tables. If the analysis requires data from joined tables, then edit the connection to remove unused data.

    Execute data source filters before executing traditional filters for context filtering so that the extract is smaller and takes less time to refresh. Although context filtering creates a flat table with initial performance issues, performance improves for subsequent views and filters.

    Use extracts filtered by context filters so that it contains only the data that is needed. Extracts are stored in an internal structure, which is easier for Tableau to query and access. Also, calculated fields are saved as actual data, saving further computation time. One of the drawback of using extracts is that the data is not real-time and a scheduled task is required to refresh the data.

  3. Hide unused columns: Hiding unused columns (dimensions/measures) minimizes extract refresh time or custom SQL query time. You can hide fields in the data window or data source, or allow Tableau to hide all unused fields before creating the extract in the Extract Data box.

  4. Use Extracts: Tableau uses different techniques to optimize the extract. You can also improve visualization by aggregating the data for a visible dimension, known as an aggregated extract. When users interact with an aggregated extract, all calculations and summations have already been compiled.

    Extracts can be filtered with data source filters, which can help to control the size of the extract in two different ways, depending on when the filter is applied:

    • If a data source filter is in place prior to extract creation, the extract will contain filtered records.

    • If a data source filter is put in place after extract creation, the filter will be applied against the full extracted data set. So, your extract will contain all the data but will only show what the data source filter is allowing.

Filtering

  1. Minimize quick filters: Quick filters require Tableau to run a query against the database to determine the values to display for the selected dimension. Therefore, the quick filters that do not require querying the database for values are custom value list, wildcard match, relative date filters, and the browse period date filter.

    It is recommended that you avoid quick filters that require knowledge of the values in the database such as multiple value list, single value list, compact list, slider, measure filters, and ranged date filters.

  2. Use All values in database option in quick filters: Use the default option All values in database for a quick filter and avoid using Only relevant values option. The default option makes all values in the database for that particular field available for user selection. In contrast, Only relevant values compares the values returned from the database with those in other quick filters to show only the values that apply, given the choice made on the other filters. This behavior can bring performance issues, especially if the dashboard contains more than two quick filters.

  3. Avoid quick filters or actions that generate context filters: Context filters create a context TEMP table with the values that go through the filter. All other filters access this TEMP table to draw their values from the limited set of data. This improves the performance of the dashboard; however, if the context filter does not trim down the data to a more manageable set in the new context table then it may cause a performance issue with the visualization.

    If you use a context filter, make the TEMP table as small as possible. For example, eliminate columns that are not needed for that particular visualization to reduce the size of the data set at least to one tenth of the original size. Also, the context filter should be used against slow-changing values or dimensions only.

  4. Keep range quick filters simple: To display results for across large separated periods of time, use a visualization rather than a quick filter. This is applicable for the following date filters:

    • Relative date filters, which are used for a date range that is relative to a specific date.

    • Range of date filters, which are used for a defined range of discrete dates.

    • Discrete date filters, which are used for the individual dates selected from a list. It is recommended to avoid discrete date filters.

  5. Replace quick filters with action filters: Instead of using multiple quick filters, Only Relevant Values option, or quick filters with too many values, use actions filters as they do not require Tableau to run additional queries. These filters work on the users actions, such as clicking on a mark. Action filters can also operate as cascading filters in a filter hierarchy, where values are filtered out as they traverse through the hierarchy.

  6. Avoid using action filters from several sheets for a single dashboard layout as quick filters: Creating visualizations in different worksheets to use them later as action filters in a single dashboard generates extra load every time the dashboard is loaded, as the visualizations are refreshed with every action from the user. When this situation arises, quick filters may be a better solution as they are only loaded once when the dashboard is loaded, and then the filter is applied across all the sheets simultaneously.

SQL Code

  1. Limit the use of customer SQL code in live connections: SQL connections are issued to the database inside a subquery, which can include other clauses from Tableau like GROUP BY, ORDER BY, WHERE and more. Even with efficient SQL code, the extra clauses issued to the database can slow down performance. Use custom SQL only if Tableau cannot generate the desired outcome.

    If a SQL command is necessary, then create a view inside the database and connect to it from Tableau. If it is not possible to create a new view, build a data extract with the SQL code. It will run only once when the extract is built or refreshed, minimizing the effect on visualization performance.

  2. Remove extra clauses: Effective SQL code provides Tableau the required data to produce the desired outcome. Remove extra clauses in the SQL code for Tableau to organize and visualize data effectively.

Calculation

  1. Consider data types for faster calculation: Tableau provides a massive list of functions, divided into different categories that assist in creating calculated fields. The performance impact from different data types may be unnoticeable on smaller data sets, but these differences are more pronounced as the number of records increase. In general, the fastest calculations involve Boolean or number data types, followed by dates, and finally string calculations. It is important to consider ways to achieve same calculated results using faster data types.

  2. Avoid blended calculations: Blended calculations occur when you have to query different data sources to obtain a single calculated field in your visualization. In this case, Tableau needs to query each data source separately to retrieve the values. This can affect performance, especially in large data set. An alternative is to prepare a new view on the data layer on the database server to keep data processing outside Tableau.

  3. Avoid row-level calculations involving parameters: Row-level or record-level calculations operate on every record in the underlying data. Every row calculation consumes time, but when the parameter contains a significant variety of values, for example a table as parameter, it increases time processing exponentially.

Rendering

  1. Avoid high mark counts: Marks are the points, plots, or symbols on the visualizations. Each mark must be created and positioned before the report can be rendered. Use the capabilities of Tableau to drill down or interact with visualizations to obtain better results.

  2. Minimize the file size of images or custom shapes: Big images or shapes result in slow loading and exporting process. Keep images below 50Kb, 32x32 pixels in dimensions, and use efficient image compression formats to reduce load time.

Implementing TDC

Tableau Datasource Customization (TDC) helps optimize the interaction of the Tableau dashboard with a SingleStore cluster. Since Tableau is designed to create TEMP tables, it can sometimes result in performance issues. TDC can stop creation of these tables, along with other configuration settings.

Last modified: October 26, 2023

Was this article helpful?