Exploring Tableau Desktop’s Revolutionary New Data Model: Part IBy: Tyler Zysberg | June 3, 2020
Tableau 2020.2 has arrived and it’s full of new and exciting features as always but—with a radical revision of the Data Source and Analysis tab, this version of Desktop contains changes as significant as any Tableau has previously released.
A data model is a set of instructions for connecting tables in and across databases. Every data source you have ever used in Tableau depended on a fixed data model, whether it was configured automatically by Tableau, or manually constructed by a user. This is all about to change in the newest version of Tableau. In this post, we’ll give you an inside peek into how this new functionality will help you create data sources in a more robust and efficient way.
Data Model Structure
Prior to 2020.2, for each project you built in Tableau Desktop, you first had to create a data model by specifying all the required tables from your database(s) and designate exactly how they were related to each other. Such data models could take considerable effort to plan out and execute, and once complete, they were relatively inflexible.
With the new data modeling feature, Tableau augments or even replaces fixed joins and unions by allowing you to specify the core relationships between tables. This lets the data engine create dynamically defined SQL queries based on the fields you use to create each view. These queries are created on the fly and automatically use the tables that are required for each view. This is significantly different than the current method of creating a visualization, which requires the data engine to query all the elements in a data source before creating the visualization, whether they are related to the view or not.
In the new data source pane you will notice there are two layers to a data source now: logical and physical. The newly introduced logical layer allows you to create a relationship, also known as a “Noodle.” Within each of these Noodles, you can change the nature or cardinality of the relationship, as well as referential integrity assumptions. Contained within each logical table, a physical table will also be present, allowing you to create joins between tables just as in older versions of Tableau.
- Improves performance. The most obvious advantage is a gain in the speed of our workbooks. The data engine queries only the tables and creates only the joins needed to create each visualization, making the SQL queries shorter and dashboards faster.
- Intelligently defines relationships based on visualization. Tableau will now understand what is being displayed in the visualization and create the right relationship between tables to support that view. Each table is stored independently rather than requiring fixed joins between them, allowing the data model to reconfigure itself dynamically for each view.
- Prevents duplicated records. Because the relationship is defined between data sources at the visualization level, duplication of records is no longer an issue, which will alleviate the need for many LOD calculations. You will still want to use LODs for things such as multi-level aggregation (e.g. averaging a set of sums) and cohort analyses (e.g. grouping customers by their first order date).
- Eliminates problems with data blending. In the old data model, blending data sources could create challenges for users. It was often difficult, or even impossible, to create certain LOD calculations or use functions like Distinct Count across the blend. Blends would sometimes give rise to performance issues with granular data. Blends could also not be published as data sources. In the new data model, many situations that previously required blending can now be handled with relationships, so these issues should be far less troubling.
Tips to Keep in Mind
- Relationships cannot be defined on calculated fields. Tableau will not intelligently define a relationship from a user calculated field. If this is necessary, it must be done in the physical layer.
- Published data sources can not be related to each other. If you want to combine data from published data sources, blends are currently your only option.
- Users must embed their data sources to be able to edit relationships and performance options in the Data Source page. It is not possible to edit the data model of a published data source on the web or in desktop.
The new model makes connecting to data simpler and more intuitive, yet more flexible than ever before. At the same time, the new approach manages to improve performance. We see this as one of the biggest changes to the Tableau Desktop user experience in recent memory. In our next post, we’ll dig deeper into the new data model with a demonstration highlighting the differences between the old and new data modeling features.
To read more about this topic and download the beta, please visit the Tableau official website.
Want to see how you can take your Tableau projects to the next level? Learn more about how our experts can help grow your skills today.
Subscribe to our resources!
Sign up to receive our latest eBooks, webinars, blog posts, newsletter, event invitations, and much more.