Uncovering the Value of Tableau’s Workbook XML MetadataBy: Pasquale Gatti | February 27, 2018
Tableau, at its core, is a visual tool. While its main purpose is to serve as an industry-leading data visualization platform, Tableau offers many other ways to help you get the most out of your data. One of those ways is by providing metadata, in the form of a structured XML tree, in the un-packaged versions of its workbooks. In this post, we’ll use a workbook built on Superstore datasets to highlight some of the major elements of this XML tree structure and illustrate some of the ways you can access and extract valuable information from this Tableau workbook metadata.
Those un-packaged workbooks, .twb files, are essentially XML files containing any and all information on what has been produced and built in the workbooks themselves. They don’t contain any of the data actually being visualized in these workbooks, but rather metadata on those workbooks. Anything you click, drag, drop or do while working in Tableau Desktop will be reflected in some way in this XML file.
Most of the information available in this XML data, such as the description of a particular field, a calculated field’s formula, or a custom SQL query feeding the workbook its data is accessible when the workbook is open in Tableau Desktop. Having that information available in XML form means it can also be programmatically accessed and organized, allowing the user to customize a scripting solution to suit their respective needs.
This XML metadata has information on all the things that make Tableau go: data sources, worksheets, dashboards, filters, actions, parameters, and everything in between. It provides information about the data connections in the workbook and about each of the fields that make up those data sources.
The two screenshots below allow a comparison between what the user sees in the data source tab in Tableau Desktop, and how that information is stored in the underlying XML metadata. This connection element is where the XML metadata indicates what type of data source is being used and whether it’s a live connection or an extract, as well as if a database is being utilized, the tables being hit and/or the query being executed to produce the data source.
Beyond the data source level, the XML metadata also offers detailed information on the fields making up each data source. It stores this field-level metadata in multiple kinds of elements, some in “column” elements and others in “metadata-records” elements. Below is a screenshot showing the “metadata-record” element in the workbook’s XML metadata and, below that, what Tableau Desktop displays when you ask it to describe the corresponding field (right-click a field in the list of “Dimensions” or “Measures”, then select “Describe…”).
As is the case with numeric fields, this element contains metadata on various pieces of information on the field’s distribution of values – minimum, maximum, average, standard deviation, etc. The final screenshot in this group shows what a CSV file collecting this metadata might look like, including the information on the Postal Code field in row 11.
Worksheets & Dashboards
Another couple of core elements in a workbook’s XML metadata are those containing information on its worksheets and dashboards. Within the worksheet elements, you can find information on the data source(s) behind the visualizations on particular worksheets. You can also find which fields within those data sources are active and how they are being used: as filters, as parameters, on rows, on columns, etc.
In the screenshots below, we see in the top image information on a particular worksheet in XML format, including its name (“Customer Ranking”), the data source it uses (“Sample – Superstore”), some of the filters that appear on the worksheet, and some of the formatting changes made to the worksheets, including the fact that the field labels for rows are hidden. All of these worksheet characteristics are reflected in the corresponding Desktop screenshot (bottom image).
Here is a similar set of views, this time showing how dashboards look in Tableau Desktop juxtaposed with how they appear in their workbook’s XML metadata. Among the items included are the title of the dashboard and how it’s formatted and structured, information on the worksheets, objects (i.e. text boxes, containers, etc.) making up the dashboard, and the dashboard’s dimensions. Again, all of these features are also captured in the image from a CSV file shown.
This is just a few of the ways the underlying workbook XML can provide valuable information, which has many potentially beneficial applications, including:
- Streamlining of the documentation creation process
- Better understanding of how a workbook is structured and organized
- Simplification of the ‘generating fake data to mimick real data in a workbook’ process
- Better understanding of the logic behind the workbook’s calculated fields
It is useful to know that every move you make when building a dashboard – from importing the data, to calculating, filtering, formatting and visualizing it – is captured in the workbook metadata, giving you access to any piece of information about a workbook you might want.
Want to take your Tableau projects to the next level? Learn more about how our Tableau data analytics experts can help grow your skills today.
Subscribe to our resources!
Sign up to receive our latest eBooks, webinars, blog posts, newsletter, event invitations, and much more.