Packing Our Gear: Data Preparation With Tableau Prep BuilderBy: Greg Herzing | July 25th, 2022
“If you wanna make ETL a better thing, take a look at your tools and make a change! Shamong.” – Greg Herzing
Have you ever heard the words “change is hard?” It’s a totally human way to feel, but no matter how many times you hear those words, and no matter how many people you know feel the same way, it doesn’t make changing things any easier.
Many times, the motivation for change comes down to one thing. Pain.
That’s why you’re here today. You want to alleviate your pain. The pain that is data manipulation and analysis in Microsoft Excel. You want a new and better way to prepare and explore your data, and you know that there exists somewhere a better way to share and explain your data.
Every data analyst has their list of frustrations and pain points:
- Painstakingly collecting data from storage locations buried deep beneath the top-level file folders
- Repeatedly opening countless spreadsheets, copying data, pasting data into your special spreadsheet
- Dutifully cleaning the data that someone else entered as fast as humanly possible just to make it usable
- Creatively tying that data together to glean something meaningful to you or to someone you work with
- Expertly exploring that data in tables, pivot tables, and charts provided by the enigmatic Chart Wizard
- Dexterously weaving the meaning you’ve found into some ponderous fabric that you can save somewhere in a cedar box until you’re ready to tell the story
- And everyone’s favorite, taking little snips and pasting them all into a Word doc, a PowerPoint, or an email just to send them off to an audience like a message in a bottle
Was your message received? Did anyone pull that scroll out and get your message? Did they understand it? Is it what they wanted in the first place?
The repetitive tasks, the painstaking stewardship, the frustration of pouring your energy into a story that may never be read or appreciated is… well, it’s painful.
But I can help. Will you come with me on this journey past Excel? Be a pioneer with me. Step out from behind the not-so-friendly confines of your ABC 123 grid and discover a new possibility with me. Be a data scout.
Let’s explore Tableau.
Every good scout knows that preparation is the key to success. As a data scout you know that step in the process is called extract, transform, and load or ETL. It’s the magical manipulation of data sets large and small that makes good analysis possible.
In Excel we’re talking about sorting columns, renaming columns, homogenizing cell contents, writing dense formulas, and crudely but effectively joining data sets together with everyone’s favorite function… VLOOKUP. And of course, you must do this every time you need to perform your analysis.
If only there existed a tool that you could show how to do all of that for you the same way every time and perhaps even do it on a schedule!? Tableau has that covered. The first step on our Journey from Excel finds us in the welcoming arms of Tableau Prep Builder.
Tableau Prep Builder (Prep) is a data scout’s Swiss Army knife. In your savvy hands Prep can tackle your laborious Excel ETL tasks in a reliable, repetitive manner and much more. And don’t forget, if you lose your way, pick up that Walkie-Talkie and radio CoEnterprise for help.
Now, let’s use Prep to prepare for our journey beyond Excel ETL. Let’s pack for our hike to a new ecosystem. The Tableau ecosystem. Let’s start with food, shall we?
This is a virtual walk through our home. We have food in three different rooms of our house: kitchen, pantry, and basement. We need to combine them into our pack and identify what we don’t want to take with us.
Here’s how we connect to our food data: We’ll bring in each one of these tables and plan our meals for the trip!
Data Ingestion & Combination
- Click, “Connect to Data” or “Connections +”, then click “Microsoft Excel”
- Navigate to our data source
- Click and drag these rooms, or data tables, onto your blank white canvas.
- Here at the data ingestion step (which precedes the food digestion step), you can control what data fields are imported using the Input Pane. Anything you don’t require you can unclick in the first column of the table. Values can be filtered, data types can be changed, and field names can be edited.
- We’ll bring all of our food options together with a Union.
- This is equivalent to cutting and pasting your identical columns into the same spreadsheet in Excel.
- Add the Union tool, then click and drag each table to the union and ADD
- Then look at the results of your union.
- The color key allows a data scout to identify mismatched fields.
- Two tables (Blue and Green) contain a column named “Healthiness”, but the Yellow table representing our Kitchen.
- We can step back to data ingestion and rename the field, so it matches the others.
- In Excel you might have to ask a data owner to correct their spreadsheet or correct it yourself. Here in the Prep Builder data flow the change is made for you every time once you set it up!
- “Collect the food.” before the field name change.
- In Excel you’re CTRL + Fing for this field name, changing it. In Prep Builder, you know right where to look!
- “Collect the food.” after the field name change. PRESTO CHANGO! That’s “Magicianish” for “fixed”.
Rationalize Our Data…I Mean Food
- It looks like my kids were filling out the “Category” column in our data source. Lets GROUP the misspelled words into a single category.
- In Excel this is multiple “Find and Replaces”. What a pain. Just type right people!
- Now let’s get practical about the amount we can lug through the forest. Who thought a frozen turkey was a good idea to bring along? And potatoes? Those things are all water! We’ll FILTER them out of our packing flow.
- And while we’re at it, let’s not pack anything that we consider to be unhealthy. We’ll EXCLUDE anything with a low “Healthiness” score.
- In Excel we’re sorting and searching our columns for the values we don’t want then deleting them permanently. In Prep Builder we know what happened to our data and why it happened so that we’re able to step back in the flow any time and reverse it.
- FILTERING heavy stuff.
- Fruit Roll Ups?! GONE!
- Let’s see how much this pack weighs. We’ll create a CALCUALTED FIELD that multiplies the weight of a single item by the quantity of that item to figure out how much each food item weighs in total. And we can do the same for calorie count because we don’t want to starve out there!
- In Excel this is similar only in Prep Builder we only do it once and it will be applied to all new data fields that enter the flow without the need to copy/paste or fill rows down.
- We’re doing this in preparation for an AGGREGATION step.
Aggregation & Final Tuning
- So how much does that pack weigh? “Weight Check” will help us answer that question.
- The pack weighs… 24.4 LBS?!!? We’re going to have to take some things out. I guess it’ll have to be the beer. *sigh*
- We can also add group by fields to see what the weight and calorie totals are for any categorical data type. The function by which we aggregate can also be changed.
- Beer’s gone, new weight is acceptable, but I think we can manage some campfire relaxation. Let’s bring in a quick new data set, perform all of our other manipulations on it, and add one last minute item to the pack before we head off on our grand expedition!
- These AGGREGATE tools are just like pivot tables in Excel only easier to manage and react to any changes in data ingestion without having to refresh them and refilter them.
- Here it is! Our data output step and our final packing list for our Journey Past Excel. Yes, this list is set to export to Excel right now… that’s called irony.
- HOWEVER! This output step can also be configured to output to a Tableau Extract (*.hyper) file, a database table, or published to at Tableau Server or Cloud instance!
- Database options range from on-prem SQL Servers to Cloud-based data warehouses like Snowflake!
- In Excel you’re going to need some coding wizardry to do any of this fancy output stuff… that is, other than just saving your spreadsheet. Prep Builder makes it very easy to send data to a plethora of destinations, and every time you want to run this ETL flow you just push “Play”.
- After set up, how long is this going to take us to run in Prep Builder next Monday?
- 00:00 seconds! That spells more productivity… or a longer stroll through the cafeteria as it may be.
Don’t forget to click back here in another month or so to hit the trail toward Excel exoneration when we recheck our pack, identify the landmarks that dot our path, calculate the time it will take us to hike between them, and plot the overall time spent exploring in “Exploration: The Hike to Base Camp.”
Want to take your Tableau projects to the next level?
Learn more about how our experts can help take you analytics project to the next level.
Subscribe to our resources!
Sign up to receive our latest eBooks, webinars, blog posts, newsletter, event invitations, and much more.