Tableau Prep + Python 101

Viviana Márquez
3 min readApr 22, 2020

--

In this post, I will show you how to integrate Python scripts to your Tableau Prep flow. If you’re curious about how to integrate Python to Tableau Desktop, then go to my previous article here.

Tableau Prep ≠ Tableau Desktop

Yes, me too. Until recently, I also was not aware of its existence. So, what is Tableau Prep? It is a data preparation tool allows users to cleanse, aggregate, merge, and do other preparations to their data before it goes to Tableau Desktop for analysis.

Talk Python to me 🐍

That being said, you can also use Python in Tableau Prep to further prepare your data and here I’ll show you how to do it.

First, you have to be running tabpy on your server (click here to see how).

Next, you’ll load your data as you normally would and then click on Script.

Once you do that, click on Tableau Python (TabPy) Server and put in your credentials. If you’re using your local machine, then just do Server: localhost and Port: 9004 and leave the rest empty.

Then you’ll click on browse to find your Python script and specify the function name where you’re doing all the transformations.

What about the script?

Creating the script is somewhat cumbersome and I hope Tableau Prep improves this in future releases. But for now, here’s what you have to keep in mind:

  • You can have as many functions as you wish, but be sure to have a function that will return your desired data frame output. This function is the one you’ll call in the Function Name field on Tableau Prep. In this case, I called it my_fun.
  • If you’re going to add more columns than what you originally had, you have to create a function called get_output_schema where you specify for all your column their types exactly as shown below. I know, annoying.
import pandas as pd
import numpy as np

def sqrt_root(number):
return np.sqrt(number)

def my_fun(df):
df['SqrtRevenue'] = df['Revenue'].apply(lambda n: sqrt_root(n))
return df

# Create Schema
def get_output_schema():
return pd.DataFrame({
'Store': prep_string(),
'Revenue': prep_int(),
'SqrtRevenue': prep_decimal()
})
  • TRICK:
    In case you’re only adding a couple of columns, here’s a trick so you don’t have to create the get_output_schema function. Before loading your Script on Tableau Prep, add a Clean Step where you create a new column (calculated metric) with just a “0” in it, and name it exactly as your new column on the script. Tadah!

That’s all for now, thank you for reading! Feel free to drop questions or comments in the box below if you’d like to learn something new about Tableau!

--

--

Viviana Márquez

🤖👩‍💻 EdTech Founder @MissFactorial 📊🎞️ Making AI & Data as captivating as your favorite series 🎤 Miss Sucre 2021 & TV/Radio Host