- EDA Discovering with Pandas
- Pandas Assign to Add a Column
- EDA Discovering with Visuals
How do you add a column to a pandas DataFrame using assign?
The documentation for assign is called pandas.DataFrame.assign. There are a couple of examples here when you have a temperature column in Celcius and you want to add a column that’s in Fahrenheit. The formula is F = C 9/5 + 32. Assuming your DataFrame is called df, here below is an example. The new column will be called temp_f. It’s values are based on the existing column temp_c and a formula.
df.assign(temp_f = df['temp_c'] * 9 / 5 + 32)
The syntax is not complex. Let’s create a new function that does absolutely nothing, just for illustration.
def do_nothing(my_string): return my_string df.assign(new_col = do_nothing('abc'))
The above code adds a new column to the DataFrame (called df). The column is called new_col. What’s in the now column? Every row has the three characters ‘abc’.
We can add more than one column with assign(). The following code works.
df = df.assign(ab = '1', dc = '2')
Without assign
We can create a new column without using assign. In this example we’ll divide one column by another.
df['new_col'] = df[col1] / df[col2]
We should also round the result. In this case we’ll do this again.
df['new_col'] = round(df[col1] / df[col2],3)
Let’s do some comparisons.
SQL
How do you add a column in SQL?
ALTER TABLE table_name ADD column_name datatype;
After creating the column you could use UPDATE … SET to fill the column with some data. Hare’s a post on this website called SQL Add Column.
DAX
If you are working with DAX, we have a post here at this website called Calculated Columns Introduction.
Tableau
In the visualization software Tableau, adding a column is easy. They are called calculated fields. Calculated fields reference different fields (columns) in your data. For example, if you had a column called Sales and a column called Profit, you could create a calculated field called ‘Profit Ratio’ by dividing Profit by Sales. How do you accomplish this? Click the drop-down arrow at the top of the Data pane (on the left). Select Create Calculated Field. A window pops up for you to name and type your calculation