Rename a DataFrame Column


As a data professional do you have a Python dataset that might require some column renaming? Why would you want to do this? One reason is to make all of the column in the dataset (pandas DataFrame) consistent. You might choose to rename everything into snake_case. Snake case (stylized as snake_case) is the naming convention in which each space is replaced with an underscore (_) character, and words are written in lowercase. You might even find that one or more columns are misspelled. Rename them. You might want to shorten or lengthen the column names. Avoid ambiguity when doing this. This is part of the data cleaning step.

How do you rename a pandas DataFrame column? Let’s look at an example. I created a project called Rename DataFrame Columns in Jupyter Notebook with Anaconda Navigator, and this is what it looks like.

import pandas as pd
# manually create a new DataFrame
data = {'name': ['Bob', 'Sally', 'Suzie', 'Rowan'],
       'num': [12, 67, 33, 41],
       'category': [2, 2, 4, 5]}
df = pd.DataFrame(data)
df

Now we’ll rename the two columns and leave the third column alone..

# rename two columns
df = df.rename(columns={'name': 'firstname', 'num': 'amount'})
df

Rename an Unnamed Column

You are working with a pandas DataFrame and you’ve noticed that the first column, is Unnamed: 0. It looks like it might be an Id column. You also wonder if the CSV you imported didn’t have a column name specified for the first column. Either way, you want to give a name to the unnamed coluumn.

Suppose we have a DataFrame called df. The unnamed column is called Unnamed: 0. The new column name that we want to use is id.

df.rename(columns ={'Unnamed: 0': 'id'}, inplace = True)
df

Use the read_csv function from the pandas library. Set the index_col parameter to 0 to read in the first column as an index and to avoid “Unnamed: 0” appearing as a column in the resulting Dataframe.

df = pd.read_csv("my_data.csv", index_col = 0)

Suppose you had a column name called Social Media. It has a space in it and you want to rename that column so that there are no spaces. Here’s how you could do that, assuming your DataFrame was named df.

df = df.rename(columns={'Social Media': 'Social_Media'})

Sorting

If you need to sort any DataFrame columns, check out the post called Sorting a pandas DataFrame.

Leave a Reply