Manually Create a pandas DataFrame


A DataFrame is a rectangular table of data with an ordered, named collection of columns. Each columns has its own data type which might be numeric, string, boolean, object and so on. The DataFrame has a row and column index. A DataFrame is typically tow-dimensional, like a table or spreadsheet in Excel or Google Sheets. You can manually construct a DataFrame several different ways, but this post will provide a few small DataFrames that you can practice with that are created from a dictionary of equal-length lists or NumPy arrays. Feel free to copy them, edit them and use them in your projects.

Below is a very simple example that will illustrate the syntax. The screenshot is from Jupyter Notebook.

import pandas as pd  # import the pandas library into Python
data = {'firstname': ['Bob', 'Sally', 'Suzie', 'Rohan'],
       'amount': [12, 67, 33, 41]}
df = pd.DataFrame(data)
df

From Excel to DataFrame

Here is a post called Excel Columns to CSV on how to use an Excel file to avoid all of the awkward syntax typing it might take to enter all of this data manually. It would also be easier to enter it manually into Excel and use the VBA code to create the lists. Normally in the real world you would import the data into Python with a function like read_csv(). One reason why this is handy is because when you are creating a tutorial you want the examples to be easy for the reader to replicate and creating a DataFrame manually avoids the complexity of importing data from an external file.

Company

Here is an example of a very small DataFrame with companies.

import pandas as pd  # import the pandas library into Python
data = {'company': ['ABC Inc.', 'XYZ Corp.', 'Acme Ltd', 'Widget LLC'],
       'sales': [1286, 6722, 3320, 4197],
       'industry': ['Technology', 'Foods', 'Foods', 'Technology'],
        'date founded': [ 2/25/2006, 5/17/2003, 3/7/2011, 11/2/2012]}
df = pd.DataFrame(data)
df

Below is another example of a manually created small DataFrame.

import pandas as pd  # import the pandas library into Python
data = {'firstname': ['Bob', 'Sally', 'Suzie', 'Rohan'],
       'amount': [12, 67, 33, 41],
       'group': ['B', 'A', 'A', 'B']}
df = pd.DataFrame(data)
df

Leave a Reply