Pandas is an open-source library that is made mainly for working with relational or labeled data in Python. pandas has adopted coding idioms from another Python library called NumPy. NumPy stands for Number Python. pandas is designed for working with tabular or heterogeneous data. NumPy, by contrast is designed to work with homogeneously typed numerical array data.
pandas became an open-source project in 2010. The developer community has grown to over 2,500 contributors. Many data analysts work with pandas. Pandas is actually short for panel data structures. pandas has nothing to do with the bear.
Exploratory Data Analysis (EDA)
We have a series of posts here on EDA with pandas. It starts with the post called EDA Discovering with Pandas. The second post is EDA Structuring with Pandas. The third is EDA Cleaning with Pandas.
pandas’ Data Structures
pandas has two main data structures: Series and DataFrame. A series is a one-dimensional array-like object containing a sequence of values of the same type. It also comes with an associated array of data labels called an index.