NumPy is a Python library. NumPy is used for working with arrays. NumPy is short for “Numerical Python”. NumPy also has functions for working in domain of linear algebra, fourier transform, and matrices. In 2005, it was created by Travis Oliphant. It is an open source project and you can use it freely.
In Python there are lists that serve the purpose of arrays, but they are slow to process. NumPy aims to provide an array object that is up to 50x faster than traditional Python lists. The array object in NumPy is called ndarray, it provides a lot of supporting functions that make working with ndarray easy. Arrays are frequently used in data science, where speed and resources are very important.
NumPy arrays are stored at one continuous (contiguous) place in memory, unlike lists. In this way, processes can access and manipulate them very efficiently. This behavior is called locality of reference in computer science. This is the main reason why NumPy is faster than lists. Also it is optimized to work with latest CPU architectures.
Here below is a very simple example of using NumPy from w3schools.com in their NumPy tutorial. We can create a NumPy ndarray object by using the array() function.
import numpy as np # use the np alias arr = np.array([1, 2, 3, 4, 5]) print(arr) print(type(arr))
The version string is stored under __version__ attribute. It is traditional to use the alias np for NumPy.
import numpy as np print(np.__version__)
You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0. In the code above where we’ve created an array called arr, print(arr[0]) would return 1, and print(arr[2]) would return 3.
In NumPy, vectorization enables operations to be performed on multiple components of a data object at the same time. Data professionals often work with large datasets, and vectorized code helps them efficiently compute large quantities of data.