home
bytes
articles
getting started with num py
Harshini Bhat
Data Science Consultant at almaBetter
11 mins
2457
Are you tired of numerical computations and struggling with data manipulation in Python? Meet NumPy - a powerful library that helps perform numerical operations, linear algebra, and array manipulation in Python in an efficient manner.
NumPy is a must-have tool for anyone looking forward to or working with data in Python, from Data scientists and Data engineers to machine learning enthusiasts. We will explore the basics of NumPy and get started with this powerful library.
Everything from installing NumPy to creating arrays and performing mathematical operations will be covered in this article. Whether you are an experienced Python developer or just starting out, get ready to upgrade your numerical computing skills with NumPy!
NumPy (short for Numerical Python) is a powerful Python library that is open-source and utilized in practically every discipline of research and engineering. It's the universal Python standard for working with numerical data, and it's at the heart of the scientific Python and PyData ecosystems. NumPy users range from novice coders to expert researchers conducting cutting-edge scientific and corporate research and development. Pandas, SciPy, Matplotlib, scikit-learn, scikit-image, and most other data science and scientific Python packages make substantial use of the NumPy API.
The NumPy library includes multidimensional array and matrix data structures (more on this in later parts). It provides ways for efficiently operating on ndarray, a homogenous n-dimensional array object. NumPy can be used to conduct a wide range of tasks.
NumPy may be used to conduct a wide range of array-based mathematical operations. It extends Python with sophisticated data structures that provide efficient calculations with arrays and matrices, as well as a vast library of high-level mathematical functions that operate on these arrays and matrices.
NumPy provides a host of features that make it a versatile and indispensable tool for scientific computing and data analysis. It allows you to work with large, multi-dimensional arrays and perform complex mathematical operations on them efficiently. NumPy arrays are faster and smaller than Python lists. An array uses less memory and is easier to utilize. NumPy stores data in substantially less RAM and has a mechanism for specifying data types. This enables even further optimization of the code. NumPy's extensive library of functions and tools makes it easy to perform linear algebra, statistical analysis, and data manipulation tasks. Plus, NumPy integrates seamlessly with other Python libraries, making it a valuable asset in a wide range of applications.
NumPy is designed to solve a wide range of numerical problems, such as:
NumPy is an incredibly useful library that help to solve a wide range of numerical problems efficiently and effectively.
One should be familiar with Python. See the Python tutorial for a refresher.
Matplotlib, as well as NumPy, are required to run the examples. We will go through the installation in this article
We strongly advise using a scientific Python distribution to install NumPy. See Installing NumPy for complete instructions on installing NumPy on your operating system.
If you currently have Python installed, you may install NumPy using:
conda install NumPy
or
pip install NumPy
If you don't already have Python, you might want to look into Anaconda. It's the most straightforward method to get started. The advantage of acquiring this distribution is that we won't have to worry about separately installing NumPy or any of the other key programs we will be utilized for data analysis, such as pandas, Scikit-Learn, and so on.
How to import NumPy?
We can Import NumPy and its functions into our Python code as follows:
import NumPy as np
We shorten the imported name to np to improve code readability while using NumPy. This is a well-accepted convention that you should adhere to so that anyone working with your code may understand it quickly.
What is an array?
The NumPy library's central data structure is an array. An array is a grid of numbers that provides information about the raw data and how to locate and interpret elements. It has an element grid that may be indexed in numerous ways. The array data type refers to the fact that all of the items are of the same type.
A tuple of nonnegative integers, booleans, another array, or integers can be used to index an array. The number of dimensions is represented by the array's rank. The array's form is a tuple of integers indicating the array's size along each dimension.
For example:
a = np.array([1, 2, 3, 4, 5, 6])
or:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
With square brackets, we can access the array's elements. When accessing items, keep in mind that NumPy indexing begins at 0. That is, if we wish to access the first element in our array, we will use element "0."
An array is sometimes referred to as a "ndarray," which is shorthand for "N-dimensional array." An N-dimensional array is simply any number of dimensions in an array. We may also come across terms like 1-D, or one-dimensional array, 2-D, or two-dimensional array, and so on. Matrixes and vectors are both represented by the NumPy ndarray class. A vector is a one-dimensional array (there is no distinction between a row and column vectors), whereas a matrix is a two-dimensional array. Tensor is another word for three-dimensional or higher-dimensional arrays.
What are the attributes of an array?
Let us see the attributes with the help of Python code examples. The attributes of an array are as follows
# Create an array of 5 integers
my_array = [10, 20, 30, 40, 50]
# Size attribute
array_size = len(my_array)
print("Size of array:", array_size)
# Type attribute
print("Type of array:", type(my_array))
# Indexing attribute
third_element = my_array[2]
print("Third element of array:", third_element)
# Contiguous memory allocation attribute
import ctypes
array_address = id(my_array)
print("Memory address of array:", array_address)
# Homogeneity attribute
my_array[1] = 'hello' # This will raise a TypeError
# Fixed size attribute
my_array.append(60) # This will add an element to the array
print("New size of array:", len(my_array))
# Fixed size attribute
my_array.append(60) # This will add an element to the array
print("New size of array:", len(my_array))
Arrays support a variety of basic operations. Here are some of the most common ones with examples in Python:
1. Creating an array: Arrays can be created using square brackets notation in Python:
# Create an array of integers
my_array = [1, 2, 3, 4, 5]
2. Accessing an element: Elements in an array can be accessed using their index:
# Access the third element of the array
third_element = my_array[2]
3. Updating an element: Elements in an array can be updated using their index:
# Update the first element of the array
my_array[0] = 10
4. Adding an element: Elements can be added to the end of an array using the append method:
# Add an element to the end of the array
my_array.append(6)
5. Removing an element: Elements can be removed from an array using the remove method
# Remove the second element of the array
my_array.remove(2)
6. Sorting an array: Arrays can be sorted using the sorted function:
# Sort the array in ascending order
sorted_array = sorted(my_array)
7. Reversing an array: Arrays can be reversed using the reverse method
# Reverse the order of the array
my_array.reverse()
8. Finding the length of an array: The length of an array can be found using the len function:
# Find the length of the array
array_length = len(my_array)
9. Iterating over an array: Arrays can be iterated over using a for loop:
# Print each element of the array
for element in my_array:
print(element)
10. Arithmetic operators can be applied to arrays to perform element-wise operations. Here are some examples in Python:
a = np.array([10,20, 30, 40, 50])
b = np.arange(5)# create an array with 5 entries
b
Output→ array([0, 1, 2, 3, 4])
c = a - b
c
Output →array([10, 19, 28, 37, 46])
b**2
# Raise an array to a power element-wise
array1 = [1, 2, 3]
result_array = [a ** 2 for a in array1]
print(result_array)
Output: [1, 4, 9]
In NumPy, a universal function (ufunc) is a function that performs element-wise operations on ndarrays. Universal functions are important in NumPy because they allow you to perform fast and vectorized operations on large arrays, without the need for Python loops.
Here are some examples of universal functions in NumPy:
All of these functions operate element-wise on the input arrays, which means that they perform the same operation on each element of the array. For example, if we have two arrays x and y, np.add(x, y) will compute the sum of the first element of x and the first element of y, the sum of the second element of x and the second element of y, and so on.
NumPy also provides many other universal functions, and we can even create our own custom ufuncs using the np.frompyfunc() or np.vectorize() functions.
There may be occasions when we need to perform an operation between an array and a single integer (also known as an operation between a vector and a scalar) or between arrays of different sizes. For example, our array (which we'll refer to as "data") may contain information about distance in miles that you want to convert to kilometers.
This operation can be carried out with python as follows:
data = np.array([1.0, 2.0])
data * 1.6
Output→ array([1.6, 3.2])
Learning NumPy is a must for anyone interested in scientific computing or data analysis with Python. NumPy's efficient arrays and sophisticated array operations allow you to swiftly and simply manipulate enormous volumes of data. By learning the fundamentals of NumPy, we will be able to use the various tools and libraries that are built on top of it, such as Pandas, Matplotlib, and SciPy.
While the syntax of NumPy may appear difficult at first, it is well worth the time and effort to become acquainted with the library, as it will save you many hours of work in the long run. Thus, if one has not started already, plunge into NumPy and unleash the full power of scientific computing.
NumPy recognizes that multiplication should occur with each cell. This is known as broadcasting. Broadcasting is a method that allows NumPy to operate on arrays of various shapes. Your array's dimensions must be compatible, for example, when the size of both arrays is equal, or one of them is 1. If the dimensions are incompatible, a ValueError will be returned.
If you have a keen interest in learning the key aspects of Data Science, sign up for AlmaBetter’s Full Stack Data Science program to become a coveted Data Science and Analytics professional.
Stay tuned to our blog page for more interesting blogs.
Read our recent blog on “Mastering Machine Learning in 2023:Top 10 Libraries to Keep Your Eye On”.