NumPy: A Comprehensive Guide with Examples | Lecture 1

 

NumPy: A Comprehensive Guide with Examples

NumPy (Numerical Python) is a powerful library in Python for numerical computing. It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays.


Key Features of NumPy

  1. Efficient multi-dimensional array support: ndarray is the core object.
  2. Broadcasting: Operate on arrays of different shapes.
  3. Vectorized operations: Faster computation compared to loops.
  4. Integration with other libraries: Pandas, SciPy, etc.
  5. Extensive math functions: Trigonometric, statistical, and linear algebra.

Getting Started with NumPy

Installation

pip install numpy

Importing NumPy

import numpy as np


Core Concepts

1. Creating Arrays

# 1D array

arr1 = np.array([1, 2, 3, 4])

print(arr1)

 

# 2D array

arr2 = np.array([[1, 2], [3, 4]])

print(arr2)

 

# Array with zeros

zeros = np.zeros((3, 3))  # 3x3 matrix

print(zeros)

 

# Array with ones

ones = np.ones((2, 4))  # 2x4 matrix

print(ones)

 

# Array with a range of numbers

range_arr = np.arange(1, 10, 2)  # Start at 1, end before 10, step 2

print(range_arr)

 

# Array with equally spaced numbers

linspace_arr = np.linspace(0, 1, 5)  # 5 numbers between 0 and 1

print(linspace_arr)

 

# Random array

rand_arr = np.random.rand(3, 2)  # 3x2 matrix with values between 0 and 1

print(rand_arr)


2. Array Attributes

arr = np.array([[1, 2, 3], [4, 5, 6]])

 

print(arr.shape)  # Shape of the array (rows, columns)

print(arr.size)   # Total number of elements

print(arr.ndim)   # Number of dimensions

print(arr.dtype)  # Data type of elements


3. Array Indexing and Slicing

arr = np.array([10, 20, 30, 40, 50])

 

# Accessing elements

print(arr[0])  # First element

print(arr[-1])  # Last element

 

# Slicing

print(arr[1:4])  # Elements from index 1 to 3

print(arr[:3])   # First three elements

print(arr[::2])  # Every second element

 

# 2D slicing

arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

print(arr2d[1, 2])  # Element at 2nd row, 3rd column

print(arr2d[:2, 1:])  # Subarray with first two rows and last two columns


4. Operations on Arrays

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

 

# Element-wise operations

print(arr1 + arr2)  # [5 7 9]

print(arr1 * arr2)  # [4 10 18]

print(arr1 ** 2)    # [1 4 9]

 

# Mathematical operations

print(np.sum(arr1))     # Sum of elements

print(np.mean(arr1))    # Mean

print(np.std(arr1))     # Standard deviation

print(np.max(arr1))     # Maximum value

print(np.min(arr1))     # Minimum value


5. Matrix Operations

# Dot product

A = np.array([[1, 2], [3, 4]])

B = np.array([[5, 6], [7, 8]])

print(np.dot(A, B))

 

# Transpose

print(A.T)

 

# Determinant

print(np.linalg.det(A))

 

# Inverse

print(np.linalg.inv(A))


6. Broadcasting

Broadcasting allows operations on arrays with different shapes.

arr = np.array([1, 2, 3])

print(arr + 10)  # Add 10 to each element


7. Reshaping and Manipulating Arrays

arr = np.arange(1, 10)

 

# Reshape

reshaped = arr.reshape((3, 3))

print(reshaped)

 

# Flatten

flattened = reshaped.flatten()

print(flattened)

 

# Stack arrays

arr1 = np.array([1, 2])

arr2 = np.array([3, 4])

print(np.vstack((arr1, arr2)))  # Vertical stack

print(np.hstack((arr1, arr2)))  # Horizontal stack


8. Boolean Indexing

arr = np.array([10, 20, 30, 40, 50])

 

# Boolean condition

print(arr[arr > 25])  # [30 40 50]


9. Handling Missing Data

arr = np.array([1, 2, np.nan, 4, 5])

 

# Check for NaN

print(np.isnan(arr))

 

# Replace NaN with a value

arr[np.isnan(arr)] = 0

print(arr)


10. Saving and Loading Data

arr = np.array([1, 2, 3, 4, 5])

 

# Save array

np.save('array_file.npy', arr)

 

# Load array

loaded_arr = np.load('array_file.npy')

print(loaded_arr)

 

# Save/load text files

np.savetxt('array_file.txt', arr)

loaded_txt = np.loadtxt('array_file.txt')

print(loaded_txt)


Advanced Topics

1. Random Sampling

# Random integers

rand_ints = np.random.randint(1, 100, size=(3, 3))

print(rand_ints)

 

# Random seed

np.random.seed(42)  # Reproducible random numbers

print(np.random.rand(3))

2. Sorting

arr = np.array([3, 1, 2])

print(np.sort(arr))  # Sorted array

3. Advanced Indexing

arr = np.array([10, 20, 30, 40, 50])

 

# Using a list of indices

indices = [0, 2, 4]

print(arr[indices])  # [10, 30, 50]


Conclusion

NumPy is a fundamental library for scientific computing in Python. Its array manipulation, mathematical operations, and efficient computations make it an essential tool for data scientists, engineers, and researchers.

 

Comments

Popular posts from this blog

Ecommerce Purchases Data Analysis Exercises (Pandas Practice)

Handling Missing Data in Python DataFrames | Lecture 4