
7 NumPy Tricks for Faster Numerical Computations
Image by Editor | ChatGPT
Introduction
Numerical computations in Python become much faster and more efficient with NumPy: a library specifically designed for array operations and vectorized mathematical functions that remove the need for loops or other statements, thereby simplifying the code and making large-scale data computations lightweight.
This article uncovers seven practical NumPy tricks to speed up numerical tasks and reduce computational overhead. Needless to say, since the NumPy library plays a starring role in the code examples below, make sure you “import numpy as np
” first!
1. Replace Loops with Vectorized NumPy Operations
NumPy’s vectorized operations eliminate the need for loops to perform a variety of array-level operations, such as summing the elements of an array. They use precompiled code written in C behind the scenes to boost efficiency in mathematical operations.
Given sales data over two consecutive days for seven stores, this example shows how to calculate the total sales per store over the two days.
sales = np.array([[120,130,115,140,150,160,170], [ 90, 85, 88, 92, 95, 100, 105]])
totals = sales.sum(axis=0) |
2. Broadcasting for Efficient Arithmetic
Broadcasting is NumPy’s mechanism that enables fast mathematical computations across arrays that may have different shapes and sizes, provided they are compatible.
Consider this example of daily prices for several products, and we want to apply a discount factor to all products that varies depending on the day:
prices = np.array([[100, 200, 300], [110, 210, 310], [120, 220, 320], [130, 230, 330]])
discounts = np.array([0.9, 0.85, 0.95, 0.8])
final_prices = prices * discounts[:, None] |
This broadcasted multiplication does the trick, but there’s a small catch: the shape of prices
is (4, 3)
, whereas discounts
is a 1D array of shape (4,)
. To make them compatible for the element-wise product across the entire price matrix, we first reshape discounts
into a 2D array of shape (4, 1)
using discounts[:, None]
.
3. Fast Math with np.where()
This trick is a great replacement for conventional Python conditionals in many situations. np.where()
applies an element-wise condition across an entire array and selects one value or another for each element based on that condition.
This code applies a 20% surcharge to a default daily cost of $100 on energy fees for days with extreme temperatures below 10 degrees or above 30 degrees.
temps = np.array([15, 22, 28, 31, 18, 10, 5])
surcharge = np.where((temps < 10) | (temps > 30), 1.2, 1.0) costs = 100 * surcharge |
Note that the resulting costs
array is also 1D of length 7, as NumPy seamlessly allows element-wise multiplication of an array by a scalar like 100.
4. Direct Matrix Multiplication with @
The @
operator makes standard matrix multiplication easy by using optimized linear algebra modules behind the scenes, without the need for loops that iterate through rows and columns. The following example illustrates the multiplication of two matrices using this operator (note that we apply the transpose of the second matrix to make dimensions compatible):
prices = np.array([[10, 12, 11], [11, 13, 12], [12, 14, 13], [13, 15, 14]])
quantities = np.array([[5, 2, 3], [6, 3, 2], [7, 2, 4], [8, 3, 5]])
total_revenue = prices @ quantities.T |
5. Fast Inner Product with np.dot
There’s also a NumPy shortcut to calculate the inner product of two arrays of equal size, thanks to the np.dot()
function.
returns = np.array([0.01, –0.02, 0.015, 0.005, 0.02]) weights = np.array([0.4, 0.1, 0.2, 0.2, 0.1])
expected_return = np.dot(returns, weights) |
The result is a scalar equal to the inner product of the two 1D arrays passed as arguments.
6. Generate Large Random Data Quickly with np.random()
When a data variable is assumed to follow a certain probability distribution, you can generate a large set of random samples on the fly with the np.random
module by choosing the appropriate distribution function and arguments. This example shows how to generate one million random sales values from a uniform distribution and compute their mean efficiently:
purchases = np.random.uniform(5, 100, size=1_000_000) avg_spend = purchases.mean() |
7. Prevent Memory-Expensive Copies with np.asarray()
The last example focuses on memory efficiency. When converting array-like data, np.asarray()
avoids making a physical copy whenever possible (e.g. when the input is already a NumPy array with a compatible dtype
), whereas np.array()
defaults to creating a copy. If the input is a plain Python list (as below), a new array will still be allocated; the memory-saving benefit appears when the input is already an ndarray
.
data_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
arr = np.asarray(data_list) mean_val = arr.mean() |
Wrapping Up
With the seven NumPy tricks illustrated in this article, and when applied to large datasets, the efficiency of numerical computations can be significantly taken to the next level. Below is a quick summary of what we learned.
Trick | Value |
---|---|
sum(axis=…) | Performs fast vectorized operations such as aggregations. |
Broadcasting | Allows operations across differently shaped, compatible arrays without explicit loops. |
np.where() | Vectorized conditional logic without looped if-statements. |
@ (matrix multiplication) | Direct, loop-free matrix multiplication. |
np.dot() | Fast inner product between arrays. |
np.random | Single vectorized approach to generate large random datasets. |
np.asarray() | Avoids unnecessary copies when possible to save memory. |