10 Useful NumPy One-Liners for Time Series Analysis

10 Useful NumPy One-Liners for Time Series Analysis
Image by Editor | ChatGPT

Introduction

Working with time series data often means wrestling with the same patterns over and over: calculating moving averages, detecting spikes, creating features for forecasting models. Most analysts find themselves writing lengthy loops and complex functions for operations that could actually be solved — with NumPy — in a single line of elegant and easy-to-maintain code.

How to Build an Agentic Decision-Tree RAG System with Intelligent Query Routing, Self-Checking, and Iterative Refinement?

Revolutionizing MLOps: Enhanced BigQuery ML UI for Seamless Model Creation and Management

NumPy’s array operations can help simplify most common time series operations. Instead of thinking step-by-step through data transformations, you can apply vectorized operations that process entire datasets at once.

This article covers 10 NumPy one-liners that can be used for time series analysis tasks you’ll come across often. Let’s get started!

🔗 Link to the Colab notebook

Sample Data

Let’s create realistic time series data to check each of our one-liners:

import numpy as np import pandas as pd from datetime import datetime, timedelta # Create sample time series data np.random.seed(42) dates = pd.date_range(‘2023-01-01′, periods=100, freq=’D’) trend = np.linspace(100, 200, 100) seasonal = 20 * np.sin(2 * np.pi * np.arange(100) / 30) noise = np.random.normal(0, 5, 100) values = trend + seasonal + noise # Additional sample data for examples stock_prices = np.array([100, 102, 98, 105, 107, 103, 108, 112, 109, 115]) returns = np.array([0.02, -0.03, 0.05, 0.01, -0.02, 0.04, 0.03, -0.01, 0.02, -0.01]) volumes = np.array([1000, 1200, 800, 1500, 1100, 900, 1300, 1400, 1050, 1250])

import numpy as np

import pandas as pd

from datetime import datetime, timedelta

# Create sample time series data

np.random.seed(42)

dates = pd.date_range(‘2023-01-01’, periods=100, freq=‘D’)

trend = np.linspace(100, 200, 100)

seasonal = 20 * np.sin(2 * np.pi * np.arange(100) / 30)

noise = np.random.normal(0, 5, 100)

values = trend + seasonal + noise

# Additional sample data for examples

stock_prices = np.array([100, 102, 98, 105, 107, 103, 108, 112, 109, 115])

returns = np.array([0.02, –0.03, 0.05, 0.01, –0.02, 0.04, 0.03, –0.01, 0.02, –0.01])

volumes = np.array([1000, 1200, 800, 1500, 1100, 900, 1300, 1400, 1050, 1250])

With our sample data generated, let’s get to our one-liners.

1. Creating Lag Features for Prediction Models

Lag features capture temporal dependencies by shifting values backward in time. This is essential for autoregressive models.

# Create multiple lag features lags = np.column_stack([np.roll(values, i) for i in range(1, 4)]) print(lags)

# Create multiple lag features

lags = np.column_stack([np.roll(values, i) for i in range(1, 4)])

print(lags)

Truncated output:

[[217.84819466 218.90590418 219.17551225] [102.48357077 217.84819466 218.90590418] [104.47701332 102.48357077 217.84819466] [113.39337757 104.47701332 102.48357077] … … … [217.47142868 205.96252929 207.85185069] [219.17551225 217.47142868 205.96252929] [218.90590418 219.17551225 217.47142868]]

[[217.84819466 218.90590418 219.17551225]

[102.48357077 217.84819466 218.90590418]

[104.47701332 102.48357077 217.84819466]

[113.39337757 104.47701332 102.48357077]

...

[217.47142868 205.96252929 207.85185069]

[219.17551225 217.47142868 205.96252929]

[218.90590418 219.17551225 217.47142868]]

This gives a matrix where each column represents values shifted by 1, 2, and 3 periods respectively. The first few rows contain wrapped-around values from the end of the series.

2. Calculating Rolling Standard Deviation

Rolling standard deviation is a decent measure of volatility. Which is particularly useful in risk assessment.

# 5-period rolling standard deviation rolling_std = np.array([np.std(values[max(0, i-4):i+1]) for i in range(len(values))]) print(rolling_std)

# 5-period rolling standard deviation

rolling_std = np.array([np.std(values[max(0, i–4):i+1]) for i in range(len(values))])

print(rolling_std)

Truncated output:

[ 0. 0.99672128 4.7434077 7.91211311 7.617056 6.48794287 … … … 6.45696044 6.19946918 5.74848214 4.99557589]

[ 0. 0.99672128 4.7434077 7.91211311 7.617056 6.48794287 ... ... ...

6.45696044 6.19946918 5.74848214 4.99557589]

We get an array showing how volatility changes over time, with early values calculated on fewer periods until the full window is available.

3. Detecting Outliers Using Z-Score Method

Outlier detection helps identify unusual data points due to market events or data quality issues.

# Identify outliers beyond 2 standard deviations outliers = values[np.abs((values – np.mean(values)) / np.std(values)) > 2] print(outliers)

# Identify outliers beyond 2 standard deviations

outliers = values[np.abs((values – np.mean(values)) / np.std(values)) > 2]

print(outliers)

Output:

[217.47142868 219.17551225 218.90590418 217.84819466]

[217.47142868 219.17551225 218.90590418 217.84819466]

This returns an array containing only the values that deviate significantly from the mean, useful for flagging anomalous periods.

4. Calculate Exponential Moving Average

Instead of regular moving averages, you may sometimes need exponential moving averages which give more weight to recent observations. This makes it more responsive to trend changes.

ema = np.array([values[0]] + [0.3 * values[i] + 0.7 * ema[i-1] for i, ema in enumerate([values[0]] + [0] * (len(values)-1)) if i > 0][:len(values)-1]) print(ema)

ema = np.array([values[0]] + [0.3 * values[i] + 0.7 * ema[i–1] for i, ema in enumerate([values[0]] + [0] * (len(values)–1)) if i > 0][:len(values)–1])

print(ema)

Well, this won’t work as expected. This is because exponential moving average calculation is inherently recursive, and it isn’t straightforward to do recursion in vectorized form. The above code will raise a TypeError exception. But feel free to uncomment the above code cell in the notebook and check for yourself.

Here’s a cleaner approach that works:

# More readable EMA calculation alpha = 0.3 ema = values.copy() for i in range(1, len(ema)): ema[i] = alpha * values[i] + (1 – alpha) * ema[i-1] print(ema)

# More readable EMA calculation

alpha = 0.3

ema = values.copy()

for i in range(1, len(ema)):

ema[i] = alpha * values[i] + (1 – alpha) * ema[i–1]

print(ema)

Truncated output:

[102.48357077 103.08160353 106.17513574 111.04294223 113.04981966 … … … 200.79862052 205.80046297 209.81297775 212.54085568 214.13305737]

[102.48357077 103.08160353 106.17513574 111.04294223 113.04981966 ... ... ...

200.79862052 205.80046297 209.81297775 212.54085568 214.13305737]

We now get a smoothed series that reacts faster to recent changes compared to simple moving averages.

5. Finding Local Maxima and Minima

Peak and trough detection is important for identifying trend reversals and support or resistance levels. Let’s now find local maxima in the sample data.

# Find local peaks (maxima) peaks = np.where((values[1:-1] > values[:-2]) & (values[1:-1] > values[2:]))[0] + 1 print(peaks)

# Find local peaks (maxima)

peaks = np.where((values[1:–1] > values[:–2]) & (values[1:–1] > values[2:]))[0] + 1

print(peaks)

Output:

[ 3 6 9 12 15 17 20 22 25 27 31 34 36 40 45 47 50 55 59 65 67 71 73 75 82 91 94 97]

[ 3 6 9 12 15 17 20 22 25 27 31 34 36 40 45 47 50 55 59 65 67 71 73 75

82 91 94 97]

We now get an array of indices where local maxima occur. This can help identify potential selling points or resistance levels.

6. Calculating Cumulative Returns from Price Changes

It’s sometimes helpful to transform absolute price changes into cumulative performance metrics.

# Cumulative returns from daily returns cumulative_returns = np.cumprod(1 + returns) – 1 print(cumulative_returns)

# Cumulative returns from daily returns

cumulative_returns = np.cumprod(1 + returns) – 1

print(cumulative_returns)

Output:

[ 0.02 -0.0106 0.03887 0.0492587 0.02827353 0.06940447 0.1014866 0.09047174 0.11228117 0.10115836]

[ 0.02 –0.0106 0.03887 0.0492587 0.02827353 0.06940447

0.1014866 0.09047174 0.11228117 0.10115836]

This shows total return over time, which is essential for performance analysis and portfolio tracking.

7. Normalizing Data to 0-1 Range

Min-max scaling ensures all features are mapped to the same [0,1] range avoiding skewed feature values from affecting analyses.

# Min-max normalization normalized = (values – np.min(values)) / (np.max(values) – np.min(values)) print(normalized)

# Min-max normalization

normalized = (values – np.min(values)) / (np.max(values) – np.min(values))

print(normalized)

Truncated output:

[0.05095609 0.06716856 0.13968446 0.21294383 0.17497438 0.20317761 … … … 0.98614086 1. 0.9978073 0.98920506]

[0.05095609 0.06716856 0.13968446 0.21294383 0.17497438 0.20317761

... ... ...

0.98614086 1. 0.9978073 0.98920506]

Now the values are all scaled between 0 and 1, preserving the original distribution shape while standardizing the range.

8. Calculating Percentage Change

Percentage changes provide scale-independent measures of movement:

# Percentage change between consecutive periods pct_change = np.diff(stock_prices) / stock_prices[:-1] * 100 print(pct_change)

# Percentage change between consecutive periods

pct_change = np.diff(stock_prices) / stock_prices[:–1] * 100

print(pct_change)

Output:

[ 2. -3.92156863 7.14285714 1.9047619 -3.73831776 4.85436893 3.7037037 -2.67857143 5.50458716]

[ 2. –3.92156863 7.14285714 1.9047619 –3.73831776 4.85436893

3.7037037 –2.67857143 5.50458716]

The output is an array showing percentage movement between each period, with length one less than the original series.

9. Creating Binary Trend Indicator

Sometimes you may need binary indicators instead of continuous values. As an example, let’s convert continuous price movements into discrete trend signals for classification models.

# Binary trend (1 for up, 0 for down) trend_binary = (np.diff(values) > 0).astype(int) print(trend_binary)

# Binary trend (1 for up, 0 for down)

trend_binary = (np.diff(values) > 0).astype(int)

print(trend_binary)

Output:

[1 1 1 0 1 1 0 0 1 0 0 1 0 0 1 0 1 0 0 1 0 1 0 1 1 0 1 0 1 1 1 0 0 1 0 1 0 1 1 1 0 0 0 0 1 0 1 0 0 1 0 0 1 1 1 0 1 1 1 0 1 1 1 1 1 0 1 0 0 1 1 0 1 0 1 0 0 0 0 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 0 1 1 0 0]

[1 1 1 0 1 1 0 0 1 0 0 1 0 0 1 0 1 0 0 1 0 1 0 1 1 0 1 0 1 1 1 0 0 1 0 1 0

1 1 1 0 0 0 0 1 0 1 0 0 1 0 0 1 1 1 0 1 1 1 0 1 1 1 1 1 0 1 0 0 1 1 0 1 0

1 0 0 0 0 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 0 1 1 0 0]

The output is a binary array indicating upward (1) or downward (0) movements between consecutive periods.

10. Calculating Useful Correlations

We’ll often need to calculate the correlation between variables for meaningful analysis and interpretation. Let’s measure the relationship between price movements and trading activity.

# Correlation coefficient in one line price_volume_corr = np.corrcoef(stock_prices, volumes)[0, 1] print(np.round(price_volume_corr,4))

# Correlation coefficient in one line

price_volume_corr = np.corrcoef(stock_prices, volumes)[0, 1]

print(np.round(price_volume_corr,4))

Output:

We get a single correlation coefficient between -1 and 1. Which indicates the strength and direction of the linear relationship.

Wrapping Up

These NumPy one-liners show how you can use vectorized operations to make time series tasks easier and faster. They cover common real-world problems — like creating lag features for machine learning, spotting unusual data points, and calculating financial stats — while keeping the code short and clear.

The real benefit of these one-liners isn’t just that they’re short, but that they run efficiently and are easy to understand. Since NumPy is built for speed, these operations handle large datasets well and help keep your code clean and readable.

Once you get the hang of these techniques, you’ll be able to write time series code that’s both efficient and easy to work with.

Source_link