• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Saturday, June 27, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

Forecasting the Future with Tree-Based Models for Time Series

Josh by Josh
December 2, 2025
in Al, Analytics and Automation
0
Forecasting the Future with Tree-Based Models for Time Series


In this article, you will learn how to turn a raw time series into a supervised learning dataset and use decision tree-based models to forecast future values.

Topics we will cover include:

  • Engineering lag features and rolling statistics from a univariate series.
  • Preparing a chronological train/test split and fitting a decision tree regressor.
  • Evaluating with MAE and avoiding data leakage with proper feature design.

Let’s not waste any more time.

Forecasting Future Tree-Based Models Time Series

Forecasting the Future with Tree-Based Models for Time Series
Image by Editor

Introduction

Decision tree-based models in machine learning are frequently used for a wide range of predictive tasks such as classification and regression, typically on structured, tabular data. However, when combined with the right data processing and feature extraction approaches, decision trees also become a powerful predictive tool for other data formats like text, images, or time series.

READ ALSO

Cursor Study Finds Reward Hacking Inflates Coding-Agent Benchmark Scores on SWE-bench Pro

Building Browser-Using AI Agents in Python

This article demonstrates how decision trees can be used to perform time series forecasting. More specifically, we show how to extract significant features from raw time series — such as lagged features and rolling statistics — and leverage this structured information to perform the aforementioned predictive tasks by training decision tree-based models.

Building Decision Trees for Time Series Forecasting

In this hands-on tutorial, we will use the monthly airline passengers dataset available for free in the sktime library. This is a small univariate time series dataset containing monthly passenger numbers for an airline indexed by year-month, between 1949 and 1960.

Let’s start by loading the dataset — you may need to pip install sktime first if you haven’t used the library before:

import pandas as pd

from sktime.datasets import load_airline

 

y = load_airline()

y.head()

Since this is a univariate time series, it is managed as a one-dimensional pandas Series indexed by date (month-year), rather than a two-dimensional DataFrame object.

To extract relevant features from our time series and turn it into a fully structured dataset, we define a custom function called make_lagged_df_with_rolling, which takes the raw time series as input, plus two keyword arguments: lags and roll_window, which we will explain shortly:

def make_lagged_df_with_rolling(series, lags=12, roll_window=3):

    df = pd.DataFrame({“y”: series})

    

    for lag in range(1, lags+1):

        df[f“lag_{lag}”] = df[“y”].shift(lag)

    

    df[f“roll_mean_{roll_window}”] = df[“y”].shift(1).rolling(roll_window).mean()

    df[f“roll_std_{roll_window}”] = df[“y”].shift(1).rolling(roll_window).std()

    

    return df.dropna()

 

df_features = make_lagged_df_with_rolling(y, lags=12, roll_window=3)

df_features.head()

Time to revisit the above code and see what happened inside the function:

  1. We first force our univariate time series to become a pandas DataFrame, as we will shortly expand it with several additional features.
  2. We incorporate lagged features; i.e., given a specific passenger value at a timestamp, we collect the previous values from preceding months. In our scenario, at time t, we include all consecutive readings from t-1 up to t-12 months earlier, as shown in the image below. For January 1950, for instance, we have both the original passenger numbers and the equivalent values for the previous 12 months added across 12 additional attributes, in reverse temporal order.
  3. Finally, we add two more attributes containing the rolling average and rolling standard deviation, respectively, spanning three months. That is, given a monthly reading of passenger numbers, we calculate the average or standard deviation of the latest n = 3 months excluding the current month (see the use of .shift(1) before the .rolling() call), which prevents look-ahead leakage.

The resulting enriched dataset should look like this:

Augmented time series with lagged and rolling features

After that, training and testing the decision tree is straightforward and done as usual with scikit-learn models. The only aspect to keep in mind is: what will be our target variable to predict? Of course, we want to forecast “unknown” values of passenger numbers at a given month based on the rest of the features extracted. Therefore, the original time series variable becomes our target label. Also, make sure you choose the DecisionTreeRegressor, as we are focused on numerical predictions in this scenario, not classifications:

Partitioning the dataset into training and test, and separating the labels from predictor features:

train_size = int(len(df_features) * 0.8)

train, test = df_features.iloc[:train_size], df_features.iloc[train_size:]

 

X_train, y_train = train.drop(“y”, axis=1), train[“y”]

X_test, y_test = test.drop(“y”, axis=1), test[“y”]

Training and evaluating the decision tree error (MAE):

from sklearn.tree import DecisionTreeRegressor

from sklearn.metrics import mean_absolute_error

 

dt_reg = DecisionTreeRegressor(max_depth=5, random_state=42)

dt_reg.fit(X_train, y_train)

y_pred = dt_reg.predict(X_test)

 

print(“Forecasting:”)

print(“MAE:”, mean_absolute_error(y_test, y_pred))

In one run, the resulting error was MAE ≈ 45.32. That is not bad, considering that monthly passenger numbers in the dataset are in the several hundreds; of course, there is room for improvement by using ensembles, extracting additional features, tuning hyperparameters, or exploring alternative models.

A final takeaway: unlike traditional time series forecasting methods, which predict a future or unknown value based solely on past values of the same variable, the decision tree we built predicts that value based on other features we created. In practice, it is often effective to combine both approaches with two different model types to obtain more robust predictions.

Wrapping Up

This article showed how to train decision tree models capable of dealing with time series data by extracting features from them. Starting with a raw univariate time series of monthly passenger numbers for an airline, we extracted lagged features and rolling statistics to act as predictor attributes and performed forecasting via a trained decision tree.



Source_link

Related Posts

Cursor Study Finds Reward Hacking Inflates Coding-Agent Benchmark Scores on SWE-bench Pro
Al, Analytics and Automation

Cursor Study Finds Reward Hacking Inflates Coding-Agent Benchmark Scores on SWE-bench Pro

June 26, 2026
Building Browser-Using AI Agents in Python
Al, Analytics and Automation

Building Browser-Using AI Agents in Python

June 26, 2026
MIT in the media: Exploring how curiosity-driven science is an essential ingredient in America’s success | MIT News
Al, Analytics and Automation

MIT in the media: Exploring how curiosity-driven science is an essential ingredient in America’s success | MIT News

June 26, 2026
DeepReinforce Releases Ornith-1.0: An Open-Source Coding Model Family That Learns Its Own RL Scaffolds
Al, Analytics and Automation

DeepReinforce Releases Ornith-1.0: An Open-Source Coding Model Family That Learns Its Own RL Scaffolds

June 26, 2026
Al, Analytics and Automation

Clustering Unstructured Text with LLM Embeddings and HDBSCAN

June 25, 2026
Improving the speed and energy-efficiency of AI agents | MIT News
Al, Analytics and Automation

Improving the speed and energy-efficiency of AI agents | MIT News

June 25, 2026
Next Post
YouTube releases its first-ever recap of videos you’ve watched

YouTube releases its first-ever recap of videos you've watched

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

November 4, 2025

EDITOR'S PICK

Why AI Search Just Killed Your Old Funnel

Why AI Search Just Killed Your Old Funnel

December 9, 2025
How to Track Your Google AI Mode Visibility with Semrush

How to Track Your Google AI Mode Visibility with Semrush

August 22, 2025
Protecting Customer Data: A Cybersecurity Guide for Fitness Centers

Protecting Customer Data: A Cybersecurity Guide for Fitness Centers

June 29, 2025
Kyutai Releases 2B Parameter Streaming Text-to-Speech TTS with 220ms Latency and 2.5M Hours of Training

Kyutai Releases 2B Parameter Streaming Text-to-Speech TTS with 220ms Latency and 2.5M Hours of Training

July 5, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • Corgi, the buzzy Y Combinator-backed insurance tech startup, says it didn’t steal an open source product
  • Why Misalignment Is Killing Performance Marketing
  • Cursor Study Finds Reward Hacking Inflates Coding-Agent Benchmark Scores on SWE-bench Pro
  • GeoGuessr Daily Challenge Answer Today for June 26, 2026
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions