• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Wednesday, October 8, 2025
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

A Hands-On Introduction to cuML for GPU-Accelerated Machine Learning Workflows

Josh by Josh
September 28, 2025
in Al, Analytics and Automation
0
A Hands-On Introduction to cuML for GPU-Accelerated Machine Learning Workflows
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter


In this article, you will learn what cuML is, and how it can significantly speed up the training of machine learning models through GPU acceleration.

Topics we will cover include:

  • The aim and distinctive features of cuML.
  • How to prepare datasets and train a machine learning model for classification with cuML in a scikit-learn-like fashion.
  • How to easily compare results with an equivalent conventional scikit-learn model, in terms of classification accuracy and training time.

Let’s not waste any more time.

A Hands-On Introduction to cuML for GPU-Accelerated Machine Learning Workflows

A Hands-On Introduction to cuML for GPU-Accelerated Machine Learning Workflows
Image by Editor | ChatGPT

Introduction

This article offers a hands-on Python introduction to cuML, a Python library from RAPIDS AI (an open-source suite within NVIDIA) for GPU-accelerated machine learning workflows across widely used models. In conjunction with its data science–oriented sibling, cuDF, cuML has gained popularity among practitioners who need scalable, production-ready machine learning solutions.

READ ALSO

Ai Flirt Chat Generator With Photos

Fighting for the health of the planet with AI | MIT News

The hands-on tutorial below uses cuML together with cuDF for GPU-accelerated dataset management in a DataFrame format. For an introduction to cuDF, check out this related article.

About cuML: An “Accelerated Scikit-Learn”

RAPIDS cuML (short for CUDA Machine Learning) is an open-source library that accelerates scikit-learn–style machine learning on NVIDIA GPUs. It provides drop-in replacements for many popular algorithms, often reducing training and inference times on large datasets — without major code changes or a steep learning curve for those familiar with scikit-learn.

Among its three most distinctive features:

  • cuML follows a scikit-learn-like API, easing the transition from CPU to GPU for machine learning with minimal code changes
  • It covers a broad set of techniques — all GPU-accelerated — including regression, classification, ensemble methods, clustering, and dimensionality reduction
  • Through tight integration with the RAPIDS ecosystem, cuML works hand-in-hand with cuDF for data preprocessing, as well as with related libraries to facilitate end-to-end, GPU-native pipelines

Hands-On Introductory Example

To illustrate the basics of cuML for building GPU-accelerated machine learning models, we will consider a fairly large, yet easily accessible, dataset via public URL in Jason Brownlee’s repository: the adult income dataset. This is a large, slightly class-unbalanced dataset intended for binary classification tasks, namely predicting whether an adult’s income level is high (above $50K) or low (below $50K) based on a set of demographic and socio-economic features. Therefore, we aim to build a binary classification model.

IMPORTANT: To run the code below on Google Colab or a similar notebook environment, make sure you change the runtime type to GPU; otherwise, a warning will be raised indicating cuDF cannot find the specific CUDA driver library it utilizes.

We start by importing the necessary libraries for our scenario:

import cudf

import cuml

from cuml.model_selection import train_test_split as gpu_train_test_split

from cuml.linear_model import LogisticRegression as cuLogReg

from IPython.display import display

 

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

import time

Note that, in addition to cuML modules and functions to split the dataset and train a logistic regression classifier, we have also imported their classical scikit-learn counterparts. While not mandatory for using cuML (as it works independently from plain scikit-learn), we are importing equivalent scikit-learn components for the sake of comparison in the rest of the example.

Next, we load the dataset into a cuDF dataframe optimized for GPU usage:

url = “https://raw.githubusercontent.com/jbrownlee/Datasets/master/adult-all.csv”

# Column names (they are not included in the dataset’s CSV file we will read)

cols = [

    “age”,“workclass”,“fnlwgt”,“education”,“education_num”,

    “marital_status”,“occupation”,“relationship”,“race”,“sex”,

    “capital_gain”,“capital_loss”,“hours_per_week”,“native_country”,“income”

]

 

df = cudf.read_csv(url, header=None, names=cols)

display(df.head())

Once the data is loaded, we identify the target variable and convert it into binary (1 for high income, 0 for low income):

df[“income”] = df[“income”].str.strip()

df[“income”] = (df[“income”] == “>50K”).astype(“int32”)

This dataset combines numeric features with a slight predominance of categorical ones. Most scikit-learn models — including decision trees and logistic regression — do not natively handle string-valued categorical features, so they require encoding. A similar pattern applies to cuML; hence, we will select a small number of features to train our classifier and one-hot encode the categorical ones.

# Feature selection (let’s say based on domain expertise!)

features = [“age”,“education_num”,“hours_per_week”,“workclass”,“occupation”,“sex”]

X = df[features]

y = df[“income”]

 

# One-hot encode categorical features

X_enc = cudf.get_dummies(X, drop_first=True)

print(“Encoded feature shape:”, X_enc.shape)

So far, we have used cuML (and also cuDF) much like using classical scikit-learn along with Pandas.

Now comes the interesting part. We will split the dataset into training and test sets and train a logistic regression classifier twice, using both CUDA GPU (cuML) and standalone scikit-learn. We will then compare both the classification accuracy and the time taken to train each model. Here’s the complete code for the model training and comparison:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

# MODEL 1: GPU (cuML) train-test split and training

t0 = time.time()

X_train, X_test, y_train, y_test = gpu_train_test_split(X_enc, y, test_size=0.2, random_state=42)

 

model_gpu = cuLogReg(max_iter=1000)

model_gpu.fit(X_train, y_train)

gpu_time = time.time() – t0

 

acc_gpu = model_gpu.score(X_test, y_test)

print(f“cuML Logistic Regression accuracy: {acc_gpu:.4f}, time: {gpu_time:.3f} sec”)

 

# MODEL 2: Scikit-learn and Pandas-driven train-test split and model training

df_pd = pd.read_csv(url, header=None, names=cols)

df_pd[“income”] = df_pd[“income”].str.strip()

df_pd[“income”] = (df_pd[“income”] == “>50K”).astype(“int32”)

 

X_pd = df_pd[features]

y_pd = df_pd[“income”]

X_pd = pd.get_dummies(X_pd, drop_first=True)

 

t0 = time.time()

X_train_pd, X_test_pd, y_train_pd, y_test_pd = train_test_split(X_pd, y_pd, test_size=0.2, random_state=42)

 

model_cpu = LogisticRegression(max_iter=1000)

model_cpu.fit(X_train_pd, y_train_pd)

cpu_time = time.time() – t0

 

acc_cpu = model_cpu.score(X_test_pd, y_test_pd)

print(f“scikit-learn Logistic Regression accuracy: {acc_cpu:.4f}, time: {cpu_time:.3f} sec”)

The results are quite interesting. They should look something like:

cuML Logistic Regression accuracy: 0.8014, time: 0.428 sec

scikit–learn Logistic Regression accuracy: 0.8097, time: 15.184 sec

As we can observe, the model trained with cuML achieved very similar classification performance to its classical scikit-learn counterpart, but it trained over an order of magnitude faster: about 0.5 seconds compared to roughly 15 seconds for the scikit-learn classifier. Your exact numbers will vary with hardware, drivers, and library versions.

Wrapping Up

This article provided a gentle, hands-on introduction to the cuML library for enabling GPU-boosted construction of machine learning models for classification, regression, clustering, and more. Through a simple comparison, we showed how cuML can help build effective models with significantly enhanced training efficiency.



Source_link

Related Posts

Ai Flirt Chat Generator With Photos
Al, Analytics and Automation

Ai Flirt Chat Generator With Photos

October 8, 2025
Fighting for the health of the planet with AI | MIT News
Al, Analytics and Automation

Fighting for the health of the planet with AI | MIT News

October 8, 2025
Building a Human Handoff Interface for AI-Powered Insurance Agent Using Parlant and Streamlit
Al, Analytics and Automation

Building a Human Handoff Interface for AI-Powered Insurance Agent Using Parlant and Streamlit

October 7, 2025
How OpenAI’s Sora 2 Is Transforming Toy Design into Moving Dreams
Al, Analytics and Automation

How OpenAI’s Sora 2 Is Transforming Toy Design into Moving Dreams

October 7, 2025
Printable aluminum alloy sets strength records, may enable lighter aircraft parts | MIT News
Al, Analytics and Automation

Printable aluminum alloy sets strength records, may enable lighter aircraft parts | MIT News

October 7, 2025
Google DeepMind Introduces CodeMender: A New AI Agent that Uses Gemini Deep Think to Automatically Patch Critical Software Vulnerabilities
Al, Analytics and Automation

Google DeepMind Introduces CodeMender: A New AI Agent that Uses Gemini Deep Think to Automatically Patch Critical Software Vulnerabilities

October 7, 2025
Next Post
How to Choose the Right Gaming Laptop (2025): What You Need to Know

How to Choose the Right Gaming Laptop (2025): What You Need to Know

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR NEWS

Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
7 Best EOR Platforms for Software Companies in 2025

7 Best EOR Platforms for Software Companies in 2025

June 21, 2025

EDITOR'S PICK

Police Officers From The K9 Unit During A Operation To Find Victims

April 19, 2025

How AdTech Brands Can Build Powerful Employee Advocacy Programs for PR Success

July 22, 2025
Seeing Images Through the Eyes of Decision Trees

Seeing Images Through the Eyes of Decision Trees

August 23, 2025
5 tips for taking better group photos with the Pixel 10

5 tips for taking better group photos with the Pixel 10

October 7, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • How To Create Engaging Content For Ski Resort Social Media Channels
  • Pinterest Board Strategy: How to Use Boards Effectively
  • The “Great Lock In” is Gen Z’s latest self-help trend
  • How Donors, Doers, Door Openers, and Dunbar’s Number Help Create Campaign Momentum
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?