• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Monday, May 25, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

A Coding Guide to High-Quality Image Generation, Control, and Editing Using HuggingFace Diffusers

Josh by Josh
February 21, 2026
in Al, Analytics and Automation
0


In this tutorial, we design a practical image-generation workflow using the Diffusers library. We start by stabilizing the environment, then generate high-quality images from text prompts using Stable Diffusion with an optimized scheduler. We accelerate inference with a LoRA-based latent consistency approach, guide composition with ControlNet under edge conditioning, and finally perform localized edits via inpainting. Also, we focus on real-world techniques that balance image quality, speed, and controllability.

!pip -q uninstall -y pillow Pillow || true
!pip -q install --upgrade --force-reinstall "pillow<12.0"
!pip -q install --upgrade diffusers transformers accelerate safetensors huggingface_hub opencv-python


import os, math, random
import torch
import numpy as np
import cv2
from PIL import Image, ImageDraw, ImageFilter
from diffusers import (
   StableDiffusionPipeline,
   StableDiffusionInpaintPipeline,
   ControlNetModel,
   StableDiffusionControlNetPipeline,
   UniPCMultistepScheduler,
)

We prepare a clean and compatible runtime by resolving dependency conflicts and installing all required libraries. We ensure image processing works reliably by pinning the correct Pillow version and loading the Diffusers ecosystem. We also import all core modules needed for generation, control, and inpainting workflows.

def seed_everything(seed=42):
   random.seed(seed)
   np.random.seed(seed)
   torch.manual_seed(seed)
   torch.cuda.manual_seed_all(seed)


def to_grid(images, cols=2, bg=255):
   if isinstance(images, Image.Image):
       images = [images]
   w, h = images[0].size
   rows = math.ceil(len(images) / cols)
   grid = Image.new("RGB", (cols*w, rows*h), (bg, bg, bg))
   for i, im in enumerate(images):
       grid.paste(im, ((i % cols)*w, (i // cols)*h))
   return grid


device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.float16 if device == "cuda" else torch.float32
print("device:", device, "| dtype:", dtype)

We define utility functions to ensure reproducibility and to organize visual outputs efficiently. We set global random seeds so our generations remain consistent across runs. We also detect the available hardware and configure precision to optimize performance on the GPU or CPU.

seed_everything(7)
BASE_MODEL = "runwayml/stable-diffusion-v1-5"


pipe = StableDiffusionPipeline.from_pretrained(
   BASE_MODEL,
   torch_dtype=dtype,
   safety_checker=None,
).to(device)


pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)


if device == "cuda":
   pipe.enable_attention_slicing()
   pipe.enable_vae_slicing()


prompt = "a cinematic photo of a futuristic street market at dusk, ultra-detailed, 35mm, volumetric lighting"
negative_prompt = "blurry, low quality, deformed, watermark, text"


img_text = pipe(
   prompt=prompt,
   negative_prompt=negative_prompt,
   num_inference_steps=25,
   guidance_scale=6.5,
   width=768,
   height=512,
).images[0]

We initialize the base Stable Diffusion pipeline and switch to a more efficient UniPC scheduler. We generate a high-quality image directly from a text prompt using carefully chosen guidance and resolution settings. This establishes a strong baseline for subsequent improvements in speed and control.

LCM_LORA = "latent-consistency/lcm-lora-sdv1-5"
pipe.load_lora_weights(LCM_LORA)


try:
   pipe.fuse_lora()
   lora_fused = True
except Exception as e:
   lora_fused = False
   print("LoRA fuse skipped:", e)


fast_prompt = "a clean product photo of a minimal smartwatch on a reflective surface, studio lighting"
fast_images = []
for steps in [4, 6, 8]:
   fast_images.append(
       pipe(
           prompt=fast_prompt,
           negative_prompt=negative_prompt,
           num_inference_steps=steps,
           guidance_scale=1.5,
           width=768,
           height=512,
       ).images[0]
   )


grid_fast = to_grid(fast_images, cols=3)
print("LoRA fused:", lora_fused)


W, H = 768, 512
layout = Image.new("RGB", (W, H), "white")
draw = ImageDraw.Draw(layout)
draw.rectangle([40, 80, 340, 460], outline="black", width=6)
draw.ellipse([430, 110, 720, 400], outline="black", width=6)
draw.line([0, 420, W, 420], fill="black", width=5)


edges = cv2.Canny(np.array(layout), 80, 160)
edges = np.stack([edges]*3, axis=-1)
canny_image = Image.fromarray(edges)


CONTROLNET = "lllyasviel/sd-controlnet-canny"
controlnet = ControlNetModel.from_pretrained(
   CONTROLNET,
   torch_dtype=dtype,
).to(device)


cn_pipe = StableDiffusionControlNetPipeline.from_pretrained(
   BASE_MODEL,
   controlnet=controlnet,
   torch_dtype=dtype,
   safety_checker=None,
).to(device)


cn_pipe.scheduler = UniPCMultistepScheduler.from_config(cn_pipe.scheduler.config)


if device == "cuda":
   cn_pipe.enable_attention_slicing()
   cn_pipe.enable_vae_slicing()


cn_prompt = "a modern cafe interior, architectural render, soft daylight, high detail"
img_controlnet = cn_pipe(
   prompt=cn_prompt,
   negative_prompt=negative_prompt,
   image=canny_image,
   num_inference_steps=25,
   guidance_scale=6.5,
   controlnet_conditioning_scale=1.0,
).images[0]

We accelerate inference by loading and fusing a LoRA adapter and demonstrate fast sampling with very few diffusion steps. We then construct a structural conditioning image and apply ControlNet to guide the layout of the generated scene. This allows us to preserve composition while still benefiting from creative text guidance.

mask = Image.new("L", img_controlnet.size, 0)
mask_draw = ImageDraw.Draw(mask)
mask_draw.rectangle([60, 90, 320, 170], fill=255)
mask = mask.filter(ImageFilter.GaussianBlur(2))


inpaint_pipe = StableDiffusionInpaintPipeline.from_pretrained(
   BASE_MODEL,
   torch_dtype=dtype,
   safety_checker=None,
).to(device)


inpaint_pipe.scheduler = UniPCMultistepScheduler.from_config(inpaint_pipe.scheduler.config)


if device == "cuda":
   inpaint_pipe.enable_attention_slicing()
   inpaint_pipe.enable_vae_slicing()


inpaint_prompt = "a glowing neon sign that says 'CAFÉ', cyberpunk style, realistic lighting"


img_inpaint = inpaint_pipe(
   prompt=inpaint_prompt,
   negative_prompt=negative_prompt,
   image=img_controlnet,
   mask_image=mask,
   num_inference_steps=30,
   guidance_scale=7.0,
).images[0]


os.makedirs("outputs", exist_ok=True)
img_text.save("outputs/text2img.png")
grid_fast.save("outputs/lora_fast_grid.png")
layout.save("outputs/layout.png")
canny_image.save("outputs/canny.png")
img_controlnet.save("outputs/controlnet.png")
mask.save("outputs/mask.png")
img_inpaint.save("outputs/inpaint.png")


print("Saved outputs:", sorted(os.listdir("outputs")))
print("Done.")

We create a mask to isolate a specific region and apply inpainting to modify only that part of the image. We refine the selected area using a targeted prompt while keeping the rest intact. Finally, we save all intermediate and final outputs to disk for inspection and reuse.

In conclusion, we demonstrated how a single Diffusers pipeline can evolve into a flexible, production-ready image generation system. We explained how to move from pure text-to-image generation to fast sampling, structural control, and targeted image editing without changing frameworks or tooling. This tutorial highlights how we can combine schedulers, LoRA adapters, ControlNet, and inpainting to create controllable and efficient generative pipelines that are easy to extend for more advanced creative or applied use cases.


Check out the Full Codes here. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.




Source_link

READ ALSO

Best Authentication Platforms for AI Agents and MCP Servers in 2026

Build a Complete Langfuse Observability and Evaluation Pipeline for Tracing, Prompt Management, Scoring, and Experiments

Related Posts

Best Authentication Platforms for AI Agents and MCP Servers in 2026
Al, Analytics and Automation

Best Authentication Platforms for AI Agents and MCP Servers in 2026

May 25, 2026
Build a Complete Langfuse Observability and Evaluation Pipeline for Tracing, Prompt Management, Scoring, and Experiments
Al, Analytics and Automation

Build a Complete Langfuse Observability and Evaluation Pipeline for Tracing, Prompt Management, Scoring, and Experiments

May 25, 2026
Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4’s 33.5%
Al, Analytics and Automation

Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4’s 33.5%

May 24, 2026
Tencent Open-Sources TencentDB Agent Memory: A 4-Tier Local Memory Pipeline for AI Agents
Al, Analytics and Automation

Tencent Open-Sources TencentDB Agent Memory: A 4-Tier Local Memory Pipeline for AI Agents

May 24, 2026
Nous Research Releases Contrastive Neuron Attribution (CNA): Sparse MLP Circuit Steering Without SAE Training or Weight Modification
Al, Analytics and Automation

Nous Research Releases Contrastive Neuron Attribution (CNA): Sparse MLP Circuit Steering Without SAE Training or Weight Modification

May 23, 2026
A Step-by-Step Coding Tutorial to Implement GBrain: The Self-Wiring Memory Layer Built by Y Combinator’s Garry Tan for AI Agents
Al, Analytics and Automation

A Step-by-Step Coding Tutorial to Implement GBrain: The Self-Wiring Memory Layer Built by Y Combinator’s Garry Tan for AI Agents

May 23, 2026
Next Post
You Can Now Install—and Update—Microsoft Store Apps Using the Command Line

You Can Now Install—and Update—Microsoft Store Apps Using the Command Line

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

November 4, 2025

EDITOR'S PICK

Google Makes a Stylish Comeback With New Smart Glasses

Google Makes a Stylish Comeback With New Smart Glasses

May 23, 2026
Grok team apologizes for the chatbot’s ‘horrific behavior’ and blames ‘MechaHitler’ on a bad update

Grok team apologizes for the chatbot’s ‘horrific behavior’ and blames ‘MechaHitler’ on a bad update

July 12, 2025
Branding in Ads Manager: AI-Generated Text and Images

Branding in Ads Manager: AI-Generated Text and Images

July 4, 2025
Insights on Marketing Automation in 2026

Insights on Marketing Automation in 2026

January 22, 2026

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • How to Optimize Content & Not Overdo It
  • A Swimmer Broke a World Record at the Enhanced Games
  • Is Insider One the Best Braze Alternative for B2C?
  • Google Marketing Live 2026: News and announcements
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions