• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Saturday, August 23, 2025
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

The journey of Modernizing TorchVision – Memoirs of a TorchVision developer – 3

Josh by Josh
May 29, 2025
in Al, Analytics and Automation
0
The journey of Modernizing TorchVision – Memoirs of a TorchVision developer – 3
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter

READ ALSO

Google AI Proposes Novel Machine Learning Algorithms for Differentially Private Partition Selection

Seeing Images Through the Eyes of Decision Trees


  • May 21, 2022
  • Vasilis Vryniotis
  • . No comments

It’s been a while since I last posted a new entry on the TorchVision memoirs series. Thought, I’ve previously shared news on the official PyTorch blog and on Twitter, I thought it would be a good idea to talk more about what happened on the last release of TorchVision (v0.12), what’s coming out on the next one (v0.13) and what are our plans for 2022H2. My target is to go beyond providing an overview of new features and rather provide insights on where we want to take the project in the following months.

TorchVision v0.12 was a sizable release with dual focus: a) update our deprecation and model contribution policies to improve transparency and attract more community contributors and b) double down on our modernization efforts by adding popular new model architectures, datasets and ML techniques.

Updating our policies

Key for a successful open-source project is maintaining a healthy, active community that contributes to it and drives it forwards. Thus an important goal for our team is to increase the number of community contributions, with the long term vision of enabling the community to contribute big features (new models, ML techniques, etc) on top of the usual incremental improvements (bug/doc fixes, small features etc).

Historically, even though the community was eager to contribute such features, our team hesitated to accept them. Key blocker was the lack of a concrete model contribution and deprecation policy. To address this, Joao Gomes worked with the community to draft and publish our first model contribution guidelines which provides clarity over the process of contributing new architectures, pre-trained weights and features that require model training. Moreover, Nicolas Hug worked with PyTorch core developers to formulate and adopt a concrete deprecation policy.

The aforementioned changes had immediate positive effects on the project. The new contribution policy helped us receive numerous community contributions for large features (more details below) and the clear deprecation policy enabled us to clean up our code-base while still ensuring that TorchVision offers strong Backwards Compatibility guarantees. Our team is very motivated to continue working with the open-source developers, research teams and downstream library creators to maintain TorchVision relevant and fresh. If you have any feedback, comment or a feature request please reach out to us.

Modernizing TorchVision

It’s no secret that for the last few releases our target was to add to TorchVision all the necessary Augmentations, Losses, Layers, Training utilities and novel architectures so that our users can easily reproduce SOTA results using PyTorch. TorchVision v0.12 continued down that route:

  • Our rockstar community contributors, Hu Ye and Zhiqiang Wang, have contributed the FCOS architecture which is a one-stage object detection model.

  • Nicolas Hug has added support of optical flow in TorchVision by adding the RAFT architecture.

  • Yiwen Song has added support for Vision Transformer (ViT) and I have added the ConvNeXt architecture along with improved pre-trained weights.

  • Finally with the help of our community, we’ve added 14 new classification and 5 new optical flow datasets.

  • As per usual, the release came with numerous smaller enhancements, bug fixes and documentation improvements. To see all of the new features and the list of our contributors please check the v0.12 release notes.

TorchVision v0.13 is just around the corner, with its expected release in early June. It is a very big release with a significant number of new features and big API improvements.

Wrapping up Modernizations and closing the gap from SOTA

We are continuing our journey of modernizing the library by adding the necessary primitives, model architectures and recipe utilities to produce SOTA results for key Computer Vision tasks:

  • With the help of Victor Fomin, I have added important missing Data Augmentation techniques such as AugMix, Large Scale Jitter etc. These techniques enabled us to close the gap from SOTA and produce better weights (see below).

  • With the help of Aditya Oke, Hu Ye, Yassine Alouini and Abhijit Deo, we have added important common building blocks such as the DropBlock layer, the MLP block, the cIoU & dIoU loss etc. Finally I worked with Shen Li to fix a long standing issue on PyTorch’s SyncBatchNorm layer which affected the detection models.

  • Hu Ye with the support of Joao Gomes added Swin Transformer along with improved pre-trained weights. I added the EfficientNetV2 architecture and several post-paper architectural optimizations on the implementation of RetinaNet, FasterRCNN and MaskRCNN.

  • As I discussed earlier on the PyTorch blog, we have put significant effort on improving our pre-trained weights by creating an improved training recipe. This enabled us to improve the accuracy of our Classification models by 3 accuracy points, achieving new SOTA for various architectures. A similar effort was performed for Detection and Segmentation, where we improved the accuracy of the models by over 8.1 mAP on average. Finally Yosua Michael M worked with Laura Gustafson, Mannat Singhand and Aaron Adcock to add support of SWAG, a set of new highly accurate state-of-the-art pre-trained weights for ViT and RegNets.

New Multi-weight support API

As I previously discussed on the PyTorch blog, TorchVision has extended its existing model builder mechanism to support multiple pre-trained weights. The new API is fully backwards compatible, allows to instantiate models with different weights and provides mechanisms to get useful meta-data (such as categories, number of parameters, metrics etc) and the preprocessing inference transforms of the model. There is a dedicated feedback issue on Github to help us iron our any rough edges.

Revamped Documentation

Nicolas Hug led the efforts of restructuring the model documentation of TorchVision. The new structure is able to make use of features coming from the Multi-weight Support API to offer a better documentation for the pre-trained weights and their use in the library. Massive shout out to our community members for helping us document all architectures on time.

Thought our detailed roadmap for 2022H2 is not yet finalized, here are some key projects that we are currently planing to work on:

  • We are working closely with Haoqi Fan and Christoph Feichtenhofer from PyTorch Video, to add the Improved Multiscale Vision Transformer (MViTv2) architecture to TorchVision.

  • Philip Meier and Nicolas Hug are working on an improved version of the Datasets API (v2) which uses TorchData and Data pipes. Philip Meier, Victor Fomin and I are also working on extending our Transforms API (v2) to support not only images but also bounding boxes, segmentation masks etc.

  • Finally the community is helping us keep TorchVision fresh and relevant by adding popular architectures and techniques. Lezwon Castelino is currently working with Victor Fomin to add the SimpleCopyPaste augmentation. Hu Ye is currently working to add the DeTR architecture.

If you would like to get involved with the project, please have a look to our good first issues and the help wanted lists. If you are a seasoned PyTorch/Computer Vision veteran and you would like to contribute, we have several candidate projects for new operators, losses, augmentations and models.

I hope you found the article interesting. If you want to get in touch, hit me up on LinkedIn or Twitter.





Source_link

Related Posts

Google AI Proposes Novel Machine Learning Algorithms for Differentially Private Partition Selection
Al, Analytics and Automation

Google AI Proposes Novel Machine Learning Algorithms for Differentially Private Partition Selection

August 23, 2025
Seeing Images Through the Eyes of Decision Trees
Al, Analytics and Automation

Seeing Images Through the Eyes of Decision Trees

August 23, 2025
Tried an AI Text Humanizer That Passes Copyscape Checker
Al, Analytics and Automation

Tried an AI Text Humanizer That Passes Copyscape Checker

August 22, 2025
Top 10 AI Blogs and News Websites for AI Developers and Engineers in 2025
Al, Analytics and Automation

Top 10 AI Blogs and News Websites for AI Developers and Engineers in 2025

August 22, 2025
AI-Powered Content Creation Gives Your Docs and Slides New Life
Al, Analytics and Automation

AI-Powered Content Creation Gives Your Docs and Slides New Life

August 22, 2025
What Is Speaker Diarization? A 2025 Technical Guide: Top 9 Speaker Diarization Libraries and APIs in 2025
Al, Analytics and Automation

What Is Speaker Diarization? A 2025 Technical Guide: Top 9 Speaker Diarization Libraries and APIs in 2025

August 22, 2025
Next Post
In Play: Idaho Lottery Creative Services

In Play: Idaho Lottery Creative Services

POPULAR NEWS

Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
7 Best EOR Platforms for Software Companies in 2025

7 Best EOR Platforms for Software Companies in 2025

June 21, 2025
Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
Refreshing a Legacy Brand for a Meaningful Future – Truly Deeply – Brand Strategy & Creative Agency Melbourne

Refreshing a Legacy Brand for a Meaningful Future – Truly Deeply – Brand Strategy & Creative Agency Melbourne

June 7, 2025

EDITOR'S PICK

Branding & Packaging for Cashflow Vodka by Marx Design – BP&O

Branding & Packaging for Cashflow Vodka by Marx Design – BP&O

August 7, 2025
LLMs vs Generative AI: Are They the Same?

LLMs vs Generative AI: Are They the Same?

June 5, 2025
The Skills B2B Marketers Need to Go From Good to Great in 2025 – TopRank® Marketing

The Skills B2B Marketers Need to Go From Good to Great in 2025 – TopRank® Marketing

July 14, 2025
IBM sees enterprise customers are using ‘everything’ when it comes to AI, the challenge is matching the LLM to the right use case

IBM sees enterprise customers are using ‘everything’ when it comes to AI, the challenge is matching the LLM to the right use case

June 26, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • Our approach to energy innovation and AI’s environmental footprint
  • Transparency, accountability, security & trust
  • Maximize Your Amazon Affiliate Income with Pinterest
  • OpenCUA’s open source computer-use agents rival proprietary models from OpenAI and Anthropic
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?