PyTorch Hub is a central repository designed to facilitate the discovery and usage of pre-trained machine learning models within the PyTorch ecosystem. Launched by the PyTorch team, its core mission is to accelerate research and development by making a wide array of cutting-edge, pre-trained models easily accessible to researchers, developers, and data scientists. Instead of requiring users to manually download model weights and code, PyTorch Hub provides a simple API, torch.hub.load()
, to load models directly from GitHub repositories with minimal boilerplate.
The platform hosts a curated selection of models across various domains like computer vision, natural language processing (NLP), audio processing, and generative AI. By promoting reproducibility and ease of use, PyTorch Hub serves as a valuable resource for transfer learning, benchmarking, educational purposes, and integrating AI capabilities into applications.
PyTorch Hub offers several features to streamline the use of pre-trained models:
torch.hub.load()
): The cornerstone of PyTorch Hub, this function allows users to load pre-trained models (including model architecture and weights) directly from a specified GitHub repository and entry point with a single line of Python code.hubconf.py
file that defines the available models and their entry points (callable functions that create and return the model instance).torch.hub.list()
: A function to list available models or entry points within a specified GitHub repository on the Hub.torch.hub.help()
: Provides the docstring and help for a specific model entry point in a Hub repository.force_reload
Option: Allows users to bypass the cache and force a fresh download of the model and its dependencies.skip_validation
and trust_repo
Options: Provide controls for advanced users regarding repository validation and trusting non-official repositories (use with caution).PyTorch Hub is a valuable resource for various machine learning tasks and workflows:
Using models from PyTorch Hub is designed to be straightforward:
Ensure PyTorch is Installed: PyTorch Hub is part of the core PyTorch library. Make sure you have PyTorch installed in your Python environment.
pip install torch torchvision torchaudio
Browse or Find a Model:
Loading a Model using torch.hub.load()
:
The primary way to load a model is with the torch.hub.load()
function. The basic syntax is:
import torch
model = torch.hub.load('repository_owner/repository_name[:branch_or_tag]', 'model_entrypoint', arg1, arg2, ..., pretrained=True, **kwargs)
'repository_owner/repository_name[:branch_or_tag]'
: The GitHub repository path (e.g., 'pytorch/vision'
, 'ultralytics/yolov5'
). An optional branch or tag can be specified.'model_entrypoint'
: The name of the function defined in the repository's hubconf.py
file that creates and returns the model (e.g., 'resnet18'
, 'yolov5s'
).*args
, **kwargs
: Any arguments required by the model's entry point function (e.g., pretrained=True
is common to load pre-trained weights).source='github'
: This is the default and usually not needed to be specified explicitly. It can also be 'local'
to load from a local directory.force_reload=False
(optional): Set to True
to discard the existing cache and force a fresh download of the model and dependencies.skip_validation=False
(optional): Set to True
to skip GitHub API validation for non-default branches/tags (use with caution from untrusted repos).trust_repo=None
(optional): Can be set to True
or False
to explicitly trust or distrust a repository. If None
, a prompt may appear for non-official repositories.Example (loading a pre-trained ResNet18 model from pytorch/vision
):
import torch
model = torch.hub.load('pytorch/vision:v0.13.0', 'resnet18', pretrained=True)
model.eval() # Set the model to evaluation mode
Performing Inference: Once the model is loaded, you can use it for inference as you would with any other PyTorch model:
# Example using the loaded ResNet18 model
# Assuming you have an image preprocessed as `input_tensor`
# (Preprocessing steps like resizing, normalization, etc., are model-specific
# and usually detailed in the model's documentation or Hub page examples)
#
# import torchvision.transforms as T
# from PIL import Image
#
# # Example preprocessing (may vary)
# preprocess = T.Compose([
# T.Resize(256),
# T.CenterCrop(224),
# T.ToTensor(),
* T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
# ])
# img = Image.open("path/to/your/image.jpg")
# input_tensor = preprocess(img)
# input_batch = input_tensor.unsqueeze(0) # Create a mini-batch as expected by the model
# # Move the input and model to GPU for speed if available
# if torch.cuda.is_available():
# input_batch = input_batch.to('cuda')
# model.to('cuda')
with torch.no_grad():
output = model(input_batch)
# Post-process the output (e.g., get class probabilities, apply softmax)
# probabilities = torch.nn.functional.softmax(output[0], dim=0)
# print(probabilities)
Preparing for Fine-Tuning: For transfer learning, you typically load a pre-trained model, replace its final classification layer (or other relevant layers) to suit your new task, and then train it on your custom dataset. The exact steps depend on the model architecture and your specific task.
# Example: Modifying ResNet18 for a new number of classes
num_ftrs = model.fc.in_features
model.fc = torch.nn.Linear(num_ftrs, NUM_NEW_CLASSES)
# Now, model can be trained on your new dataset
Exploring Models in a Repository:
torch.hub.list(github, force_reload=False)
: Lists available entry points in a hubconf.py
.
torch.hub.list('pytorch/vision')
torch.hub.help(github, model, force_reload=False)
: Shows the docstring for a specific model entry point.
torch.hub.help('pytorch/vision', 'resnet18')
PyTorch Hub itself is a free resource provided as part of the open-source PyTorch project. The models listed on PyTorch Hub are also generally open-source and available for free use, subject to the specific licenses provided by the model contributors in their respective GitHub repositories. There are no subscription fees or charges imposed by PyTorch Hub for accessing or using these models.
Users are responsible for any computational costs incurred if they run or train these models on cloud platforms or their own hardware.
hubconf.py
Researchers and developers can make their pre-trained PyTorch models available through PyTorch Hub by:
hubconf.py
file in the root of their repository. This file defines one or more entry point functions. Each entry point is a Python callable that:
pretrained=False
, num_classes=1000
).torch.nn.Module
).dependencies = ['torch', 'torchvision']
list in hubconf.py
if your model has specific library dependencies beyond PyTorch itself (though the goal is often minimal dependencies for Hub models).While torch.hub.list()
primarily shows models from repositories explicitly indexed by PyTorch (e.g., in pytorch/hub
), torch.hub.load()
can load models from any public GitHub repository that has a valid hubconf.py
, promoting broader sharing.
Q1: What is PyTorch Hub?
A1: PyTorch Hub is a semi-centralized repository of pre-trained PyTorch models. It provides a simple API (torch.hub.load()
) to discover and use these models from GitHub with minimal code, facilitating research reproducibility and easy access to state-of-the-art models.
Q2: How does torch.hub.load()
work?
A2: It downloads (and caches) a model's definition and pre-trained weights from a specified public GitHub repository (which must contain a hubconf.py
file defining model entry points). It then instantiates and returns the PyTorch model ready for use.
Q3: Is PyTorch Hub free? A3: Yes, PyTorch Hub is a free component of the PyTorch open-source project. The models available through it are generally also open-source and free to use, but you must check the specific license of each model you download.
Q4: How are models added to PyTorch Hub?
A4: Model authors can make their models discoverable via PyTorch Hub by ensuring their public GitHub repository contains a hubconf.py
file that defines the model entry points. While some models are prominently featured on the PyTorch Hub website, torch.hub.load()
can fetch from any compliant GitHub repository.
Q5: Can I use PyTorch Hub models for commercial projects? A5: This depends entirely on the license of the specific model you intend to use. PyTorch Hub itself doesn't impose licensing restrictions, but each model contributor specifies their own license (e.g., MIT, Apache 2.0). Always check the model's source repository for its license before commercial use.
Q6: What kind of models are available on PyTorch Hub? A6: A wide variety, including models for image classification (ResNet, VGG), object detection (YOLO, SSD), segmentation, natural language processing (BERT, Transformers for translation/summarization), speech synthesis (Tacotron2), speech recognition (Wav2Vec2), and generative models.
Q7: How does PyTorch Hub ensure reproducibility?
A7: By allowing users to load specific versions of models (via Git tags or branches in the repo_or_dir
argument) and their pre-trained weights directly from source repositories, PyTorch Hub helps in creating reproducible research workflows.
Q8: What is hubconf.py
?
A8: hubconf.py
is a Python file that must be present in the root of a GitHub repository for its models to be loadable via torch.hub.load()
. It defines entry point functions that create and return model instances, and can also list dependencies.
torch.hub
: https://pytorch.org/docs/stable/hub.htmltorch.hub
is within this repository)pytorch/vision
: https://github.com/pytorch/visionultralytics/yolov5
: https://github.com/ultralytics/yolov5Last updated: May 26, 2025
Community and data science platform providing tools for building, training and deploying ML models.
A unified API to access and utilize a wide variety of Large Language Models (LLMs) and AI models from different providers, simplifying development and model experimentation.
Platform for sharing and discovering AI art models, primarily for Stable Diffusion.
A pre-trained model repository designed to facilitate research reproducibility.