Reference | Hugging Face Tutorial

Use this as a fast refresher after reading the tutorial.

Glossary

Term	Meaning
Hub	Central repository platform for models, datasets, and Spaces
Model card	README-like documentation for a model
Dataset card	Documentation for a dataset
Space	Hosted demo app
Pipeline	High-level inference wrapper in `transformers`
Tokenizer	Converts raw input into model-readable tokens
Checkpoint	Saved model weights/state
PEFT	Parameter-efficient fine-tuning
LoRA	A common PEFT method for adapting large models cheaply
Revision	A specific version of a repo, branch, tag, or commit
Gated model	Requires approval or terms acceptance before access

What to Learn First

If you are new, use this order:

Learn what the ecosystem contains
Browse models and read model cards
Run a pipeline
Learn AutoTokenizer and auto model classes
Load datasets and compute metrics
Fine-tune only when needed
Deploy with a Space or endpoint

Fast Decision Guide

If You Need To...	Start Here
Browse available models	Hub search and model cards
Run something in 5 minutes	`pipeline()`
Build a proper Python workflow	`transformers` + `datasets`
Adapt a model cheaply	`peft` / LoRA
Share an interactive demo	Space
Put a model behind an API	Endpoint or self-hosted service
Reuse files programmatically	`huggingface_hub`

Common Commands

# Install core packages
pip install -U transformers datasets tokenizers evaluate accelerate peft huggingface_hub

# Log in
hf auth login

# Download a file from a repo
hf download distilbert/distilbert-base-uncased config.json

Common Python Patterns

from transformers import pipeline
classifier = pipeline("sentiment-analysis")

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("distilbert/distilbert-base-uncased-finetuned-sst-2-english")
model = AutoModelForSequenceClassification.from_pretrained("distilbert/distilbert-base-uncased-finetuned-sst-2-english")

from datasets import load_dataset
dataset = load_dataset("imdb")

Common Mistakes

Choosing a model without reading the license
Confusing a good demo with production readiness
Ignoring tokenizer/model compatibility
Skipping evaluation on real data
Using fine-tuning when prompting or smaller models would suffice
Forgetting to pin revisions

A Sensible Learning Roadmap

Day 1

Read chapters 01-03
Browse the Hub and shortlist a few interesting repos
Run one pipeline locally

Day 2

Read chapters 04-05
Load one dataset and one model in Python
Compare two models on a few real inputs

Day 3+

Read chapters 06-09
Fine-tune only if you have a clear use case
Publish a small demo or internal proof of concept