Core Products
This chapter covers the main parts of Hugging Face you will use most often.
The Hub
The Hub is the center of everything.
It hosts:
- Model repositories
- Dataset repositories
- Space repositories
- Versioned files
- Documentation-like model cards and dataset cards
- Community discussions and pull requests
Each repo behaves a lot like a Git repository.
Example: A model repo might contain model weights, tokenizer files, config files, a README model card, and example usage.
Models
Model pages usually include:
- Task tags
- License
- Downloads and likes
- Supported languages or domains
- Intended use and limitations
- Example inference code
- Files such as weights and tokenizer config
What to Check First
| Check | Why It Matters |
|---|---|
| Task | Make sure the model matches your job |
| License | Confirm you can legally use it |
| Size | Large models may be too slow or expensive |
| Languages/domain | General models may perform poorly on niche data |
| Last update | Stale repos can still be good, but inspect carefully |
| Model card quality | Good documentation usually means easier adoption |
Datasets
Dataset pages describe:
- Where the data came from
- What fields exist
- How the data is split
- Known biases or limitations
- Licensing and usage constraints
Example: For named entity recognition, a dataset might have tokens and ner_tags columns.
Spaces
Spaces are hosted demos and mini-apps.
Common frameworks:
- Gradio for quick ML demos
- Streamlit for data/app UIs
- Docker for custom environments
- Static for simple front-end pages
Example: A translation demo where users enter text, pick languages, and see model output.
Inference API and Endpoints
Hugging Face offers hosted inference options so you do not have to run everything yourself.
Two broad patterns:
- Shared/simple hosted inference for quick testing or moderate usage
- Dedicated endpoints for production-grade deployment and control
Use shared inference when:
- You are prototyping
- Traffic is low
- You need convenience more than control
Use dedicated endpoints when:
- You need predictable latency
- You want autoscaling or private networking
- You need production ownership
Account and Authentication
You can browse most public resources without an account.
You need an account when you want to:
- Like or discuss repos
- Create models, datasets, or Spaces
- Push content
- Access gated assets
- Use private repos
Access Tokens
Tokens are used for programmatic access.
Best practice:
- Store them in environment variables or secure secret managers
- Give the minimum permissions needed
- Rotate them if exposed
Do not hardcode tokens in notebooks or committed source files.
Public, Private, and Gated
| Visibility | Meaning |
|---|---|
| Public | Anyone can access |
| Private | Only authorized users can access |
| Gated | Publicly visible listing, but download requires approval or accepting terms |
Example: Some model providers require you to accept a license agreement before downloading weights.
Organizations and Collaboration
Hugging Face supports users and organizations.
Organizations help teams:
- Share ownership of repos
- Manage permissions
- Publish work under a company or project name
- Keep models, datasets, and demos grouped together
Typical End-to-End Flow
A common path looks like this:
- Search the Hub for a model
- Read the model card
- Test it in browser or notebook
- Download it with
transformersorhuggingface_hub - Fine-tune or evaluate if needed
- Push your improved artifact back to the Hub
- Create a Space or endpoint for others to use
The Minimum You Need to Remember
If you forget everything else, remember this:
- Hub = repository and discovery layer
- Model card = documentation and warning label
- Dataset card = data provenance and risk notes
- Space = live demo
- Endpoint = managed deployment