Core Products

This chapter covers the main parts of Hugging Face you will use most often.

The Hub

The Hub is the center of everything.

It hosts:

  • Model repositories
  • Dataset repositories
  • Space repositories
  • Versioned files
  • Documentation-like model cards and dataset cards
  • Community discussions and pull requests

Each repo behaves a lot like a Git repository.

Example: A model repo might contain model weights, tokenizer files, config files, a README model card, and example usage.

Models

Model pages usually include:

  • Task tags
  • License
  • Downloads and likes
  • Supported languages or domains
  • Intended use and limitations
  • Example inference code
  • Files such as weights and tokenizer config

What to Check First

CheckWhy It Matters
TaskMake sure the model matches your job
LicenseConfirm you can legally use it
SizeLarge models may be too slow or expensive
Languages/domainGeneral models may perform poorly on niche data
Last updateStale repos can still be good, but inspect carefully
Model card qualityGood documentation usually means easier adoption

Datasets

Dataset pages describe:

  • Where the data came from
  • What fields exist
  • How the data is split
  • Known biases or limitations
  • Licensing and usage constraints

Example: For named entity recognition, a dataset might have tokens and ner_tags columns.

Spaces

Spaces are hosted demos and mini-apps.

Common frameworks:

  • Gradio for quick ML demos
  • Streamlit for data/app UIs
  • Docker for custom environments
  • Static for simple front-end pages

Example: A translation demo where users enter text, pick languages, and see model output.

Inference API and Endpoints

Hugging Face offers hosted inference options so you do not have to run everything yourself.

Two broad patterns:

  • Shared/simple hosted inference for quick testing or moderate usage
  • Dedicated endpoints for production-grade deployment and control

Use shared inference when:

  • You are prototyping
  • Traffic is low
  • You need convenience more than control

Use dedicated endpoints when:

  • You need predictable latency
  • You want autoscaling or private networking
  • You need production ownership

Account and Authentication

You can browse most public resources without an account.

You need an account when you want to:

  • Like or discuss repos
  • Create models, datasets, or Spaces
  • Push content
  • Access gated assets
  • Use private repos

Access Tokens

Tokens are used for programmatic access.

Best practice:

  • Store them in environment variables or secure secret managers
  • Give the minimum permissions needed
  • Rotate them if exposed

Do not hardcode tokens in notebooks or committed source files.

Public, Private, and Gated

VisibilityMeaning
PublicAnyone can access
PrivateOnly authorized users can access
GatedPublicly visible listing, but download requires approval or accepting terms

Example: Some model providers require you to accept a license agreement before downloading weights.

Organizations and Collaboration

Hugging Face supports users and organizations.

Organizations help teams:

  • Share ownership of repos
  • Manage permissions
  • Publish work under a company or project name
  • Keep models, datasets, and demos grouped together

Typical End-to-End Flow

A common path looks like this:

  1. Search the Hub for a model
  2. Read the model card
  3. Test it in browser or notebook
  4. Download it with transformers or huggingface_hub
  5. Fine-tune or evaluate if needed
  6. Push your improved artifact back to the Hub
  7. Create a Space or endpoint for others to use

The Minimum You Need to Remember

If you forget everything else, remember this:

  • Hub = repository and discovery layer
  • Model card = documentation and warning label
  • Dataset card = data provenance and risk notes
  • Space = live demo
  • Endpoint = managed deployment