Internals: It's Just Files
This chapter opens the .git directory and shows you that everything you learned is just blobs, trees, commits, and refs.
Why This Matters
You can use git productively without ever looking inside .git/. But one hour of poking around makes every command make sense. When git push rejects you or git checkout complains about your index, you'll know what's actually going on.
The .git Directory
Start a fresh repo and look:
mkdir demo && cd demo
git init
ls -a .git
HEAD config description hooks/ info/
objects/ refs/
Key files:
- HEAD: a text file with one line, usually
ref: refs/heads/main. - config: this repo's config (remotes, user, etc.).
- objects/: every blob, tree, commit, and tag in the repo.
- refs/: the pointers.
refs/heads/for branches,refs/tags/for tags,refs/remotes/for remote-tracking branches.
The Four Object Types
Everything git stores is one of four kinds of object. Each is identified by the SHA of its content.
Blob
A blob is the content of a file. No filename, no permissions. Just bytes.
echo "hello" | git hash-object --stdin
ce013625030ba8dba906f756967f9e9ca394464a
That's the hash of the string hello\n. Any file with that exact content gets the same hash, anywhere in the world, on any machine.
To write it into the object store:
echo "hello" | git hash-object --stdin -w
ls .git/objects/ce/
013625030ba8dba906f756967f9e9ca394464a
Git splits the first two hash characters into a directory name. Old filesystems preferred fewer files per directory.
Read it back:
git cat-file -p ce013625
hello
Tree
A tree is a directory listing. It maps names to objects (other trees or blobs), with a mode (file/executable/symlink/directory).
After a commit, you can see one:
echo "# notes" > README.md
git add README.md
git commit -m "first"
git cat-file -p HEAD^{tree}
100644 blob a906049... README.md
That tree has one entry: a file called README.md with mode 100644 (regular file), whose content is the blob a906049....
Trees reference other trees for subdirectories:
100644 blob a906049... README.md
040000 tree b3ad56c... src
Commit
A commit points at a tree (the snapshot), one or more parents, and carries author, committer, date, and a message.
git cat-file -p HEAD
tree 3a2b1c...
parent 4d5e6f...
author Ada Lovelace <ada@example.com> 1713521700 +0000
committer Ada Lovelace <ada@example.com> 1713521700 +0000
Add login form
A commit is four or five fields and a message, text-serialized, hashed, written to .git/objects/.
Branches, rebases, and merges all boil down to creating more commits and moving refs to point at them.
Tag
An annotated tag is the fourth object type. It wraps a commit with a name, tagger, date, and message.
git tag -a v1.0 -m "first release"
git cat-file -p v1.0
object 3f1a22c...
type commit
tag v1.0
tagger Ada Lovelace <ada@example.com> 1713521700 +0000
first release
Lightweight tags (no -a) aren't tag objects; they're just named refs pointing at a commit. That's why annotated tags can carry a message and lightweight tags can't.
Refs: Named Pointers
A ref is a name that points at a commit (usually). Most refs are files under .git/refs/:
cat .git/refs/heads/main
3f1a22c1b0abcdef1234567890abcdef12345678
Single line, a commit hash. Moving a branch forward is just rewriting this file.
Git sometimes packs refs for performance:
cat .git/packed-refs
# pack-refs with: peeled fully-peeled sorted
3f1a22c... refs/heads/main
4d5e6f7... refs/tags/v1.0
git update-ref is the low-level command to change a ref:
git update-ref refs/heads/main 4d5e6f7
git branch -f, git reset, git commit: all update-ref underneath.
HEAD
HEAD is usually a symbolic ref pointing at a branch:
cat .git/HEAD
ref: refs/heads/main
Detached HEAD is when HEAD points directly at a commit:
3f1a22c1b...
That's the only difference. Git warns about detached HEAD because commits you make there aren't captured by any branch; only the reflog holds on to them.
The Index (Staging Area)
The index is a binary file at .git/index. It's an ordered list of the paths and hashes git expects in the next commit.
git ls-files --stage shows you:
git ls-files --stage
100644 a906049... 0 README.md
100644 3fb2e5c... 0 src/index.ts
Mode, blob hash, stage number, path. Stage 0 is the normal state; stages 1, 2, 3 hold ancestor/ours/theirs during a merge conflict.
git add updates the index. git commit creates a tree from the index and writes a commit pointing at that tree.
Packfiles
A fresh repo stores each object as a file under .git/objects/. Over time, git packs them: compresses many objects into one packfile, using delta compression so near-identical objects share storage.
ls .git/objects/pack/
pack-a1b2c3.idx
pack-a1b2c3.pack
git gc runs packing manually. Normally git runs it for you in the background.
Packfiles are why a git repo with a million commits is often a few hundred megabytes rather than several gigabytes.
Plumbing vs Porcelain
Git commands fall into two categories:
- Porcelain: high-level commands for humans.
git commit,git log,git branch,git merge. - Plumbing: low-level commands for scripts.
git cat-file,git hash-object,git update-ref,git ls-files,git rev-parse.
You use porcelain daily. Plumbing is for writing tools, debugging weird states, and understanding what porcelain does.
Making a commit entirely with plumbing:
# Write a blob
blob=$(echo "hello" | git hash-object -w --stdin)
# Build a tree with one entry
git update-index --add --cacheinfo 100644,$blob,hello.txt
tree=$(git write-tree)
# Make a commit
commit=$(echo "first" | git commit-tree $tree)
# Update a branch to point at it
git update-ref refs/heads/main $commit
You now have a commit. git log will show it. Porcelain commands are shortcuts around this kind of flow.
What This Unlocks
Knowing the objects makes error messages readable:
- "fatal: not a git repository": no
.git/directory. - "fatal: bad object": a hash isn't in
.git/objects/, or the object is corrupt. - "cannot lock ref 'refs/heads/main'": something already has the ref file open.
- "detached HEAD":
HEADpoints at a commit, not a ref.
It also makes recovery possible. If a branch ref was deleted but the commit exists in .git/objects/, you can recreate the branch:
git update-ref refs/heads/rescued 3f1a22c
Common Pitfalls
Editing files under .git/ directly. Usually a bad idea; use plumbing commands. The exception is safe reads (cat HEAD, cat refs/heads/main) for debugging.
Assuming git gc runs automatically in every state. Headless servers sometimes skip it. Unusually large repos can be healed by running git gc --aggressive occasionally.
Treating SHA-1 collisions as a security concern. Git mitigates collision attacks and is migrating to SHA-256. For everyday use, assume hashes are unique.
Trying to understand git rebase without understanding objects. Rebase is "make new commits with the same trees but different parents". Once objects click, rebase stops being scary.
Next Steps
Continue to 10-workflows.md to see how teams put this together in practice.