Version Control with Git
Learning Objectives
By the end of this reading, you will be able to:
- Understand the fundamental concepts of version control systems
- Use Git for tracking changes in your codebase
- Create and manage branches effectively
- Merge and rebase branches appropriately
- Implement common Git workflows
- Collaborate with teams using distributed version control
Introduction
Version control is a system that records changes to files over time, allowing you to recall specific versions later. Git is a distributed version control system that has become the industry standard for software development. Unlike centralized systems, every developer has a complete copy of the repository, including its full history.
Why Version Control?
- History Tracking: See what changed, when, and by whom
- Collaboration: Multiple developers can work simultaneously
- Backup: Every clone is a full backup
- Experimentation: Try new features without affecting stable code
- Rollback: Revert to previous versions if something breaks
Core Concepts
Repository
A repository (repo) is a directory that contains your project files and the complete history of changes. Git stores this information in a hidden .git directory.
# Example: Project structure with Git
"""
my_project/
├── .git/ # Git metadata and history
├── src/
│ ├── main.py
│ └── utils.py
├── tests/
│ └── test_main.py
├── .gitignore # Files to ignore
└── README.md
"""
Commits
A commit is a snapshot of your repository at a specific point in time. Each commit has:
- A unique SHA-1 hash identifier
- Author information
- Timestamp
- Commit message describing the changes
- Reference to parent commit(s)
# Creating commits
git add main.py utils.py # Stage files
git commit -m "Add utility functions for data processing"
# View commit history
git log --oneline --graph --all
The Three States
Git has three main states that files can be in:
- Modified: Changed but not staged
- Staged: Marked for inclusion in next commit
- Committed: Safely stored in the repository
# Example: Understanding file states
"""
Working Directory Staging Area Repository
---------------- ------------ ----------
main.py (modified) -> main.py (staged) -> main.py (committed)
utils.py (new) -> utils.py (staged)-> utils.py (committed)
config.py (modified)
"""
Branching
Branches allow you to diverge from the main line of development and work independently without affecting the main codebase.
Creating and Switching Branches
# Create a new branch
git branch feature/user-authentication
# Switch to the branch
git checkout feature/user-authentication
# Create and switch in one command
git checkout -b feature/payment-integration
# Modern Git (2.23+)
git switch feature/user-authentication
git switch -c feature/new-feature
Branch Strategy Example
"""
Branch Naming Conventions:
main/master - Production-ready code
develop - Integration branch for features
feature/NAME - New features
bugfix/NAME - Bug fixes
hotfix/NAME - Urgent production fixes
release/VERSION - Release preparation
Example:
feature/user-login
bugfix/memory-leak
hotfix/security-patch
release/v2.0.0
"""
Visualizing Branches
# Example: Branch visualization
"""
main: A---B---C---F---G
\ /
feature: D-----E
A, B, C: Commits on main
D, E: Commits on feature branch
F: Merge commit
G: New commit on main
"""
Merging
Merging integrates changes from one branch into another.
Fast-Forward Merge
Occurs when there are no new commits on the target branch.
# Fast-forward merge example
git checkout main
git merge feature/simple-update
# Force a merge commit even during fast-forward
git merge --no-ff feature/simple-update
# Fast-forward visualization
"""
Before:
main: A---B---C
\
feature: D---E
After (fast-forward):
main: A---B---C---D---E
"""
Three-Way Merge
Occurs when both branches have diverged.
# Three-way merge
git checkout main
git merge feature/complex-feature
# Three-way merge visualization
"""
Before:
main: A---B---C---F
\
feature: D---E
After (merge commit M):
main: A---B---C---F---M
\ /
feature: D---------E
"""
Handling Merge Conflicts
# Example: Conflict in calculator.py
"""
<<<<<<< HEAD
def calculate_total(items):
return sum(item.price * item.quantity for item in items)
=======
def calculate_total(items):
total = 0
for item in items:
total += item.price * item.quantity * (1 - item.discount)
return total
>>>>>>> feature/add-discounts
"""
# Resolution
def calculate_total(items):
"""Calculate total price including discounts."""
total = 0
for item in items:
discount = getattr(item, 'discount', 0)
total += item.price * item.quantity * (1 - discount)
return total
Rebasing
Rebasing moves or combines commits to create a linear history.
Basic Rebase
# Rebase feature branch onto main
git checkout feature/my-feature
git rebase main
# Interactive rebase (last 3 commits)
git rebase -i HEAD~3
# Rebase visualization
"""
Before:
main: A---B---C---D
\
feature: E---F
After rebase:
main: A---B---C---D
\
feature: E'---F'
Note: E' and F' are new commits with same changes but different hashes
"""
Merge vs Rebase
"""
MERGE:
Pros:
- Preserves complete history
- Safe for public branches
- Easy to understand
Cons:
- Creates merge commits
- Non-linear history
REBASE:
Pros:
- Clean, linear history
- Easier to follow
- No merge commits
Cons:
- Rewrites history (dangerous for shared branches)
- Can be complex with conflicts
Golden Rule: Never rebase public/shared branches!
"""
Interactive Rebase
# Interactive rebase allows you to:
# - Reword commit messages
# - Squash commits together
# - Reorder commits
# - Drop commits
git rebase -i HEAD~4
# Example interactive rebase file:
# pick a1b2c3d Add user model
# squash e4f5g6h Fix typo in user model
# reword h7i8j9k Update user validation
# drop k0l1m2n Experimental feature
Git Workflows
Centralized Workflow
Simple workflow for small teams.
"""
Everyone works on main branch:
Developer 1: A---B---C
Developer 2: D---E
Merged: A---B---C---D---E
"""
Feature Branch Workflow
Each feature gets its own branch.
"""
main: A---B-------F---G
\ /
feature-1: C---D
\
feature-2: E (still in progress)
Process:
1. Create feature branch from main
2. Work on feature
3. Merge back to main when complete
4. Delete feature branch
"""
# Feature branch workflow
git checkout main
git pull origin main
git checkout -b feature/user-profile
# ... make changes ...
git add .
git commit -m "Add user profile page"
git push origin feature/user-profile
# ... create pull request ...
# ... after merge ...
git checkout main
git pull origin main
git branch -d feature/user-profile
Gitflow Workflow
Structured workflow for release management.
"""
main: A-----------H-------K
/ /
develop: B---C---E---F---I---J
\ /
feature: D---
Branch types:
- main: Production code
- develop: Integration branch
- feature/*: New features
- release/*: Release preparation
- hotfix/*: Production fixes
"""
# Gitflow example
# Start new feature
git checkout develop
git checkout -b feature/shopping-cart
# Finish feature
git checkout develop
git merge feature/shopping-cart
git branch -d feature/shopping-cart
# Start release
git checkout -b release/1.0.0 develop
# ... version bump, bug fixes ...
git checkout main
git merge release/1.0.0
git tag -a v1.0.0 -m "Release version 1.0.0"
git checkout develop
git merge release/1.0.0
git branch -d release/1.0.0
Forking Workflow
Common in open source projects.
"""
Original Repo: main: A---B---C---D
Fork (Developer 1): main: A---B---C---D---E
\
feature: F---G
Process:
1. Fork repository
2. Clone your fork
3. Create feature branch
4. Push to your fork
5. Create pull request to original repo
"""
Essential Git Commands
Configuration
# Set user information
git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"
# Set default editor
git config --global core.editor "vim"
# View configuration
git config --list
Basic Operations
# Initialize repository
git init
# Clone repository
git clone https://github.com/user/repo.git
# Check status
git status
# Add files to staging
git add file.txt # Specific file
git add *.py # Pattern
git add . # All files
# Commit changes
git commit -m "Message"
git commit -am "Message" # Add and commit tracked files
# View history
git log
git log --oneline
git log --graph --all --decorate
# View changes
git diff # Unstaged changes
git diff --staged # Staged changes
git diff main..feature # Between branches
Remote Operations
# Add remote
git remote add origin https://github.com/user/repo.git
# View remotes
git remote -v
# Fetch changes
git fetch origin
# Pull changes (fetch + merge)
git pull origin main
# Push changes
git push origin main
git push -u origin feature # Set upstream
# Delete remote branch
git push origin --delete feature/old-feature
Undoing Changes
# Unstage file
git reset HEAD file.txt
# Discard changes in working directory
git checkout -- file.txt
git restore file.txt # Modern Git
# Amend last commit
git commit --amend
# Undo commit (keep changes)
git reset --soft HEAD~1
# Undo commit (discard changes)
git reset --hard HEAD~1
# Revert commit (create new commit)
git revert abc123
Best Practices
1. Write Meaningful Commit Messages
# Good commit messages
"""
Add user authentication with JWT tokens
Implement login, logout, and token refresh endpoints.
Add middleware for protected routes.
Include unit tests for auth service.
Closes #123
"""
# Bad commit messages
"""
Fixed stuff
WIP
Updated files
asdf
"""
# Format:
# <type>: <subject>
#
# <body>
#
# <footer>
# Types: feat, fix, docs, style, refactor, test, chore
2. Commit Often, Perfect Later
# Make frequent commits during development
git commit -m "WIP: Add basic user model"
git commit -m "WIP: Add validation"
git commit -m "WIP: Add tests"
# Clean up before merging with interactive rebase
git rebase -i HEAD~3
# Squash into single meaningful commit
3. Use .gitignore
# .gitignore example
"""
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
venv/
.venv
# IDEs
.vscode/
.idea/
*.swp
*.swo
# OS
.DS_Store
Thumbs.db
# Project specific
config/local_settings.py
*.log
.env
secrets.json
"""
4. Branch Naming Conventions
"""
Use descriptive, hierarchical names:
feature/user-authentication
feature/payment-integration
bugfix/login-error
hotfix/security-vulnerability
release/v2.0.0
docs/api-documentation
Avoid:
- Special characters except - and /
- Spaces
- Ambiguous names (temp, fix, stuff)
"""
Practical Example: Complete Workflow
# Example: Adding a new feature
"""
Project: E-commerce platform
Feature: Product recommendation system
"""
# 1. Start from updated main
# git checkout main
# git pull origin main
# 2. Create feature branch
# git checkout -b feature/product-recommendations
# 3. Implement feature
# File: recommendations.py
class RecommendationEngine:
"""Generate product recommendations based on user history."""
def __init__(self, user_id):
self.user_id = user_id
self.history = self._load_user_history()
def _load_user_history(self):
"""Load user purchase and browse history."""
# Implementation
pass
def get_recommendations(self, count=5):
"""
Get personalized product recommendations.
Args:
count (int): Number of recommendations to return
Returns:
list: List of recommended product IDs
"""
# Implementation using collaborative filtering
similar_users = self._find_similar_users()
recommendations = self._aggregate_preferences(similar_users)
return recommendations[:count]
def _find_similar_users(self):
"""Find users with similar purchase patterns."""
# Implementation
pass
def _aggregate_preferences(self, users):
"""Aggregate product preferences from similar users."""
# Implementation
pass
# 4. Write tests
# File: test_recommendations.py
import pytest
from recommendations import RecommendationEngine
class TestRecommendationEngine:
def test_initialization(self):
engine = RecommendationEngine(user_id=1)
assert engine.user_id == 1
def test_get_recommendations_returns_list(self):
engine = RecommendationEngine(user_id=1)
recs = engine.get_recommendations(count=3)
assert isinstance(recs, list)
assert len(recs) <= 3
def test_recommendations_are_unique(self):
engine = RecommendationEngine(user_id=1)
recs = engine.get_recommendations(count=10)
assert len(recs) == len(set(recs))
# 5. Commit changes
# git add recommendations.py test_recommendations.py
# git commit -m "feat: Add product recommendation engine
#
# Implement collaborative filtering for personalized recommendations.
# Include comprehensive test suite.
#
# Related to #456"
# 6. Push to remote
# git push -u origin feature/product-recommendations
# 7. Create pull request (on GitHub/GitLab)
# 8. Code review and address feedback
# 9. Merge to main
# 10. Delete feature branch
# git branch -d feature/product-recommendations
Exercises
Basic Exercises
Repository Setup
- Initialize a Git repository
- Create a Python project with at least 3 files
- Make your first commit
- Check the commit history
Basic Workflow
- Modify two files
- Stage only one file
- Commit the staged file
- Check status and diff for remaining changes
- Commit the second file
Branching Practice
- Create a new branch called
feature/calculator - Add a simple calculator function
- Switch back to main
- Merge the feature branch
- Create a new branch called
Intermediate Exercises
Merge Conflict Resolution
- Create two branches from main
- Modify the same line in the same file in both branches
- Merge one branch to main
- Attempt to merge the second branch (will conflict)
- Resolve the conflict and complete the merge
Gitflow Simulation
- Set up main and develop branches
- Create a feature branch from develop
- Implement a feature (e.g., user authentication)
- Merge back to develop
- Create a release branch
- Merge release to both main and develop
Interactive Rebase
- Create 5 small commits
- Use interactive rebase to squash them into 2 meaningful commits
- Reword one commit message
Advanced Exercises
Complex Workflow
- Simulate a team environment with multiple features
- Create 3 feature branches
- Make commits on each branch
- Handle conflicts during merging
- Use rebase to maintain clean history
Cherry-Pick Scenario
- Create a feature branch with multiple commits
- Identify one specific commit that's needed urgently
- Cherry-pick that commit to main
- Handle any conflicts
- Research:
git cherry-pick <commit-hash>
Recovering Lost Work
- Make several commits
- Reset hard to an earlier commit (losing recent work)
- Use
git reflogto find lost commits - Recover the lost work
- Research:
git reflogandgit reset --hard <commit>
Bisect Debugging
- Create a repository with 10 commits
- Introduce a bug in one of the middle commits
- Use
git bisectto identify which commit introduced the bug - Research:
git bisect start,git bisect good,git bisect bad
Common Pitfalls
1. Working Directly on Main
# Bad
"""
main: A---B---C---D (all your work)
"""
# Good
"""
main: A-----------D (merges only)
\ /
feature: B-------C
"""
2. Committing Large Binaries
# Avoid committing large files:
# - Videos, large images
# - Database dumps
# - Compiled binaries
# - Dependencies (use package managers)
# Use Git LFS for necessary large files
# Or store in external storage with references
3. Not Pulling Before Pushing
# This can cause conflicts
# git push origin main
# ! [rejected] main -> main (fetch first)
# Always pull first
git pull origin main
# Resolve any conflicts
git push origin main
Summary
Version control with Git is essential for modern software development. Key takeaways:
- Commits are snapshots of your codebase at specific points in time
- Branches allow parallel development without affecting main code
- Merging integrates changes from different branches
- Rebasing creates linear history but rewrites commits
- Workflows provide structure for team collaboration
- Best practices include meaningful commits, frequent commits, and proper branching
Git enables:
- Effective collaboration across teams
- Safe experimentation with features
- Complete history tracking
- Ability to rollback changes
- Code review through pull requests
Mastering Git takes practice, but the investment pays dividends in productivity and code quality.
Additional Resources
- Pro Git Book - Comprehensive Git resource
- Git Documentation - Official documentation
- Learn Git Branching - Interactive tutorial
- Oh Shit, Git!?! - Fixing common mistakes
Next Reading
Continue to 02-sdlc.md to learn about the Software Development Lifecycle and how teams plan, develop, and deliver software projects.