Writing Data: INSERT, UPDATE, DELETE, Upsert

This chapter covers putting data into the database: single and bulk INSERT, UPDATE, DELETE, and upserts with ON CONFLICT, including how to get the changed rows back in one round trip.

INSERT

The simplest form names columns and values:

INSERT INTO users (email, name)
VALUES ('ada@example.com', 'Ada Lovelace');

Columns you don't name get their DEFAULT. id is generated, created_at defaults to NOW(), so you don't list them.

Insert multiple rows at once by giving multiple value tuples:

INSERT INTO users (email, name) VALUES
    ('grace@example.com',     'Grace Hopper'),
    ('alan@example.com',      'Alan Turing'),
    ('margaret@example.com',  'Margaret Hamilton'),
    ('katherine@example.com', 'Katherine Johnson');

Multi-row INSERT is one round trip and one transaction. Much faster than a loop of single inserts.

INSERT ... SELECT

Copy rows from one table into another:

CREATE TABLE archived_users (LIKE users INCLUDING ALL);

INSERT INTO archived_users
SELECT * FROM users
WHERE created_at < NOW() - INTERVAL '2 years';

Useful for backfills, migrations, and building denormalized tables from normalized sources.

RETURNING: Get the Row Back

RETURNING makes the database hand you the inserted row (or columns of it) without a second query:

INSERT INTO users (email, name)
VALUES ('hedy@example.com', 'Hedy Lamarr')
RETURNING id, created_at;

Output:

 id |          created_at          
----+------------------------------
  6 | 2026-04-19 10:15:02.482+00

This is the idiomatic way to get a generated id back in PostgreSQL. It works with UPDATE and DELETE too, which makes auditing changes a one-liner.

UPDATE

UPDATE changes existing rows:

UPDATE users
SET name = 'Ada Augusta Lovelace'
WHERE id = 1;

Without the WHERE clause, UPDATE hits every row. Which is almost never what you want. Check your WHERE twice.

Update multiple columns:

UPDATE products
SET price      = price * 0.9,
    updated_at = NOW()
WHERE sku LIKE 'SALE-%';

Each SET expression is evaluated against the old row, not the intermediate state. So price = price * 0.9 uses the original price.

UPDATE ... FROM

When the new values come from another table, use UPDATE ... FROM:

UPDATE products p
SET stock = p.stock - s.sold
FROM (
    SELECT product_id, SUM(quantity) AS sold
    FROM order_items
    WHERE order_id = 42
    GROUP BY product_id
) AS s
WHERE p.id = s.product_id;

One statement subtracts the right amount from each product's stock. Much cleaner than a loop in application code.

DELETE

DELETE FROM users WHERE id = 6;

Same WHERE warning. Without it, you empty the table.

If foreign keys reference this row, the delete either fails (ON DELETE RESTRICT) or cascades (ON DELETE CASCADE), as defined on the foreign key. Chapter 3 covered this.

RETURNING works here too. It's a cheap audit trail:

DELETE FROM sessions
WHERE expires_at < NOW()
RETURNING id, user_id;

Every session you expired, with ID and user, in a single round trip.

Upsert: INSERT ... ON CONFLICT

"Insert if it doesn't exist, update if it does" is called upsert. PostgreSQL spells it ON CONFLICT.

First, the constraint that defines "exists":

INSERT INTO products (sku, name, price, stock)
VALUES ('SKU-001', 'Keyboard', 79.00, 50)
ON CONFLICT (sku) DO UPDATE
SET name  = EXCLUDED.name,
    price = EXCLUDED.price,
    stock = EXCLUDED.stock;

EXCLUDED is the row you tried to insert. You can reference it when deciding what to update.

Sometimes you want "insert if missing, otherwise ignore". Use DO NOTHING:

INSERT INTO users (email, name)
VALUES ('ada@example.com', 'Ada Lovelace')
ON CONFLICT (email) DO NOTHING;

Safer than check-then-insert: the check-then-insert pattern has a race condition; ON CONFLICT resolves it atomically.

Bulk Loading

For hundreds of thousands of rows, INSERT is too slow. Use COPY:

COPY users (email, name) FROM '/tmp/users.csv' CSV HEADER;

Or from the psql side (no server filesystem access required):

\copy users (email, name) FROM 'users.csv' CSV HEADER

COPY is 10 to 100 times faster than INSERT for bulk loads. It bypasses the SQL parser and writes rows in batches.

Updating Lots of Rows

For large UPDATEs, three patterns help:

  1. Batch the update. Instead of one statement touching 10 million rows (which holds locks forever), loop in chunks of 10k.
  2. Use indexes on the WHERE. Updating a row is cheap; finding it without an index is not.
  3. Watch for bloat. PostgreSQL's MVCC means an UPDATE writes a new row and marks the old one dead. Autovacuum cleans up, but a huge UPDATE can double table size temporarily.

Chapter 10 goes further on performance.

Common Pitfalls

UPDATE without WHERE. Always. Write the WHERE first, the SET second, when you're typing the query.

Assuming INSERT returns the id by default. It doesn't. Use RETURNING id.

Using UPSERT without a unique constraint. ON CONFLICT (column) requires a unique or primary key constraint on that column. Otherwise PostgreSQL has no way to detect the conflict.

Forgetting that DELETE respects foreign keys. If you get a "violates foreign key constraint" error on DELETE, check what references the row. Either add ON DELETE CASCADE to the FK, delete the children first, or reconsider whether you should be deleting at all.

Next Steps

Continue to 05-joins.md to combine data across tables.