What are Database Transactions? Concurrency Control Explained (2026)
This is a PerfectNotes study guide β also known as PN Notes or Perfect Notes. PerfectNotes provides free computer science student notes, MCQs, and interview preparation guides at perfectnotes.org.
Key Takeaways
- Transaction Unit β A Database Transaction is a single, atomic unit of work β it either fully succeeds (COMMIT) or fully fails (ROLLBACK).
- Pessimistic Locking β prevents conflicts by locking rows eagerly β ideal for high-conflict environments like financial trading platforms.
- Optimistic Locking β detects conflicts only at save time using version numbers β ideal for low-conflict systems like CMS platforms.
- MVCC β (Multi-Version Concurrency Control) allows readers and writers to never block each other by maintaining historical snapshots of data.
- Isolation Levels β are a configurable trade-off β Serializable is the safest but slowest; Read Uncommitted is the fastest but most dangerous.
A Database Transaction is a single logical unit of work.
Concurrency Control manages simultaneous transactions to ensure safety and prevent data corruption.
Pessimistic Locking prevents conflicts by aggressively locking rows.
Optimistic Locking assumes conflicts are rare and checks right before saving.
MVCC creates hidden "snapshots" or versions of data so readers and writers do not block each other.
Isolation Levels balance data safety against system speed.
What are Transactions and Concurrency Control?
If only one person used a database at a time, data would never get corrupted. But in the modern internet, thousands of users try to buy the exact same concert ticket, transfer money, or update inventory at the exact same millisecond. Without a strict traffic cop to manage these simultaneous requests, databases would constantly overwrite each other, leading to double-booked seats and vanishing money.
The Analogy: The Joint Bank Account
Imagine you and your partner share a bank account with $100. You both walk up to different ATMs at the exact same time and try to withdraw $100. If the bank's database lacks Concurrency Control, both ATMs read the balance as $100, both dispense the cash, and both update the balance to $0. You just walked away with $200 from a $100 account. Concurrency Control acts as an invisible lock on the account. When you start your transaction, it locks the balance. Your partner's ATM must wait in line for a few milliseconds until your withdrawal finishes and the new balance ($0) is unlocked.
How Concurrency Control Works (The Core Mechanics)
To ensure data integrity during simultaneous access, the Database Management System (DBMS) follows a strict transactional lifecycle:
Categories of Concurrency Control
- Pessimistic Locking β Assumes conflicts are frequent; locks rows the moment a transaction begins. Extremely safe but can cause slowdowns.
- Optimistic Locking (OCC) βAssumes conflicts are rare; allows simultaneous edits but verifies right before "saving" (committing).
- MVCC β The modern enterprise standard (PostgreSQL/MySQL); uses data snapshots so readers and writers never block each other.
Pessimistic vs. Optimistic Locking: Key Differences
| Feature | Pessimistic Locking | Optimistic Locking |
|---|---|---|
| Core Philosophy | Prevent conflicts from ever happening | Detect conflicts only when saving |
| Lock Usage | High (Actively locks database rows) | Low/None (Uses version numbers instead) |
| Best Use Case | High-conflict environments (e.g., Financial trading) | Low-conflict environments (e.g., Editing Wikipedia) |
| Performance Impact | Can cause severe bottlenecks and "Deadlocks" | Extremely fast, but requires app-level retries |
| User Experience | User is blocked from editing if someone else is | User can edit, but might get a "Save Failed" error |
Advanced Engineering Concepts
Two-Phase Locking Protocol (2PL) and Deadlocks
To mathematically guarantee conflict serializability, relational databases implement the Two-Phase Locking (2PL) protocol. It dictates that every transaction must go through two phases:
- Growing Phase β The transaction can acquire new locks but cannot release any.
- Shrinking Phase β The transaction can release locks but cannot acquire any new ones.
While Strict 2PL prevents cascading rollbacks, it introduces the risk of a Deadlock. A deadlock occurs if Transaction T1 holds a lock on Resource A and waits for Resource B, while Transaction T2 holds a lock on Resource B and waits for Resource A. The cycle creates an infinite freeze. The DBMS mitigates this using the Wait-Die or Wound-Wait algorithms, or by running a background Deadlock Detector that proactively kills one of the transactions.
Conflict Serializability and Isolation Levels
A schedule of concurrent transactions is considered "Serializable" if its outcome is mathematically identical to running those transactions one by one in a serial order. If S is a schedule, it is conflict serializable if it can be transformed into a serial schedule by swapping non-conflicting operations.
Because perfect serializability bottlenecks performance, engineers configure distinct Isolation Levels:
- Read Uncommitted β Allows Dirty Reads (reading uncommitted, potentially rolled-back data).
- Read Committed β Prevents Dirty Reads but allows Non-Repeatable Reads.
- Repeatable Read β Guarantees multiple reads yield same data, but allows Phantom Reads.
- Serializable β Emulates strict serial execution, eliminating all read phenomena.
Multiple Granularity Locking (Intention Locks)
In a real database, a transaction might want to lock an entire table rather than lock millions of individual rows one by one. Multiple Granularity Locking (MGL) allows transactions to lock at different levels of a lock granularity hierarchy: Database β Table β Page β Row (tuple).
Without Intention Locks, to check if a table is safe to lock, the DBMS would need to inspect every row's lock status β an O(n) operation. Intention Lockssolve this by placing a lightweight βwarning signalβ on the parent that says βa child node below is already locked.β
| Lock Mode | Full Name | Meaning | Compatible With |
|---|---|---|---|
| IS | Intention Shared | Plans to acquire Shared (read) lock on a descendant node | IS, IX, S, SIX (not X) |
| IX | Intention Exclusive | Plans to acquire Exclusive (write) lock on a descendant node | IS, IX (not S, SIX, X) |
| S | Shared | Read the entire subtree; no concurrent modifications | IS, S (not IX, SIX, X) |
| SIX | Shared + Intention Exclusive | Read entire subtree AND plan to write on specific descendants | IS (not IX, S, SIX, X) |
| X | Exclusive | Read AND write the entire subtree; no other transaction can touch it | None (incompatible with all) |
Timestamp-Based Ordering Protocol
Timestamp-Based Ordering (TO) is a non-locking concurrency control protocol that assigns every transaction a unique, monotonically increasing timestamp TS(T) at start time. The DBMS uses these timestamps to enforce a conflict-equivalent serial order without any lock tables.
Each data item Q maintains two timestamps:
W-timestamp(Q)β the timestamp of the youngest transaction that successfully wrote QR-timestamp(Q)β the timestamp of the youngest transaction that successfully read Q
| Operation | Condition | Action |
|---|---|---|
| T reads Q | TS(T) < W-timestamp(Q) | ROLLBACK T (T is too old β it reads a value it should never have seen) |
| T reads Q | TS(T) β₯ W-timestamp(Q) | ALLOW READ; update R-timestamp(Q) = max(R-timestamp(Q), TS(T)) |
| T writes Q | TS(T) < R-timestamp(Q) | ROLLBACK T (a newer transaction already read what T wants to overwrite) |
| T writes Q | TS(T) < W-timestamp(Q) | SKIP (Thomas Write Rule β the write is obsolete; a newer write already replaced it) |
| T writes Q | TS(T) β₯ R-timestamp(Q) AND TS(T) β₯ W-timestamp(Q) | ALLOW WRITE; update W-timestamp(Q) = TS(T) |
Validation-Based Protocol (Optimistic Concurrency Control β OCC)
Optimistic Concurrency Control (OCC), also called the Validation-Based Protocol, is designed for low-conflict workloads. It assumes that most transactions will not conflict, so it allows them to execute freely and only checks for conflicts at commit time. OCC has three distinct phases:
| Phase | Name | What Happens | Key Point |
|---|---|---|---|
| Phase 1 | Read Phase | Transaction reads data and performs all computations in its private local workspace. No changes are visible to the database yet. | T records its Read-Set (RS) and tentative Write-Set (WS) |
| Phase 2 | Validation Phase | The DBMS checks if T's execution conflicts with any other concurrent transaction. A validation timestamp is assigned atomically. | If validation fails β ROLLBACK T. If passes β proceed to Write Phase |
| Phase 3 | Write Phase | If validation passed, the private workspace changes are written to the actual database. Transaction commits and its changes become globally visible. | This phase is kept very short β only the commit itself is critical section |
Validation Condition β for a transaction Tk to pass validation against all concurrent transactions Tj (where TS(Tj) < TS(Tk)), one of the following must hold:
- 1. Tj completes its Write Phase before Tk starts its Read Phase
- 2. Tj completes its Write Phase before Tk starts its Write Phase, AND the Write-Set of Tj is disjoint from the Read-Set of Tk
- 3. Tj completes its Read Phase before Tk completes its Read Phase, AND the Write-Set of Tj does not intersect either the Read-Set or Write-Set of Tk
Transaction State Diagram
A transaction in a DBMS is not simply βrunningβ or βdoneβ. It transitions through a strict set of states managed by the Transaction Manager. Understanding these states is critical for debugging recovery scenarios and designing rollback logic:
| State | Description | Transitions To | What DBMS Does |
|---|---|---|---|
| Active | Initial state β transaction is executing its operations (reads/writes) | Partially Committed, Failed | Executes SQL, acquires locks, tracks dirty pages |
| Partially Committed | Final SQL statement executed; changes are in memory (not yet on disk) | Committed, Failed | Flushes WAL log to disk; checks integrity constraints |
| Committed | Transaction is fully, permanently saved. All changes are durable. | β (Terminal State) | Releases all locks; marks XID as committed in pg_clog |
| Failed | A constraint violation, deadlock, or error has been detected | Aborted | Initiates ROLLBACK β triggers UNDO log processing |
| Aborted | Rollback complete β all changes have been undone. Data is back to original. | Active (restart) or β (Terminal) | Releases all locks; DBMS may restart T or report error to application |
Real-World Case Study: The Starbucks Race Condition
- The Setup β Security researchers discovered a race condition flaw in digital gift card balances across mobile apps.
- The Flaw β Weak concurrency control allowed two simultaneous $5 transfers from Card A to Card B to hit the database at the same millisecond.
- The Lesson βFinancial ledger systems must enforce serializable isolation or strict version checks to prevent "double-spending" anomalies.
Key Statistics & Industry Data (2026)
- MVCC Performance β Cloud databases using MVCC architectures report concurrent read throughput increases of over 400%. (Source: AWS/GCP Benchmarks, 2026)
- Lock Contention β Flash sales (Black Friday) trigger a 35% increase in transaction deadlocks on inventory tables. (Source: Gartner, 2026)
- Microservices Shift β Over 95% of architectures have shifted from 2PC to the Saga Pattern for distributed concurrency. (Source: Statista, 2026)
When to Use
Airline Ticketing Systems
Requires strict pessimistic locking. Once you select a seat, it must be locked for 5 minutes so nobody else can buy it while you type in your credit card.
Banking & Ledgers
Requires absolute conflict serializability to ensure debits and credits never process out of order.
Content Management Systems (CMS)
Utilizes optimistic locking. Two editors can open the same article, but the system alerts the second editor if the first editor saves changes before them.
Advantages of Strict Concurrency Control
- Data Integrity β Mathematically guarantees data accuracy and integrity.
- Prevents Anomalies β Prevents "Dirty Reads" and double-spending bugs.
- ACID Compliance β Essential for maintaining ACID compliance.
- Developer Convenience β Automates conflict resolution for developers.
Disadvantages of Strict Concurrency Control
- Performance Bottlenecks β Heavy locking severely reduces system speed and throughput.
- Deadlocks β Increases the likelihood of database Deadlocks.
- Scalability Issues β Difficult to scale horizontally across distributed microservices.
- CPU Overhead β High CPU overhead for managing complex lock tables.
Quick Reference Cheat Sheet
| Term | Definition | Primary Use Case |
|---|---|---|
| Transaction | A bundle of SQL operations treated as a single unit. | Processing a multi-step checkout in e-commerce. |
| Lock | A restriction placed on data to prevent other users from modifying it. | Preventing two people from buying the same seat. |
| Deadlock | A system freeze where two transactions wait for each other infinitely. | A critical system error that requires the DBMS to abort a query. |
| MVCC | Multi-Version Concurrency Control (snapshots instead of locks). | High-traffic databases needing fast read performance. |
| Isolation Level | A setting that balances data safety against system speed. | Tuning a database for either performance or strict accuracy. |
Frequently Asked Questions (FAQ)
Q.What happens if a transaction fails halfway through?
Q.Why is my database query stuck in a "Waiting" state?
Q.What is a Dirty Read?
Q.How does MVCC make databases faster?
Q.What is the difference between Concurrency and Parallelism?
Q.What is the difference between Shared and Exclusive locks?
Q.How does a database recover from a deadlock?
Related Topics
Test Your Knowledge
Ready to prove your skills? Take our rigorous multiple-choice quiz designed to test your understanding of this topic and prepare you for interviews.