Saturday, November 20, 2010

Snapshot Isolation in concurrent Transactions

Snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database (in practice it reads the last committed values that existed at the time it started), and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made since that snapshot.
In practice snapshot isolation is implemented within multiversion concurrency control (MVCC), where generational values of each data item (versions) are maintained.
Snapshot isolation is called "serializable" mode in Oracle.
Eg:
If V1 and V2 are two balances held by a single person, Phil. The bank will allow either V1 or V2 to run a deficit, provided the total held in both is never negative (i.e. V1 + V2 ≥ 0). Both balances are currently $100. Phil initiates two transactions concurrently, T1 withdrawing $200 from V1, and T2 withdrawing $200 from V2.
If the database guaranteed serializable transactions, the simplest way of coding T1 is to deduct $200 from V1, and then verify that V1 + V2 ≥ 0 still holds, aborting if not. T2 similarly deducts $200 from V2 and then verifies V1 + V2 ≥ 0. Since the transactions must serialize, either T1 happens first, leaving V1 = -$100, V2 = $100, and preventing T2 from succeeding (since V1 + (V2 - $200) is now -$200), or T2 happens first and similarly prevents T1 from committing.
Under snapshot isolation, however, T1 and T2 operate on private snapshots of the database: each deducts $200 from an account, and then verifies that the new total is zero, using the other account value that held when the snapshot was taken. Since neither update conflicts, both commit successfully, leaving V1 = V2 = -$100, and V1 + V2 = -$200.
Snapshot isolation present this.
If built on MVCC, snapshot isolation allows transactions to proceed without worrying about concurrent operations, and more importantly without needing to re-verify all read operations when the transaction finally commits. The only information that must be stored during the transaction is a list of updates made, which can be scanned for conflicts fairly easily before being committed.
  • Materialize the conflict: Add a special conflict table, which both transactions update in order to create a direct write-write conflict.
  • Promotion: Have one transaction "update" a read-only location (replacing a value with the same value) in order to create a direct write-write conflict (or use an equivalent promotion, e.g. Oracle's SELECT FOR UPDATE).
Materialize the conflict: by adding a new table which makes the hidden constraint explicit, mapping each person to their total balance. Phil would start off with a total balance of $200, and each transaction would attempt to subtract $200 from this, creating a write-write conflict that would prevent the two from succeeding concurrently. This approach violates the normal form.
Alternatively, we can promote one of the transaction's reads to a write. For instance, T2 could set V1 = V1, creating an artificial write-write conflict with T1 and, again, preventing the two from succeeding concurrently. This solution may not always be possible.

What is MVCC?
- provides access to implement transactional memory (shared memory allowing a concurrent group of load and store instructions to execute in an atomic way).

For detailed study:
Snapshot Isolation

related blog entries:
ACID Properties
Compensation Transaction

2 comments:

  1. Hi,

    Just to complete this abstract about snapshot isolation anomalies...

    The anomaly created by the execution of the above example, about the bank accounts, is called Write-Skew. This problem can actually be detected statically by verifying the database code and then corrected using the approaches described above.

    The static verification and the solutions were proposed in [1] and in [2] is reported a tool that automatically verifies the database application and corrects the code.

    The advantage of Snapshot Isolation being implemented on top of MVCC is that read-only transactions will never abort because they are running in the transaction's private snapshot. This is very useful for long-running transactions.

    Also, almost all Software Transactional Memory systems are not implemented using MVCC. STM systems are implemented using versioned read-write locks and non-blocking techniques.

    There is also a work in the STM area that tries to statically identify transactional programs that will misbehave when running under Snapshot Isolation [3]. This work verifies Java programs and identities the groups of transactions that when executing in parallel will trigger a Write-Skew anomaly.

    [1] Alan Fekete, Dimitrios Liarokapis, Elizabeth O’Neil, Patrick O’Neil, and Dennis Shasha. Making snapshot isolation serializable. ACM Trans. Database Syst., 30(2):492–528, 2005.

    [2] Sudhir Jorwekar, Alan Fekete, Krithi Ramamritham, and S. Sudarshan. Automating the detection of snapshot isolation anomalies. In VLDB ’07: Proceedings of the 33rd international conference on Very large data bases, pages 1263–1274. VLDB Endowment, 2007.

    [3] Ricardo Dias, João Seco and João Lourenço. Snapshot Isolation Anomalies Detection in Software Transactional Memory. In InForum'10: Proceedings of 2nd InForum Simpósio de Informática. Universidade do Minho, Portugal, 2010.
    doi: http://inforum.org.pt/INForum2010/papers/ciencia-e-engenharia-de-software/Paper053.pdf

    ReplyDelete
  2. Thanks Ricardo, for sharing your knowledge.
    Your research paper is very interesting.
    cheers!

    ReplyDelete