10/21/2008 Reading: * Jim Gray et al. Granularity of Locking and Degrees of Consistency in a Shared Data Base. From "Modeling in Data Base Management Systems", G.M. Nijssen, (ed.), 1976. In Red Book. =Locking= Different granularity of locks can be beneficial for different operations: Table level locks for a full table scan, record level locks for index operations. Locking hierarchy: database - locks entire DB, low concurrency table - low concurrency page record - high-overhead of locks. Want to lock at level that gets high concurrency without too much overhead for locking. Records can have multiple parents: db / \ tables indices \ / records OR db | table / | \ A B C <- partition table ==Intention Locks== New lock types: - IX - intend to update a child in the hierarchy - IS - intend to read a child in the hierarchy - SIX - will discuss in a bit Request a lock at each level of the hierarchy if you get that far down the hierarchy. Lock compatibility table (Y means you let T1 and T2 aquire both locks. N means you block one): T1 IS IX S SIX X IS Y Y Y N IX Y Y N N T2 S Y N Y N SIX X N N N N What if you want to scan entire table, and update parts of it. Want to combine S and IX. That is SIX. T1 IS IX S SIX X IS Y Y Y Y N IX Y Y N N N T2 S Y N Y N N SIX Y N N N N X N N N N N Locking Protocol: - Before a transaction can get an S or IS lock on an object, it must have an IS lock on all parents. - Before a transaction can get an X, IX, or SIX lock on an object, it must have one of an {X, IX, or SIX} lock on all parents. Implicit Locking: X (resp. S) lock on parent implies X (resp. S) lock on all children, so you don't have to keep locking entire way down the hierarchy. Lock release protocol: release from the bottom up, so that you maintain invariant described in Locking Protocol. ==Partitioning== db | table / | \ A B C <- partition table Partitions can be ranges: A=key 0...10, B=key 11...20, etc. This is useful if you perform range queries. Especially helps with phantom locking, since record-level locking on its own would allow phantom insertions to happen. If you had locked a logical range, the no phantom insertions could happen. Doesn't work in practice, since it's hard to figure out dynamic ranges for locks. Instead, people tend to lock next pointers on pages in a B+Tree, since then you can't insert a new page when they insert a new item in a logically different part of the tree. If you have no B+Tree, then you need a shared lock on the entire table to avoid phantom records from popping up. =Degrees of Consistency= Some Xactions can tolerate reduced consistency. For example, might want to run an aggregate to run without worrying about seeing other Xactions that update stuff or abort during it running. At the same time, don't want your Xaction's not caring to affect other Xactions that might care. "short" means only lock while operation is happening "long" means keep lock until end of Xaction. Assume everyone else is running in serializable mode - getting a long read and long write lock. Your transaction is in: Degree 0 - short write locks, no read locks (no one ever uses this, since it requires cascading aborts) - means another Xaction can read your writes. If you abort later, then it has to abort to be consistent (cascading aborts) Degree 1 - long write locks, no read locks (good for read-only Xactions) - you might read dirty data of other Xactions - you might have non-repeatable reads - no cascading aborts, since no one else can read your possibly aborted data Degree 2 - long write, short read - avoid reading dirty data - still susceptible to non-repeatable reads Degree 3 - long write, long read - this is 2-phase locking SQL Standard calls these: - READ UNCOMMITTED (Deg 1) - READ COMMITTED (Deg 2) - REPEATABLE READ (Deg 3) - SERIALIZABLE (Deg 3 + No Phantom)