6.824 Lecture 3: Threads and RPC Today: - RPC - finalize comparison between RPC paper and our thread implementation - Threads, pthreads =RPCs= When sending messages over a socket that multiple threads send messages on, it is best to have a queue for messages, and have a single thread asynchronously send the messages. This avoids making worker threads block on sending due to the other side being slow, etc. YFS RPC vs. Birrell paper's RPC - Similarities - Both provide at-most-once semantics - Nice programming abstraction by use of stubs, etc., to give a generic RPC framework - Differences - YFS uses TCP, Birrell uses UDP tcp: reliable unless machine failure, guarantees FIFO. TCP tries to be nice about backing off to use bottlenecks in links effectively. udp: if packet arrives, it's intact, but no guarantee on delivery UDP has no services for congestion control TCP is preferable over wide area networks, as there is variability in bottleneck capabilities of the links - Birrell assumes client/server is written for high speed LANs specifically designed for RPC traffic. It made no affordances for flow/congestion control because of that assumption. - Birrell cares most about performance to convince programmers to use it - Birrell avoids context switches, etc. of multiple threads to send messages. YFS assumes the bottleneck will be the (WAN) network anyway, and will try to avoid RPCs altogether by thinking about caching locks, etc, to turn RPCS into procedure calls. - Birrell sequentializes RPC calls due to no order guarantee by TCP, but because of this doens't have to maintain state on server about many unacknowledged calls YFS instead uses TCP and thus can pipeline messages along the network, but now the parallel calls have to be stored on the server until acknowledged. =Threads= Thread is short for thread of control, a running program with its own program counter, stack pointer, etc. (For this class a process is a one of more threads executing in a single address space.) Primary purpose: a way of running code concurrently within a single process. For example, in lab 1, if one client is waiting for lock a, the server may want to process requests from other clients, in particular ones for different locks. The primary reasons to use concurrent programming with threads: exploit several processors to run an application faster hide long delays (e.g., while waiting for a disk do something else on processor) run long-running ops concurrenty with short ones in user interfaces network servers and RPC At a minimum, a thread interface must support: creating and managing threads ways of avoiding race conditions for updates to shared variables assume each treads runs on its own processor, sharing a memory instructions that appear to be atomic, might not be (e.g., x = x + 1) ways of coordinating different threads We will mostly deal with pthreads library, which is slightly lower-level - Thread - create creates a thread - join waits on the completion of another thread - exit terminates a thread - Mutexes - lock waits on a mutex to be unlocked - unlock unlocks the mutex - Condition Variables - cond_wait waits on some condition - signal alerts other waiters that something has changed - broadcast tells all other waiting threads that something has changed. this is sometimes wasteful, if you know only one person will consume the stuff, whereas everyone will wake up and sleep (wasted contex switches) pthread_cond_wait(cond_var, mutex) -> releases mutex while waiting, so someone can update shared state. That way, it's done atomically (instead of releasing and waiting in two instructions), and you don't miss any broadcasts. Re-rentrant locks - If you lock a mutex within a thread and call a method that tries to lock it again, a re-entrant lock will allow the thread to re-acquire the lock. - The default in pthreads is to not be re-entrant, which Frans agrees with, since you can't analyze your invariants as easily if you don't have a clean interface that you can make correctness arguments for in your locks. Try to only lock once, so you know when locks happen. - One argument for re-entrancy is composability, or reuse of functions which already lock at a fine level, but you need to lock as well in the caller function. Pitfall of multithreaded programming race condition may be difficult to reproduce deadlock better bug to have than race; you program stops when it happens wrong lock granularity the finer the granularity, the more parallelism you can allow, or the more problem-specific reason you have for locking starvation; can you give me an example? Multithreaded programming is more difficult than sequential programming! worth to reread the paper as you get more experience How to use the interface. Let's look at lock_tester.cc: count_mutex note that it must be initialized with pthread_mutex_init. why do check_grant and check_release use it? main how are threads created? how can you wait on a thread termination? what is the differences between test3, test4, and test5? Lab 1 How do you make an RPC call see lock_demo.cc What is the call semantics? request may arrive 0, 1, 2, ... times at the server RPC semantics impact on lab1? duplicates may arrive More next lecture when looking at the implementation of the RPC library.