Plan - RPC - Threads - RPC in YFS (for lab1) - RPC system in paper contrasted w/ the one in YFS RPC (Remote Procedure Call) - execute a procedure on a remote machine so that it looks like a call on the local machine - structure: caller/client - method() -> calls a stub, which sends request to callee, and blocks waiting for request callee/server - stub gets request, calls implementation, and returns result to caller stub RPC Challenges - Communication failure - link dies, requests come out of order - Node failure - no response from server might mean two things: 1) failure before calling method() -> operation never happened 2) failure after calling method() -> operation ocurred - At the communication level, the RPC system doesn't let client know whether operation happened or not. Program state has to be maintained at higher level in protocol. - Server isn't serialized for performance---multiple threads/events will run concurrently. Threads (of computation) - An abstraction that allows you to stop the state of a computation and resume it later. The idea is to provide a lightweight implementation in which the threads run in parallel. - Once created, it has its own stack, registers, program counter until it is deleted - One thread can wait on another---this operation is called a join - pthreads are the standard threading API - helps avoid latency in one thread by executing another, or for interleaving independent sections of code Pitfalls - Race conditions: occur when output of computation differs on order in which multiple threads run, usually because of some shared variables // y = x = 0 before running if (x == 0) x = y = 1; print y might output: "1 _," "_ 1," or "1 1," depending on whether the if statement is evaluated in parallel before x is set to 1. To get consistent output, put acquire(L) before the "if(... and a release(L)" after the "print..." This avoids undesired interleavings of thread execution. It's nontrivial where to put acquire/lock statements, and you have no guarantee that other threads will execute the acquire in their code in the appropriate places (to protect shared variables). - Deadlock: if a thread acquired a lock and doesn't release it, other threads may sit waiting RPC semantics - exactly-once semantics: return from a procedure call means the procedure suceeded exactly once, not less or more. This would require logging each call on the server, which would require writing to disk once per transaction, requiring milliseconds on an operation that would otherwise take microseconds - at-least-once semantics: client keeps retrying until server responds w/ success. This means certain operations (e.g. acquire/release) which have side effects might be run multiple times. NFS RPC does this, since protocol is built to handle failure cases. - at-most-once semantics: procedure is executed either 0 times or 1 time, but never more than once. If client asks for a call twice and server has restarted, server will say "I don't know whether I previously did this or not." This is implemented by the server having a unique nonce each time it restarts, which is learned by the client on bind, and checked by the server for each communication. At most vs. at least once - not an argument w/ a definite solution - Reason for at least once: no matter what you do, you have to be able to tell at the server whether something is a repeat (maybe at the application level). So why do it twice? Applications log information anyway... - Reason for at most once: even if you implement it at application level, you get programmer error in unexpected ways unless you enforce no duplicates. At most once is what RPC in YFS will use. YFS RPCs are over TCP sockets, so packet loss is not a concern. In RPC paper, they assume unreliable sockets (e.g., UDP), making implementation more confusing.