6.824 2020 Final Exam *** Transactions and Serializability This is a 120-minute exam. Please write down any assumptions you make. You are allowed to consult 6.824 papers, notes, and lab code. If you have questions, please ask them on Piazza in a private post to the staff. Please don't discuss the exam with anyone. Alyssa has built a storage system that provides its clients with transactions. Alyssa won't tell us how her system works, but she does guarantee that the transactions are serializable. Following Section 9.1.6 of the reading for Lecture 12, serializable means that, if you run a bunch of concurrent transactions and they produce some results (output and updated database records), there exists some serial order of those transactions that would, if followed, yield the same results. You start three concurrent transactions at about the same time: T1: begin() put(y, 2) end() T2: begin() put(x, 99) put(y, 99) put(z, 99) end() T3: begin() tmpx = get(x) tmpy = get(y) tmpz = get(z) print tmpx, tmpy, tmpz end() x, y, and z are identifiers of different database records. All three start out with value 0. put() and get() write and read database records, and begin() and end() indicate the start and finish of a transaction. There is no activity in the system other than these three transactions, and no failures occur. If T3 prints 99, 2, 99, is that a serializable result? Explain why or why not. --- If T3 prints 0, 2, 99, is that a serializable result? Explain why or why not. *** Two-Phase Commit Recall the Two-Phase Commit protocol for distributed transactions from Lecture 12 and Section 9.6.3 of the reading for Lecture 12. A common criticism of two-phase commit is that it causes workers to hold locks for a long time; for example, in Figure 9.37 (page 9-89), each worker must hold its locks while waiting to receive the COMMIT message. What would go wrong if each worker released its locks after replying to the PREPARE message? *** Spanner Refer to the paper "Spanner: Google's Globally-Distributed Database," by Corbett et al. The start of Section 4.1.2 says that all of a read/write transactions's writes are assigned the same timestamp (the timestamp assigned to the Paxos write that commits the transaction). Suppose, instead, that each write is assigned a timestamp equal to TT.now().latest at the time when the client transaction code calls the Spanner library function to modify the data. What aspect of Spanner would this break, and why? *** FaRM Ben is using FaRM (see the paper "No compromises: distributed transactions with consistency, availability, and performance" by Dragojevic et al). Here is Ben's transaction code: begin() if x > y: y = y + 1 else: x = x + 1 end() x and y are objects stored in FaRM. begin() and end() mark the start and end of the transaction. Ben starts two instances of this transaction at exactly the same time on two different machines. The two transactions read x and y at exactly the same time, and send LOCK messages at exactly the same time. There is no other activity in the system, and no failures occur. x and y both start out with value zero. Explain what the outcome will be, and how the FaRM commit protocol (Figure 4) arrives at that outcome. --- Ben thinks that some transactions might execute faster if he modified FaRM to run Figure 4's VALIDATE phase before the LOCK phase. It turns out that this change can cause incorrect results in some situations. Outline such a situation. Your answer can involve transactions other than the one shown above. *** Spark Ben is using Spark to compute PageRank on a huge database of web links. He's using PageRank code much like that shown in Section 3.2.2 of "Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing," by Zaharia et al. Ben's PageRank job takes hours to run, and he wants to find out which parts of the job take the most time. He inserts code that gets the wall-clock time before and after each line of the PageRank shown in Section 3.2.2, and prints the difference (i.e. prints how long that line took to execute). He sees that each line in the "for" loop in Section 3.2.2 takes only a fraction of a second to execute, and that the whole loop itself takes less than a second, even though the entire Spark job takes hours. Explain why the for loop takes so much less time than the whole job. --- Ben thinks that telling Spark to cache the "ranks" RDD may make his PageRank job complete in less time, since ranks is used in each iteration of the "for" loop. He adds a call to "ranks.persist()" after each of the two assignments to ranks. However, he finds that his PageRank job takes the same amount of time to complete despite this change. Explain. *** Memcache / Facebook Ben runs a small web site. He has a bunch of web servers, a single memcached server, and a single MySQL DB. The web servers store data persistently in the MySQL DB, and cache database records in memcached. Thus Ben's web servers act as clients to the memcached and MySQL servers. Though Ben has read "Scaling Memcache at Facebook," by Nishtala et al, he ignores all the ideas in that paper. For example, Ben's setup has nothing like Figure 6's invalidates or McSqueal, and no leases. Ben has programmed his web servers to use memcached and MySQL as follows. When a web server needs a data item, it first tries to read it from memcached; if that read misses, the web server reads the data item from MySQL. However, his web servers do not insert data read from MySQL into memcached. Instead, when one of Ben's web servers writes data, it always inserts (or updates) the data in both memcached and MySQL. Here's pseudo-code describing how Ben's web server's use memcached and MySQL: read(k): v = get(k) // is the key/value in memcached? if v is nil { // miss -- not in memcached v = fetch from DB } return v write(k,v): send k,v to DB set(k,v) // insert the new key/value into memcached Ben reasons that, since every write updates both MySQL and memcached, the two will always agree except during the small window of time between the two lines of code in write(). Explain why Ben is wrong. *** COPS You're using COPS (see "Don't Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS," by Lloyd et al.). You have four data centers, and clients and a COPS Data Store Cluster at each data center. Four clients, each at a different data center, each start executing code that uses COPS at about the same time. Here's the code each client executes: Client C1: put(x, 1) put(y, 2) Client C2: tmp_y = get(y) put(z, 100 + tmp_y) Client C3: tmp_x = get(x) tmp_y = get(y) tmp_z = get(z) print "x=", tmp_x print "y=", tmp_y print "z=", tmp_z Client C4: tmp_z = get(z) tmp_y = get(y) tmp_x = get(x) print "x=", tmp_x print "y=", tmp_y print "z=", tmp_z Note that C3 and C4 read x, y, and z in opposite orders. All values start as zero. The clients start executing these code sequences at about the same time. The system uses COPS, not COPS-GT (that is, there are no transactions). There is no other activity in the system, just these four clients. There are no failures. Which of the following are possible outputs from C3? Mark all that apply. [ ] x=1 y=2 z=102 [ ] x=1 y=0 z=102 [ ] x=0 y=2 z=102 [ ] x=1 y=2 z=0 [ ] x=1 y=0 z=0 [ ] x=0 y=2 z=0 --- Which are possible outputs from C4? Mark all that apply. [ ] x=1 y=2 z=102 [ ] x=1 y=0 z=102 [ ] x=0 y=2 z=102 [ ] x=1 y=2 z=0 [ ] x=1 y=0 z=0 [ ] x=0 y=2 z=0 *** Lab 3 Explain what would go wrong if clients sent Lab 3 Get() RPCs to followers, and followers responded with the values in their replica of the data (without talking to the leader, and without adding anything to the log). ---- Alyssa wonders whether one could use Spanner's TrueTime (the API in Table 1 of the Spanner paper) to help design a correct version of Lab 3 in which followers can respond to client Get() RPCs. Outline a design for this. You can assume that both clients and servers have TrueTime, and that the key/value service is continuously busy with both Put() and Get() operations. *** 6.824 Which papers should we omit in future 6.824 years, because they are not useful or are too hard to understand? [ ] 6.033 chapter on transactions [ ] Spanner [ ] FaRM [ ] Spark [ ] Memcached at Facebook [ ] COPS [ ] Certificate Transparency [ ] Bitcoin [ ] BlockStack [ ] AnalogicFS --- Which papers did you find most useful? [ ] 6.033 chapter on transactions [ ] Spanner [ ] FaRM [ ] Spark [ ] Memcached at Facebook [ ] COPS [ ] Certificate Transparency [ ] Bitcoin [ ] BlockStack [ ] AnalogicFS