Spinnaker FAQ Q: What is timeline consistency? A: It means that all writes are in some total order, but that Gets() are allowed to return the value of an earlier Put instead of the last one. Q: When there is only 1 node up in the cohort, the paper says it’s still timeline consistency; how is that possible? A: Timeline consistency allows the system to return stale values: a get can return a value that was written in the past, but is not the most recent one. Thus, if a server is partitioned off from the majority, and perhaps has missed some Puts, it can still execute a Get if the Get doesn't have to be strongly consistent. Q: What are the trade-offs of the different consistency levels? A: Strongly-consistent systems are easier to program because they behave like a non-replicated systems. Building applications on top of eventually-consistent systems is generally more difficult because the programmer must consider scenarios that couldn't come up in a non-replicated system. Furthermore, certain applications just require strong consistency. Q: What is the CAP theorem about? A: The CAP theorem captures the intuition that it is difficult to achieve consistency, availability, and partition tolerance at the same time. The reason is to handle partitions and achieve consistency there are situations where the system cannot make progress (thus reducing availability), because there is no majority. For example, in Raft, if the leader election fails because there is no majority in any partition, then Raft cannot process new client requests. Q: Could Spinnaker use Raft as the replication protocol rather than Paxos? A: Although the details are slightly different, my thinking is that they are interchangeable to first order. The main reason that we reading this paper is as a case study of how to build a complete system using a replicated log. It is similar to lab 3 and lab 4, except you will be using your Raft library. Q: The paper mentions Paxos hadn't previously been used for database replication. What was it used for? A: Paxos was not often used at all until recently. By 2011 most uses were for configuration services (e.g., Google's Chubby or Zookeeper), but not to directly replicate data. Q: What is the reason for logical truncation? A: The logical truncation exists because Spinnaker merges logs of 3 cohorts into a single physical log for performance on magnetic disks. This complicates log truncation, because when one cohort wants to truncate the log there maybe log entries from other cohorts, which it cannot truncate. So, instead of truncating the log, it remembers its entries that are truncated in a skip list. When the cohort recovers, and starts replaying log entries from the beginning of the log, it skips the entries in the skip list, because they are already present in the last checkpoint of memtables. Q: What exactly is the key range (i.e. 'r') defined in the paper for leader election? A: One of the key ranges of a shard (e.g., [0,199] in figure 2). Q: Is there a more detailed description of Spinnaker's replication protocol somewhere? A: http://www.vldb.org/2011/files/slides/research15/rSession15-1.ppt Q: How does Spinnaker's leader election ensure there is at most one leader? A: The new leader is the candidate with the max n.lst in the Zookeeper under /r/candidates, using Zookeeper sequence numbers to break ties. Q: Does Spinnaker have something corresponding to Raft's terms? Yes, it has epoch numbers (see appendix B). Q: Section 9.1 says that a Spinnaker leader replies to a consistent read without consulting the followers. How does the leader ensure that it is still the leader, so that it doesn't reply to a consistent read with stale data? A: Unclear. Maybe the leader learns from Zookeeper that it isn't the leader anymore, because some lease/watch times out. Q: Would it be possible to replace the use of Zookeeper with a Raft-like leader election protocol? A: Yes, that would be a very reasonable thing to do. My guess is that they needed Zookeeper for shard assignment and then decided to also use it for leader election. Q: What is the difference between a forced and a non-forced log write? A: After a forced write returns, then it is guaranteed that the data is on persistent storage. After a non-forced write returns, the write to persistent storage has been issued, but may not yet be on persistent storage.