Read two papers:
Experiences
in Building and Operating ePOST, a Reliable Peer-to-Peer
Application, by Mislove et al, 2006,
and
Exploiting
KAD: Possible Uses and Misuses, Steiner et al, 2007.
Open DHTs are an attractive way to store data for decentralized applications, since DHTs are themselves decentralized; IPFS, Etherium, Storj, and BitTorrent all use DHTs. The ePOST paper describes a case study of use of a DHT in a decentralized application (e-mail), and explains a number of unexpected reliability and consistency problems that arose, along with solutions. The KAD paper focuses on malicious Sybil attacks on a Kademlia DHT. Our job is to try to form an opinion about whether, all things considered, open DHTs are promising as a foundation for decentralized applications.
Are there lessons from ePOST about basing decentralized applications on peer-to-peer storage, particularly DHTs?
In ePost Section 3.1, what is the point of the multicast? What would go wrong if senders didn't perform the multicast (i.e. they only inserted the new message into the DHT)?
Why does ePOST store a log of each user's actions? Why not just store the user's mail directly in the DHT?
How is the ePOST log structured (e.g. how does one find the log records)? The POST paper explains some details.
ePOST Section 3.3 says that almost all of the data in the DHT is immutable. What prevents the data from being changed? Is e-mail data naturally immutable, or does some aspect of the design force use of immutable storage? Would it make sense to represent folders, "read" flags, and deletion using mutable data?
ePOST Section 4.1 says that network partitions occur in real life, and caused trouble for ePOST. How would other systems we've looked at cope with partition?
The POST paper is a short read and explains some of the background for ePOST.
The POST paper's Section 2 mentions that it relies on the Pastry secure routing system. How does Pastry provide security?
Could the certificate authority in Section 3 of the POST paper use Keybase or Blockstack?
The KAD paper's Section 5 suggests an anti-Sybil scheme in which a central CA registers users and assigns them IDs. Does that seems like a good design?
Why not instead require each node to have a public/private key pair, to use its public key (or a hash of it) as its ID, and to have other nodes check signatures on messages? Would that prevent nodes from being able to choose their own IDs, and thus prevent nodes from executing eclipse attacks?
Why not require each node to use an ID that is a hash of its IP address? Wouldn't that prevent nodes from being able to choose their own IDs, and also prevent attackers from creating large numbers of fake "Sybil" nodes?
What was discovered to be wrong with Ethereum's use of Kademlia in early 2018? Was it serious? What was the fix?