6.824 2015 Lecture 23: Bitcoin

Bitcoin: A Peer-to-Peer Electronic Cash System, by Satoshi Nakamoto
2008

why aren't credit cards the perfect digital money?
  +  work online 
  +  hard to steal (a complex situation)
  +- can cancel transactions, call customer support
  -  relies on 3rd parties to verify (banks)
     b/c users cannot directly verify charges
  -  3% fee
  -  long settling time
  -  hard to become a merchant
  +- tied to currency controlled by government

bitcoin: e-money without a central trusted party
  a public ledger: anyone can verify transactions

what's hard technically?
  forgery
  double spending
  theft

what's hard socially/economically?
  why does it have value?
  how to pay for infrastructure?
  monetary policy (intentional inflation &c)
  laws (taxes, laundering, drugs, terrorists)

let's design OneBit, a simple e-money system
  to illustrate a public, verifiable ledger using transaction chain
  each user owns some coins
  single server -- OneBank -- everyone talks to it
  OneBank records all transactions

OneBit transactions
  every coin has a chain of transaction records
    one for each time this coin was transferred as payment
  OneBank maintains the complete chain for each coin
  chain helps ensure that only the current owner spends

what's in a OneBit transaction record?
  public key of new owner
  hash of this coin's previous transaction record
  signed by private key of previous owner
  (BitCoin is much more complex: amount (fractional), multiple in/out, ...)

OneBit example:
  Y owns a coin, previously given to it by X:
    T7: pub(Y), hash(T6), sig(X)
  Y buys a hamburger from Z and pays with this coin
    Z tells Y Z's public key ("address")
    Y creates a new transaction and signs it
    T8: pub(Z), hash(T7), sig(Y)
  OneBank verifies that:
    no other transaction mentions hash(T7),
    T8's sig() corresponds to T7's pub()
  Z waits until OneBank has seen/verified T8,
    verifies that T8's pub() is Z's public key,
    then Z gives hamburger to Y

why include pub(Z)?

why sign with sig(Y)?

why include hash(T7)?
  hash(T7) identifies the exact coin to be spent
  a coin ID scheme might be ambiguous if Y owned this coin previously

where is Z's resulting coin value "stored"?
  coin balance = unspent xaction
  the "identity" of a coin is the (hash of) its most recent xaction
  Z "owns" the coin: has private key that allows Z to make next xaction

does OneBit's transaction chain prevent stealing?
  current owner's private key needed to sign next transaction
  danger: attacker can steal Z's private key
    Z uses private key a lot, so probably on his PC, easy to steal?
  a significant problem for BitCoin in practice

what if OneBank is corrupt?
  it can't forge transactions (doesn't know private keys)
  but it can help people double-spend!

double-spending with OneBit
  suppose OneBank is cooperating with Y to cheat Z or Q
  Y creates two transactions for same coin: Y->Z, Y->Q
    both have has pointing to same current end of chain
  OneBank shows chain ending in Y->Z to Z, and Y->Q to Q
    both transactions look good, including signatures and hash
  now both Z and Q will give hamburgers to Y

why was double-spending possible?
  OneBank can *hide* some information,
  or selectively reveal it

what's needed?
  many servers ("peers")
  send all transactions to all peers
  much harder for a few bad peers to hide transactions
  conventions to un-fork if problems do arise

the BitCoin block chain
  single block chain contains transactions on all coins
  a copy stored in each peer
    so each peer can validate new transactions against old ones
  each block:
    hash(prevblock)
    set of transactions
    "nonce" (not quite a nonce in the usual cryptographic sense)
    current time (wall clock timestamp)
  a transaction isn't real unless it's in the block chain
  new block every 10 minutes containing xactions since prev block

who creates each new block?
  all the peers try
  requirement: hash(block) < "target"
  peers try nonce values until this works out
  can't predict a winning nonce, since cryptographic hash
  trying one nonce is fast, but most nonces won't work
  it would take one CPU months to create one block
  but thousands of peers are working on it
  such that expected time to first to find is about 10 minutes
  the winner sends the new block to all peers
  (this is part of "mining")

how do transactions work w/ block chain?
  start: all peers know ...<-B5
    and are working on B6 (trying different nonces)
  Z sends public key (address) to Y
  Y sends Y->Z transaction to peers, which flood it
  peers buffer the transaction until B6 computed
  peers that heard Y->Z include it in next block
  so eventually ...<-B5<-B6<-B7, where B7 includes Y->Z

what if bad person wants to double-spend?
  start with block chain ...<-B6
  Y gets Y->Z into block chain
    ...<-B6<-BZ (BZ contains Y->Z)
  Z will see Y->Z in chain and give Y a hamburger
  can Y create ...<-B6<-BQ
    and persuade peers to accept it in place of ...<-B6<-BZ?

when will a peer accept chain CX it hears from another peer?
  suppose peer already knows of chain CY
  it will accept CX if len(CX) > len(CY)
  and if CX passes some validity tests
  will not accept if len(CX) = len(CY): first chain of same length wins

so attacker needs a longer chain to double-spend
  e.g. ...<-B6<-BQ<-B8, which is longer than ...<-B6<-BZ
  and must create it in less than 10 minutes
    *before* main chain grows by another block
  10 minutes is the time it takes the 1000s of honest peers
    to find one block
  if the attacker has just one CPU, will take months to create the blocks
    by that time the main chain will be much longer
    no peer will accept the attacker's shorter chain
    attacker has no chance
  if the attacker has 1000s of CPUs -- more than all the honest
    bitcoin peers -- then the attacker can double spend
 
summary:
  attacker can force honest peers to switch chains if attacker
    controls majority of peer CPU power

how long should Z wait before giving Y the hamburger?
  until Z sees Y flood the transaction to many peers?
    no -- not in the chain, Y might flood conflicting xaction
    maybe OK for low-value transactions
  until Z sees chain ...<-BZ?
    maybe
    risky -- non-zero chance that some other chain will win
      i.e. some lucky machine discovered a few blocks in a row
        quickly, but its network msgs have so far been lost
      perhaps that chain won't have Y->Z
      probability goes down rapidly with number of blocks
  until Z sees chain with multiple blocks after BZ?
    yes -- slim chance attacker with few CPUs could catch up

validation checks:
  much of burden is on (honest) peers, to check new xactions/blocks
    to avoid ever having to scan the whole block chain
    and so that clients don't have to maintain the whole block chain
  peer, new xaction:
    no other transaction refers to same previous transaction
    signature is by private key of previous transaction
    [ then will add transaction to txn list for new block being mined ]
  peer, new block:
    hash value < target (i.e. nonce is right, proof of work)
    previous block hash exists
    new chain longer than current chain
    all transactions in block are valid
  Z:
    Y->Z is in a recent block
    Z's public key / address is in the transaction
    multiple peers have accepted that block
    there's several more blocks in the chain
  (other stuff has to be checked as well, lots of details)

where does each bitcoin originally come from?
  each time a peer creates a block, it gets 25 bitcoins
    assuming it is the winner for that block
  it puts its public key in a special transaction in the block
  this is incentive for people to operate bitcoin peers
  but that number halves every 210,000 blocks (abt 4 years)
  the point: motivate people to run peers

Q: how do peers communicate / find each other?

Q: what prevents a bad peer from modifying an existing block?

Q: what if two nodes disagree on the validity of a transaction?
   (slight implementation differences between software versions)

Q: 10 minutes is annoying; could it be made much shorter?

Q: are transactions anonymous?  pseudonymous?  analogy: IP addresses.

Q: can bitcoins be stolen?

Q: if I steal bitcoins, is it safe to spend them?

Q: what can adversary do with a majority of CPU power in the world?
   can double-spend
   cannot steal others' bitcoins
   can prevent xaction from entering chain
   can revert past xactions (by building longer chain from before that block)

Q: how rich are you likely to get with one machine mining?

Q: if more people (CPUs) mine, will that create new bitcoin faster?
   important use of block timestamps: control difficulty (hash target)

Q: why mine at all? why not start with a fixed number of coins?

Q: why does it make sense for the mining reward to decrease with time?

Q: is it a problem that there will be a fixed number of coins?

Q: what if the real economy grows (or shrinks)?

Q: why do bitcoins have value?
   e.g., 1 BTC appears to be around US$242 on may 12th 2015.

Q: will we still need banks, credit cards, &c?
   today, dollar bills are only a small fraction of total money
   same may be true of bitcoin
   so properties of bitcoin (anonymity &c) may not be very important

Q: what are the benefits of banks, credit cards, &c?
   disputes (Z takes payment and does not give hamburger to Y)
   loss / recovery (user cannot find their private key)

Q: will bitcoin scale well?
   as transaction rate increases?
     claim CPU limits to 4,000 tps (signature checks)
     more than Visa but less than cash
   as block chain length increases?
     do you ever need to look at very old blocks?
     do you ever need to xfer the whole block chain?
     merkle tree: block headers vs txn data.

key idea: block chain
  public ledger is a great idea
  decentralization might be good
  mining seems imperfect, but does avoid centralized trust
  tieing ledger to new currency seems bad