1000 Datapoints & 1000 Nodes
I am not entirely sure I got this correct but I’m going to post my (attempted) solution. Anyway, here’s the problem description:
We have 1,000 data items to store on 1,000 nodes. Each node can store copies of exactly three different items. Propose a replication scheme to minimize data loss as nodes fail. What is the expected number of data entries that get lost when three random nodes fail?
Solution Link to heading
Important info.:
Each node can store copies of exactly three different items
So I thought,
Therefore we need 334 nodes to store the 1000 datapoints. The node configuration will look like:
So when we get to node 334 we insert data point 1000 in the first slot and use the two spares to store replicas of data point 1 and 2. And so forth from node 334 up to 668:
At this stage we have stored all the original data points in the first 334 nodes, then used the next 334 to backup the first set of replicas. We thus have 332 spare nodes to use for additional backups. Yet we notice that we would need another 334 nodes to fully backup the data the second time, but we only have 332 left. Therefore the best we can do is minimise losses. I think this is why the exercise is describe the way it is.
A naive replication scheme would be simply to replicate all data again from node 668 to 1000 and ignore data points 997 - 1000:
We now need to think of the different scenarios where nodes fail. Not every scenario causes the same outcome after all.
Scenarios: Link to heading
0 data loss, i.e. nodes 1, 2, and 3 fail, and we lose the originals of data point 1 - 9, but their replicas in nodes 334 - 336 are not affected.
2 data entries lost, i.e., failure of nodes 1, 334, and 668, would trigger the loss of originals of entry 1 and its two replicas, in addition to the loss of entry 1000 and its one replica in node 668. Another example are entries 999 and 1000 if we lose nodes 333, 334, and 668 we lose both their originals and one replica. Similarly for entry 997 and 998 if we lose nodes 333, and 667 we lose both their originals and one replica (the failure of the third node doesn’t cause any further data loss).
1 data entry is lost, i.e. combinations that fall outside the 0-loss and 2-entry loss scenarios.
Since no assumption is made on the distribution of failure, I assume that any given node is independent and can fail with equal probability 1/1000
, and thus in terms of expectations we have:
$$ E(data \ entries \ lost) = \frac {1}{3} (0 + 2 + 1) = 1 $$
Share on: