8/03/2009

Consistent Hashing - Theory & Implementation

What's it?

The consistent hashing comes from the solving of hot spot problem in Internet system, I.E., it comes from the distributed cache system. [1][2]

The idea is simple and straight forward (more detail in paper[1][2]):
Hot Spot -> Centric Cache -> Distributed Cache (Communicate using IP Multicast) -> Harvest Cache(Tree Structure Cache, structured communication)[9] -> Random Tree Cache(different Tree for different cached Objects, hash mapping from tree node to machine node) -> Random Tree + Consistent Hashing Cache(deal with machine node dynamics: machine may leave/down and join/recover)

Essentially, a Consistent Hash Function is one that changes minimally as the range of the function changes. A hash function can be looked as a mapping from items to buckets, suppose we have such a mapping m1. If some buckets are added/removed, we have another mapping m2. In normal hash functions, m1 -> m2 will introduce many item movements (from one bucket to another). A consistent hash function has the characteristic that item movements are minimal.

How does it accomplish that?

In consistent hash function:
1. Items and buckets are all hashed(normal hash) into some integer interval (for example: [0, 1024]).
2. Item is mapped to a bucket that is closet to it.
3. Bucket may be replicated to improve even distribution balance.

NOTE: here "closet" means the first bucket it meets when traverse clock wise along the integer circle (see diagram below)

Suppose the normal hash output range is [0, 1023], you have 4 buckets and 8 items, each bucket is replicated twice. One possible consistent hashing may be illustrated as below:


Current Mapping/Hashing:
Bucket1 - Item1, Item3, Item8
Bucket2 - Item2, Item7
Bucket3 - Item4, Item6
Bucket4 - Item5

If Bucket3 is down/broken, the new Mapping/Hashing will be:


Bucket1 - Item1, Item3, Item8
Bucket2 - Item2, Item6, Item7
Bucket4 - Item4, Item5


You can see that only Items on Bucket3 are changed, and they are distributed among the remaining buckets.

If a new Bucket5 is added, you can see that only small number of items are changed, and the new bucket gets load from the original 4 Buckets.

How to Implement it?

Normal hash function is stateless, the return value only determines by the input parameter, but the consistent hash function is a stateful function. The state is how the buckets are arranged on the integer circle.

It's natural to store how the buckets are arranged on the integer circle as a search tree, since it's in fact a typical search algorithm - you need to know which segment an item (its hash value) belongs to.

In practical system, this stateful data structure will be stored on each client that uses this consistent hash function. As buckets(machine nodes) join and leave, the state will change. But different client may see the join/leave at different time, or even in different order, thus will produce different hash value using the same consistent hash function(It's said to have different view in paper[1]).

But it is proven in [1], that the number of buckets that one item may belong and the number of items that on one particular bucket won't be very large (the so called Spread/Load property).

A C++ version of consistent hashing function can be found here, it uses STL map as the binary search tree.

The impact of the bucket replica number can be visualized as below (code can be found in testmain.cxx):

You can see that as the replica count increases, the item distribution over buckets will become more and more even.

[Reference]
1. Theory Paper Consistent hashing and random trees
: distributed caching protocols for relieving hot spots on the World Wide Web
2. Practical Paper Web Caching with Consistent Hashing
3. Blog About Consistent Hashing with Java Code
4. Blog Understanding Consistent Hash
5. http://en.wikipedia.org/wiki/Consistent_hashing
6. http://en.wikipedia.org/wiki/Distributed_hash_table
7. The Chord P2P system
8. A Hierarchical Internet Object Cache

No comments: