3/15/2008

LiveJournal's Backend: A history of scaling

Today, Livejournal has 20 million+ dynamic page views per day for 1 million users. This scalable website was developed by Danga Interactive(now part of Six Apart). In late 2005, the developers released a presentation about the backend architecture of this web site. The interesting part of it is not that it gave its architecture, but that it gave the history of the architecture, you can learn a lot from the scalable infrastructure evolution.

The PDF version of the presentation can be found at: http://www.danga.com/words/2007_06_usenix/usenix.pdf

You can also see the Flash version here: http://www.slideshare.net/vishnu/livejournals-backend-a-history-of-scaling

From this presentation we can learn that, the main components of a high scalable web site(interactive, which means it hosts user created content) are:

1. Load Balancer (Most likely, it will be F5 hardware like BIG/IP, or software Reverse Web Proxy, many web server can act as this role, or even LVS/HA). It sits at the most front of the whole system. Redistribute the client requests among the back end server pool and send response back properly.

2. Web Server Farm (Easies part to scale, maybe diskless.).It's CPU bound server. Since web request processing is in per-user style, the task division of web site application is very natural. The scalability of the web logic component is very easy and straight forward to reach - "Add More Machines". But in the case of user session data is involved in a unstable server farm(which means server may down), standalone cache server is needed. If the business logic is complex and separated from application logic, some framework like EJB/Spring is needed. But this only happens in business processing(Mainly in Java) world.

3. Distributed Cache (It's inside your application logic, not Content delivery network. To speed up your database access. It's network I/O bound and needs lots of memory). It should be apart from web server to provide better whole system availability. The performance and capability trade-off is very interesting and challenging. There are many available solutions in this area, for example, Oracle Coherence(formerly Tangosol), JBoss Cache and Danga's MemCacheD.

4. Database Partition/Replication/Clustering (Most likely, it will be the bottle neck of your system. Both I/O & CPU bounded). This is the most challenging part to be scalable. The presentation devotes most of its content to DB discussion. Good DB schema(vertical/horizontal partitioned) design and clustering(replication/mirroring, master/slave) configuration are very critical to extreme scalable and available web system.

5. Reliable Distributed Storage System, especially for those media(Photo/Video) hosting service provider. (RAID, NFS or home brewed file system)

6. Monitoring and Fail over service.

Danga design&implemented their own Cache/Reverse Proxy/Distributed Storage system. It's so cool!

Their memcached is widely used in many very popular web sites.

P.S.
1. Since web farm nodes are all diskless, the web server nodes are netbooted from a redundant NFS image. Thus the server farm is more cheaper and more stable. (Disk failure is one of the main system failure factor)

2. The current Database(mySql) architecture is that: one global db cluster, nine user specific data cluster. In one user data cluster, it's master-master pattern(mirroring each other actually). Only global db cluster is master-slave pattern. Why not make all clusters in master/slave pattern? Because this can help speed up write operations. If one master(write) node and 100 slave(read) nodes, many time will be spent on write propagation. In the meanwhile, each piece of data is redundant in every db nodes, it's not necessary, only wasting spaces.

A good article about Danga's presentation can be found here: http://www.linuxjournal.com/article/7451

No comments: