Scaling Distributed Counters

In: HBase

10 Dec 2010

Distributed counters is an important functionality many distributed databases offer. For an ad network distributed counters are important for many reasons. Real time ad impressions and click data can be used for ad optimization. HBase and Cassandra both support distributed counters.

Ultimately, whatever system you may choose, scaling distributed counters remains a challenge. It boils down to a fundamental problem – atomic updates. At a any given time unit, only one process/thread can update a counter. This counter cannot live on multiple servers at the same time. It has to be updated on one node. Thus, there is no such thing as a true distributed counter! You will always be limited by how many operations a node can execute in a given amount of time.

In spite of being a fundamental problem, there are some simple strategies you can employ to alleviate this problem. In this article I plan to discuss some strategies we used to solve this problem.

Break the counter into multiple counters
First strategy is to break the counter, make it more granular. Let’s say we want to update the ad impressions counter for ad that is being served millions of times across thousands of websites. Here, instead of updating one counter per ad, we can choose to update a counter per ad- website combination. Thus, when you want to find out how many ad impressions were served in any given time interval, you will have to sum up all the website impressions for an ad. This will help in distributing the load. In HBase speak, one ad-website combination can be on one region server and another ad-website combination can be on the other region server. HBase allows you to design your key in such a way that it is easy to do partial key queries.

Let’s say that we employ this strategy to find out that your ad is being served on a huge website (such as cnn.com) where an-ad website combination itself is not enough for a node to handle. What do you do then? You employ In Memory Aggregation.

In Memory Aggregation
In most systems, “Real Time” does not have to be an absolute real time. Let’s say you have a cluster of 40 web servers, and a cluster of 10 HBase nodes. All the 40 nodes are updating counters on 3 region servers. What if you keep these counters in the web server memory, accumulate them for 30 seconds, and then dump the counters to HBase? Thousands of requests to Hbase will immediately reduce to 40 requests per 30 second! That’s a huge reduction in load! Your ability to get real time data will go away, and you will always get 30 seconds delayed data. However, most systems can live with such a delay.

Java offers concept of shared memory; hence it’s easy to do In Memory Aggregation with Java. In Java, each process is handled by a thread in the same JVM process. On other platforms, you must have memcache, or a similar system installed on each of the servers as every request is processed by a separate process (no shared memory exist).

This optimization will take you so far, that you will not have to worry about scaling distributed counters unless your web server farms grows beyond thousands of servers! If that happens, you are in the same league as large companies such as Facebook and Google. In that case you wouldn’t need my advice, you would have some brilliant engineers dedicated to solving all your problems.

Share and Enjoy:
  • Print
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Blogplay
  • Guest

    Clear and concise overview of Hbase.
    Thanks

  • Guest

    Thanks for your great post. Breaking is a good strategy to implement distributed counters, but as you say, “absolute real-time” will be scarified. However, in some cases, ”
    absolute real-time” may be preferable. For example, if the counts relate to money, delays may incur losses. Are there any good approached to handle such a situation? Any suggestions are welcome. Thank you. 

  • Anonymous

    In that case breaking the counter into multiple counters is preferable. You loose absolute real time only if you are using the second strategy described above – In Memory Aggregation. if you are running a tuned Hbase on your own servers, you can get pretty far with a good powered machine.

  • Anonymous

    HBase counters and Cassandra from one giant bottleneck: it reads current value from disk, increments and the saves updated value to memory. This make io the bottleneck bad not very scalable.
    One way to over come it is to write the delta as just another column and write a process which aggregates it once in a while. Also when reading aggregate on the fly the remaining existing delta.

blog comments powered by Disqus

WhyNosql subscription by Email

Name:
E-Mail Address:

Top Commentators

Individuals who contribute to WhyNoSQL on a regular basis, through commenting, will be rewarded here. When will you be on this list?
  • No commentators.