Adventures in Scaling, Part 1: Using REE

The engineering team at Miso has been quite busy the last few weeks hacking on new features for the site, as well as on experimental ideas regarding the ‘future of social TV’. We are also busy fleshing out the Developer Platform that allows others to build applications using our API and to embed widgets displaying a user’s latest watching activity. [Post on building a OAuth REST API is forthcoming...] As always, we are very focused on infrastructure and stability as our user base grows.

There are a lot of important lessons we have been learning as we continue to scale our application. An ongoing theme of this blog will be to detail a variety of topics relating to our adventures in scaling our various web services as our traffic grows. We decided to start simple and explain in this first post how developers hosting a rails application on 1.8.7 should be using Ruby Enterprise Edition to take advantage of the performance tuning capabilities.

Ruby Enterprise Edition is very simply an improved version of the 1.8.7 MRI Ruby Runtime. The improvements include a copy-on-write friendly garbage collector, an improved memory allocator, ability to debug and tune garbage collection, and various thread bug fixes and performance improvements. In short, if you are using 1.8.7 in your Rails or Rack application anyways, there is really no reason not to switch to REE. We have been using it in production for months and the benefits are worth the switch.

First, let’s get REE installed on your servers. For this tutorial, we will assume you are running Ubuntu 32-bit in production (if not then certain details might be slightly different):

wget http://rubyenterpriseedition.googlecode.com/files/ruby-enterprise_1.8.7-2011.01_i386_ubuntu10.04.deb
sudo dpkg -i ruby-enterprise_1.8.7-2011.01_i386_ubuntu10.04.deb

This will install the REE Package onto your Ubuntu system at /usr/local/ directory by default. Don’t worry this can happily co-exist with your existing Ruby installation. Next, you need to reconfigure your web server to use this version of ruby. If you are using Passenger for instance, run the passenger command to install the nginx or apache module and then change the configuration to point to Passenger.

REE comes with the ability to tuneĀ performance by tweaking the garbage collector. This can have a significant impact on your application and is worth the extra effort. We have seen up to 20-30% increase in ruby performance simply by fine tuning these parameters. To tune ruby, we need to create a wrapper script to set the appropriate variables and then launch ruby.

There are many different recommended settings for the variables and these do depend on your application. Let’s take a look at the various adjustable options for the garbage collector. Each has a different effect on the performance of your server:

RUBY_HEAP_MIN_SLOTS

The first option has to do with the initial number of heap slots ruby will allocate upon startup. This will affect memory usage because the larger the heap size, the more initial memory required. However, most ruby applications will need much more memory then the default allocation provides. By increasing this value, the startup time of your application will be decreased and increase throughput. The default is 10000 slots but the recommended range is between 500000-1250000. For most applications I have tested, the sweet spot is roughly 800000.

RUBY_HEAP_SLOTS_INCREMENT

This option is the number of additional heap slots that will be allocated whenever Ruby is forced to allocate new heap slots for the first time. The default value is 10000 but this lower than most Rails applications could benefit from. The recommended range is between 100000-300000 because this means that ruby will grow in heap size much faster which allows for better throughput and faster response times depending on your application. Our recommended setting is 250000.

RUBY_HEAP_SLOTS_GROWTH_FACTOR

This option is the multiplier that ruby uses to calculate the new heaps to allocate next time Ruby needs new heap slots. The default is 1.8 but if you adjust the slots increment to a much higher value as recommended above, this should be changed to a value of 1 because heap allocation is already sized correctly which means no need for incremental growth of the slots.

RUBY_GC_MALLOC_LIMIT

This option is the amount of data structures that can be allocated before a garbage sweep occurs. This value is really important because the garbage collector in Ruby can be very slow and as such minimizing the frequency that it executes can significantly increase performance. The default is 8000000 but recommended values range from 30000000-80000000 which allows many more structures to be created before a collection is triggered. This means more memory consumption in exchange for less frequent sweeping which can translate to significant performance gains.

RUBY_HEAP_FREE_MIN

This options is the number of heap slots that should be free after a garbage collection has executed. This number determines when ruby must allocate more heap space if the amount free is too low after the garbage sweep. The default value is 4096 but a value much higher can be used in conjunction with the other settings above. A value more to the tune of 100000 is more suitable which means less frequent allocation but each allocation will be a much larger number of slots as defined above. This translates to higher performance in most cases.

Bringing it all together

These changes significantly impact the way Ruby manages memory and performs garbage collection. Now, Ruby will start with enough memory to hold the application in memory from the initial launch. Normally, Ruby starts with far too little memory for a production web application. The memory is increased linearly as more is required rather than the default exponential growth. Garbage collection also happens far less frequently during the execution of your application. The downside is higher peak memory usage but the upside is significant performance gains.

In our case, we ended up using settings very similar to Github and Twitter. We are going to show those settings below, but feel free to research and tweak based on your own analysis of an individual application’s needs.

Let’s create a wrapper for the tweaked ruby settings and save it to /usr/local/bin/tuned_ruby:

# !/bin/bash
export RUBY_HEAP_MIN_SLOTS=800000
export RUBY_HEAP_FREE_MIN=100000
export RUBY_HEAP_SLOTS_INCREMENT=300000
export RUBY_HEAP_SLOTS_GROWTH_FACTOR=1
export RUBY_GC_MALLOC_LIMIT=79000000
exec "/usr/local/bin/ruby" "$@"

Then let’s set the appropriate permissions:

sudo chmod a+x /usr/local/bin/tuned_ruby

Once that tweaked ruby is executable, change your configuration so that the web server uses this new version of ruby. For instance in Passenger 3 for Nginx (our deployment tool of choice), the change would look like:

# /etc/nginx/conf/includes/passenger.conf
# ...
passenger_ruby /usr/local/bin/tuned_ruby;
# ...

Then you need to restart your web server for the changes to take effect in our case:

sudo god restart nginx

Now your application will be using the new tuned REE ruby runtime and will likely be much more memory efficient and have decreased response times for users. Miso uses a variety of profiling tools to gauge the performance of our application (post forthcoming) but I feel it is important to at least mention that after changing these parameters and using REE, profiling to measure the actual impact is essential.

This is only the first blog post of this series. We will be releasing another one soon about database optimization (for Postgres or MySQL) and how to tune your database for better performance as you scale.

Resources

For additional resources about this topic be sure to check out:

This entry was posted in All, Engineering and tagged , . Follow any comments here with the RSS feed for this post. Post a comment or leave a trackback: Trackback URL.

Add a Comment

Your email is never published nor shared.

*
*

6 Comments

  1. Hrmm says:

    Is that it?

    A little more information about what those numbers affect and a rough guide would make this blog post a lot more useful.

    As it is you have presented a black box “Do this and it might make things better. if you want to know more, go and learn it youself”.

    Kind of defeats the point of these kind of posts I think.

    Is the Postgresql just going to be another dump of numbers without explanation? I expect posts like this from companies to be along the lines of Anchor’s wiki site.

    full of useful information like what the numbers affect and what you should set them to depending on some scenarios.

    1. nesquena says:

      You are right we can definitely do a better job explaining the full details of each value. In this case, we wrote up a quick guide to our experience so far. It could certainly be improved a great deal. In the future, we will try to go into additional depth when we do these types of guides. Thanks again for the feedback.

  2. [...] Adventures in Scaling, Part 1: Using REE (tags: ruby rails tuning passenger sysadmin) [...]

  3. Ooooooo says:

    Just came back for a re-read. Hugely more informative and much more useful I think. Thank you for taking the time to expand it.

  4. UnConundrum says:

    What happened to the followups you promised?

    1. nesquena says:

      They will be coming soon as time permits, we haven’t forgotten.

  5. [...] The good news: it’s easy to quickly (< 10 min) see how much your app is impacted from garbage collection. You’re likely to improve your performance by 20-30% just by tuning your garbage collection parameters. [...]

  6. [...] months ago, I wrote a post about REE Garbage Collection Tuning with the intent of kicking off a series dedicated to different approaches and methods applied at [...]

  7. John Bachir says:

    This is the best article online about REE tuning, thanks!

    One thing that would be useful to point out — which settings strictly improve startup performance, and which affect runtime performance? (if it is in fact that case that some only affect startup performance, to my understanding)

3 Trackbacks

  1. By links for 2011-02-25 « Bloggitation on February 25, 2011 at 10:02 pm

    [...] Adventures in Scaling, Part 1: Using REE (tags: ruby rails tuning passenger sysadmin) [...]

  2. [...] The good news: it’s easy to quickly (< 10 min) see how much your app is impacted from garbage collection. You’re likely to improve your performance by 20-30% just by tuning your garbage collection parameters. [...]

  3. [...] months ago, I wrote a post about REE Garbage Collection Tuning with the intent of kicking off a series dedicated to different approaches and methods applied at [...]