Friday, June 10, 2011

Speeding Up Rails Load Time

The rake and mongrel start-up times increased quite a bit in my production after upgrading a large app from Rails 2.3.4 to Rails 3.0.7. I've read a lot about Rails 3 in general starting up slower, but what I was seeing seemed more than that.

In production, my gems and code are stored on a shared disk mounted via NFS such that the exact same code can be shared between multiple machines. I suspected the start-up problem is related to NFS.

I did a quick test with rake db:migrate to see how many file stats it is doing. In Rails 2, the rake did 280k. In Rails 3, the rake did 480k. Wow! No wonder NFS becomes overwhelmed especially when multiple of these are running at the same time. So I started to look at cause behind this huge increase.

Almost all of these file stats are caused by ruby scanning its load path trying to locate files for require calls. The load path in my application has 100 directories in it, which mean require 'foo.rb' will on average look at 50 directories.

In general, you don't see require 'foo.rb' but rather require 'foo' (without the extension). This makes the problem worse as ruby not only needs to traverse directories in the load path, but it also needs to guess the file extension in each directory. This is system specific, but at least on linux, it tries .rb and .so. So in fact, require 'foo' will on average look at 100 files.

I don't know about you, but 99.9% of my files are .rb so looking for .so in the vast majority of cases is completely useless. So I overrode require to force ruby to first scan the load path looking for .rb files and if nothing is found revert to the original way.
module RequirePatch def require(file) unless file.to_s =~ /\.[a-z]+$/ begin super("#{file}.rb") rescue LoadError super end else super end end end This drops the number of file stats to 160k! Note, this is great but has nothing to do with Rails 2 vs Rails 3, which is really what I wanted to answer.

I realized that in Rails 2 version of the application, rails was in vendor/rails and in Rails 3 version of the application, rails is in the Gemfile. This difference causes the paths belonging to rails (active support, active record, etc) to end up in different locations in the load path. Here is why:

The load path is initially setup by Bundler based on the Gemfile. It does so by building a dependency graph of all the gems and placing the dependents of a gem before that gem in the load path (presumably this is to allow dependents to override functionality by naming files in the same way as the gem they depend on ... although this seems like a terrible idea). This means gems that have lots of dependents tend to be towards the end of the load path and gems without any dependents tend to be towards the beginning of the load path. Since the rails gems are popular and have lots of dependents, Bundler puts them towards the end of the load path.

After Bundler is done with the load path, Rails starts modify the load path (usually by prepending) during its initialization process. One of the directories that is prepended, if it exists, is vendor/rails.

So using vendor/rails causes rails to be towards the front of the load path and using Gemfile causes rails to be towards the end of the load path. Since so many files come from rails, this causes the large difference in file stats.

Knowing this, I wrote some additional code to move the rails gem to the front of the load path. Here is how it looks,
rails_gems = [ 'activeresource', 'actionpack', 'activerecord', 'activemodel', 'activesupport' ] rails_paths = rails_gems.map { |rails_gem| $LOAD_PATH.detect { |path| path.include? rails_gem } } rails_paths.each { |path| $LOAD_PATH.unshift(path) } # put in front Putting the two solutions together, the number of file stats dropped to 120k. No more problems in production related to NFS and few seconds got shaved off the start time.

Putting it all together, create a file config/speed_up_rails_load_time.rb,
module SpeedUpRailsLoadTime class << self def yes! move_rails_gem_to_front_of_load_path try_to_require_rb_files_first end def move_rails_gem_to_front_of_load_path rails_gems = [ 'activeresource', 'actionpack', 'activerecord', 'activemodel', 'activesupport' ] rails_paths = rails_gems.map { |rails_gem| $LOAD_PATH.detect { |path| path.include? rails_gem } } rails_paths.each { |path| $LOAD_PATH.unshift(path) } # put in front end def try_to_require_rb_files_first Object.send(:include, ::SpeedUpRailsLoadTime::RequirePatch) end end module RequirePatch def require(file) unless file.to_s =~ /\.[a-z]+$/ begin super("#{file}.rb") rescue LoadError super end else super end end end end SpeedUpRailsLoadTime.yes! And require it in config/application.rb before require 'rails/all' like so,
require File.expand_path('../boot', __FILE__) require File.expand_path('../speed_up_rails_load_time', __FILE__) require 'rails/all'

2 Comments:

Anonymous Anonymous said...

Hi,

This looks like great advice.

Where is the best place to put this to get the most effect from the rails load process?

(We are running 2.3.11 w/ >300 models - looking at the migration to 3.0/3.1)

Thanks,
Keenan

June 15, 2011 at 7:21 AM  
Blogger Paul Kmiec said...

Wow ... 300 models. I thought we large with 150 models :). We should talk. I'm interested how you deal with memory and GC issues. Which Ruby version are you using?

I added a full example and the right place to hook it into Rails 3. In Rails 2, however, you'll probably want to insert right after Bundler.setup. Let me know if that's clear.

June 16, 2011 at 11:13 AM  

Post a Comment

Subscribe to Post Comments [Atom]

<< Home