Tuesday, May 17, 2011

Rails3 leaks (more) memory?

I'm involved in upgrading a rather large Rails app from Rails 2.3.4 to Rails 3.0.7. I say rather large because this app has about 150 active record models and another 200 classes that use the active model functionality. It also uses 90ish gems and plugins. The app is deployed with mongrels and REE 1.8.7 (2010.02 .. 32bit) with GC tuned.

With Rails2, a mongrel initially uses about 210MB of memory and eventually grows to high 200's as requests force additional parts of the app to be loaded. Monit is set to restart anything that goes beyond 300MB, which happens periodically but at a manageable rate (i.e. once per hour).

With Rails3, a mongrel initially uses about 190MB of memory and is restarted by Monit after few requests. Turning off Monit, the mongrel process quickly grows to 400s with initial few requests to few different controllers and then to high 500s without even hitting the more obscure parts of the app.

Something is wrong!

Additional experiments revealed that it's not a leak in the traditional sense. Meaning hitting the same url over and over does not grow the memory. Only the first and sometimes also the second requests cause memory to jump. This points to the memory growth being related to Rails' lazy loading. Rails3 (much more so than Rails2) defers lots of initialization until "later" to help achieve faster start up time. That "later" is usually when a request is being handled at which point Rails initializes whatever is needed to complete the request. For example, even though the User model is required during start time, the columns are not obtained from the db and the attribute methods are not created until they are needed.

I configured the app such that I can run in in production mode on my MacBook. I added memprof to the Gemfile. I created the following dead simple selenium script,
login # /sessions/new, fill out form, redirected to /dashboards open_and_wait("/owners/new") open_and_wait("/owners/new") open_and_wait("/owners/new") open_and_wait("/owners/new") open_and_wait("/final_gc") This scripts hits the SessionsController, DashboardsController, and OwnersController. The final_gc is a middleware right after Rails::Lock which just GCs and returns 200. Afterwards, the heap is dumped via kill -URG <pid> as provided by memprof gem.

I was surprised to find instances of the controllers on the heap (and therefore everything that controllers reference: env, templates, etc). Let's focus on the OwnersController. That controller was hit 4 times and two instances were hanging around. In fact, it was an instance that handled the first request and another instance that handled the second request. The third and fourth requests did not leave controller instances. This behavior is very consistent.

At first, I was convinced the complexity of the app was causing the problem (i.e. some monkey patch or custom functionality). But the more I poked around the less it seemed related to the app itself.

So, I created a brand new Rails 3.0.7 application with two models: Post and Author. I've modified the selenium script to do the following,
open_and_wait("/authors/new") open_and_wait("/authors/new") open_and_wait("/authors/new") open_and_wait("/authors/new") open_and_wait("/authors/new") open_and_wait("/final_gc") where /authors/new is just,
def new @author = Author.new end Amazingly, this simple app has the same problem in that an instance of AuthorsController, which handled the first request, is hanging around (metal.rb line 178 is where controller instances are created),
{"_id":"0x80d2ab0", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_controller/metal.rb", "line":178,"time":1305667260896820,"type":"object", "class":"0x1d3f6b0", "class_name":"AuthorsController", "ivars":{"@_status":200,"@_response_body":"0x809d928", "@_env":"0x80f69d8", "@_view_runtime":"0x8091830", "@_request":"0x80d29e8", "@lookup_context":"0x80bfaa0", "@_headers":"0x80d2998", "@author":"0x80ae3e0", "@action_has_layout":true,"@_action_name":"0x80d2e70", "@_config":"0x80aea98", "@view_context_class":null,"@_db_runtime":0,"@_response":"0x80d2920"},"code":2} so following my understanding of GC, the controller instance is hanging around because something in the global scope is directly or indirectly referencing this instance. So, I ran a script to produce a everything that directly and indirectly reference this instance,
{"_id":"0x80afbc8", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/activesupport-3.0.7/lib/active_support/callbacks.rb", "line":93, ...} {"_id":"0x80d2920", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_controller/metal/rack_delegation.rb", ...} {"_id":"0x80d29e8", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_dispatch/http/request.rb", "line":42, ...} {"_id":"0x80d2ab0", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_controller/metal.rb", "line":178, ...} {"_id":"0x80f59e8", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_dispatch/middleware/remote_ip.rb", "line":47, ...} {"_id":"0x80f69d8", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/rack-1.2.2/lib/rack/handler/mongrel.rb", "line":46, ...} {"_id":"0x82964c8", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_dispatch/http/response.rb", "line":149, ...} It is a limited amount of stuff and nothing stands out as to why GC would no collect this graph (just to be sure, I uploaded this heap dump to memprof.com and confirmed this graph is correct). I noticed that the heap data contains the thread responsible for creating that thing on the heap. Using this, I can find everything that was created (and is still around) during the request that caused the controller instance to leak,
{"_id":"0x809da40", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_view/template.rb", "line":105,"time":1305667260916474,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80a09c0", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_view/template.rb", "line":105,"time":1305667260916068,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80a2e28", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_view/template/resolver.rb", "line":10,"time":1305667260915127,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80a2f18", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_view/template/resolver.rb", "line":10,"time":1305667260915111,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80a4958", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_view/template/resolver.rb", "line":10,"time":1305667260912128,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80a5cb8", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/activesupport-3.0.7/lib/active_support/core_ext/class/attribute.rb", "line":73,"time":1305667260911863,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80a6e38", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/activesupport-3.0.7/lib/active_support/core_ext/class/attribute.rb", "line":73,"time":1305667260911852,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80a70b8", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_dispatch/routing/route_set.rb", "line":283,"time":1305667260911649,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80a7130", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_dispatch/routing/route_set.rb", "line":283,"time":1305667260911638,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80a9908", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_view/template/resolver.rb", "line":10,"time":1305667260905821,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80a99f8", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_view/template/resolver.rb", "line":10,"time":1305667260905808,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80aa8d0", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/activemodel-3.0.7/lib/active_model/attribute_methods.rb", "line":110,"time":1305667260905562,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80aa8f8", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/activemodel-3.0.7/lib/active_model/attribute_methods.rb", "line":110,"time":1305667260905553,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80ab6e0", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/activemodel-3.0.7/lib/active_model/attribute_methods.rb", "line":110,"time":1305667260905452,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80ab708", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/activemodel-3.0.7/lib/active_model/attribute_methods.rb", "line":110,"time":1305667260905443,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80ad850", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/activemodel-3.0.7/lib/active_model/attribute_methods.rb", "line":110,"time":1305667260904746,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80ad878", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/activemodel-3.0.7/lib/active_model/attribute_methods.rb", "line":110,"time":1305667260904737,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80ae980", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/activesupport-3.0.7/lib/active_support/ordered_hash.rb", "line":37,"time":1305667260904385,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80aea48", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/activesupport-3.0.7/lib/active_support/ordered_hash.rb", "line":37,"time":1305667260904366,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80afbc8", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/activesupport-3.0.7/lib/active_support/callbacks.rb", "line":93,"time":1305667260904191,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80b0e38", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_dispatch/http/parameter_filter.rb", "line":27,"time":1305667260903865,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80c1058", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_view/lookup_context.rb", "line":25,"time":1305667260899722,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80c1080", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_view/lookup_context.rb", "line":25,"time":1305667260899715,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80c1c88", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_view/lookup_context.rb", "line":25,"time":1305667260899620,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80c1cb0", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_view/lookup_context.rb", "line":25,"time":1305667260899614,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80c3da8", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_view/template/resolver.rb", "line":9,"time":1305667260899365,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x80c3ec0", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_view/template/resolver.rb", "line":9,"time":1305667260899331,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} {"_id":"0x82964c8", "file":"/Users/pkmiec/.rvm/gems/ree-1.8.7-2010.02/gems/actionpack-3.0.7/lib/action_dispatch/http/response.rb", "line":149,"time":1305667261167737,"type":"data", "class":"0x22f31b0", "class_name":"Proc", "thread":"0x80f7310" ...} As expected, you see some of the Rails lazy loading stuff (i.e. attribute_methods for active model and resolver / template stuff for partial). I began thinking that lazy loading is related to the memory leak because that's the only difference between the first request and consequent requests. This test this theory, I added the following code to production.rb which pre-loads model and controller bits,
class Paul def self.init_controllers(controllers) controllers.each do |c| puts "init_controller #{c.name}" begin c.view_context_class c.config rescue => ignored end init_controllers(c.subclasses) end end def self.init_models(models) models.each do |m| puts "init_model #{m.name}" begin m.define_attribute_methods m.new({}) rescue => ignored end init_models(m.subclasses) end end end config.after_initialize do Paul.init_models(ActiveRecord::Base.subclasses) Paul.init_controllers(ApplicationController.subclasses) end Eureka ... no more controller instances hanging around! This same trick avoids controller instances / memory leak on the rather large app which I started with.

But the question of WHY still remains. I believe it has to do with some of the craziness of lazy-initialization (i.e. eigen / singleton classes created by attribute_methods, scope capturing in resolver.rb with Hash.new { }, etc). But the heap dump does not show it ... which leads to be believe the problem is somehow fundamental to ruby vm.

I've tried both mysql and sqlite3. I've tried mongrel and webrick. I've tried ruby-1.8.7-p334, ree-1.8.7-2010.02, and ree-1.8.7-2011.03 (memprof does not work with 1.9 yet). All show the same behavior.

Perhaps someone with more memory / GC /gdb experience can shed a light ...

2 Comments:

Blogger @yinhm said...

I'v hit this problem when I upgraded to rails3 too, memory was raise to 500M quickly, I remember it was raised to 1.5G once, after couple of hours digging I ends up migrating to ruby 1.9, problem solved.

May 24, 2011 at 11:41 PM  
Blogger Paul Kmiec said...

I wouldn't be surprised if ruby 1.9 does not have this problem as it's completely different code. When I get a change, I'll try to verify it though.

May 25, 2011 at 10:00 PM  

Post a Comment

Subscribe to Post Comments [Atom]

<< Home