Data Processing

Reducing ActiveRecord memory consumption using Apache Arrow

The pluck method provided in ActiveRecord is a platform to obtain one or few field values as an array or arrays from a database. The pluck method, compared with finder methods, can considerably reduce memory consumption because it does not generate model instances.

In this talk, I would like to introduce my new approach to reduce the memory consumption of ActiveRecord. This approach employs Apache Arrow as the internal data representation of an ActiveRecord::Result object. This approach can achieve a remarkable reduction of the memory consumption of the pluck method; it is 2-12x efficient than the original implementation.

RubyKaigi 2019 https://rubykaigi.org/2019/presentations/mrkn.html#apr20

RubyKaigi 2019