Benchmarking JSON Generation in Ruby

8 min read
rubyjsonperformance

At theScore, we have a big JSON API with hundreds of end-points that expose sports data. Since sports data is very rich, the JSON representation of most of the resources on our API tend to be complex. As a result, a lot of time is spent on generating JSON in our Rails application.

At the time we wrote the API, RABL was a great choice. When ActiveModel Serializers came out with an object-oriented approach to generating JSON, we were excited. We tried it out, and ended up with something that is more maintainable than the RABL counterparts.

But, migrating the entire JSON generation code from RABL to ActiveModel Serializers (AMS) would require massive effort. Since RABL is still relatively easy to use and maintain, more ease of use and maintenance aspects of AMS is not a good enough reason to do the migration. On the other hand, if we can significantly increase the performance of JSON generation, then the effort to do this migration is justified.

Having said that, we had been using plain Ruby presenters to generate JSON with great success in some of our smaller API projects. If AMS turn out to be much slower than using presenters, then we might want to explore presenters a little more as a possible approach for our main API.

So that's enough background as to why we're doing this benchmark.

The Setup

For the benchmark, we extracted real RABL code from our API that generates JSON for basketball teams and events, and simplified a bit. Then, we implemented AMS and plain Ruby presenters to generate the same JSON. We have 3 different cases that we want to benchmark.

Case 1: Ultra Simple

The simplest case in which we generate JSON for a single team object:

{
  "abbreviation": "TOR",
  "full_name": "Toronto Raptors",
  "location": "Toronto"
}

Case 2: Simple

Slightly more complex than above in which we embed team objects into the main event object:

{
  "game_date": "2014-05-21",
  "game_type": "Regular Season",
  "status": "Final",
  "away_team": {
    "abbreviation": "MIA",
    "full_name": "Miami Heat",
    "location": "Miami",
    "medium_name": null,
    "short_name": "Heat"
  },
  "home_team": {
    "abbreviation": "TOR",
    "full_name": "Toronto Raptors",
    "location": "Toronto",
    "medium_name": "Toronto",
    "short_name": "Raptors"
  }
}

Case 3: Complex

The most complex case in which we not only embed team objects and a box score object into the main event object, but the box score in turn also embed a last play object. Also, there are more attributes on the event object.

{
  "game_date": "2014-05-21",
  "game_type": "Regular Season",
  "status": "Final",
  "share_url": "http://thesco.re/123",
  "sport_name": "basketball",
  "away_ranking": 10,
  "away_region": "Pac-12",
  "home_ranking": 15,
  "home_region": "Top 25",
  "important": true,
  "location": "Washington, DC",
  "away_team": {
    "abbreviation": "MIA",
    "full_name": "Miami Heat",
    "location": "Miami",
    "medium_name": null,
    "short_name": "Heat"
  },
  "home_team": {
    "abbreviation": "TOR",
    "full_name": "Toronto Raptors",
    "location": "Toronto",
    "medium_name": "Toronto",
    "short_name": "Raptors"
  },
  "box_score": {
    "has_statistics": true,
    "progress": "11:23 2nd",
    "attendance": "21,307",
    "referees": "Thuva, Nate, Roel",
    "last_play": {
      "points_type": "Field Goal",
      "player_fouls": 10,
      "player_score": 15,
      "record_type": "Postseason",
      "seconds": 100
    }
  }
}

We will also benchmark generating JSON for a collection of 100 objects for each case above.

Further, the benchmark is run under Ruby 2.1.1, and with the latest versions of RABL and AMS (0.9.3 and 0.8.1 respectively) at the time of writing this. The machine in which the benchmark is run is irrelevant because we're only interested in relative performance of RABL, AMS, and presenters. You can checkout the setup here.

We will be running the benchmark twice. You will see why shortly.

First Run

As you can see below, AMS is about 2-3X slower than presenters. Since we're not going for pure speed, it's still worth using AMS as it provides many features that we will likely to implement ourselves for presenters. So we're going to stop comparing them now.

On the other hand, RABL is significantly slower than AMS. It is about 20-25X slower. This difference remains the same when we deal with a collection of objects.

The times you see is the total for 10,000 iterations. Refer to "The Setup" section
above to understand the different cases: Ultra Simple, Simple, and Complex.

                                               user     system      total        real
RABL Ultra Simple                          1.610000   0.420000   2.030000 (  2.029277)
AMS Ultra Simple                           0.080000   0.000000   0.080000 (  0.083180)
Presenters Ultra Simple                    0.030000   0.000000   0.030000 (  0.029341)
--------------------------------------------------------------------------------------
RABL Simple                                9.080000   2.250000  11.330000 ( 11.342106)
AMS Simple                                 0.510000   0.000000   0.510000 (  0.505133)
Presenters Simple                          0.140000   0.000000   0.140000 (  0.140356)
--------------------------------------------------------------------------------------
RABL Complex                              19.030000   4.490000  23.520000 ( 23.529937)
AMS Complex                                1.010000   0.000000   1.010000 (  1.007729)
Presenters Complex                         0.320000   0.010000   0.330000 (  0.320893)


                                               user     system      total        real
RABL Ultra Simple: Collection              1.400000   0.420000   1.820000 (  1.829971)
AMS Ultra Simple: Collection               0.060000   0.000000   0.060000 (  0.062879)
Presenters Ultra Simple: Collection        0.020000   0.000000   0.020000 (  0.017285)
--------------------------------------------------------------------------------------
RABL Simple: Collection                    8.420000   2.180000  10.600000 ( 10.590213)
AMS Simple: Collection                     0.440000   0.000000   0.440000 (  0.443547)
Presenters Simple: Collection              0.110000   0.000000   0.110000 (  0.105298)
--------------------------------------------------------------------------------------
RABL Complex: Collection                  18.370000   4.370000  22.740000 ( 22.744374)
AMS Complex: Collection                    0.940000   0.000000   0.940000 (  0.944308)
Presenters Complex: Collection             0.270000   0.000000   0.270000 (  0.272239)

Second Run

As per the results of the first benchmark run, RABL is much slower compared to AMS. There is no good reason why it has to be. So we did a bit of investigation, and discovered that RABL spends a lot of time on template lookup. We can configure RABL to cache this lookup:

Rabl.configure do |config|
  config.cache_sources = true
end

Below, you can see the results of the benchmark with template lookup caching enabled for RABL.

Although performance doubles for RABL, it is still much slower. It is now about 12X slower compared to AMS.

                                               user     system      total        real
RABL Ultra Simple                          1.050000   0.000000   1.050000 (  1.051644)
AMS Ultra Simple                           0.080000   0.000000   0.080000 (  0.084204)
Presenters Ultra Simple                    0.030000   0.000000   0.030000 (  0.028589)
--------------------------------------------------------------------------------------
RABL Simple                                5.820000   0.020000   5.840000 (  5.834718)
AMS Simple                                 0.490000   0.000000   0.490000 (  0.492227)
Presenters Simple                          0.130000   0.000000   0.130000 (  0.128998)
--------------------------------------------------------------------------------------
RABL Complex                              12.790000   0.020000  12.810000 ( 12.810967)
AMS Complex                                0.990000   0.010000   1.000000 (  0.998783)
Presenters Complex                         0.330000   0.000000   0.330000 (  0.322893)


                                               user     system      total        real
RABL Ultra Simple: Collection              0.830000   0.000000   0.830000 (  0.832709)
AMS Ultra Simple: Collection               0.060000   0.000000   0.060000 (  0.062520)
Presenters Ultra Simple: Collection        0.020000   0.000000   0.020000 (  0.018377)
--------------------------------------------------------------------------------------
RABL Simple: Collection                    5.420000   0.010000   5.430000 (  5.428974)
AMS Simple: Collection                     0.440000   0.000000   0.440000 (  0.444187)
Presenters Simple: Collection              0.100000   0.000000   0.100000 (  0.103999)
--------------------------------------------------------------------------------------
RABL Complex: Collection                  12.350000   0.010000  12.360000 ( 12.364715)
AMS Complex: Collection                    0.940000   0.010000   0.950000 (  0.942812)
Presenters Complex: Collection             0.270000   0.000000   0.270000 (  0.263692)

Conclusion

To reiterate, AMS is significantly faster compared RABL. Migrating to AMS from RABL not only gives us greater ease of use and maintenance, but it also gives us great performance gains. So the effort to do the migration is rightly justified.

Benchmarking is a tricky business. There might be some big holes in the benchmarking above. Checkout out the setup here, and feel free to comment if you find any issues with it.

Update: When the post was originally written, the assumption was that Oj was being used as the JSON encoding engine by all of RABL, AMS, and presenters However, it turned out not to be the case.

In fact, it was only being used by RABL as it is the engine RABL uses by default. In othewords, the benchmark was unfair to AMS and presenters. With the Oj boost to AMS and presenters, the relative performance of RABL gets even worse. This post has been updated to include the latest benchmark results. You can find the original benchmark results here.