Adding Sorbet to a Rails Monolith

The Types Must Flow

William Pride
Flexport Engineering

--

After writing Java for six years at my previous job, onboarding to Flexport’s large Rails monolith was a daunting task. Flexport’s domain — international logistics — is complex. Our operations span the globe and diverse business units from finance and customs to trucking and ocean freight. Even with a perfect codebase with zero incidental complexity the amount of inherent complexity would still be huge.

Ruby on Rails served us well getting up and running, but as our operations grew, the shortcomings of a “convention over configuration” monolith with a large team became more apparent. Models and code were becoming highly coupled across the team. The data model was very hard for new engineers to learn. Code changes in central parts of the application felt dangerous.

We’ve had a number of initiatives to manage this complexity. We created visualization tools to better understand our data graph. We created Rails Engines to enforce domain modularity, reducing the surface area engineers had to worry about. We implemented a customized linter that let us enforce formatting standards and ban unnecessarily confusing or non-idiomatic Ruby.

Unfortunately, even well-written and structured Ruby can be difficult for a new engineer to understand. Without type signatures, it’s hard to tell exactly what a method’s inputs and outputs are. You can’t reliably search for class and method definitions or usages since many aren’t explicitly defined in source code or are called indirectly. IDE features like Find Usages work poorly and so aren’t much help; command line tools can’t warn you about misspelled variables or calls to non-existent methods.

The underlying causes of these problems are the same: meta-programming and dynamic typing. Because Ruby lets you define a method at runtime you can’t possibly know what methods are defined on a class just from searching the source code. There could be any method; there could even be all of them! The same is true for constants and even classes. These problems are particularly pronounced in Rails which makes heavy use of these meta-programming constructs.

For faster feedback loops during the development, we need static code analysis. Static analysis lets IDEs and tools provide instantaneous error warnings and suggestions. This is much harder for untyped languages; however, some tools can add this functionality. Flexport had already added Flow to our JavaScript code with great results. Enter Sorbet, a gradual typing system for Ruby.

Sorbet

Sorbet brings to Ruby the static analysis that makes writing and refactoring code in typed languages so much safer and smoother. Sorbet warns when you try to reference a non-existent class, call a non-existent method, or pass in an argument of the wrong type. Sorbet will warn you about unused variables or unreachable code paths. Sorbet can even top Java in some regards; since Sorbet classes are non-nullable by default, Sorbet will warn you about potential null pointers. Best of all, Sorbet is blazingly fast.

In order to be fast enough, static analysis must work without running your program code. Usually only a tightly optimized C executable will be performant enough.

This begs the question: if classes and methods are added by the Ruby runtime, but static code analysis can’t run Ruby code, how can a static tool possibly know all the types available in your Ruby codebase?

Sorbet cleverly solves this problem with a two-step process. First, Sorbet generates RBI (Ruby interface) files that enumerate the dynamic methods and classes used by your code. Sorbet does this by executing your Ruby code and performing runtime introspection to see what methods and classes are added. Then, Sorbet can merge the types in your Ruby and RBI files and determine whether all the methods and classes referenced actually exist. Sorbet also provides a signature DSL for specifying the argument and return types of your methods. These annotations in your Ruby files work hand-in-hand with the generated RBI files to comprehensively analyze your codebase.

Because Ruby has so many constructs that are antithetical to static typing you have to do some work creating these RBI files in order for Sorbet to statically understand your codebase. In the rest of this post we’ll go over how we added Sorbet to our monorepo at Flexport, the pitfalls we ran into, and the tools we made (and found) along the way. We’ve open sourced any tools that we thought might be useful to the broader community.

srb init

The first step in adding Sorbet to a codebase is running srb init successfully. This command creates the set of RBI files that statically declare all of the modules, classes, constants, and methods that are added by Ruby at runtime. This is how Sorbet can be aware of methods defined by define_method or method_missing during static analysis — without running any Ruby (slow) code.

Sorbet learns about these dynamic methods and constants through runtime introspection; that is, by requiring all of your classes and seeing what methods and constants get added. In a large codebase this can take a long time — currently around 45 minutes for us at Flexport, though we understand we’re on the high end. Unsurprisingly, Rails applications seem to have a lot of dynamic methods that need to be enumerated here.

Ideally we’d only have run this once; unfortunately, we had to run this multiple times to work out all the issues. Sorbet has a way of unearthing many of the skeletons in a codebase’s closet. Because Sorbet needs to know about every line of your codebase, and because it can only learn by doing, the executable will likely run many files that haven’t been edited or even loaded in years. To get srb init to complete we had to:

  1. Remove an old 45MB migration file that was crashing the executable
  2. Remove a 10MB constants file
  3. Ignore many .rb scripts that looped indefinitely or raised an error when run locally. You can ignore files by adding # typed: ignore to the top or adding — ignore command to your config file.
  4. Fix, delete, or ignore many broken Ruby files that would crash when run due to things like missing constants or imports

The command notified us what file it was running when it crashed. This was usually enough information to work out the issue.

Getting to false

Once we managed to get srb init to complete we had a set of RBI files declaring all of the classes and modules we include from gems, as well as the methods and classes dynamically added by our own classes. This meant all of our types were known to Sorbet.

The next step was getting the Sorbet typecheck to pass on our codebase at the minimum strictness level. Sorbet is a gradual typing system, meaning that you “opt in” each file to a certain level of strictness and increase the strictness over time. Briefly, the levels are:

  • ignore: Sorbet cannot see this file.
  • false: All referenced classes and constants must be known
  • true: All referenced methods must be known and called with the correct arity and type of arguments
  • strict: All methods must have signatures
  • strong: Sorbet knows the types of 100% of the variables in the file

First we want Sorbet to pass with all files typed at false. At this level Sorbet checks that all of the constants and classes referenced in the codebase are valid, but does not check that methods exist. After running srb init all of the dynamic constants should be in RBI files; the rest should be in our normal Ruby files. So this type level should pass no-sweat, right?

Unfortunately we had more work remaining. Sorbet cannot parse some Ruby constructs and so some of our formerly valid Ruby code was now considered invalid. Usually there was a one-to-one replacement for the disallowed syntax, but we had to fix all of these before Sorbet could pass at even the most lenient type level.

No Dynamic Includes

Ruby in all its permissiveness allows you to include an expression, as in include Rails.application.routes.url_helpers This makes static analysis impossible since Sorbet cannot know what the expression will evaluate to at runtime and so can’t know what constants will and will not be available in the class. All of these have to be replaced with constant includes:

This was probably the trickiest to fix as we used DryTypes quite extensively and the core import statement is dynamic. Further, most of the classes being imported are themselves dynamically created! One of the joys (or horrors) of adding Sorbet to your codebase is learning about crazy cleverness going on under the hood. To address this particular problem we created a (now open sourced) RuboCop rule that can auto-correct dynamic DryTypes imports to a static formulation and forbids the dynamic form.

Sorbet provides its own Struct system that replaces DryTypes and is statically analyzable by Sorbet. We’re in the process of migrating all of our DryTypes to T::Structs.

No Constant Inheritance

The next largest set of changes we had to make was significantly easier; in Sorbet, child classes do not inherit constants from their parents. So, in places where a subclass accessed an inherited constant we either had to add the constant to the subclass, or add the full path to the superclass constant.

Bad, Bad Code

Sorbet will find references to classes and constants that just do not exist in methods that are never called or classes that are never included. Usually these classes had been moved or renamed and some reference was missed in the shuffle. While these errors are a bit embarrassing to find they are also the proof in the pudding: once you’ve added Sorbet they become a thing of the past.

And the rest

Sorbet doesn’t allow some random Ruby constructs we were using lightly such as Regular Expression back references, a newer range syntax, and private_class_method. Sorbet is also stricter about using absolute paths than the Rails autoloader will require you to be, so some relative paths have to be updated.

We made each of these changes in separate pull requests before adding Sorbet to the codebase so that the changes would be more digestible and any failures easily identifiable. We added the DryTypes RuboCop rule early so that engineers would stop adding soon-to-be-forbidden includes during the rollout.

Sorbet Forever

Once we had addressed all of that running srb tc returned no errors. Relief washes over me in an awesome wave.

We were then ready to merge the change into master. In addition to adding the RBI files and new gems we started running srb typecheck as part of the CI build to ensure no new type errors could be introduced. And because srb typecheck is so fast we added this check to our pre-commit hook so that developers receive immediate feedback about errors.

Tools

IDE

The Sorbet team has had a VSCode plug-in in the pipeline for a while but there’s been no word recently on its release date. In the meantime, running watchman-make -p ‘**/*.rb’ ‘**/*.rbi’ — run srb inside VSCode gives us fast, actionable type-checking while we’re working.

Sorbet

The built-in srb rbi suggest-typed tool is also terrific; this will automatically bump all files to the strictest possible type level that doesn’t cause errors.

Rubocop

During the migration we used RuboCop heavily to forbid newly-disallowed syntaxes and to codemod existing violations. Before switching to Sorbet we were using Contracts for runtime method type checking. Since we wanted to be able to preserve this work we implemented a RuboCop rule to automatically convert most Contracts into Signatures. This rule also prevents users from adding new Contracts to the codebase. We did the same with DryTypes. We’ve used this quite successfully and have now open sourced this code in the excellent rubocop-sorbet library.

sorbet-rails

For a Rails application the sorbet-rails Gem is indispensable. Rails makes heavy use of meta-programming to add methods to your controllers, mailers, and models in particular. This gem will create RBI files containing all of the methods you need for Sorbet to understand Rails and ActiveRecord in particular.

Tapioca

We’ll discuss this more below but one of our biggest challenges with Sorbet was with generating RBI files. These commands tend to be slow and to sometimes produce type conflicts. To address this Shopify released Tapioca that takes a different approach to generating your gem RBI files. We’ve found this to be much faster and reliable than the built-in sorbet rbi gems command and have switched over entirely.

Challenges

RBI Files

While executing sorbet typecheck is blazingly fast, generating RBI files is not. Sorbet provides two generator functions: rbi gems and rbi hidden-definitions. rbi gems creates types for all of the classes available from your Gem imports. rbi hidden-definitions is a catch-all for all the types that get added to your classes at runtime.

These functions are slow — sometimes taking upwards of an hour to run. This usually isn’t a problem since we don’t need to update them that often; only when we update Gems or need new hidden functions added. However, because they’re updated so infrequently when they are updated they often change substantially, surfacing conflicts that had been introduced previously. Ideally we would run srb rbi hidden-definitions as part of our CI; however, until the run is faster than 45 minutes we can’t introduce this bottleneck. Dealing with this friction has been the biggest pain point of our users.

We haven’t had much help bringing the runtime down, so we’re now working to remove our dependency on these generated RBI files altogether. As mentioned above, we use Tapioca to generate the Gem RBI files. This has worked incredibly smoothly for us and we no longer use rbi gems.

Killing the hidden-definitions file was far more difficult; there’s no one-to-one replacement and when we deleted the file we had over 6000 type errors. All of these types would need to be declared elsewhere, mostly using hand-written RBI files for all of the missing types. Fortunately there were not 6000 missing types as adding one type sometimes fixed hundreds of errors.

Further, we discovered that not all of the definitions were actually needed. For example, Rails generates all sorts of finder/dirty methods that really never need to be used. To address this, we wrote a script to remove unused definitions (those that don’t result in errors) to vastly descope the manual work needed to kill the hidden-definitions file.

Literal Types

By far the biggest “missing feature” for us is Sorbet’s lack of support for literal types. Many of our Contracts and DryType structs use literal enums and we can’t automatically convert these.

Rails itself makes extensive use of literal enums so these are pervasive in our application. Recently the Sorbet team released T::Enums which go part of the way but still require a conversion step from literal to class.

Type Polymorphism

The other big missing feature is the lack of support for type polymorphism. Ruby on Rails has many commonly used overloaded methods; for example, `find` can accept a single `id` and return a single record, or an array of `ids` and return an array of records. Sorbet cannot express this, so as a result we treat `find` as only having the former usage and must use `find_n` for the latter.

Generics

Generic types exist in Sorbet but are pretty poorly supported and documented. They are not checked at runtime and advanced features like bounded generics are not offered at all. Ideally the type system would have full support for generics. This seems to be the aspiration based on the Sorbet team’s Generics milestone.

Conclusions

Overall we’re thrilled with the benefits we’ve gotten from Sorbet with comparatively little engineering work. Despite the limitations and frustrations listed above we’ve found the benefits of Sorbet far outweigh the drawbacks. Teams creating new Ruby gems or Rails engines continue to add Sorbet to their apps. Excluding our test files, we’ve typed 83% of our Ruby files true or higher, with 11170 (!) manually written method signatures.

Thanks

Thank you to @jez, @ptarjan, and the Sorbet team at Stripe for creating and open-sourcing this incredible tool. Huge thanks to @HarryDoan for the absolutely essential sorbet-rails library, and to the Shopify team for rubocop-sorbet and tapioca libraries. Finally to @Gopi for support, and @KevinMiller, @RileyKlinger, @MichaelHarris, and @MaxHeinritz for helping bring Sorbet to Flexport.

--

--