Elasticsearch + Ruby: Use Query Objects to Build and Execute Queries

Writing code to construct and execute elasticsearch queries can get complex and lengthy quickly. On top of building and executing the query you usually have to format the returned results.

When I was implementing search at Zaarly I wasn't super confident that the place I was putting all of this logic made sense. Before getting into where I believe it should go let's quickly walk through what all is involved when you want to get back search results.

Birth of a search result

Building a query is usually straightforward assuming familiarity with elasticsearch's verbose syntax. Queries just tend to be lengthy. We're talking potentially hundreds of lines of code to construct the JSON for the query. Some of this length comes from conditional filters. For example if a user wants results only within a certain price range then a range filter should be applied, otherwise it shouldn't.

Once the query is constructed it needs to be executed. This can involve specifying the mapping types and indices to search within.

Finally, results may need to be formatted before being used. For example if a document has a created_at field it will be of type String instead of Time and could be changed for convenience.

This all needs a home. 🏡

Just put it in the model

First I stored all of the query logic right inside of the model. A simple Item model with attributes title, description, and approved might look something like:

What bothered me about this approach was that this massive method in the model didn't actually have anything to do with the model aside from its name. No Item instances get created and no other methods, validations, callbacks, etc. get used when calling Item.search.

A concerning concern

So, uh, next I explored moving the search method to a concern. By having the .search method elsewhere it made item.rb look cleaner, but nothing really changed. Item.search still had nothing to with the Item model and it also had way too much responsibility for one method.

Not that I'd recommend this approach. At all. But here's what it looks like:

Query objects

I had seen query objects before being used to extract complex queries out of a model but never "needed" to use them myself. After going through some examples, I came up with a version that gets used throughout our codebase.

To execute a search query we can just run ItemSearchQuery.execute("foo").

The Item model is now no longer polluted with logic that doesn't pertain to the lifetime of its instances and the large .search method gets a nice refactoring into #execute, #build_query, and #format_results.

The naming convention works for all types of queries e.g. ItemSearchQuery, ItemAutocompleteQuery, ItemSuggestionQuery. You can even extract out .execute into a concern for less code duplication if you end up having multiple queries: