The Promotions team at theScore is responsible for enabling dynamic marketing campaigns within theScore Bet app. Each promotional campaign allows us to specify an audience of users who will be available to participate in the campaign, by comparing on past actions that a given user has taken against a set of dynamic rules. For example, when creating a new promotion, we can specify that its audience can only include patrons who have previously deposited a specific amount of money, or placed a specific quantity of bets.
Disclaimer: The specific scenario, and format of the data shown in the examples throughout, is contrived entirely for the purposes of this article.
An overview of the PostgreSQL schema
The Promotions service depends on the ability to read past events placed by a patron. These events are received asynchronously from other internal services, via Kafka, and then stored internally in the Promotions database as a generic "user action". These actions are used for a variety of purposes within our Promotions service, but in this case, we are exploring looking at the past actions of a given user to see if they fulfill the audience criteria of a given promotional campaign. The table which they get stored within has been created with a migration that resembled the code sample below.
defmodule Promotions.Repo.Migrations.CreateUserActionsTable do use Ecto.Migration def change do create table(:user_actions) do add :action_id, :string, null: false add :action_type, :action_type, null: false add :occurred_at, :utc_datetime_usec, null: false add :region, :region, null: false add :user_id, :binary_id, null: false add :attributes, :jsonb, null: false timestamps() end end end
The corresponding Ecto schema for the "user_actions" table looks like the following.
defmodule Promotions.UserAction do use Ecto.Schema alias Promotions.UserAction.CasinoWagerAttributes alias Promotions.UserAction.DepositAttributes alias Promotions.UserAction.RegistrationAttributes alias Promotions.UserAction.SportsbookBetAttributes schema "user_actions" do field :action_id, :string field :action_type, Ecto.Enum, values: [:registration, :deposit, :sportsbook_bet, :casino_wager] field :occurred_at, :utc_datetime_usec field :region, Ecto.Enum, values: [:usa_new_jersey, :usa_colorado, :usa_indiana, :usa_iowa, :can_ontario] field :user_id, Ecto.UUID embeds_one :casino_wager_attributes, CasinoWagerAttributes, source: :attributes embeds_one :deposit_attributes, DepositAttributes, source: :attributes embeds_one :registration_attributes, RegistrationAttributes, source: :attributes embeds_one :sportsbook_bet_attributes, SportsbookBetAttributes, source: :attributes timestamps() end end
Each of the various embedded schemas to populate the
attributes field is represented in a separate Ecto schema of its own. For the purposes of this article, only a truncated sample of the
SportsbookBetAttributes embedded schema will be shown below as well.
defmodule Promotions.UserAction.SportsbookBetAttributes do use Ecto.Schema @primary_key false embedded_schema do field :amount_cents_wagered, :integer field :amount_cents_payout, :integer field :bet_type, Ecto.Enum, values: [:straight, :parlay] field :odds, :integer embeds_many :legs, SportsbookBetLegAttributes end end defmodule Promotions.UserAction.SportsbookBetLegAttributes do use Ecto.Schema @primary_key false embedded_schema do field :event_id, :string field :event_type, :string field :market_id, :string field :market_selection_id, :string field :market_type, :string field :in_play, :boolean field :odds, :integer end end
Example Criteria to Turn into Database Queries
The audience criteria of a promotional campaign can be based on any type of action that we record (ie. deposit, sportsbook bet, etc) and the various attributes of the action. The following are examples of such an audience rule.
- The patron registered their account in January 2022
- The patron has placed at least 3 deposits from within New Jersey
- The patron placed at least one sportsbook bet on the last years Super Bowl
For the first two examples, the data is stored within more fields that are of a more primitive data type within the
user_actions table of the database, and are standard to query. For the last example, it requires querying the JSONB
attributes field to determine if any of the "legs" on any of the users previous sportsbook bets contain the
event_id associated with last years Super Bowl.
Similarly, the audience criteria also needs to support scenarios where "The patron has placed a sportsbook bet that was not on last years Super Bowl" as well. This would require looking for a sportsbook bet that does not have an
event_id corresponding to the Super Bowl for the previous year in any of its legs.
The queries we need to write need to work for both "at least one leg of a sportsbook contains" and "none of the legs of a sportsbook bet can contain" scenarios. To do so, we need to investigate how to make precise queries on a set of data within a JSONB field in PostgreSQL, and subsequently how we can turn those queries into composable Ecto queries within our Elixir application as well.
Notably, we want to be able to support operators of "in" and "not in" when querying for data, as opposed to operators such as "equals", "not equals", "greater than or equals", etc.
Query of a JSONB Field
The first results our team came across while looking into this was using PostgreSQL native functions and operators. We quickly realized that these would not fit all of our use-cases. It would be challenging to turn it into a reusable Ecto query that would work for a variety of fields and would be an inelegant solution for deeply nested data.
The next option we came across was to use the function
jsonb_to_recordset to create a queryable schema of records from an array within a JSONB field (ie. the
embeds_many :legs section of our bet attributes).
In the example below, we’ll try to find if the user has at least one bet with the following attributes.
user_id: "fcca980b-8edc-4848-91cc-4e00ff47019c" region: :can_ontario amount_cents_wagered: at least 1500 (ie. $15) event_id: (not in any of the following) - "e8a95b17-972c-4421-9090-8bf66c0804a9" - "5fbef652-e772-4bae-84f5-31255fe2fadd" - "dd92d86c-fe6d-43a7-8765-a3e1026145db"
jsonb_to_recordset with SQL
The first step to transforming this into a usable query within our Elixir app was to try to get it working with a query made directly against the database to validate that it is the right tool for the job.
After some initial research, we composed the following SQL query:
SELECT public.user_actions AS action, jsonb_to_recordset(action.attributes->'legs') AS leg("event_id" varchar) WHERE action.user_id = 'f0bb75d3-2cce-4328-9177-02349fa99de6' AND action.region = 'ontario' AND (action.attributes->'amount_cents_wagered')::integer >= 1500 AND leg.event_id NOT IN ( 'e8a95b17-972c-4421-9090-8bf66c0804a9', '5fbef652-e772-4bae-84f5-31255fe2fadd', 'dd92d86c-fe6d-43a7-8765-a3e1026145db' );
While this allows us to use the operators we want (ie.
NOT IN) on data within the "attributes" JSONB field, and gets us the exact records we want, it would be a challenge to try to make it fit into an Ecto query since Ecto queries only allow for one entity to be in the
from clause of a given query. For example, we are permitted to begin a query with
from UserAction, but are unable to create a query that selects from two data sources such as
from UserAction, fragment("jsonb_to_recordset(...)"). If there is a way around that, we were unable to find a solution.
The next option we looked to was modifying the first query by utilizing a lateral-join to try to make a "user action" such that it has a one-to-many relationship with its own "legs" nested within the JSONB "attributes" field. By doing so, we eliminate the need to use multiple clauses in the
FROM section of the query, which helps gets us closer to a valid Ecto query translation. The query then looks like this:
SELECT action.* FROM user_actions AS action INNER JOIN LATERAL ( SELECT * FROM jsonb_to_recordset(action.attributes->'legs') AS leg("event_id" varchar) ) AS leg ON true WHERE action.user_id = 'f0bb75d3-2cce-4328-9177-02349fa99de6' AND action.region = 'can_ontario' AND (action.attributes->'amount_cents_wagered')::integer >= 1500 AND leg.event_id NOT IN ( 'e8a95b17-972c-4421-9090-8bf66c0804a9', '5fbef652-e772-4bae-84f5-31255fe2fadd', 'dd92d86c-fe6d-43a7-8765-a3e1026145db' )
The problem we quickly saw with this query is that will look to see if any of the "legs" of a specific bet adhere to the condition. Instead, we want to make sure that the all of the legs within the bet are not on those listed events (ie. cannot have any of the specified "event id" values). As such, we found that putting the inner join in a subquery helped to achieve the goal, wrapping it in a "NOT EXISTS" condition.
SELECT action.* FROM user_actions AS action WHERE action.user_id = 'f0bb75d3-2cce-4328-9177-02349fa99de6' AND action.region = 'ontario' AND (action.attributes->'amount_cents_wagered')::integer >= 1500 AND NOT EXISTS ( SELECT * FROM user_actions AS lateral_action INNER JOIN LATERAL ( SELECT * FROM jsonb_to_recordset(lateral_action.attributes->'legs') AS leg("event_id" varchar) ) AS leg ON true WHERE action.id = lateral_action.id AND leg.event_id IN ( 'e8a95b17-972c-4421-9090-8bf66c0804a9', '5fbef652-e772-4bae-84f5-31255fe2fadd', 'dd92d86c-fe6d-43a7-8765-a3e1026145db' ) );
This query seems to meet all of the goals that we were looking to achieve, so the next step was to take it and turn it into a composable query within Ecto.
Translating SQL query to an Ecto Query
Many parts of our query are pretty quick and easy to make into an
Ecto.Query structure. We can begin with everything before the subquery. For this example, we can assume all of the values were passed in and the operators for each part of the expression are known:
def query(user_id, region, amount_cents, event_ids) from action in Promotions.UserAction, as: :user_action, where: action.user_id == ^user_id, where: action.region == ^region, where: fragment("(?->?)::integer", action.sportsbook_bet_attributes, "amount_cents_wagered") >= ^amount_cents end
We also need to build a query around the
jsonb_to_recordset data that we created in the lateral join seen earlier.
This was a bit challenging, as
jsonb_to_recordset is a native function of PostgreSQL and there are no functions built into Ecto to work with it directly. It also requires us to define an
AS ... ON ... set of clauses to use with it. Since Ecto will create its own
AS alias with each "from" or "join", that presents challenges as well.
After some research on this issue on the elixir forums, we were pointed in the right direction to wrap the subquery in parentheses, so that we could use our own
AS within it, and it would not conflict with the one Ecto adds on.
def sportsbook_bet_legs_query do from action in Promotions.UserAction, as: :leg_user_action, inner_lateral_join: legs in fragment( "(SELECT * FROM jsonb_to_recordset(?->'legs') AS record(event_id varchar)", field(action, :sportsbook_bet_attributes) ) on: true as: :sportsbook_bet_legs end
To utilize both queries together, we need to include the
sportsbook_bet_legs_query function as a subquery within the first, and wrap it in a
where not exists clause of the outer query.
Since the subquery uses a lateral join, we need to remember to join it back to the outer query using Ecto’s
parent_as/1 function. Another example of this can be found in Ecto’s documentation for
exists/1 as well (https://hexdocs.pm/ecto/Ecto.Query.API.html#exists/1)
def sportsbook_bet_query(user_id, region, amount_cents_wagered, event_ids) do from action in Promotions.UserAction, as: :action, where: action.user_id == ^user_id, where: action.region == ^region, where: fragment("(?->?)::integer", action.sportsbook_bet_attributes, "amount_cents_wagered") >= ^amount_cents_wagered, where: not exists( from [ leg_user_action: action_subquery, sportsbook_bet_legs: sportsbook_bet_legs ] in event_ids_query(), where: parent_as(:action).id == action_subquery.id, where: sportsbook_bet_legs.event_id in ^event_ids ) end
With the function now complete, it can be called by simply taking the earlier example and passing in the arguments to the reusable function:
user_id = "fcca980b-8edc-4848-91cc-4e00ff47019c" region = :can_ontario amount_cents_wagered = 1_500 event_ids = [ "e8a95b17-972c-4421-9090-8bf66c0804a9", "5fbef652-e772-4bae-84f5-31255fe2fadd", "dd92d86c-fe6d-43a7-8765-a3e1026145db" ] ecto_query = sportsbook_bet_query(user_id, region, amount_cents_wagered, event_id) matching_bets = Repo.all(ecto_query) if not Enum.empty?(matching_bets) do # user qualified for promotion end
As a reminder, one of the goals of our function was to find a "user action" record that did not contain any of the values of the list of
event_id’s from the attributes of any of its array of "leg" data points. One benefit of structuring the query in the way that has been shown here is that if we want to flip it to find if one (or more) legs does include a particular
event_id, we can simply remove the
not from the
not exists condition without any other modifications.
The queries that we need in our application are more robust, including changing up the operators that we use instead of statically choosing "equals", "greater than or equals" and "not in" as outlined in our example, but from this template we were able to build out a library of robust and composable queries to scope results based on attributes both in more primitive data types within our database schema, as well as JSONB fields too.
I've set up a small-scale Elixir application which can be used to quickly scaffold a database with records, as seen in the format throughout this blog post, so you can test the queries seen within this article and tweak them as needed to experiment. Github Repository: Ecto JSONB Example.