This document lists all the methods Metarank may use for ranking. It's the one defined in the models.<name>.type part of config file:

    type: lambdamart 


LambdaMART is a Learn-to-Rank model, optimizing the NDCG metric. There is a Lambdamart in Depth article by Doug Turnbull describing all the details about how it works. In a simplified way, LambdaMART in the scope of Metarank does the following:

  1. Takes a ranking and some relevancy judgements over items as an input (judgements can be implicit, like clicks, or implicit like stars in movie recommendations)

  2. All items in the ranking have a set of characteristics (ML features, like genre or CTR as an example)

  3. A pair of items from the ranking is sampled.

  4. ML model must be able to guess which item in this pair may have higher relevancy judgement.

  5. Repeat over all pairs in the ranking.

  6. Repeat over all the rankings in the dataset.

At the end, items with higher judgements should be ranked higher, making the resulting ranking more relevant.

In Metarank there are two supported library backends implementing this algorithm:

To configure the model, use the following snippet:

  <model name>:
    type: lambdamart 
      type: xgboost # supported values: xgboost, lightgbm
      iterations: 100 # optional (default 100), number of interations while training the model
      seed: 0 # optional (default = random), a seed to make training deterministic
    weights: # types and weights of interactions used in the model training
      click: 1 # you can increase the weight of some events to hint model to optimize more for them
    features: # features from the previous section used in the model
      - foo
      - bar
#    selector: # optional set of selectors to filter events for this specific model
#      rankingField: source
#      value: search

#    split: optional definition of train/test splitting strategy. See below for examples.

#    eval: optional list of evaluation metrics.

#    warmup: optional API warmup settings
#      sampledRequests: 100 # how many requests to sample during training
#      duration: 5s # how long to perform the warmup.
  • backend: required, xgboost or lightgbm, specifies the backend and it's configuration.

  • weights: required, list of string:number pairs, specifies what interaction events are used for training. You can specify multiple events with different weights.

  • features: required, list of string, features used for model training, see Feature extractors documentation.

  • selector: optional, list of selectors, a set of rules to filter which events should be accepted by this model.

  • split: optional, a train/test splitting strategy. Default: time=80%. Options: random/hold_last/time with an optional ratio (default: 80%, which means 80% allocated to train, 20% to test). Example: random=80% means split dataset randomly, 80% should be allocated to the train set.

  • eval: optional, a list of eval metrics to measure after training. Default value is ["NDCG@10"], supported metrics are NDCG, NDCG@k, MAP, MAP@k, MRR (where k - cutoff value).

  • warmup: optional, API warmup settings. See the API warmup section for details.

Interaction weight

Interactions define the way your users interact with the items you want to personalize, e.g. click, add-to-wishlist, purchase, like.

Interactions can be used in the feature extractors, for example to calculate the click-through rate and by defining weight you can control the optimization goal of your model: do you want to increase the amount of likes or purchases or balance between them.

You can define interaction by name and set weight for how much this interaction affects the model:

  click: 1.0

Event selectors

When serving multiple models, there are cases when you need to separate ranking and interaction events per model. This is useful when your models have different contexts, e.g. you do personalized ranking for search results and recommendation results using the same Metarank installation, but utilizing different models.

Metarank supports selector configuration that can be used to route your events to correct model or drop events in certain scenarios.

Metarank supports the following event selectors:

  • Accept selector. Enabled by default to accept all events, if no selectors are defined.

  accept: true # true = accept all, false = reject all
  • Field selector. Accepts event when it has a specific string (or string-list) field defined for a ranking event. For example:

  rankingField: source
  value: search

The filter above will accept only events that have the source=search field defined in the fields section of the event:

  "event": "ranking",
  "id": "81f46c34-a4bb-469c-8708-f8127cd67d27",
  "timestamp": "1599391467000",
  "user": "user1",
  "session": "session1",
  "fields": [
      {"name": "source", "value": "search"}
  "items": [
    {"id": "item1", "fields": [{"name": "relevancy", "value": 1.0}]},
  • Sampling selector: randomly accept or drop an event, depending on the acceptance ratio:

  ratio: 0.5

The sampling selector above will accept only half of events randomly.

  • Max interaction position selector: only accept click-through events when interaction position is not too high/low. Can be useful to exclude visitor sessions with discovery-style browsing behavior, or too short rankings.

  maxInteractionPosition: 10
  minInteractionPosition: 3 # both fields are optional
  • Ranking length selector: only accept click-through events with number of items within a defined range.

  minItems: 10 # so there should be at least 10 items in the ranking
  maxItems: 20 # both min and max are optional
  • AND/OR/NOT selector: combine multiple selectors within a single boolean combination:

    - rankingField: source
      value: search
    - or:
        - rankingField: segment
          value: test
        - ratio: 0.5
        - not:
            accept: false

AND and OR selectors take a list of nested selectors as arguments, NOT selector only takes a single selector argument.

Train/test splitting strategies

Metarank supports three train/test splitting strategies:

  • random: split dataset randomly.

  • hold_last: for each session having multiple rankings, take last N% of rankings as a test set. Can be useful to measure an in-session personalization impact .

  • time: split dataset by a timestamp.

Each strategy definition in a config file can be optionally configured with a split ratio - 80% by default. An example:

  • random=80%: split dataset randomly. Be careful with random splitting, as it may introduce label leaking.

  • hold_last: split within session with a default 80% splitting ratio.

XGBoost and LightGBM backend options

  • iterations: optional, number, default: 100, number of trees in the model.

  • learningRate: optional, number, default: 0.1, higher the rate - faster training - less precise model.

  • ndcgCutoff: optional, number, default: 10, only N first items may affect the NDCG.

  • maxDepth: optional, number, default: 8, the depth of the tree.

  • seed: optional, string or number, default: random to make model training deterministic.

  • sampling: optional, default: 0.8, fraction of features used to build a tree, useful to prevent over-fitting.

  • debias: optional, default: false. Enable booster-native position bias removal support. See these two articles about the unbiased LTR for XGBoost and LightGBM for details.

LightGBM also supports these specific options:

  • numLeaves: optional, number, default: 16, how many leaves the tree may have.

Please consult LightGBM and XGBoost docs about tuning these parameters.


A shuffle is a baseline model, which may be used in the a/b tests as a "worst-case" ranking scenario, when the order of items is random. The shuffle model is configured in the following way:

  <model name>:
    type: shuffle
    maxPositionChange: 5
  • Parameter maxPositionChange controls the amount of randomness that shuffle can introduce in the original ranking. In other words, maxPositionChange sets how far away an item can drift from its original position.


A noop is also a baseline model, which does nothing. The main purpose of this model to be a baseline of the original ranking sent to metarank during a/b tests. It's configured with the following snippet:

  <model name>:
    type: noop

It has no options and does not do any modifications to the ranking, just bouncing it back as-is.

Last updated