Metarank Docs
  • Introduction
    • What is Metarank?
    • Quickstart
    • Performance
  • Guides
    • Search
      • Reranking with cross-encoders
  • Reference
    • Installation
    • Event Format
      • Timestamp formats
    • API
    • Command-line options
    • Configuration
      • Feature extractors
        • Counters
        • Date and Time
        • Generic
        • Relevancy
        • Scalars
        • Text
        • User Profile
        • Diversification
      • Recommendations
        • Trending items
        • Similar items
        • Semantic similarity
      • Models
      • Data Sources
      • Persistence
    • Deployment
      • Standalone
      • Docker
      • Kubernetes
      • Prometheus metrics export
      • Custom logging
      • Warmup
    • Integrations
      • Snowplow
  • How-to
    • Automated ML model retraining
    • Automatic feature engineering
    • Running in production
  • Development
    • Changelog
    • Building from source
  • Doc versions
    • 0.7.9 (stable)
    • master (unstable)
Powered by GitBook
On this page
  • LambdaMART
  • Interaction weight
  • Event selectors
  • Train/test splitting strategies
  • XGBoost and LightGBM backend options
  • Shuffle
  • Noop

Was this helpful?

Edit on GitHub
  1. Reference
  2. Configuration

Models

PreviousSemantic similarityNextData Sources

Last updated 1 year ago

Was this helpful?

This document lists all the methods Metarank may use for ranking. It's the one defined in the models.<name>.type part of :

models:
  default: 
    type: lambdamart 

LambdaMART

LambdaMART is a Learn-to-Rank model, optimizing the . There is a article by describing all the details about how it works. In a simplified way, LambdaMART in the scope of Metarank does the following:

  1. Takes a ranking and some relevancy judgements over items as an input (judgements can be implicit, like clicks, or implicit like stars in movie recommendations)

  2. All items in the ranking have a set of characteristics (ML features, like genre or CTR as an example)

  3. A pair of items from the ranking is sampled.

  4. ML model must be able to guess which item in this pair may have higher relevancy judgement.

  5. Repeat over all pairs in the ranking.

  6. Repeat over all the rankings in the dataset.

At the end, items with higher judgements should be ranked higher, making the resulting ranking more relevant.

In Metarank there are two supported library backends implementing this algorithm:

  • XGBoost: objective

  • LightGBM: objective

To configure the model, use the following snippet:

  <model name>:
    type: lambdamart 
    backend:
      type: xgboost # supported values: xgboost, lightgbm
      iterations: 100 # optional (default 100), number of interations while training the model
      seed: 0 # optional (default = random), a seed to make training deterministic
    weights: # types and weights of interactions used in the model training
      click: 1 # you can increase the weight of some events to hint model to optimize more for them
    features: # features from the previous section used in the model
      - foo
      - bar
#    selector: # optional set of selectors to filter events for this specific model
#      rankingField: source
#      value: search

#    split: optional definition of train/test splitting strategy. See below for examples.

#    eval: optional list of evaluation metrics.

#    warmup: optional API warmup settings
#      sampledRequests: 100 # how many requests to sample during training
#      duration: 5s # how long to perform the warmup.
  • backend: required, xgboost or lightgbm, specifies the backend and it's configuration.

  • weights: required, list of string:number pairs, specifies what interaction events are used for training. You can specify multiple events with different weights.

  • selector: optional, list of selectors, a set of rules to filter which events should be accepted by this model.

  • split: optional, a train/test splitting strategy. Default: time=80%. Options: random/hold_last/time with an optional ratio (default: 80%, which means 80% allocated to train, 20% to test). Example: random=80% means split dataset randomly, 80% should be allocated to the train set.

  • eval: optional, a list of eval metrics to measure after training. Default value is ["NDCG@10"], supported metrics are NDCG, NDCG@k, MAP, MAP@k, MRR (where k - cutoff value).

Interaction weight

Interactions define the way your users interact with the items you want to personalize, e.g. click, add-to-wishlist, purchase, like.

Interactions can be used in the feature extractors, for example to calculate the click-through rate and by defining weight you can control the optimization goal of your model: do you want to increase the amount of likes or purchases or balance between them.

You can define interaction by name and set weight for how much this interaction affects the model:

  click: 1.0

Event selectors

When serving multiple models, there are cases when you need to separate ranking and interaction events per model. This is useful when your models have different contexts, e.g. you do personalized ranking for search results and recommendation results using the same Metarank installation, but utilizing different models.

Metarank supports selector configuration that can be used to route your events to correct model or drop events in certain scenarios.

Metarank supports the following event selectors:

  • Accept selector. Enabled by default to accept all events, if no selectors are defined.

selector:
  accept: true # true = accept all, false = reject all
  • Field selector. Accepts event when it has a specific string (or string-list) field defined for a ranking event. For example:

selector:
  rankingField: source
  value: search

The filter above will accept only events that have the source=search field defined in the fields section of the event:

{
  "event": "ranking",
  "id": "81f46c34-a4bb-469c-8708-f8127cd67d27",
  "timestamp": "1599391467000",
  "user": "user1",
  "session": "session1",
  "fields": [
      {"name": "source", "value": "search"}
  ],
  "items": [
    {"id": "item1", "fields": [{"name": "relevancy", "value": 1.0}]},
  ]
}
  • Sampling selector: randomly accept or drop an event, depending on the acceptance ratio:

selector:
  ratio: 0.5

The sampling selector above will accept only half of events randomly.

  • Max interaction position selector: only accept click-through events when interaction position is not too high/low. Can be useful to exclude visitor sessions with discovery-style browsing behavior, or too short rankings.

selector:
  maxInteractionPosition: 10
  minInteractionPosition: 3 # both fields are optional
  • Ranking length selector: only accept click-through events with number of items within a defined range.

selector:
  minItems: 10 # so there should be at least 10 items in the ranking
  maxItems: 20 # both min and max are optional
  • AND/OR/NOT selector: combine multiple selectors within a single boolean combination:

selector:
  and:
    - rankingField: source
      value: search
    - or:
        - rankingField: segment
          value: test
        - ratio: 0.5
        - not:
            accept: false
    

AND and OR selectors take a list of nested selectors as arguments, NOT selector only takes a single selector argument.

Train/test splitting strategies

Metarank supports three train/test splitting strategies:

  • random: split dataset randomly.

  • hold_last: for each session having multiple rankings, take last N% of rankings as a test set. Can be useful to measure an in-session personalization impact .

  • time: split dataset by a timestamp.

Each strategy definition in a config file can be optionally configured with a split ratio - 80% by default. An example:

  • random=80%: split dataset randomly. Be careful with random splitting, as it may introduce label leaking.

  • hold_last: split within session with a default 80% splitting ratio.

XGBoost and LightGBM backend options

  • iterations: optional, number, default: 100, number of trees in the model.

  • learningRate: optional, number, default: 0.1, higher the rate - faster training - less precise model.

  • ndcgCutoff: optional, number, default: 10, only N first items may affect the NDCG.

  • maxDepth: optional, number, default: 8, the depth of the tree.

  • seed: optional, string or number, default: random to make model training deterministic.

  • sampling: optional, default: 0.8, fraction of features used to build a tree, useful to prevent over-fitting.

LightGBM also supports these specific options:

  • numLeaves: optional, number, default: 16, how many leaves the tree may have.

Shuffle

A shuffle is a baseline model, which may be used in the a/b tests as a "worst-case" ranking scenario, when the order of items is random. The shuffle model is configured in the following way:

  <model name>:
    type: shuffle
    maxPositionChange: 5
  • Parameter maxPositionChange controls the amount of randomness that shuffle can introduce in the original ranking. In other words, maxPositionChange sets how far away an item can drift from its original position.

Noop

A noop is also a baseline model, which does nothing. The main purpose of this model to be a baseline of the original ranking sent to metarank during a/b tests. It's configured with the following snippet:

  <model name>:
    type: noop

It has no options and does not do any modifications to the ranking, just bouncing it back as-is.

features: required, list of string, features used for model training, see documentation.

warmup: optional, API warmup settings. See the for details.

debias: optional, default: false. Enable booster-native position bias removal support. See these two articles about the unbiased LTR for and for details.

Please consult and docs about tuning these parameters.

config file
NDCG metric
Lambdamart in Depth
Doug Turnbull
rank:pairwise
lambdarank
Feature extractors
API warmup section
XGBoost
LightGBM
LightGBM
XGBoost