> For the complete documentation index, see [llms.txt](https://docs.metarank.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.metarank.ai/reference/overview/feature-extractors/diversity.md).

# Diversification

## diversity

Computes how different your current ranking item is compared to other items within the same ranking. Numeric and string fields are supported.

### Diversification over numeric fields

Consider that all items in your inventory have a numeric `price` field:

```json
{
  "event": "item",
  "id": "81f46c34-a4bb-469c-8708-f8127cd67d27",
  "item": "item1",
  "timestamp": "1599391467000",
  "fields": [{"name": "price", "value": 69.0}]
}
```

Then for a ranking below:

```json
{
  "event": "ranking",
  "id": "81f46c34-a4bb-469c-8708-f8127cd67d27",
  "timestamp": "1599391467000",
  "user": "user1",
  "session": "session1",
  "items": [
    {"id": "item1"},
    {"id": "item2"},
    {"id": "item3"} 
  ]
}
```

we can compute how different each item price compared to the median price across the whole ranking with the following configuration snippet:

```yaml
- name: price_diff
  type: diversity
  source: item.price # only item.* fields are accepted
  ttl: 90d # optional, when to expire tracked fields
  top: 10 # optional, take only top-N items to compute the median
```

For example, given the following item prices:

* p1: price=100
* p2: price=200
* p3: price=250
* p4: price=300
* p5: price=220

So for a ranking `[p1, p2, p3, p4, p5]` we compute a median value of 220, and then compute the difference:

* p1: price\_diff=-120
* p2: price\_diff=-20
* p3: price\_diff=30
* p4: price\_diff=80
* p5: price\_diff=0

When you have a very long ranking, it's worth to consider limiting the amount of items taken into account, when computing median. When setting `top=3`, for the same set of items in the ranking event above, you'll get the median of 200:

* p1: price\_diff=-100
* p2: price\_diff=0
* p3: price\_diff=50
* p4: price\_diff=100
* p5: price\_diff=20

### Diversification over string fields

This type of diversification can be useful to see how different your items over low-cardinality fields like tags, colors, sizes and categories. Both `string` and `string[]` field types are supported.

When all your inventory items have a field `color` like in an example below:

```json
{
  "event": "item",
  "id": "81f46c34-a4bb-469c-8708-f8127cd67d27",
  "item": "item1",
  "timestamp": "1599391467000",
  "fields": [{"name": "color", "value": "red"}]
}
```

Then for a ranking below:

```json
{
  "event": "ranking",
  "id": "81f46c34-a4bb-469c-8708-f8127cd67d27",
  "timestamp": "1599391467000",
  "user": "user1",
  "session": "session1",
  "items": [
    {"id": "item1"},
    {"id": "item2"},
    {"id": "item3"} 
  ]
}
```

we can compute how frequently each color is presented in the result set with the following configuration snippet:

```yaml
- name: color_diff
  type: diversity
  source: item.color # only item.* fields are accepted
  ttl: 90d # optional, when to expire tracked fields
  top: 10 # optional, take only top-N items to compute the histogram
```

The difference algorithm builds tag frequencies over the ranking (so `color -> count` in our example above), and then computes relative intersection between tags of item and tag frequencies. An example:

* given a frequency of {red: 50%, green: 30%, blue: 20%}
* for an item having only red color, the score will be 50%.
* for a red-blue item, the score will be 50%+20%=70%


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.metarank.ai/reference/overview/feature-extractors/diversity.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
