# Diversification

## diversity

Computes how different your current ranking item is compared to other items within the same ranking. Numeric and string fields are supported.

### Diversification over numeric fields

Consider that all items in your inventory have a numeric `price`

field:

Then for a ranking below:

we can compute how different each item price compared to the median price across the whole ranking with the following configuration snippet:

For example, given the following item prices:

p1: price=100

p2: price=200

p3: price=250

p4: price=300

p5: price=220

So for a ranking `[p1, p2, p3, p4, p5]`

we compute a median value of 220, and then compute the difference:

p1: price_diff=-120

p2: price_diff=-20

p3: price_diff=30

p4: price_diff=80

p5: price_diff=0

When you have a very long ranking, it's worth to consider limiting the amount of items taken into account, when computing median. When setting `top=3`

, for the same set of items in the ranking event above, you'll get the median of 200:

p1: price_diff=-100

p2: price_diff=0

p3: price_diff=50

p4: price_diff=100

p5: price_diff=20

### Diversification over string fields

This type of diversification can be useful to see how different your items over low-cardinality fields like tags, colors, sizes and categories. Both `string`

and `string[]`

field types are supported.

When all your inventory items have a field `color`

like in an example below:

Then for a ranking below:

we can compute how frequently each color is presented in the result set with the following configuration snippet:

The difference algorithm builds tag frequencies over the ranking (so `color -> count`

in our example above), and then computes relative intersection between tags of item and tag frequencies. An example:

given a frequency of {red: 50%, green: 30%, blue: 20%}

for an item having only red color, the score will be 50%.

for a red-blue item, the score will be 50%+20%=70%

Last updated