User Profile
User-Agent field extractor
A typical HTTP User-Agent field has quite a lot of embedded meta information, which can be useful for ranking:
is it mobile or desktop? Mobile visitors behave differently compared to desktop ones as they scroll less and get distracted quicker.
iOS or Android? Assuming that on average Apple devices are more expensive than Android ones, it can also provide more insights on visitor goals.
Stock browser or something custom?
How old is the OS? On Android, an ancient version of OS can mean an old and unsupported device, so it can be also a signal on your ranking.
But User-Agent string is quite cryptic:
There is a large collaborative effort to build a database of typical UA patterns, (UA-Parser)[https://github.com/ua-parser], which is used to extract all the possible item metadata from these strings.
To map this to actual ML features, there is a predefined set of mappers:
platform: mobile, desktop, tablet
os: ios, android, windows, linux, macos, chrome os
browser: safari, chrome, firefox, opera, ie, other
bot: is it a known crawler or not
To configure the extractor, use this YAML snippet:
The UA field is taken from each ranking request, so it should be always present.
Interacted with
For the current item, did this visitor have an interaction with other item with the same field?
Example:
For this example, Metarank will track all color field values for all items visitor clicked and intersect this set with per-item field values in the ranking.
interacted_with
extractor can also track multiple fields at once within a single visitor profile:
Referer
For user/ranking/interaction events it's possible to parse a HTTP Referer field and extract the source medium. We use a snowplow referer parser parsing library, so it defines 6 types of referer mediums: unknown, search, internal, social, email, paid.
For a ranking event:
and a configuration:
It will detect that it's a "search" medium and one-hot-encode it to [0, 1, 0, 0, 0, 0]
.
A source field can be of a user/ranking/interaction type, and feature extractor memorises all the referer fields ingested:
it matches the HTTP Referer semantics, as referer field is sent on each request
there can be multiple referers. For example, visitor lands on a site from google (and gets a "search" referer), then does a couple of interactions with the site (and also gets an "internal" referer medium)
In a case when a visitor has multiple referers memorized, then the one-hot-encoded vector will have multiple flags enabled, like [0, 1, 1, 0, 0, 0]
for a case with search+internal referer mediums.
Last updated