Metarank Docs
  • Introduction
    • What is Metarank?
    • Quickstart
    • Performance
  • Guides
    • Search
      • Reranking with cross-encoders
  • Reference
    • Installation
    • Event Format
      • Timestamp formats
    • API
    • Command-line options
    • Configuration
      • Feature extractors
        • Counters
        • Date and Time
        • Generic
        • Relevancy
        • Scalars
        • Text
        • User Profile
        • Diversification
      • Recommendations
        • Trending items
        • Similar items
        • Semantic similarity
      • Models
      • Data Sources
      • Persistence
    • Deployment
      • Standalone
      • Docker
      • Kubernetes
      • Prometheus metrics export
      • Custom logging
      • Warmup
    • Integrations
      • Snowplow
  • How-to
    • Automated ML model retraining
    • Automatic feature engineering
    • Running in production
  • Development
    • Changelog
    • Building from source
  • Doc versions
    • 0.7.9 (stable)
    • master (unstable)
Powered by GitBook
On this page

Was this helpful?

Edit on GitHub
  1. Reference
  2. Configuration
  3. Recommendations

Semantic similarity

semantic is a content recommendation model, which computes item similarity only based on a difference between neural embeddings of items.

This model is useful for solving a cold-start problem of recommendations, as it requires no user feedback.

Configuration

- type: semantic
  encoder:
    type: bert
    model: metarank/all-MiniLM-L6-v2
    dim: 384 # embedding size
  itemFields: [title, description]
  • itemFields: fields which should be used for embedding

  • encoder: a method of computing embeddings

Metarank has quite limited support for embeddings:

  • bert type of embeddings only supports ONNX-encoded models from sentence-transformers from HuggingFace

  • csv type of embeddings allows loading a custom pre-made dictionary.

- type: semantic
  encoder:
    type: csv
    dim: 384 # embedding size
    path: /opt/dic.csv
  itemFields: [title, description]

A dictionary should be a comma-separated CSV-formatted file, where:

  • 1st column is product id

  • 2 till N+1 columns - float values for N-dimentional embedding

Example:

p1,1.0,2.0,3.0
p2,2.0,1.5,1.0
PreviousSimilar itemsNextModels

Last updated 2 years ago

Was this helpful?