Changelog
Last updated
Was this helpful?
Last updated
Was this helpful?
In a human-readable format. For a technical changelog for robots, see . Check our for more detailed updates.
expose redis click-through store TTL to config
a bigfix release: slash/semicolon in key/value, kinesis retries
a bugfix release: race condition in cache invalidation, booster native memleak
Rate feature now can be scoped to
You can now specify should be computed on training.
Proper handling of
support for rocksdb-backed file storage
a bugfix release
Support for kv-granular Redis TTLs
Support HF tokenizers for biencoders: now you can run a multi-lingual E5 model in Metarank!
a minor bugfix release
fixed an important bug with dataset preparation (symptom: NDCG reported by the booster was higher than NDCG computed after the training) - prediction quality should go up a lot.
print NDCG before and after reranking
print statistics for mem usage after training
fix for crash when using file-based clickthrough store
Upgrading: note that redis state format has a non backwards compatible change, so you need to reimport the data when upgrading.
Local caching for state, the import should be 2x-3x faster.
fix bug for a case when there is a click on a non-existent item.
cache.maxSize
for redis now disables client-side caching altogether. Makes Metarank compatible with GCP Memstore Redis.
fixed mem leak in clickthrough joining buffer.
lower mem allocation pressure in interacted_with feature.
interacted_with feature now supports string[] fields
fixed a notorious bug with local-file click-through store.
XGBoost LambdaMART impl now supports categorical encoding.
linux/aarch64 support, so docker images on Mac M1 are OK.
a ton of bugfixes
bugfix: add explicit sync on /feedback api call
bugfix: config decoding error for field_match over terms
bugfix: version detect within docker was broken
bugfix: issue with improper iface being bound in docker
Notable features:
Most notable improvements:
Highlights of this release are:
Highlights of this release are:
Highlights of this release are:
Highlights of this release are:
Kubernetes support: now it's possible to have a production-ready metarank deployment in K8S
Kinesis source: on par with Kafka and Pulsar
Custom connector options pass-through
Metarank is a multi-stage and multi-component system, and now it's possible to get it deployed in minutes inside a Kubernetes cluster:
Bootstrap, Upload and Update jobs can be run both locally (to simplify things up for small datasets) and inside the cluster in a distributed mode.
As we're using Flink's connector library for pulling data from Kafka/Kinesis/Pulsar, there is a ton of custom options you can tune for each connector. It's impossible to expose all of them directly, so now in connector config there is a free-form options
section, allowing to set any supported option for the underlying connector.
Example to force Kinesis use EFO consumer:
/inference
: to expose bi- and cross-encoders.
Support for
field_match
support for
field_match
support for
field_match
support for
Relevance judgments can now also
feature extractor
feature
support for similar and trending items models.
JAVA_OPTS
env variable to control JVM heap size.
when serving multiple models.
for storing clickthrough data.
Redis AUTH override with
Prometheus /metrics
Per-item
--split option for CLI with
Redis
Redis
feature now has much less overhead in Redis, and supports multiple fields in a single visitor profile.
, and not inside Redis, also reducing the overall costs of running Metarank
it is now possible to datasets for further hyper-parameter optimization.
, with ballpark estimations useful for resource planning.
with reducer and autofeature support.
which is common on managed Redis setups.
, which is 2x faster and 4x more compact than JSON
, so metarank can now be run natively on Mac M1/M2.
and an official guide on how to do production deployment on k8s.
for further hyper-parameter tuning.
support, so having 1 click over 2 impressions is not resulting in a 50% CTR anymore.
based on a dynamic position feature.
based on an existing dataset
added sub-command
added parameters to improve memory consumption
added and
Flink is rermoved. As a result only memory
and redis
modes are supported now.
now has updated structure and is not compatible with previous format.
CLI is updated, most of the options are moved to .
We have updated the mode of the CLI, so you can use it to validate your data and configuration.
Inference API is just a regular
Job mabagement is done with
See for details.
can also be used as an event source. It still has a couple of drawbacks compared with Kafka/Pulsar, for example, due to max 7 day data retention it cannot be effectively used as a permanent storage of historical events. But it's still possible to pair it with a AWS Firehose writing events to S3, so realtime events are coming from Kinesis, and historical events are offloaded to S3.
Check out for details and examples.
See for details.