githubEdit

Kubernetes

Metarank can be deployed in a distributed fashion inside a Kubernetes cluster.

installation overview

Prerequisites

For a distributed K8S deployment, metarank requires the following external services and tools to be already available:

  1. Helm: used to install the Metarank chart.

  2. Redis: as an almost-persistent data store for inference. Can be also installed either inside k8s with helm, or as a managed service like AWS ElastiCache Redis.

  3. Distributed event bus for event ingestion: Kafka, Pulsar, Kinesis and internal RESTful API are supported.

Data Import

Metarank supports multiple ways of ingesting training data into the system:

  • event file can be HTTT POSTed to the /feedback endpoint using the REST API. Metarank does not do any in-memory buffering, so if your dataset is below 1GiB in size, this may be the simplest way to ingest.

  • event can be imported from a Kafka/Pulsar/Kinesis topic or read from files locally. Note that distributed import is not yet supported.

We suggest to start with a HTTP-based event import, and switch to offline local import if you have any issues with it.

Tuning the Helm chart

With Helm installed according to its official installation guidearrow-up-right, you need to add a Metarank Helm repoarrow-up-right:

In the chart directory there are metarank.conf and values.yaml files you'll need to update before the deployment:

The metarank.conf file is a regular metarank configuration file, so you can check the configuration guide to set things up manually, or use an automatic data-based config generator.

The metarank.conf file requires you to define a Redis endpoint for state store. A good-looking config file is shown below:

The values.yaml is a generic helm deployment configuration file. You can tune it, but default one usually requires no extra changes.

Resources

The default helm chart sets no specific memory requests & limits, but it can be configured with values.yaml.

The Metarank docker container accepts a JAVA_OPTS environment variable to control the JVM memory usage. It defaults to JAVA_OPTS="-Xmx1g -verbose:gc" which means:

  • Use 1Gb for JVM heap. The actual RSS memory usage should be a bit higher due to JVM extra overhead.

  • Enable verbose GC logging. You may notice the following lines in the log, they are normal:

Installing the chart

The chart itself is agnostic to the Metarank version, and has separate versioning. For the latest Metarank 0.7.9 release, use the following command to install the chart:

After that, a single metarank pod will be running:

Next steps

After successful deployment you may want to do the following:

Last updated

Was this helpful?