Running in production
These are general recommendations on running Metarank in a production environment.
Persistence
Metarank provides several Persistence options, however for production setup we recommend using only Redis persistance as it operates separately from running Metarank instances.
Redis does not depend on Metarank instances being re-deployed and should be configured with disc backup.
At the moment, Metarank stores only processed events in Redis, so we recommend storing all events separately.
API Serving
Metarank CLI exposes several modes with which you can run Metarank: standalone
and serve
.
Although standalone
mode is great for development purposes, it can't be used for production deployment:
standalone mode cannot be scaled as it's not possible to run several instances that point to the same database
you cannot re-train the model without restarting Metarank
For production deployment, you should only use the serve
mode. You can have as many serve
instances as you need, depending on the load you have and you can perform graceful restarts of Metarank with 0 downtime.
Resource consumption of the serve
mode is relatively low as it performs minimal computations, so you can use cheaper nodes than when training the model.
At the moment, Metarank does not provide clustering capabilities out of the box, so you will need to use an external load balancer when deploying multiple API instances.
Re-training
Metarank exposes a train
mode that re-trains your model based on the calculated features. Training is a long-running process with high memory consumption, which depends on the amount of data that is stored, so we recommend running this process on-demand. You can re-train your model once a week or once a month, so there's no need to keep a large instance online all the time.
Last updated