Docker/Kubernetes + Gunicorn/芹菜-多个工人 VS 复制品？

我想知道使用 Gunicorn & 芹菜部署一个集装箱化的 Django 应用程序的正确方法是什么。

具体来说，这些过程中的每一个都有一个内置的垂直缩放方式，对于古尼科恩使用 workers，对于芹菜使用 concurrency。然后是使用 replicas的库伯内特方法

还有一种设置 worker 等于 CPU 某些功能的概念

每个核心2-4个工人

然而，我很困惑，在 K8中 CPU 是一个可分的共享资源，除非我使用 resoureceQuotas，否则这意味着什么。

我想知道最佳实践是什么，我能想到的有三种选择:

是否为 gunicorn 提供单个 worker 和为芹菜提供1的并发性，并使用副本对它们进行比例扩展? (水平比例扩展)
在一个内部缩放(垂直缩放)的复制部署中运行 Gunicorn 和芹菜。这将意味着分别设置相当高的 worker 和并发值。
一种介于1和2之间的混合方法，其中我们运行 gunicorn 和芹菜，工作人员和并发性的值很小(比如2) ，然后使用 K8s Deployment 副本横向扩展。

围绕这个问题有一些关于 SO 的问题，但没有一个能提供深入/深思熟虑的答案。如果有人可以分享他们的经验，将不胜感激。

注意: 我们对 Gunicorn 使用默认的 worker _ class sync

14201

小开

These technologies aren't as similar as they initially seem. They address different portions of the application stack and are actually complementary.

Gunicorn is for scaling web request concurrency, while celery should be thought of as a worker queue. We'll get to kubernetes soon.

Gunicorn

Web request concurrency is primarily limited by network I/O or "I/O bound". These types of tasks can be scaled using cooperative scheduling provided by threads. If you find request concurrency is limiting your application, increasing gunicorn worker threads may well be the place to start.

Celery

Heavy lifting tasks e.g. compress an image, run some ML algo, are "CPU bound" tasks. They can't benefit from threading as much as more CPUs. These tasks should be offloaded and parallelized by celery workers.

Kubernetes

Where Kubernetes comes in handy is by providing out-of-the-box horizontal scalability and fault tolerance.

Architecturally, I'd use two separate k8s deployments to represent the different scalablity concerns of your application. One deployment for the Django app and another for the celery workers. This allows you to independently scale request throughput vs. processing power.

I run celery workers pinned to a single core per container (-c 1) this vastly simplifies debugging and adheres to Docker's "one process per container" mantra. It also gives you the added benefit of predictability, as you can scale the processing power on a per-core basis by incrementing the replica count.

Scaling the Django app deployment is where you'll need to DYOR to find the best settings for your particular application. Again stick to using --workers 1 so there is a single process per container but you should experiment with --threads to find the best solution. Again leave horizontal scaling to Kubernetes by simply changing the replica count.

It's definitely something I had to wrap my head around when working on similar projects.