姜戈芹菜采伐最佳实践

我正在尝试让芹菜日志与 Django一起工作。我有日志设置在 settings.py到控制台(这工作良好,因为我是主机上的 Heroku)。在每个模块的顶部,我有:

import logging
logger = logging.getLogger(__name__)

在我的任务中:

from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)

这对于记录任务的调用非常有效,我得到的输出如下:

2012-11-13T18:05:38+00:00 app[worker.1]: [2012-11-13 18:05:38,527: INFO/PoolWorker-2] Syc feed is starting

但是如果这个任务然后调用另一个模块中的一个方法,例如 queryset方法,我就会得到重复的日志条目,例如。

2012-11-13T18:00:51+00:00 app[worker.1]: [INFO] utils.generic_importers.ftp_processor process(): File xxx.csv already imported. Not downloaded
2012-11-13T18:00:51+00:00 app[worker.1]: [2012-11-13 18:00:51,736: INFO/PoolWorker-6] File xxx.csv already imported. Not downloaded

我想我可以利用

CELERY_HIJACK_ROOT_LOGGER = False

只是使用 Django日志,但这不工作,当我尝试它,即使我得到它的工作,我会失去的 "PoolWorker-6"位,我想要的。(顺便说一句,我不知道如何在芹菜的日志条目中显示任务名称,因为 那些文件似乎表明应该这样做)。

我怀疑我错过了一些简单的东西。

64647 次浏览

When your logger initialized in the beginning of "another module" it links to another logger. Which handle your messages. It can be root logger, or usually I see in Django projects - logger with name ''.

Best way here, is overriding your logging config:

LOGGING = {
'version': 1,
'disable_existing_loggers': True,
'formatters': {
'simple': {
'format': '%(levelname)s %(message)s',
'datefmt': '%y %b %d, %H:%M:%S',
},
},
'handlers': {
'console': {
'level': 'DEBUG',
'class': 'logging.StreamHandler',
'formatter': 'simple'
},
'celery': {
'level': 'DEBUG',
'class': 'logging.handlers.RotatingFileHandler',
'filename': 'celery.log',
'formatter': 'simple',
'maxBytes': 1024 * 1024 * 100,  # 100 mb
},
},
'loggers': {
'celery': {
'handlers': ['celery', 'console'],
'level': 'DEBUG',
},
}
}


from logging.config import dictConfig
dictConfig(LOGGING)

In this case I suppose it should work as you assume.

P.S. dictConfig added in Python2.7+.

It is troubling that Celery interferes with the root logger (which is not best practice and can't be controlled completely), but it does not disable your app's custom loggers in any way, so use your own handler names and define your own behavior rather than trying to fix this issue with Celery. [I like to keep my application logging separate anyway). You could use separate handlers or the same for Django code and Celery tasks, you just need to define them in your Django LOGGING config. Add formatting args for module, filename, and processName to your formatter for sanity, to help you distinguish where messages originate.

[this assumes you have setup a handler for 'yourapp' in the LOGGING settings value that points to an Appender - sounds like you are aware of this though].

views.py

log = logging.getLogger('yourapp')
def view_fun():
log.info('about to call a task')
yourtask.delay()

tasks.py

log = logging.getLogger('yourapp')
@task
def yourtask():
log.info('doing task')

For the logging that Celery generates - use the celeryd flags --logfile to send Celery output (eg, worker init, started task, task failed) to a separate place if desired. Or, use the other answer here that sends the 'celery' logger to a file of your choosing.

Note: I would not use RotatingFileHandlers - they are not supported for multi-process apps. Log rotation from another tool like logrotate is safer, same goes with logging from Django assuming you have multiple processes there, or the same log files are shared with the celery workers. If your using a multi-server solution you probably want to be logging somewhere centralized anyway.

To fix duplicate logging issue, what worked for me is to set the propagate setting to false when declaring my settings.LOGGING dict

LOGGING = {
'version': 1,
'disable_existing_loggers': False,
'handlers': {
'console': {
'level': 'DEBUG',
'class': 'logging.StreamHandler',
'formatter': 'verbose'
},
},
'formatters': {
'verbose': {
'format': '%(asctime)s %(levelname)s module=%(module)s, '
'process_id=%(process)d, %(message)s'
}
},
'loggers': {
'my_app1': {
'handlers': ['console'],
'level': 'DEBUG',
'propagate': False #this will do the trick
},
'celery': {
'handlers': ['console'],
'level': 'DEBUG',
'propagate': True
},
}
}

假设您的Django项目布局如下:
my_project/
- tasks.py
- email.py

and lets say one of your tasks makes a call to some function in email.py; the logging will happen in email.py and then that logging will get propagated to the 'parent' which in this case happens to be your celery task. Thus double logging. But setting propagate to False for a particular logger means that for that logger/app, its logs wont get propagated to the parent, hence their will be no 'double' logging. By default 'propagate' is set to True

Here's a link to the django docs section about that parent/children loggers stuff

Maybe it will help someone, my problem was to send all celery logs to graylog. Here is solution.

celery.py:

app.config_from_object('django.conf:settings', namespace='CELERY')




# ====== Magic starts
from celery.signals import setup_logging


@setup_logging.connect
def config_loggers(*args, **kwargs):
from logging.config import dictConfig
from django.conf import settings
dictConfig(settings.LOGGING)
# ===== Magic ends




# Load task modules from all registered Django app configs.
app.autodiscover_tasks()

settings.py:

LOGGING = {
'version': 1,
'handlers': {
'graypy': {
'class': 'graypy.GELFTCPHandler',
'host': GRAYLOG_HOST,
'port': GRAYLOG_PORT,
}
},
'loggers': {
'my_project': {
'handlers': ['graypy'],
'level': 'INFO',
},
# ====== Magic starts
'celery': {
'handlers': ['graypy'],
'level': 'INFO',
}
# ===== Magic ends
}
}