计划 Python 脚本每小时准确运行一次

在我问之前,Cron 作业和任务调度器将是我最后的选择,这个脚本将在 Windows 和 Linux 上使用,我希望有一个编码出来的方法来完成这个任务,而不是让最终用户来完成。

是否有一个 Python 库,我可以使用它来安排任务?我将需要运行一个函数每小时一次,但是,随着时间的推移,如果我运行一个脚本每小时一次,并使用。由于执行/运行脚本和/或函数所固有的延迟,“每小时一次”将在与前一天不同的时间段运行。

什么是 最好的的方式来调度一个函数运行在一天的特定时间(不止一次) 没有使用一个 Cron 作业或调度它与任务调度程序?

或者,如果这是不可能的,我希望你的意见。

AP 调度器完全符合我的需求。

版本 < 3.0

import datetime
import time
from apscheduler.scheduler import Scheduler


# Start the scheduler
sched = Scheduler()
sched.daemonic = False
sched.start()


def job_function():
print("Hello World")
print(datetime.datetime.now())
time.sleep(20)


# Schedules job_function to be run once each minute
sched.add_cron_job(job_function,  minute='0-59')

出去:

>Hello World
>2014-03-28 09:44:00.016.492
>Hello World
>2014-03-28 09:45:00.0.14110

版本 > 3.0

(摘自 Animesh Pandey 的回答)

from apscheduler.schedulers.blocking import BlockingScheduler


sched = BlockingScheduler()


@sched.scheduled_job('interval', seconds=10)
def timed_job():
print('This job is run every 10 seconds.')


@sched.scheduled_job('cron', day_of_week='mon-fri', hour=10)
def scheduled_job():
print('This job is run every weekday at 10am.')


sched.configure(options_from_ini_file)
sched.start()
206998 次浏览

Maybe this can help: Advanced Python Scheduler

Here's a small piece of code from their documentation:

from apscheduler.schedulers.blocking import BlockingScheduler


def some_job():
print "Decorated job"


scheduler = BlockingScheduler()
scheduler.add_job(some_job, 'interval', hours=1)
scheduler.start()

The Python standard library does provide sched and threading for this task. But this means your scheduler script will have be running all the time instead of leaving its execution to the OS, which may or may not be what you want.

One option is to write a C/C++ wrapper that executes the python script on a regular basis. Your end-user would run the C/C++ executable, which would remain running in the background, and periodically execute the python script. This may not be the best solution, and may not work if you don't know C/C++ or want to keep this 100% python. But it does seem like the most user-friendly approach, since people are used to clicking on executables. All of this assumes that python is installed on your end user's computer.

Another option is to use cron job/Task Scheduler but to put it in the installer as a script so your end user doesn't have to do it.

To run something every 10 minutes past the hour.

from datetime import datetime, timedelta


while 1:
print 'Run something..'


dt = datetime.now() + timedelta(hours=1)
dt = dt.replace(minute=10)


while datetime.now() < dt:
time.sleep(1)

For apscheduler < 3.0, see Unknown's answer.

For apscheduler > 3.0

from apscheduler.schedulers.blocking import BlockingScheduler


sched = BlockingScheduler()


@sched.scheduled_job('interval', seconds=10)
def timed_job():
print('This job is run every 10 seconds.')


@sched.scheduled_job('cron', day_of_week='mon-fri', hour=10)
def scheduled_job():
print('This job is run every weekday at 10am.')


sched.configure(options_from_ini_file)
sched.start()

Update:

apscheduler documentation.

This for apscheduler-3.3.1 on Python 3.6.2.

"""
Following configurations are set for the scheduler:


- a MongoDBJobStore named “mongo”
- an SQLAlchemyJobStore named “default” (using SQLite)
- a ThreadPoolExecutor named “default”, with a worker count of 20
- a ProcessPoolExecutor named “processpool”, with a worker count of 5
- UTC as the scheduler’s timezone
- coalescing turned off for new jobs by default
- a default maximum instance limit of 3 for new jobs
"""


from pytz import utc
from apscheduler.schedulers.blocking import BlockingScheduler
from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore
from apscheduler.executors.pool import ProcessPoolExecutor


"""
Method 1:
"""
jobstores = {
'mongo': {'type': 'mongodb'},
'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite')
}
executors = {
'default': {'type': 'threadpool', 'max_workers': 20},
'processpool': ProcessPoolExecutor(max_workers=5)
}
job_defaults = {
'coalesce': False,
'max_instances': 3
}


"""
Method 2 (ini format):
"""
gconfig = {
'apscheduler.jobstores.mongo': {
'type': 'mongodb'
},
'apscheduler.jobstores.default': {
'type': 'sqlalchemy',
'url': 'sqlite:///jobs.sqlite'
},
'apscheduler.executors.default': {
'class': 'apscheduler.executors.pool:ThreadPoolExecutor',
'max_workers': '20'
},
'apscheduler.executors.processpool': {
'type': 'processpool',
'max_workers': '5'
},
'apscheduler.job_defaults.coalesce': 'false',
'apscheduler.job_defaults.max_instances': '3',
'apscheduler.timezone': 'UTC',
}


sched_method1 = BlockingScheduler() # uses overrides from Method1
sched_method2 = BlockingScheduler() # uses same overrides from Method2 but in an ini format




@sched_method1.scheduled_job('interval', seconds=10)
def timed_job():
print('This job is run every 10 seconds.')




@sched_method2.scheduled_job('cron', day_of_week='mon-fri', hour=10)
def scheduled_job():
print('This job is run every weekday at 10am.')




sched_method1.configure(jobstores=jobstores, executors=executors, job_defaults=job_defaults, timezone=utc)
sched_method1.start()


sched_method2.configure(gconfig=gconfig)
sched_method2.start()

On the version posted by sunshinekitty called "Version < 3.0" , you may need to specify apscheduler 2.1.2 . I accidentally had version 3 on my 2.7 install, so I went:

pip uninstall apscheduler
pip install apscheduler==2.1.2

It worked correctly after that. Hope that helps.

   #For scheduling task execution
import schedule
import time


def job():
print("I'm working...")


schedule.every(1).minutes.do(job)
#schedule.every().hour.do(job)
#schedule.every().day.at("10:30").do(job)
#schedule.every(5).to(10).minutes.do(job)
#schedule.every().monday.do(job)
#schedule.every().wednesday.at("13:15").do(job)
#schedule.every().minute.at(":17").do(job)


while True:
schedule.run_pending()
time.sleep(1)

the simplest option I can suggest is using the schedule library.

In your question, you said "I will need to run a function once every hour" the code to do this is very simple:

    import schedule


def thing_you_wanna_do():
...
...
return




schedule.every().hour.do(thing_you_wanna_do)


while True:
schedule.run_pending()

you also asked how to do something at a certain time of the day some examples of how to do this are:

    import schedule




def thing_you_wanna_do():
...
...
return




schedule.every().day.at("10:30").do(thing_you_wanna_do)
schedule.every().monday.do(thing_you_wanna_do)
schedule.every().wednesday.at("13:15").do(thing_you_wanna_do)
# If you would like some randomness / variation you could also do something like this
schedule.every(1).to(2).hours.do(thing_you_wanna_do)


while True:
schedule.run_pending()

90% of the code used is the example code of the schedule library. Happy scheduling!

Run the script every 15 minutes of the hour. For example, you want to receive 15 minute stock price quotes, which are updated every 15 minutes.

while True:
print("Update data:", datetime.now())
sleep = 15 - datetime.now().minute % 15
if sleep == 15:
run_strategy()
time.sleep(sleep * 60)
else:
time.sleep(sleep * 60)

Probably you got the solution already @lukik, but if you wanna remove a scheduling, you should use:

job = scheduler.add_job(myfunc, 'interval', minutes=2)
job.remove()

or

scheduler.add_job(myfunc, 'interval', minutes=2, id='my_job_id')
scheduler.remove_job('my_job_id')

if you need to use a explicit job ID

For more information, you should check: https://apscheduler.readthedocs.io/en/stable/userguide.html#removing-jobs

I found that scheduler needs to run the program every second. If using a online server it would be costly. So I have following:

It run at each minute at the 5th second, and you can change it to hours days by recalculating waiting period in seconds

import time
import datetime
Initiating = True
print(datetime.datetime.now())
while True:
if Initiating == True:
print("Initiate")
print( datetime.datetime.now())
time.sleep(60 - time.time() % 60+5)
Initiating = False
else:
time.sleep(60)
print("working")
print(datetime.datetime.now())

This method worked for me using relativedelta and datetime and a modulo boolean check for every hour. It runs every hour from the time you start it.

import time
from datetime import datetime, timedelta
from dateutil.relativedelta import relativedelta


#Track next run outside loop and update the next run time within the loop
nxt_run=datetime.now()


#because while loops evaluate at microseconds we basically need to use a boolean evaluation to track when it should run next
while True:
cnow = datetime.now() #track the current time
time.sleep(1) #good to have so cpu doesn't spike
if (cnow.hour % 1 == 0 and cnow >= nxt_run):
print(f"start @{cnow}: next run @{nxt_run}")
nxt_run=cnow+relativedelta(hours=1) #add an hour to the next run
else:
print(f"next run @{nxt_run}")

clock.py

from apscheduler.schedulers.blocking import BlockingScheduler
import pytz


sched = BlockingScheduler(timezone=pytz.timezone('Africa/Lagos'))


@sched.scheduled_job('cron', day_of_week='mon-sun', hour=22)
def scheduled_job():
print('This job is run every week at 10pm.')
#your job here




sched.start()

Procfile

clock: python clock.py

requirements.txt

APScheduler==3.0.0

After deployment, the final step is to scale up the clock process. This is a singleton process, meaning you’ll never need to scale up more than 1 of these processes. If you run two, the work will be duplicated.

$ heroku ps:scale clock=1

Source: https://devcenter.heroku.com/articles/clock-processes-python

Perhaps Rocketry suits your needs. It's a powerful scheduler that is very easy to use, has a lot of built-in scheduling options and it is easy to extend:

from rocketry import Rocketry
from rocketry.conds import daily, every, after_success


app = Rocketry()


@app.task(every("1 hour 30 minutes"))
def do_things():
...


@app.task(daily.between("12:00", "17:00"))
def do_daily_afternoon():
...


@app.task(daily & after_success(do_things))
def do_daily_after_task():
...


if __name__ == "__main__":
app.run()

It has much more though:

  • String based scheduling syntax
  • Logical statements (AND, OR, NOT)
  • A lot of built-in scheduling options
  • Easy to customize (custom conditions, parameters etc.)
  • Parallelization (run on separate thread or process)
  • Paramatrization (execution order and input-output)
  • Persistence: put the logs anywhere you like
  • Modify scheduler on runtime (ie. build API on top of it)

Links:

Disclaimer: I'm the author