Python Simplified

How to schedule Python scripts using schedule library

Automate Python scripts using Schedule

Introduction

You have written lot of Python scripts for the daily tasks such as sending mails, processing excel/csv files, currency conversion, web scraping, stock prices tracking, etc. and but now you want to automate/schedule them. One good approach is to schedule these scripts as per your need; daily at 5 PM or weekly once on Sunday 12 AM etc. 

There are a number tools available at your disposal such as — schedule, apscheduler, python-crontab, apache-airflow, etc. that you can use to schedule your Python jobs . In this blog post, we will use the schedule library for scheduling the Python scrips.

Schedule

The schedule package is a simple and an elegant solution that can help you automate daily tasks such as sending mails, processing excel/csv,  etc. 

As mentioned in the official documentation, schedule package is not what you need if you are looking for:

  • Job persistence (remember schedule between restarts)
  • Exact timing (sub-second precision execution)
  • Concurrent execution (multiple threads)
  • Localization (time zones, workdays, or holidays)

If your requirement is to automate/schedule scripts in production, then I would recommend you explore Apache-AirFlow and Apscheduler. These provide a lot more functionalities than the schedule package. We will also cover these in future blog posts. Stay tuned!

Also note that the author of the library, Dan Bader, says, “schedule is a job scheduler well-suited for periodic tasks, preferably ones with relatively small intervals between executions”.

Installation

				
					pip install schedule
				
			

Examples

Let’s try to understand how to use the schedule library for scheduling Python scripts with a simple example below. The script schedules and executes the function named job every 5seconds starting from the moment you ran the code. You can see that the code is self-explanatory. 

				
					# myscript1.py
import schedule
import time

def job():
    print("I'm working...")
schedule.every(5).seconds.do(job)

while True:
    schedule.run_pending()
    time.sleep(1)
				
			

Output:

				
					(base) C:\Users\swath\Desktop>python myscript1.py
I'm working…
I'm working…
I'm working…
				
			

That is just one simple example. But you can schedule the jobs based on your requirement. The below examples give you some idea about different ways of scheduling your Python jobs. 

				
					# Run the job to run every 5 minutes from now
schedule.every(5).minutes.do(job)

# Run the job to run every 5 hours from now
schedule.every(5).hours.do(job)

# Run the job to run every 5 days from now
schedule.every(5).days.do(job)

# Run the job to run every 5 weeks from now
schedule.every(5).weeks.do(job)

# Run job on a specific day & time of the week
schedule.every().monday.do(job)
schedule.every().sunday.at("06:00").do(job)

# Run job every day at specific HH:MM and next HH:MM:SS
schedule.every().day.at("10:30:00").do(job)
				
			

Using a decorator to schedule the job

Alternatively, you can also use @repeat decorator to schedule the jobs. To schedule the job at every 5 mins, you could use the below code . There is little difference in syntax but it works similarly as we saw in the previous examples. 

				
					# myscript2.py
from schedule import every, repeat, run_pending
import time

@repeat(every(5).minutes)
def job():
    print("Job scheduled using decorator...")
    
while True:
    run_pending()
    time.sleep(1)
				
			

Output:

				
					(base) C:\Users\swath\Desktop>python myscript2.py
Job schedule using decorator...
Job schedule using decorator...
...
				
			

Passing arguments to the job

Sometimes you may want to pass the argument to the job. Since it’s a function we are scheduling, you can pass positional and keyword arguments as per your need. Refer to the example below — 

				
					# myscript3.py
import time
import schedule
import pandas as pd

def job(type):
    if type == 'xlsx':
        df = pd.read_excel('report.xlsx')
    elif type == 'csv':
        df = pd.read_csv('report.csv')
    print(f'{type} file is read successfully')

schedule.every(1).seconds.do(job, type='xlsx')
schedule.every(2).seconds.do(job, type='csv')

while True:
    schedule.run_pending()
    time.sleep(1)
				
			

Output:

				
					(base) C:\Users\swath\Desktop>python myscript3.py
xlsx file is read successfully
csv file is read successfully
xlsx file is read successfully
xlsx file is read successfully
....
				
			

Retrieve all the jobs

If you want to see all the jobs that are scheduled then you can use get_jobs() method. 

				
					# myscript4.py
import schedule

def job():
    print('Hello world')
    
schedule.every().second.do(job)
schedule.every().minute.do(job)
schedule.every().hour.do(job)

print(schedule.get_jobs())
				
			

Output:

				
					(base) C:\Users\swath\Desktop>python myscript4.py
[Every 1 second do job() (last run: [never], next run: 2021-10-09 15:36:50), 
Every 1 minute do job() (last run: [never], next run: 2021-10-09 15:37:49), 
Every 1 hour do job() (last run: [never], next run: 2021-10-09 16:36:49)]
				
			

Cancelling the job(s)

You can cancel one or more jobs using cancel_job(<job_name>) and clear() methods respectively. 

				
					# myscript5.py
import schedule

def greet():
    print('Welcome to Python Simplified !!')

job1 = schedule.every().second.do(greet)
job2 = schedule.every().minute.do(greet)
job3 = schedule.every().hour.do(greet)

print('# of jobs scheduled', len(schedule.get_jobs()))

# Cancelling job1
schedule.cancel_job(job1)
print('# of jobs scheduled', len(schedule.get_jobs()))

# Cancelling all jobs (job2 and job3)
schedule.clear()
print('# of jobs scheduled', len(schedule.get_jobs()))
				
			

Output:

				
					(base) C:\Users\swath\Desktop>python myscript5.py
# of jobs scheduled 3
# of jobs scheduled 2
# of jobs scheduled 0
				
			

Running jobs until a certain time

If you want to schedule the jobs to run until a certain time, you could use until method. The following examples help you get started using until method.

				
					# myscript6.py
import schedule
from datetime import datetime, timedelta, time

def job():
    print('Python Simplified')

# Run job until a 15:30 today
schedule.every(1).hours.until("15:30").do(job)

# Run job until a 2030-01-01 18:33 today
schedule.every(2).hours.until("2021-10-31 23:59").do(job)

# Run job to run for the next 12 hours
schedule.every(3).hours.until(timedelta(hours=12)).do(job)

# Run my_job until today 23:59:59
schedule.every(4).hours.until(time(23, 59, 59)).do(job)

# Run job until a specific datetime
schedule.every(5).hours.until(datetime(2021, 12, 31, 23, 59, 59)).do(job)
print(schedule.get_jobs())
				
			

Output:

				
					(base) C:\Users\swath\Desktop>python myscript.py
[Every 1 hour do job() (last run: [never], next run: 2021-10-10 10:37:28), 
Every 2 hours do job() (last run: [never], next run: 2021-10-10 11:37:28), 
Every 3 hours do job() (last run: [never], next run: 2021-10-10 12:37:28), 
Every 4 hours do job() (last run: [never], next run: 2021-10-10 13:37:28), 
Every 5 hours do job() (last run: [never], next run: 2021-10-10 14:37:28)]
				
			

How to run the thread in the background

If you have noticed all the code examples above and those from the official documentation, you realize that the jobs you are scheduling are blocking the main thread. So, how to continuously run the scheduler in the background without blocking the main thread?

You can achieve this by creating your own thread and let it run in the background without blocking the main thread. Refer to the below example. It’s inspired from the official example given here.

In the below example, 

  • run_continuously() function creates a thread and returns threading event cease_continuous_run.
  • background_job_1 and background_job_2 are the two jobs that you want to run in the background without blocking the main thread.
  • In the main code, you are scheduling both the jobs to run every second.
  • Next, when run_continuously() function is called, it starts the thread for both the background jobs. The jobs will run in the background even when the main function is completed. Refer to the output shown below and you will understand. 
  • If you want you can also stop all the background jobs by running stop_run_continuously.set().

I suggest you go play around with the below code to better understand how you can schedule the jobs to run in the background without blocking the main thread.

				
					import schedule
import logging
import threading

def run_continuously():
    cease_continuous_run = threading.Event()
    class ScheduleThread(threading.Thread):
        @classmethod
        def run(cls):
            while not cease_continuous_run.is_set():
                schedule.run_pending()
    continuous_thread = ScheduleThread()
    continuous_thread.start()
    return cease_continuous_run

def background_job_1():
    logging.info("Background 1 thread running...")

def background_job_2():
    logging.info("Background 2 thread running...")

if __name__ == "__main__":
    format = "%(asctime)s: %(message)s"
    logging.basicConfig(format=format, level=logging.INFO, datefmt="%H:%M:%S")
    logging.info("Main - starting thread")
    schedule.every().second.do(background_job_1)
    schedule.every().second.do(background_job_2)
    stop_run_continuously = run_continuously()
    logging.info("Main: completed !!")
    time.sleep(5)
    stop_run_continuously.set()
				
			

Output: If you comment last 2 lines in the above code, the background jobs will run continuously

				
					PS C:\Users\swath\PycharmProjects\pythonProject> python main.py
20:43:37: Main - starting thread
20:43:37: Main: completed !!
20:43:38: Background 1 thread running…
20:43:38: Background 2 thread running…
20:43:39: Background 1 thread running…
20:43:39: Background 2 thread running…
20:43:40: Background 1 thread running…
20:43:40: Background 2 thread running…
20:43:41: Background 1 thread running…
20:43:41: Background 2 thread running…
				
			

Hope you got good understanding of how to use schedule package for scheduling your Python scripts. It also provides some more features such as parallel execution, logging, exception handling, etc.

Disadvantages

The main disadvantage with schedule package is that you need to keep your computer up and running all the time in order to run the scripts in the background. This should not be a problem for non-production tasks but for production tasks you may have to create your own server or use some 3rd party cloud services.

As mentioned earlier, schedule is not suited for the tasks that require job persistence, concurrent execution, localization, etc.

References

Share on facebook
Share on twitter
Share on linkedin
Share on whatsapp
Share on email
Chetan Ambi

Chetan Ambi

A Software Engineer & Team Lead with over 10+ years of IT experience, a Technical Blogger with a passion for cutting edge technology. Currently working in the field of Python, Machine Learning & Data Science. Chetan Ambi holds a Bachelor of Engineering Degree in Computer Science.
Scroll to Top