Recently, I was working on one Django project which is already deployed on Elastic Beanstalk. Due to some new feature requirement, I had to create a python script which can connect to the same database that Django app is connecting and perform some tasks which can re-use few functions already written. I had to set up a cronjob to run this script daily exactly one time.

 

Elastic Beanstalk makes it easy to deploy a Django project. We can do it easily with AWS CLI or we can upload zip via the console. But, When it comes to placing a Django script, It is a little complex. In this post, I am going to talk about how to create such script and how to place this script to run as cron on elastic beanstalk environment.

 

Let’s create a simple python script first.

def main():
    print("Hello world")
if __name__ == "__main__":
    main()

Put this script in your Django project. You may create the folder ‘scripts’ in the root of your project.

 

Now, Let’s start using Django context in this script and name this file as daily_report.py

from django.conf import settings
# assume that, I have user model in my app 'users' in this example 
# project
from users.models import User
from users.utils import get_registered_users
def send_new_registered_emails_to_admin(registered_users):
    receipents = settings.DAILY_STATS_RECEIPENT_EMAIL_LIST
    # TODO - 
    # add logic to send list of registered_emails to receipents
    pass
def main():
    registered_users = get_registered_users()
    send_new_registered_emails_to_admin(registered_users)
if __name__ == "__main__":
    main()

Now, If I run the above script, it will throw an error saying that Django settings are not properly configured. To use some utility function, models and Django ORM, we need to configure Django settings inside this script. To do that, we need to make sure that the path to our project’s directory is added in the operating system’s path list and we need to set a proper DJANGO_SETTINGS_MODULE environment variable in the os. This can be done by adding the following lines:

if __name__ == '__main__' and __package__ is None:
    os.sys.path.append(
        os.path.dirname(
            os.path.dirname(
                os.path.abspath(__file__))))
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "project_name.settings")

 

Now, We can run this script successfully. Next step is to add some configurations in our project so that this script gets executed as cron on elastic beanstalk environment. For that, We need to create a config file under .ebextentions folder.

 

Configuration files are YAML formatted documents with a .config file extension that you place in a folder named .ebextensions and deploy in your application source bundle.

 

Let’s call it my_cron.config with the following content as per AWS docs.

files:
    "/usr/local/bin/check_leader_only_instance.sh":
        mode: "000755"
        owner: root
        group: root
        content: |
            #!/bin/bash
            INSTANCE_ID=`curl http://169.254.169.254/latest/meta-data/instance-id 2>/dev/null`
            REGION=`curl -s http://169.254.169.254/latest/dynamic/instance-identity/document 2>/dev/null | jq -r .region`

            # Find the Auto Scaling Group name from the Elastic Beanstalk environment
            ASG=`aws ec2 describe-tags --filters "Name=resource-id,Values=$INSTANCE_ID" \
                --region $REGION --output json | jq -r '.[][] | select(.Key=="aws:autoscaling:groupName") | .Value'`

            # Find the first instance in the Auto Scaling Group
            FIRST=`aws autoscaling describe-auto-scaling-groups --auto-scaling-group-names $ASG \
                --region $REGION --output json | \
                jq -r '.AutoScalingGroups[].Instances[] | select(.LifecycleState=="InService") | .InstanceId' | sort | head -1`

            # If the instance ids are the same exit 0
            [ "$FIRST" = "$INSTANCE_ID" ]

    "/usr/local/bin/my_cron_script.sh":
        mode: "000755"
        owner: root
        group: root
        content: |            
            #!/bin/bash
            /usr/local/bin/check_leader_only_instance.sh || exit
            # Now run commands that should run on only 1 instance.
            

    "/etc/cron.d/daily_cron":
        mode: "000644"
        owner: root
        group: root
        content: |
            0 0 * * * root /usr/local/bin/my_cron_script.sh 

commands:
  rm_old_cron:
    command: "rm -fr /etc/cron.d/*.bak"
    ignoreErrors: true

With the above configurations, We are adding three files on the server which is running our EB environment.

  1. /etc/cron.d/daily_cron: This is a cron file which contains a schedule of the cronjob and the command to execute a job. In our example, We are calling another shell script file named my_cron.script.sh
  2. check_leader_only_instance.sh: Shell script to check if the server on which it is being executed is a leader server or not. Only needed when you have multiple servers behind a load balancer.
  3. my_cron_script.sh: This shell script will first call check_leader_only_instance.sh to check if the server is a leader server or not. After this check is done, we can add a command to run our Django script.

Now, let’s see the command to execute the Django script. We need to activate a virtual environment or point to python in virtual environment folder.

/path/to/venv/bin/python /path/to/folder/daily_report.py

Okay, So You must be wondering how to activate a virtual environment and what is the path to python on a server running and managed by EB. Here is the answer:

source /opt/python/run/venv/bin/activate
source /opt/python/current/env
<<<<< /opt/python/current/app is the root folder of the source code you have uploaded on Elastic beanstalk >>>>
cd /opt/python/current/app
scripts/daily_report.py

Now, Let’s add these commands to our config file so that It can be ready to deploy. Your final config file should look like this now:

files:
    "/usr/local/bin/check_leader_only_instance.sh":
        mode: "000755"
        owner: root
        group: root
        content: |
            #!/bin/bash
            INSTANCE_ID=`curl http://169.254.169.254/latest/meta-data/instance-id 2>/dev/null`
            REGION=`curl -s http://169.254.169.254/latest/dynamic/instance-identity/document 2>/dev/null | jq -r .region`

            # Find the Auto Scaling Group name from the Elastic Beanstalk environment
            ASG=`aws ec2 describe-tags --filters "Name=resource-id,Values=$INSTANCE_ID" \
                --region $REGION --output json | jq -r '.[][] | select(.Key=="aws:autoscaling:groupName") | .Value'`

            # Find the first instance in the Auto Scaling Group
            FIRST=`aws autoscaling describe-auto-scaling-groups --auto-scaling-group-names $ASG \
                --region $REGION --output json | \
                jq -r '.AutoScalingGroups[].Instances[] | select(.LifecycleState=="InService") | .InstanceId' | sort | head -1`

            # If the instance ids are the same exit 0
            [ "$FIRST" = "$INSTANCE_ID" ]

    "/usr/local/bin/my_cron_script.sh":
        mode: "000755"
        owner: root
        group: root
        content: |            
            #!/bin/bash
            /usr/local/bin/check_leader_only_instance.sh || exit
            source /opt/python/run/venv/bin/activate
            source /opt/python/current/env
            cd /opt/python/current/app
            scripts/daily_report.py            

    "/etc/cron.d/daily_cron":
        mode: "000644"
        owner: root
        group: root
        content: |
            0 0 * * * root /usr/local/bin/my_cron_script.sh 

commands:
  rm_old_cron:
    command: "rm -fr /etc/cron.d/*.bak"
    ignoreErrors: true

That’s it. Thanks for reading.