I have a scrapy crawler on an elastic beanstalk app that I can run by SSH like this:
source /opt/python/run/venv/bin/activatesource /opt/python/current/envcd /opt/python/current/appscrapy crawl spidername
I want to set up a cronjob to run this for me. So I followed the suggestions here.
My setup.config file looks like this:
container_commands:
01_cron_hemnet:
command: "cat .ebextensions/spider_cron.txt > /etc/cron.d/crawl_spidername && chmod 644 /etc/cron.d/crawl_spidername"
leader_only: true
My spider_cron.txt file looks like this:
# The newline at the end of this file is extremely important. Cron won't run without it.
* * * * * root sh /opt/python/current/app/runcrawler.sh &>/tmp/mycommand.log
# There is a newline here.
My runcrawler.sh file is located at /opt/python/current/app/runcrawler.sh and looks like this
#!/bin/bash
cd /opt/python/current/app/
PATH=$PATH:/usr/local/bin
export PATH
scrapy crawl spidername
I can navigate to /etc/cron.d/ and see that crawl_spidername exists there. But when I run crontab -l or crontab -u root -l it says that no crontab exists.
I get no log errors, no deployment errors and the /tmp/mycommand.log file that I try to output the cron to is never created. Seems like the cronjob is never started.
Ideas?