Restarting Airflow Scheduler Service When it's Unhealthy

Airflow Scheduler Service

I use a unit file to run Airflow Scheduler as a daemon that should restart any time it fails. But I noticed the airflow scheduler doesn't restart when it can't connect to the backend (scheduler) database.

I'm using MariaDB (like MySQL) as the backend (scheduler) database, but it's on a separate VM. So if that VM starts after Airflow Scheduler, then it's stuck in an unhealthy state and doesn't restart.

Here's the MySQL error in the Airflow Scheduler

● airflow-scheduler.service - Airflow scheduler daemon
     Loaded: loaded (/etc/systemd/system/airflow-scheduler.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2023-02-12 00:28:21 EST; 1 day 9h ago
   Main PID: 152 (airflow)
      Tasks: 2 (limit: 134815)
     Memory: 71.0M
        CPU: 7.994s
     CGroup: /system.slice/airflow-scheduler.service
             ├─152 /usr/bin/python3 /home/airflow/.local/bin/airflow scheduler
             └─420 airflow serve-logs

Feb 12 00:28:29 anonAirflow airflow[152]:   File "/home/airflow/.local/lib/python3.8/site-packages/mysql/connector/__init__.py", lin>
Feb 12 00:28:29 anonAirflow airflow[152]:     return CMySQLConnection(*args, **kwargs)
Feb 12 00:28:29 anonAirflow airflow[152]:   File "/home/airflow/.local/lib/python3.8/site-packages/mysql/connector/connection_cext.p>
Feb 12 00:28:29 anonAirflow airflow[152]:     self.connect(**kwargs)
Feb 12 00:28:29 anonAirflow airflow[152]:   File "/home/airflow/.local/lib/python3.8/site-packages/mysql/connector/abstracts.py", li>
Feb 12 00:28:29 anonAirflow airflow[152]:     self._open_connection()
Feb 12 00:28:29 anonAirflow airflow[152]:   File "/home/airflow/.local/lib/python3.8/site-packages/mysql/connector/connection_cext.p>
Feb 12 00:28:29 anonAirflow airflow[152]:     raise errors.get_mysql_exception(msg=exc.msg, errno=exc.errno,
Feb 12 00:28:29 anonAirflow airflow[152]: sqlalchemy.exc.DatabaseError: (mysql.connector.errors.DatabaseError) 2003 (HY000): Can't c>
Feb 12 00:28:29 anonAirflow airflow[152]: (Background on this error at: http://sqlalche.me/e/13/4xp6)
root@anonAirflow:/# sudo systemctl status airflow-scheduler.service | grep sqlalchemy.exc.DatabaseError
systemctl status airflow-scheduler.service

Restart Airflow Scheduler

So I added a cronjob to test every hour if the scheduler was health, and if it was unhealthy to restart!

58 * * * * if [ "$(curl http://1.2.3.4:8080/health | jq .scheduler.status)" != '"healthy"' ]; then systemctl restart airflow-scheduler.service; fi
crontab to restart scheduler if it's not healthy

And for anyone looking for the unit file I'm using for the scheduler, please see below!

#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.

[Unit]
Description=Airflow scheduler daemon
After=network.target postgresql.service mysql.service redis.service rabbitmq-server.service
Wants=postgresql.service mysql.service redis.service rabbitmq-server.service

[Service]
EnvironmentFile=/etc/sysconfig/airflow
User=airflow
Group=airflow
Type=simple
ExecStart=/bin/airflow scheduler
Restart=always
RestartSec=5s

[Install]
WantedBy=multi-user.target
Scheduler daemon for systemd, as from Airflow's Github
airflow/airflow-scheduler.service at main · apache/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows - airflow/airflow-scheduler.service at main · apache/airflow