Replies: 2 comments 1 reply
-
No. they might do the same "work" - i.e. parse the same files and update their serialized and code. This is built-in in airflow scheduler design - DagFileProcessor is not supposed to "avoid" parsing the same file by multiple processors. There are various ways you can control it:
|
Beta Was this translation helpful? Give feedback.
0 replies
-
Thanks, I think parsing the files for multiple times by multiple schedulers is ok because it has nothing to do with scheduling. I just want to avoid that one dag is scheduled by multiple schedulers by multiple times at the same time. So how can I confirm it? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
HI, I am deploying a production airflow (2.10.3) environment on CentOS7 and I want to make sure every component has HA(highly available), including:
The webserver and celery worker are OK because they are not stateful, but after I started three schedulers on different machines, I found that three schedulers were all sending UPDATE dag SQL to backend meta db(which is MySQL 8.0):
I am sure all of the three schedulers sent the update SQL to db:
Does it means that the three schedulers are working at the same time and will possibly make data in meta db wrong?
I was following this part of document in airflow, which says we can run more than one scheduler at the same time to get HA of scheduler by using the feature
SELECT * FROM ... FOR UPDATE NOWAIT
of DB as long as the meta DB supports the SQL.https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/scheduler.html#running-more-than-one-scheduler
Thanks for help!
Beta Was this translation helpful? Give feedback.
All reactions