-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AIP-72: Handle SIGTERM
signal on Supervisor
#44626
base: main
Are you sure you want to change the base?
Conversation
257e1c6
to
691d183
Compare
I will follow-up with a PR to handle signals for the actual Task process |
# The actual signal sent to the task process is tested in `TestWatchedSubprocessKill` class | ||
mock_kill.assert_called_once_with(signal.SIGTERM, force=True) | ||
mock_logger.error.assert_called_once_with( | ||
"Received termination signal in supervisor. Terminating watched subprocess", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably also assert that a final TI patch was called, 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can only do that if I call supervise
instead of just proc._setup_signal_handlers()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative is:
def test_supervisor_signal_handling(self, mocker, monkeypatch, captured_logs):
"""Verify that the supervisor correctly handles signals and terminates the task process."""
mocker.patch("airflow.sdk.execution_time.supervisor.MIN_HEARTBEAT_INTERVAL", 0.01)
mock_client = MagicMock(spec=sdk_client.Client)
subprocess_pid = os.getpid()
def subprocess_main():
os.kill(subprocess_pid, signal.SIGTERM)
# It should not take 5 seconds. The process should terminate immediately
sleep(5)
proc = WatchedSubprocess.start(
path=os.devnull,
ti=TaskInstance(id=TI_ID, task_id="b", dag_id="c", run_id="d", try_number=1),
client=mock_client,
target=subprocess_main,
)
rc = proc.wait()
assert rc == -signal.SIGTERM
assert proc.final_state == TerminalTIState.FAILED
assert proc._exit_code == -signal.SIGTERM
assert {
'signal': signal.SIGTERM,
'process_pid': proc.pid,
'supervisor_pid': mocker.ANY,
'event': 'Received termination signal in supervisor. Terminating watched subprocess',
'timestamp': mocker.ANY,
'level': 'error',
'logger': 'supervisor'
} in captured_logs
assert {
'pid': proc.pid,
'exit_code': -signal.SIGTERM,
'signal': "SIGTERM",
'event': 'Process exited',
'timestamp': mocker.ANY,
'level': 'info',
'logger': 'supervisor'
} in captured_logs
mock_client.task_instances.finish.assert_called_once_with(
id=TI_ID, state=TerminalTIState.FAILED, when=mocker.ANY
)
691d183
to
3f1a533
Compare
Parking this for now to work on #44481. If someone wants to take it (and propagating signals to subprocess & its childrens) on, go for it |
As part of AIP-72, this PR introduces proper signal handling for the supervisor process. The supervisor now intercepts
SIGTERM
signals to ensure that child processes (task processes) are terminated gracefully.Without signal handling, terminating the supervisor process (e.g., via
kill -TERM
) could leave child processes running as orphaned tasks. This change ensures that when the supervisor is stopped, all managed subprocesses are also terminated cleanly.Sample Output when Superviser receives SIGTERM with (
kill -TERM 138
):vs without signal handling:
A key difference is in first case the task is also set to
failed
state as it should while before it kept the TI asrunning
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.