Skip to content

Commit

Permalink
fix: use loopback address for single node again
Browse files Browse the repository at this point in the history
  • Loading branch information
AlpinDale committed Sep 1, 2024
1 parent 523ac99 commit cabca73
Showing 1 changed file with 10 additions and 0 deletions.
10 changes: 10 additions & 0 deletions aphrodite/executor/ray_gpu_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,16 @@ def sort_by_driver_then_worker_ip(worker):
self._run_workers("update_environment_variables",
all_args=all_args_to_update_environment_variables)

if len(node_gpus) == 1:
# in single node case, we don't need to get the IP address.
# the loopback address is sufficient
# NOTE: a node may have several IP addresses, one for each
# network interface. `get_ip()` might return any of them,
# while they might not work for communication inside the node
# if the network setup is complicated. Using the loopback address
# solves this issue, as it always works for communication inside
# the node.
driver_ip = "127.0.0.1"
distributed_init_method = get_distributed_init_method(
driver_ip, get_open_port())

Expand Down

0 comments on commit cabca73

Please sign in to comment.