Skip to content

Conversation

@abrarsheikh
Copy link
Contributor

@abrarsheikh abrarsheikh commented Nov 8, 2025

Summary

This PR refactors the replica rank system to support multi-dimensional ranking (global, node-level, and local ranks) in preparation for node-local rank tracking. The ReplicaRank object now contains three fields instead of being a simple integer, enabling better coordination of replicas across nodes.

Motivation

Currently, Ray Serve only tracks a single global rank per replica. For advanced use cases like tensor parallelism, model sharding across nodes, and node-aware coordination, we need to track:

  • Global rank: Replica's rank across all nodes (0 to N-1)
  • Node rank: Which node the replica is on (0 to M-1)
  • Local rank: Replica's rank on its specific node (0 to K-1)

This PR lays the groundwork by introducing the expanded ReplicaRank schema while maintaining backward compatibility in feature.

Changes

Core Implementation

  • schema.py: Extended ReplicaRank to include node_rank and local_rank fields (currently set to -1 as placeholders)
  • replica.py: Updated replica actors to handle ReplicaRank objects
  • context.py: Changed ReplicaContext.rank type from Optional[int] to ReplicaRank

Current Behavior

  • node_rank and local_rank are set to -1 (placeholder values). Will change in future
  • Global rank assignment and management works as before
  • All existing functionality is preserved

Breaking Changes

Rank is changing from int to ReplicaRank

Signed-off-by: abrar <abrar@anyscale.com>
@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label Nov 8, 2025
@abrarsheikh abrarsheikh changed the title [Serve] Refactor replica rank to prepare for node local ranks [2/n] [Serve] Refactor replica rank to prepare for node local ranks Nov 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants