11"""
22Backend interface and registry for generative AI model interactions.
33
4- This module provides the abstract base class and interface for implementing
5- backends that communicate with generative AI models. Backends handle the
6- lifecycle of generation requests, including startup, validation, request
7- processing, and shutdown phases.
4+ Provides the abstract base class for implementing backends that communicate with
5+ generative AI models. Backends handle the lifecycle of generation requests.
86
97Classes:
108 Backend: Abstract base class for generative AI backends with registry support.
@@ -42,44 +40,38 @@ class Backend(
4240 """
4341 Abstract base class for generative AI backends with registry and lifecycle.
4442
45- This class defines the interface for implementing backends that communicate with
46- generative AI models. It combines the registry pattern for automatic discovery
47- with a well-defined lifecycle for process-based distributed execution.
43+ Provides a standard interface for backends that communicate with generative AI
44+ models. Combines the registry pattern for automatic discovery with a defined
45+ lifecycle for process-based distributed execution.
4846
49- The backend lifecycle consists of four main phases:
50- 1. Creation and initial configuration (constructor and factory methods)
51- 2. Process startup - Initialize resources within a worker process
52- 3. Validation - Verify backend readiness and configuration
53- 4. Request resolution - Process generation requests iteratively
54- 5. Process shutdown - Clean up resources when process terminates
47+ Backend lifecycle phases:
48+ 1. Creation and configuration
49+ 2. Process startup - Initialize resources in worker process
50+ 3. Validation - Verify backend readiness
51+ 4. Request resolution - Process generation requests
52+ 5. Process shutdown - Clean up resources
5553
56- All backend implementations must ensure that their state (excluding resources
57- created during process_startup) is pickleable to support transfer across
58- process boundaries in distributed execution environments.
54+ Backend state (excluding process_startup resources) must be pickleable for
55+ distributed execution across process boundaries.
5956
6057 Example:
6158 ::
62- # Register a custom backend implementation
6359 @Backend.register("my_backend")
6460 class MyBackend(Backend):
6561 def __init__(self, api_key: str):
6662 super().__init__("my_backend")
6763 self.api_key = api_key
6864
6965 async def process_startup(self):
70- # Initialize process-specific resources
7166 self.client = MyAPIClient(self.api_key)
7267
73- ...
74-
75- # Create backend instance using factory method
7668 backend = Backend.create("my_backend", api_key="secret")
7769 """
7870
7971 @classmethod
8072 def create (cls , type_ : BackendType , ** kwargs ) -> "Backend" :
8173 """
82- Factory method to create a backend instance based on the backend type.
74+ Create a backend instance based on the backend type.
8375
8476 :param type_: The type of backend to create.
8577 :param kwargs: Additional arguments for backend initialization.
@@ -93,65 +85,72 @@ def create(cls, type_: BackendType, **kwargs) -> "Backend":
9385
9486 def __init__ (self , type_ : BackendType ):
9587 """
96- Initialize a backend instance with the specified type .
88+ Initialize a backend instance.
9789
98- :param type_: The backend type identifier for this instance .
90+ :param type_: The backend type identifier.
9991 """
10092 self .type_ = type_
10193
10294 @property
10395 def processes_limit (self ) -> Optional [int ]:
10496 """
105- :return: The maximum number of worker processes supported by the
106- backend. None if not limited.
97+ :return: Maximum number of worker processes supported. None if unlimited.
10798 """
10899 return None
109100
110101 @property
111102 def requests_limit (self ) -> Optional [int ]:
112103 """
113- :return: The maximum number of concurrent requests that can be processed
114- at once globally by the backend. None if not limited .
104+ :return: Maximum number of concurrent requests supported globally.
105+ None if unlimited .
115106 """
116107 return None
117108
109+ @abstractmethod
110+ def info (self ) -> dict [str , Any ]:
111+ """
112+ :return: Backend metadata including model information, endpoints, and
113+ configuration data for reporting and diagnostics.
114+ """
115+ ...
116+
118117 @abstractmethod
119118 async def process_startup (self ):
120119 """
121120 Initialize process-specific resources and connections.
122121
123- This method is called when a backend instance is transferred to a worker
124- process and needs to establish connections, initialize clients, or set up
125- any other resources required for request processing. All resources created
126- here are process-local and do not need to be pickleable.
127- If there are any errors during startup, this method should raise an
128- appropriate exception.
122+ Called when a backend instance is transferred to a worker process.
123+ Creates connections, clients, and other resources required for request
124+ processing. Resources created here are process-local and need not be
125+ pickleable.
129126
130- Must be called before validate() or resolve() can be used.
127+ Must be called before validate() or resolve().
128+
129+ :raises: Exception if startup fails.
131130 """
132131 ...
133132
134133 @abstractmethod
135- async def validate (self ):
134+ async def process_shutdown (self ):
136135 """
137- Validate backend configuration and readiness for request processing .
136+ Clean up process-specific resources and connections .
138137
139- This method verifies that the backend is properly configured and can
140- successfully communicate with the target model service. It should be
141- called after process_startup() and before resolve() to ensure the
142- backend is ready to handle generation requests.
143- If the backend cannot connect to the service or is not ready,
144- this method should raise an appropriate exception.
138+ Called when the worker process is shutting down. Cleans up resources
139+ created during process_startup(). After this method, validate() and
140+ resolve() should not be used.
145141 """
142+ ...
146143
147144 @abstractmethod
148- async def process_shutdown (self ):
145+ async def validate (self ):
149146 """
150- Clean up process-specific resources and connections .
147+ Validate backend configuration and readiness .
151148
152- This method is called when the worker process is shutting down and
153- should clean up any resources created during process_startup(). After
154- this method is called, validate() and resolve() should not be used.
149+ Verifies the backend is properly configured and can communicate with the
150+ target model service. Should be called after process_startup() and before
151+ resolve().
152+
153+ :raises: Exception if backend is not ready or cannot connect.
155154 """
156155 ...
157156
@@ -167,37 +166,23 @@ async def resolve(
167166 """
168167 Process a generation request and yield progressive responses.
169168
170- This method processes a generation request through the backend's model
171- service, yielding intermediate responses as the generation progresses.
172- The final yielded item contains the complete response and timing data.
173-
174- The request_info parameter is updated with timing metadata and other
175- tracking information throughout the request processing lifecycle.
169+ Processes a generation request through the backend's model service,
170+ yielding intermediate responses as generation progresses. The final
171+ yielded item contains the complete response and timing data.
176172
177- :param request: The generation request containing content and parameters.
178- :param request_info: Request tracking information to be updated with
179- timing and progress metadata during processing.
173+ :param request: The generation request with content and parameters.
174+ :param request_info: Request tracking information updated with timing
175+ and progress metadata during processing.
180176 :param history: Optional conversation history for multi-turn requests.
181- Each tuple contains a previous request-response pair that provides
182- context for the current generation.
183- :yields: Tuples of (response, updated_request_info) as the generation
184- progresses. The final tuple contains the complete response.
185- """
186- ...
187-
188- @abstractmethod
189- async def info (self ) -> dict [str , Any ]:
190- """
191- :return: Dictionary containing backend metadata such as model
192- information, service endpoints, version details, and other
193- configuration data useful for reporting and diagnostics.
177+ Each tuple contains a previous request-response pair.
178+ :yields: Tuples of (response, updated_request_info) as generation
179+ progresses. Final tuple contains the complete response.
194180 """
195181 ...
196182
197183 @abstractmethod
198184 async def default_model (self ) -> str :
199185 """
200- :return: The model name or identifier that this backend is
201- configured to use by default for generation requests.
186+ :return: The default model name or identifier for generation requests.
202187 """
203188 ...
0 commit comments