Skip to content

Commit 73909f3

Browse files
[DP] PipelinesHandler and Start Pipeline
1 parent 06d3fa2 commit 73909f3

File tree

2 files changed

+28
-7
lines changed

2 files changed

+28
-7
lines changed

docs/declarative-pipelines/PipelinesHandler.md

Lines changed: 23 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,15 @@ handlePipelinesCommand(
2020
| `DROP_DATAFLOW_GRAPH` | [Drops a pipeline](#DROP_DATAFLOW_GRAPH) |
2121
| `DEFINE_DATASET` | [Defines a dataset](#DEFINE_DATASET) |
2222
| `DEFINE_FLOW` | [Defines a flow](#DEFINE_FLOW) |
23-
| `START_RUN` | [START_RUN](#START_RUN) |
23+
| `START_RUN` | [Starts a pipeline](#START_RUN) |
2424
| `DEFINE_SQL_GRAPH_ELEMENTS` | [DEFINE_SQL_GRAPH_ELEMENTS](#DEFINE_SQL_GRAPH_ELEMENTS) |
2525

26+
`handlePipelinesCommand` reports an `UnsupportedOperationException` for incorrect commands:
27+
28+
```text
29+
[other] not supported
30+
```
31+
2632
---
2733

2834
`handlePipelinesCommand` is used when:
@@ -49,7 +55,7 @@ Define pipelines flow cmd received: [cmd]
4955

5056
`handlePipelinesCommand` [defines a flow](#defineFlow).
5157

52-
### startRun { #startRun }
58+
### Start Pipeline { #startRun }
5359

5460
```scala
5561
startRun(
@@ -58,9 +64,21 @@ startRun(
5864
sessionHolder: SessionHolder): Unit
5965
```
6066

61-
`startRun`...FIXME
67+
`startRun` prints out the following INFO message to the logs:
68+
69+
```text
70+
Start pipeline cmd received: [cmd]
71+
```
72+
73+
`startRun` finds the [GraphRegistrationContext](GraphRegistrationContext.md) by `dataflowGraphId` in the [DataflowGraphRegistry](DataflowGraphRegistry.md) (in the given `SessionHolder`).
74+
75+
`startRun` creates a `PipelineEventSender` to send pipeline events back to the Spark Connect client (_Python pipeline runtime_).
76+
77+
`startRun` creates a [PipelineUpdateContextImpl](PipelineUpdateContextImpl.md) (with the `PipelineEventSender`).
78+
79+
In the end, `startRun` requests the `PipelineUpdateContextImpl` for the [PipelineExecution](PipelineExecution.md) to [runPipeline](PipelineExecution.md#runPipeline) or [dryRunPipeline](PipelineExecution.md#dryRunPipeline) for `dry-run` or `run` command, respectively.
6280

63-
### createDataflowGraph { #createDataflowGraph }
81+
### Create Dataflow Graph { #createDataflowGraph }
6482

6583
```scala
6684
createDataflowGraph(
@@ -105,7 +123,7 @@ For unknown types, `defineDataset` reports an `IllegalArgumentException`:
105123
Unknown dataset type: [type]
106124
```
107125

108-
### defineFlow { #defineFlow }
126+
### Define Flow { #defineFlow }
109127

110128
```scala
111129
defineFlow(

docs/declarative-pipelines/SparkPipelines.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ Creating dataflow graph...
9696
`run` sends a `CreateDataflowGraph` command for execution in the Spark Connect server.
9797

9898
!!! note "Spark Connect Server and Command Execution"
99-
`CreateDataflowGraph` and other pipeline commands are handled by [PipelinesHandler](PipelinesHandler.md) on the Spark Connect server.
99+
`CreateDataflowGraph` is handled by [PipelinesHandler](PipelinesHandler.md#createDataflowGraph) on the Spark Connect Server.
100100

101101
`run` prints out the following log message:
102102

@@ -118,6 +118,9 @@ Registering graph elements...
118118
Starting run (dry=[dry], full_refresh=[full_refresh], full_refresh_all=[full_refresh_all], refresh=[refresh])...
119119
```
120120

121-
`run` sends a `StartRun` command for execution in the Spark Connect server.
121+
`run` sends a `StartRun` command for execution in the Spark Connect Server.
122+
123+
!!! note "StartRun Command and PipelinesHandler"
124+
`StartRun` command is handled by [PipelinesHandler](PipelinesHandler.md#startRun) on the Spark Connect Server.
122125

123126
In the end, `run` keeps printing out pipeline events from the Spark Connect server.

0 commit comments

Comments
 (0)