Skip to content

Commit 71736c0

Browse files
ResolveSessionCatalog Logical Resolution Rule
1 parent edc0305 commit 71736c0

File tree

5 files changed

+81
-33
lines changed

5 files changed

+81
-33
lines changed

docs/connector/TableProvider.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,11 @@
22

33
`TableProvider` is an [abstraction](#contract) of [table providers](#implementations) (for `DataSourceV2Utils` utility when requested for a [Table](../connectors/DataSourceV2Utils.md#getTableFromProvider)).
44

5+
`TableProvider` is part of [Connector API](index.md) and serves as an indication to use newer code execution paths (e.g., [ResolveSessionCatalog](../logical-analysis-rules/ResolveSessionCatalog.md) logical resolution rule).
6+
57
## Contract
68

7-
### <span id="getTable"> Table
9+
### Table { #getTable }
810

911
```java
1012
Table getTable(
@@ -20,7 +22,7 @@ Used when:
2022
* `DataFrameWriter` is requested to [save data](../DataFrameWriter.md#save)
2123
* `DataSourceV2Utils` utility is used to [getTableFromProvider](../connectors/DataSourceV2Utils.md#getTableFromProvider)
2224

23-
### <span id="inferPartitioning"> Inferring Partitioning
25+
### Inferring Partitioning { #inferPartitioning }
2426

2527
```java
2628
Transform[] inferPartitioning(
@@ -33,7 +35,7 @@ Used when:
3335

3436
* `DataSourceV2Utils` utility is used to [getTableFromProvider](../connectors/DataSourceV2Utils.md#getTableFromProvider)
3537

36-
### <span id="inferSchema"> Inferring Schema
38+
### Inferring Schema { #inferSchema }
3739

3840
```java
3941
StructType inferSchema(
@@ -44,7 +46,7 @@ Used when:
4446

4547
* `DataSourceV2Utils` utility is used to [getTableFromProvider](../connectors/DataSourceV2Utils.md#getTableFromProvider)
4648

47-
### <span id="supportsExternalMetadata"> supportsExternalMetadata
49+
### supportsExternalMetadata { #supportsExternalMetadata }
4850

4951
```java
5052
boolean supportsExternalMetadata()

docs/connector/index.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,12 @@
11
# Connector API
22

3-
**Connector API** is a new API in Spark 3 for Spark SQL developers to create [connectors](../connectors/index.md) (_data sources_).
3+
**Connector API** is a new API in Spark 3 for Spark SQL developers to create [connectors](../connectors/index.md) (_data sources_ or _providers_).
44

5-
Connector API is meant to replace the older (and soon deprecated) DataSource v1 and v2.
5+
!!! note
6+
Connector API is meant to replace the older (deprecated) DataSource v1 and v2.
7+
8+
Although "Data Source V2" name has already been used, Connector API is considered the "real" Data Source V2.
9+
10+
[ResolveSessionCatalog](../logical-analysis-rules/ResolveSessionCatalog.md) logical resolution rule uses [TableProvider](TableProvider.md) to recognize Data Source V2 providers.
11+
12+
[spark.sql.sources.useV1SourceList](../configuration-properties.md#spark.sql.sources.useV1SourceList) configuration property is used for connectors for which Data Source V2 code path is disabled (that should fall back to Data Source V1 execution paths).

docs/hive/HiveAnalysis.md

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -3,19 +3,3 @@
33
`HiveAnalysis` is a HiveSessionStateBuilder.md#postHocResolutionRules[logical posthoc resolution rule] that the HiveSessionStateBuilder.md#analyzer[Hive-specific logical query plan analyzer] uses to <<apply, FIXME>>.
44

55
Technically, `HiveAnalysis` is a ../catalyst/Rule.md[Catalyst rule] for transforming ../spark-sql-LogicalPlan.md[logical plans], i.e. `Rule[LogicalPlan]`.
6-
7-
[source, scala]
8-
----
9-
// FIXME Example of HiveAnalysis
10-
----
11-
12-
=== [[apply]] Applying HiveAnalysis Rule to Logical Plan (Executing HiveAnalysis) -- `apply` Method
13-
14-
[source, scala]
15-
----
16-
apply(plan: LogicalPlan): LogicalPlan
17-
----
18-
19-
NOTE: `apply` is part of ../catalyst/Rule.md#apply[Rule Contract] to apply a rule to a ../spark-sql-LogicalPlan.md[logical plan].
20-
21-
`apply`...FIXME

docs/logical-analysis-rules/ResolveSessionCatalog.md

Lines changed: 42 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -17,13 +17,50 @@ title: ResolveSessionCatalog
1717

1818
`ResolveSessionCatalog` is created as an extended resolution rule when [HiveSessionStateBuilder](../hive/HiveSessionStateBuilder.md#analyzer) and [BaseSessionStateBuilder](../BaseSessionStateBuilder.md#analyzer) are requested for the analyzer.
1919

20-
## <span id="apply"> Executing Rule
20+
## Executing Rule { #apply }
21+
22+
??? note "Rule"
23+
24+
```scala
25+
apply(
26+
plan: LogicalPlan): LogicalPlan
27+
```
28+
29+
`apply` is part of the [Rule](../catalyst/Rule.md#apply) abstraction.
30+
31+
`apply` resolves the following logical operators:
32+
33+
* [AddColumns](../logical-operators/AddColumns.md)
34+
* [CreateTable](../logical-operators/CreateTable.md) with the default [session catalog](../connector/catalog/CatalogV2Util.md#isSessionCatalog) (`spark_catalog`) and the [table provider not v2](#isV2Provider)
35+
* [CreateTableAsSelect](../logical-operators/CreateTableAsSelect.md) with the default [session catalog](../connector/catalog/CatalogV2Util.md#isSessionCatalog) (`spark_catalog`) and the [table provider not v2](#isV2Provider)
36+
* _others_
37+
38+
### constructV1TableCmd { #constructV1TableCmd }
2139

2240
```scala
23-
apply(
24-
plan: LogicalPlan): LogicalPlan
41+
constructV1TableCmd(
42+
query: Option[LogicalPlan],
43+
tableSpec: TableSpecBase,
44+
ident: TableIdentifier,
45+
tableSchema: StructType,
46+
partitioning: Seq[Transform],
47+
ignoreIfExists: Boolean,
48+
storageFormat: CatalogStorageFormat,
49+
provider: String): CreateTable
2550
```
2651

27-
`apply`...FIXME
52+
`constructV1TableCmd` [buildCatalogTable](#buildCatalogTable) and creates a new [CreateTable](../logical-operators/CreateTable.md) logical operator.
53+
54+
### isV2Provider { #isV2Provider }
55+
56+
```scala
57+
isV2Provider(
58+
provider: String): Boolean
59+
```
60+
61+
`isV2Provider` [looks up the DataSourceV2 implementation](../DataSource.md#lookupDataSourceV2) (the [TableProvider](../connector/TableProvider.md)) for the given `provider`.
62+
63+
!!! note "provider"
64+
The `provider` name can be an alias (a short name) or a fully-qualified class name.
2865

29-
`apply` is part of the [Catalyst Rule](../catalyst/Rule.md#apply) abstraction.
66+
`isV2Provider` is `true` for all the [connectors](../connector/index.md) but [FileDataSourceV2](../files/FileDataSourceV2.md). Otherwise, `isV2Provider` is `false`.

docs/logical-operators/CreateTable.md

Lines changed: 24 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,29 @@ title: CreateTable
44

55
# CreateTable Logical Operator
66

7+
`CreateTable` is a [LogicalPlan](LogicalPlan.md).
8+
9+
## Creating Instance
10+
11+
`CreateTable` takes the following to be created:
12+
13+
* <span id="tableDesc"> [CatalogTable](../CatalogTable.md)
14+
* <span id="mode"> [SaveMode](../DataFrameWriter.md#SaveMode)
15+
* <span id="query"> Optional Query ([LogicalPlan](LogicalPlan.md))
16+
17+
While being created, `CreateTable` asserts the following:
18+
19+
* The [table](#tableDesc) to be created must have the [provider](../CatalogTable.md#provider)
20+
* With no [query](#query), the [SaveMode](#mode) must be `ErrorIfExists` or `Ignore`
21+
22+
`CreateTable` is created when:
23+
24+
* [DataFrameWriter.saveAsTable](../DataFrameWriter.md#saveAsTable) operator is used (to [create a table](../DataFrameWriter.md#createTable))
25+
* [ResolveSessionCatalog](../logical-analysis-rules/ResolveSessionCatalog.md) logical resolution rule is executed (to [constructV1TableCmd](../logical-analysis-rules/ResolveSessionCatalog.md#constructV1TableCmd))
26+
27+
<!---
28+
## Review Me
29+
730
`CreateTable` is a [logical operator](LogicalPlan.md) that represents (is <<creating-instance, created>> for) the following:
831
932
* `DataFrameWriter` is requested to [create a table](../DataFrameWriter.md#createTable) (for [DataFrameWriter.saveAsTable](../DataFrameWriter.md#saveAsTable) operator)
@@ -38,9 +61,4 @@ The optional <<query, AS query>> is defined when used for the following:
3861
* [[tableDesc]] [Table metadata](../CatalogTable.md)
3962
* [[mode]] [SaveMode](../DataFrameWriter.md#SaveMode)
4063
* [[query]] Optional AS query ([Logical query plan](../logical-operators/LogicalPlan.md))
41-
42-
When created, `CreateTable` makes sure that the optional <<query, logical query plan>> is undefined only when the <<mode, mode>> is `ErrorIfExists` or `Ignore`. `CreateTable` throws an `AssertionError` otherwise:
43-
44-
```
45-
assertion failed: create table without data insertion can only use ErrorIfExists or Ignore as SaveMode.
46-
```
64+
-->

0 commit comments

Comments
 (0)