Skip to content

Commit 3a87f8d

Browse files
CreateDataSourceTableAsSelectCommand Logical Command
1 parent ffbefe9 commit 3a87f8d

File tree

2 files changed

+20
-63
lines changed

2 files changed

+20
-63
lines changed

docs/logical-operators/CreateDataSourceTableAsSelectCommand.md

Lines changed: 19 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -4,80 +4,36 @@ title: CreateDataSourceTableAsSelectCommand
44

55
# CreateDataSourceTableAsSelectCommand Logical Command
66

7-
`CreateDataSourceTableAsSelectCommand` is a <<DataWritingCommand.md#, logical command>> that <<run, creates a DataSource table>> with the data from a <<query, structured query>> (_AS query_).
7+
`CreateDataSourceTableAsSelectCommand` is a [LeafRunnableCommand](LeafRunnableCommand.md) that [creates a DataSource table](#run) with the data (from a [AS query](#query)).
88

9-
NOTE: A DataSource table is a Spark SQL native table that uses any data source but Hive (per `USING` clause).
9+
!!! note
10+
A [DataSource table](../connectors/DDLUtils.md#isDatasourceTable) is a Spark SQL native table that uses any data source but Hive (per `USING` clause).
1011

11-
`CreateDataSourceTableAsSelectCommand` is <<creating-instance, created>> when [DataSourceAnalysis](../logical-analysis-rules/DataSourceAnalysis.md) post-hoc logical resolution rule is executed (and resolves a CreateTable.md[CreateTable] logical operator for a Spark table with a <<query, AS query>>).
12-
13-
NOTE: CreateDataSourceTableCommand.md[CreateDataSourceTableCommand] is used instead when a CreateTable.md[CreateTable] logical operator is used with no <<query, AS query>>.
14-
15-
[source,plaintext]
16-
----
17-
val ctas = """
18-
CREATE TABLE users
19-
USING csv
20-
COMMENT 'users table'
21-
LOCATION '/tmp/users'
22-
AS SELECT * FROM VALUES ((0, "jacek"))
23-
"""
24-
scala> sql(ctas)
25-
... WARN HiveExternalCatalog: Couldn't find corresponding Hive SerDe for data source provider csv. Persisting data source table `default`.`users` into Hive metastore in Spark SQL specific format, which is NOT compatible with Hive.
26-
27-
val plan = sql(ctas).queryExecution.logical.numberedTreeString
28-
org.apache.spark.sql.AnalysisException: Table default.users already exists. You need to drop it first.;
29-
at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:159)
30-
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
31-
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
32-
at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:115)
33-
at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:194)
34-
at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:3370)
35-
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:78)
36-
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
37-
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
38-
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3370)
39-
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:194)
40-
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
41-
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
42-
... 49 elided
43-
----
12+
??? note "CreateDataSourceTableCommand Logical Operator"
13+
[CreateDataSourceTableCommand](CreateDataSourceTableCommand.md) is used instead for [CreateTable](CreateTable.md) logical operators with no [AS query](#query).
4414

4515
## Creating Instance
4616

4717
`CreateDataSourceTableAsSelectCommand` takes the following to be created:
4818

49-
* [[table]] [CatalogTable](../CatalogTable.md)
50-
* [[mode]] [SaveMode](../DataFrameWriter.md#SaveMode)
51-
* [[query]] AS query ([LogicalPlan](../logical-operators/LogicalPlan.md))
52-
* [[outputColumnNames]] Output column names (`Seq[String]`)
19+
* <span id="table"> [CatalogTable](../CatalogTable.md)
20+
* <span id="mode"> [SaveMode](../DataFrameWriter.md#SaveMode)
21+
* <span id="query"> `AS` query ([LogicalPlan](LogicalPlan.md))
22+
* <span id="outputColumnNames"> Output Column Names
5323

54-
=== [[run]] Executing Data-Writing Logical Command -- `run` Method
24+
`CreateDataSourceTableAsSelectCommand` is created when:
5525

56-
[source, scala]
57-
----
58-
run(
59-
sparkSession: SparkSession,
60-
child: SparkPlan): Seq[Row]
61-
----
26+
* [DataSourceAnalysis](../logical-analysis-rules/DataSourceAnalysis.md) post-hoc logical resolution rule is executed (to resolve [CreateTable](CreateTable.md) logical operators with a [datasource table](../connectors/DDLUtils.md#isDatasourceTable))
6227

63-
NOTE: `run` is part of DataWritingCommand.md#run[DataWritingCommand] contract.
28+
## Executing Command { #run }
6429

65-
`run`...FIXME
30+
??? note "RunnableCommand"
6631

67-
`run` throws an `AssertionError` when the [tableType](../CatalogTable.md#tableType) of the [CatalogTable](#table) is `VIEW` or the [provider](../CatalogTable.md#provider) is undefined.
32+
```scala
33+
run(
34+
sparkSession: SparkSession): Seq[Row]
35+
```
6836

69-
## <span id="saveDataIntoTable"> saveDataIntoTable
37+
`run` is part of the [RunnableCommand](RunnableCommand.md#run) abstraction.
7038

71-
```scala
72-
saveDataIntoTable(
73-
session: SparkSession,
74-
table: CatalogTable,
75-
tableLocation: Option[URI],
76-
physicalPlan: SparkPlan,
77-
mode: SaveMode,
78-
tableExists: Boolean): BaseRelation
79-
```
80-
81-
`saveDataIntoTable` creates a [BaseRelation](../BaseRelation.md) for...FIXME
82-
83-
`saveDataIntoTable`...FIXME
39+
`run`...FIXME

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -734,6 +734,7 @@ nav:
734734
- SerializerBuildHelper: SerializerBuildHelper.md
735735
- Dataset, DataFrame and RDD: spark-sql-dataset-rdd.md
736736
- Dataset and SQL: spark-sql-dataset-vs-sql.md
737+
- DDLUtils: connectors/DDLUtils.md
737738
- Implicits: implicits.md
738739
- Row: Row.md
739740
- Data Source API:

0 commit comments

Comments
 (0)