Skip to content

Commit c3ef65a

Browse files
BasePredicate Expressions
1 parent 737ce74 commit c3ef65a

File tree

5 files changed

+89
-19
lines changed

5 files changed

+89
-19
lines changed

docs/BindReferences.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# BindReferences
2+
3+
`BindReferences` utility allows [bindReferences](#bindReferences).
4+
5+
## <span id="bindReferences"> bindReferences
6+
7+
```scala
8+
bindReference[A <: Expression](
9+
expression: A,
10+
input: AttributeSeq,
11+
allowFailures: Boolean = false): A
12+
bindReferences[A <: Expression](
13+
expressions: Seq[A],
14+
input: AttributeSeq): Seq[A] // (1)!
15+
```
16+
17+
1. `bindReference` of all the given `expressions` to the `input` schema
18+
19+
For every `AttributeReference` expression in the given `expression`, `bindReferences` finds the `ExprId` in the `input` schema and creates a [BoundReference](expressions/BoundReference.md) expression.
20+
21+
`bindReferences` throws an `IllegalStateException` when an `AttributeReference` could not be found in the input schema:
22+
23+
```text
24+
Couldn't find [a] in [input]
25+
```
26+
27+
---
28+
29+
`bindReferences` is used when:
30+
31+
* `ExpressionEncoder` is requested to [resolveAndBind](ExpressionEncoder.md#resolveAndBind)
32+
* _others_

docs/ExpressionEncoder.md

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -124,35 +124,41 @@ resolveAndBind(
124124

125125
`resolveAndBind`...FIXME
126126

127+
---
128+
127129
`resolveAndBind` is used when:
128130

129-
* `ResolveEncodersInUDF` analysis rule is executed
130131
* `Dataset` is requested for [resolvedEnc](Dataset.md#resolvedEnc)
131-
* `TypedAggregateExpression` is created
132-
* `ResolveEncodersInScalaAgg` extended resolution rule is executed
133132
* _others_
134133

135134
### Demo
136135

137-
```text
136+
```scala
138137
case class Person(id: Long, name: String)
139138
import org.apache.spark.sql.Encoders
140139
val schema = Encoders.product[Person].schema
140+
```
141141

142+
```scala
142143
import org.apache.spark.sql.catalyst.encoders.{RowEncoder, ExpressionEncoder}
143144
import org.apache.spark.sql.Row
144145
val encoder: ExpressionEncoder[Row] = RowEncoder.apply(schema).resolveAndBind()
146+
val deserializer = encoder.deserializer
147+
```
145148

149+
```scala
146150
import org.apache.spark.sql.catalyst.InternalRow
147-
val row = InternalRow(1, "Jacek")
148-
149-
val deserializer = encoder.deserializer
151+
val input = InternalRow(1, "Jacek")
152+
```
150153

151-
scala> deserializer.eval(row)
154+
```text
155+
scala> deserializer.eval(input)
152156
java.lang.UnsupportedOperationException: Only code-generated evaluation is supported
153157
at org.apache.spark.sql.catalyst.expressions.objects.CreateExternalRow.eval(objects.scala:1105)
154158
... 54 elided
159+
```
155160

161+
```scala
156162
import org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext
157163
val ctx = new CodegenContext
158164
val code = deserializer.genCode(ctx).code

docs/expressions/BasePredicate.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# BasePredicate Expressions
2+
3+
`BasePredicate` is an [abstraction](#contract) of [predicate expressions](#implementations) that can be [evaluated](#eval) to a `Boolean` value.
4+
5+
`BasePredicate` is created using [Predicate.create](Predicate.md#create) utility.
6+
7+
## Contract
8+
9+
### <span id="eval"> Evaluating
10+
11+
```scala
12+
eval(
13+
r: InternalRow): Boolean
14+
```
15+
16+
### <span id="initialize"> Initializing
17+
18+
```scala
19+
initialize(
20+
partitionIndex: Int): Unit
21+
```
22+
23+
## Implementations
24+
25+
* `InterpretedPredicate`

docs/expressions/Predicate.md

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,14 @@
11
# Predicate Expressions
22

3-
`Predicate` is an extension of the [Expression](Expression.md) abstraction for [expressions](#implementations) that return a [boolean](#dataType) value (_predicates_).
3+
`Predicate` is an extension of the [Expression](Expression.md) abstraction for [predicate expressions](#implementations) that evaluate to a value of [BooleanType](#dataType) type.
44

55
## Implementations
66

7-
* And
8-
* AtLeastNNonNulls
97
* [BinaryComparison](BinaryComparison.md)
10-
* DynamicPruning
118
* [Exists](Exists.md)
129
* [In](In.md)
1310
* [InSet](InSet.md)
14-
* `InSubquery`
15-
* IsNaN
16-
* IsNotNull
17-
* IsNull
18-
* Not
19-
* Or
20-
* StringPredicate
11+
* _others_
2112

2213
## <span id="dataType"> DataType
2314

@@ -27,4 +18,18 @@ dataType: DataType
2718

2819
`dataType` is part of the [Expression](Expression.md#dataType) abstraction.
2920

21+
---
22+
3023
`dataType` is always [BooleanType](../types/DataType.md#BooleanType).
24+
25+
## <span id="create"> Creating BasePredicate for Bound Expression
26+
27+
```scala
28+
create(
29+
e: Expression): BasePredicate
30+
create(
31+
e: Expression,
32+
inputSchema: Seq[Attribute]): BasePredicate
33+
```
34+
35+
`create` [creates a BasePredicate](#createObject) for the given [Expression](Expression.md) that is [bound](#bindReference) to the input schema ([Attribute](Attribute.md)s).

mkdocs.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -365,6 +365,7 @@ nav:
365365
- Aggregator: expressions/Aggregator.md
366366
- AttributeSeq: expressions/AttributeSeq.md
367367
- Attribute: expressions/Attribute.md
368+
- BasePredicate: expressions/BasePredicate.md
368369
- BinaryComparison: expressions/BinaryComparison.md
369370
- BinaryOperator: expressions/BinaryOperator.md
370371
- BoundReference: expressions/BoundReference.md
@@ -943,6 +944,7 @@ nav:
943944
- Hive Partitioned Parquet Table and Partition Pruning: demo/hive-partitioned-parquet-table-partition-pruning.md
944945
- Using JDBC Data Source to Access PostgreSQL: demo/using-jdbc-data-source-to-access-postgresql.md
945946
- Misc:
947+
- BindReferences: BindReferences.md
946948
- IntervalUtils: IntervalUtils.md
947949
- ExplainUtils: ExplainUtils.md
948950
- PartitionedFileUtil: PartitionedFileUtil.md

0 commit comments

Comments
 (0)