You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/UnsafeRow.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -122,7 +122,7 @@ Object getBaseObject()
122
122
123
123
*`UnsafeWriter` is requested to `write` an `UnsafeRow`
124
124
*`UnsafeExternalRowSorter` is requested to `insertRow` an `UnsafeRow`
125
-
*`UnsafeFixedWidthAggregationMap` is requested to [getAggregationBufferFromUnsafeRow](UnsafeFixedWidthAggregationMap.md#getAggregationBufferFromUnsafeRow)
125
+
*`UnsafeFixedWidthAggregationMap` is requested to [getAggregationBufferFromUnsafeRow](physical-operators/UnsafeFixedWidthAggregationMap.md#getAggregationBufferFromUnsafeRow)
126
126
*`UnsafeKVExternalSorter` is requested to `insertKV`
127
127
*`ExternalAppendOnlyUnsafeRowArray` is requested to [add an UnsafeRow](ExternalAppendOnlyUnsafeRowArray.md#add)
128
128
*`UnsafeHashedRelation` is requested to [get](physical-operators/UnsafeHashedRelation.md#get), [getValue](physical-operators/UnsafeHashedRelation.md#getValue), [getWithKeyIndex](physical-operators/UnsafeHashedRelation.md#getWithKeyIndex), [getValueWithKeyIndex](physical-operators/UnsafeHashedRelation.md#getValueWithKeyIndex), [apply](physical-operators/UnsafeHashedRelation.md#apply)
@@ -179,8 +179,8 @@ void copyFrom(
179
179
180
180
`copyFrom` is used when:
181
181
182
-
*`ObjectAggregationIterator` is requested to [processInputs](ObjectAggregationIterator.md#processInputs) (using `SortBasedAggregator`)
183
-
*`TungstenAggregationIterator` is requested to [produce the next UnsafeRow](TungstenAggregationIterator.md#next) and [outputForEmptyGroupingKeyWithoutInput](TungstenAggregationIterator.md#outputForEmptyGroupingKeyWithoutInput)
182
+
*`ObjectAggregationIterator` is requested to [processInputs](physical-operators/ObjectAggregationIterator.md#processInputs) (using `SortBasedAggregator`)
183
+
*`TungstenAggregationIterator` is requested to [produce the next UnsafeRow](physical-operators/TungstenAggregationIterator.md#next) and [outputForEmptyGroupingKeyWithoutInput](physical-operators/TungstenAggregationIterator.md#outputForEmptyGroupingKeyWithoutInput)
**(internal)** The number of rows of an in-memory hash map (to store aggregation buffer) before [ObjectHashAggregateExec](physical-operators/ObjectHashAggregateExec.md) ([ObjectAggregationIterator](ObjectAggregationIterator.md#processInputs) precisely) falls back to sort-based aggregation
55
+
**(internal)** The number of rows of an in-memory hash map (to store aggregation buffer) before [ObjectHashAggregateExec](physical-operators/ObjectHashAggregateExec.md) ([ObjectAggregationIterator](physical-operators/ObjectAggregationIterator.md#processInputs) precisely) falls back to sort-based aggregation
Copy file name to clipboardExpand all lines: docs/expressions/AggregateExpression.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,7 +20,7 @@
20
20
21
21
* For `PartialMerge` or `Final` modes, the input to the [AggregateFunction](#aggregateFunction) is [immutable input aggregation buffers](AggregateFunction.md#inputAggBufferAttributes), and the actual children of the `AggregateFunction` is not used
22
22
23
-
*[AggregateExpression](../AggregationIterator.md#aggregateExpressions)s of a [AggregationIterator](../AggregationIterator.md) cannot have more than 2 distinct modes nor the modes be among `Partial` and `PartialMerge` or `Final` and `Complete` mode pairs
23
+
*[AggregateExpression](../physical-operators/AggregationIterator.md#aggregateExpressions)s of a [AggregationIterator](../physical-operators/AggregationIterator.md) cannot have more than 2 distinct modes nor the modes be among `Partial` and `PartialMerge` or `Final` and `Complete` mode pairs
24
24
25
25
*`Partial` and `Complete` or `PartialMerge` and `Final` pairs are supported
Copy file name to clipboardExpand all lines: docs/expressions/DeclarativeAggregate.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,7 +16,7 @@ Used when:
16
16
17
17
*`EliminateAggregateFilter` logical optimization is executed
18
18
*`AggregatingAccumulator` utility is used to create an `AggregatingAccumulator`
19
-
*`AggregationIterator` is requested for the [generateResultProjection](../AggregationIterator.md#generateResultProjection)
19
+
*`AggregationIterator` is requested for the [generateResultProjection](../physical-operators/AggregationIterator.md#generateResultProjection)
20
20
*`HashAggregateExec` physical operator is requested to [doProduceWithoutKeys](../physical-operators/HashAggregateExec.md#doProduceWithoutKeys) and [generateResultFunction](../physical-operators/HashAggregateExec.md#generateResultFunction)
21
21
*`AggregateProcessor` is [created](../window-functions/AggregateProcessor.md#apply)
22
22
@@ -32,7 +32,7 @@ Used when:
32
32
33
33
*`EliminateAggregateFilter` logical optimization is executed
34
34
*`AggregatingAccumulator` utility is used to create an `AggregatingAccumulator`
35
-
*`AggregationIterator` is [created](../AggregationIterator.md#expressionAggInitialProjection)
35
+
*`AggregationIterator` is [created](../physical-operators/AggregationIterator.md#expressionAggInitialProjection)
36
36
*`HashAggregateExec` physical operator is requested to [doProduceWithoutKeys](../physical-operators/HashAggregateExec.md#doProduceWithoutKeys), [createHashMap](../physical-operators/HashAggregateExec.md#createHashMap) and [getEmptyAggregationBuffer](../physical-operators/HashAggregateExec.md#getEmptyAggregationBuffer)
37
37
*`HashMapGenerator` is created
38
38
*`AggregateProcessor` is [created](../window-functions/AggregateProcessor.md#apply)
Copy file name to clipboardExpand all lines: docs/expressions/ImperativeAggregate.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,9 +15,9 @@ Used when:
15
15
16
16
*`EliminateAggregateFilter` logical optimization is executed
17
17
*`AggregatingAccumulator` is requested to `createBuffer`
18
-
*`AggregationIterator` is requested to [initializeBuffer](../AggregationIterator.md#initializeBuffer)
19
-
*`ObjectAggregationIterator` is requested to [initAggregationBuffer](../ObjectAggregationIterator.md#initAggregationBuffer)
20
-
*`TungstenAggregationIterator` is requested to [createNewAggregationBuffer](../TungstenAggregationIterator.md#createNewAggregationBuffer)
18
+
*`AggregationIterator` is requested to [initializeBuffer](../physical-operators/AggregationIterator.md#initializeBuffer)
19
+
*`ObjectAggregationIterator` is requested to [initAggregationBuffer](../physical-operators/ObjectAggregationIterator.md#initAggregationBuffer)
20
+
*`TungstenAggregationIterator` is requested to [createNewAggregationBuffer](../physical-operators/TungstenAggregationIterator.md#createNewAggregationBuffer)
21
21
*`AggregateProcessor` is requested to [initialize](../window-functions/AggregateProcessor.md#initialize)
22
22
23
23
### <spanid="merge"> merge
@@ -31,7 +31,7 @@ merge(
31
31
Used when:
32
32
33
33
*`AggregatingAccumulator` is requested to `merge`
34
-
*`AggregationIterator` is requested to [generateProcessRow](../AggregationIterator.md#generateProcessRow)
34
+
*`AggregationIterator` is requested to [generateProcessRow](../physical-operators/AggregationIterator.md#generateProcessRow)
35
35
36
36
### <spanid="update"> update
37
37
@@ -44,7 +44,7 @@ update(
44
44
Used when:
45
45
46
46
*`AggregatingAccumulator` is requested to `add` an `InternalRow`
47
-
*`AggregationIterator` is requested to [generateProcessRow](../AggregationIterator.md#generateProcessRow)
47
+
*`AggregationIterator` is requested to [generateProcessRow](../physical-operators/AggregationIterator.md#generateProcessRow)
48
48
*`AggregateProcessor` is requested to [update](../window-functions/AggregateProcessor.md#update)
* <spanid="resultExpressions"> Result [NamedExpression](expressions/NamedExpression.md)s
30
+
* <spanid="resultExpressions"> Result [NamedExpression](../expressions/NamedExpression.md)s
31
31
* <spanid="newMutableProjection"> Function to create a new `MutableProjection` given expressions and attributes (`(Seq[Expression], Seq[Attribute]) => MutableProjection`)
`HashAggregateExec` uses [TungstenAggregationIterator](../TungstenAggregationIterator.md) (to iterate over `UnsafeRows` in partitions) when [executed](#doExecute).
16
+
`HashAggregateExec` uses [TungstenAggregationIterator](TungstenAggregationIterator.md) (to iterate over `UnsafeRows` in partitions) when [executed](#doExecute).
17
17
18
18
!!! note
19
-
`HashAggregateExec` uses `TungstenAggregationIterator` that can (theoretically) [switch to a sort-based aggregation when the hash-based approach is unable to acquire enough memory](../TungstenAggregationIterator.md#switchToSortBasedAggregation).
19
+
`HashAggregateExec` uses `TungstenAggregationIterator` that can (theoretically) [switch to a sort-based aggregation when the hash-based approach is unable to acquire enough memory](TungstenAggregationIterator.md#switchToSortBasedAggregation).
20
20
21
21
See [testFallbackStartsAt](#testFallbackStartsAt) internal property and [spark.sql.TungstenAggregate.testFallbackStartsAt](../configuration-properties.md#spark.sql.TungstenAggregate.testFallbackStartsAt) configuration property.
`supportsAggregate`[checks support for aggregation](../UnsafeFixedWidthAggregationMap.md#supportsAggregationBufferSchema) given the aggregation buffer [Attribute](../expressions/Attribute.md)s.
36
+
`supportsAggregate`[checks support for aggregation](UnsafeFixedWidthAggregationMap.md#supportsAggregationBufferSchema) given the aggregation buffer [Attribute](../expressions/Attribute.md)s.
37
37
38
38
`supportsAggregate` is used when:
39
39
@@ -228,10 +228,10 @@ In the end, `doExecute` calculates the <<aggTime, aggTime>> metric and returns a
228
228
229
229
* A single-element `Iterator[UnsafeRow]` with the <<TungstenAggregationIterator.md#outputForEmptyGroupingKeyWithoutInput, single UnsafeRow>>
230
230
231
-
* The [TungstenAggregationIterator](../TungstenAggregationIterator.md)
231
+
* The [TungstenAggregationIterator](TungstenAggregationIterator.md)
232
232
233
233
!!! note
234
-
The [numOutputRows](#numOutputRows), [peakMemory](#peakMemory), [spillSize](#spillSize) and [avgHashProbe](#avgHashProbe) metrics are used exclusively to create the [TungstenAggregationIterator](../TungstenAggregationIterator.md).
234
+
The [numOutputRows](#numOutputRows), [peakMemory](#peakMemory), [spillSize](#spillSize) and [avgHashProbe](#avgHashProbe) metrics are used exclusively to create the [TungstenAggregationIterator](TungstenAggregationIterator.md).
235
235
236
236
!!! note
237
237
`doExecute` (by `RDD.mapPartitionsWithIndex` transformation) adds a new `MapPartitionsRDD` to the RDD lineage. Use `RDD.toDebugString` to see the additional `MapPartitionsRDD`.
@@ -352,7 +352,7 @@ finishAggregate(
352
352
createHashMap():UnsafeFixedWidthAggregationMap
353
353
```
354
354
355
-
`createHashMap` creates a [UnsafeFixedWidthAggregationMap](../UnsafeFixedWidthAggregationMap.md) (with the <<getEmptyAggregationBuffer, empty aggregation buffer>>, the <<bufferSchema, bufferSchema>>, the <<groupingKeySchema, groupingKeySchema>>, the current `TaskMemoryManager`, `1024 * 16` initial capacity and the page size of the `TaskMemoryManager`)
355
+
`createHashMap` creates a [UnsafeFixedWidthAggregationMap](UnsafeFixedWidthAggregationMap.md) (with the <<getEmptyAggregationBuffer, empty aggregation buffer>>, the <<bufferSchema, bufferSchema>>, the <<groupingKeySchema, groupingKeySchema>>, the current `TaskMemoryManager`, `1024 * 16` initial capacity and the page size of the `TaskMemoryManager`)
0 commit comments