GroupedData
Defined in: grouped-data.ts:22
Returned by DataFrame.groupBy, DataFrame.rollup,
DataFrame.cube, and related methods. Mirrors RelationalGroupedDataset
in the JVM Spark API.
GroupedData is not a DataFrame. It is a transient builder that
captures grouping expressions and waits for an aggregation call
(agg, count, sum, avg, min, max, pivot) to produce the final
DataFrame.
Example
Section titled “Example”const totals = df.groupBy("department").agg(sum("salary").alias("payroll"));Spark source: RelationalGroupedDataset.scala
Constructors
Section titled “Constructors”Constructor
Section titled “Constructor”new GroupedData( df, groupExprs, groupType?, pivot?): GroupedData;Defined in: grouped-data.ts:28
Parameters
Section titled “Parameters”| Parameter | Type | Default value |
|---|---|---|
df | DataFrame | undefined |
groupExprs | Expression[] | undefined |
groupType | "groupby" | "rollup" | "cube" | "pivot" | "groupby" |
pivot? | { col: Expression; values: (string | number | boolean)[]; } | undefined |
pivot.col? | Expression | undefined |
pivot.values? | (string | number | boolean)[] | undefined |
Returns
Section titled “Returns”GroupedData
Methods
Section titled “Methods”agg(...exprs): DataFrame;Defined in: grouped-data.ts:52
Apply one or more aggregate functions.
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
…exprs | Column[] |
Returns
Section titled “Returns”Example
Section titled “Example”df.groupBy(col("dept")).agg( functions.sum(col("salary")).alias("total_salary"), functions.avg(col("age")).alias("avg_age"), )
Each Column passed in should be an aggregate expression (sum, avg, count,min, max). The grouping columns are implicitly included in the output.avg(...columnNames): DataFrame;Defined in: grouped-data.ts:88
Shorthand for agg(avg(column))
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
…columnNames | string[] |
Returns
Section titled “Returns”count()
Section titled “count()”count(): DataFrame;Defined in: grouped-data.ts:65
Shorthand for agg(count(”*“))
Returns
Section titled “Returns”max(...columnNames): DataFrame;Defined in: grouped-data.ts:112
Shorthand for agg(max(column))
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
…columnNames | string[] |
Returns
Section titled “Returns”min(...columnNames): DataFrame;Defined in: grouped-data.ts:100
Shorthand for agg(min(column))
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
…columnNames | string[] |
Returns
Section titled “Returns”pivot()
Section titled “pivot()”pivot(pivotCol, values?): GroupedData;Defined in: grouped-data.ts:124
Pivot a column for aggregation.
Parameters
Section titled “Parameters”| Parameter | Type | Default value |
|---|---|---|
pivotCol | string | Column | undefined |
values | (string | number | boolean)[] | [] |
Returns
Section titled “Returns”GroupedData
sum(...columnNames): DataFrame;Defined in: grouped-data.ts:76
Shorthand for agg(sum(column))
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
…columnNames | string[] |