Skip to content

GroupedData

Defined in: grouped-data.ts:22

Returned by DataFrame.groupBy, DataFrame.rollup, DataFrame.cube, and related methods. Mirrors RelationalGroupedDataset in the JVM Spark API.

GroupedData is not a DataFrame. It is a transient builder that captures grouping expressions and waits for an aggregation call (agg, count, sum, avg, min, max, pivot) to produce the final DataFrame.

const totals = df.groupBy("department").agg(sum("salary").alias("payroll"));

Spark source: RelationalGroupedDataset.scala

new GroupedData(
df,
groupExprs,
groupType?,
pivot?): GroupedData;

Defined in: grouped-data.ts:28

ParameterTypeDefault value
dfDataFrameundefined
groupExprsExpression[]undefined
groupType"groupby" | "rollup" | "cube" | "pivot""groupby"
pivot?{ col: Expression; values: (string | number | boolean)[]; }undefined
pivot.col?Expressionundefined
pivot.values?(string | number | boolean)[]undefined

GroupedData

agg(...exprs): DataFrame;

Defined in: grouped-data.ts:52

Apply one or more aggregate functions.

ParameterType
exprsColumn[]

DataFrame

df.groupBy(col("dept")).agg(
functions.sum(col("salary")).alias("total_salary"),
functions.avg(col("age")).alias("avg_age"),
)
Each Column passed in should be an aggregate expression (sum, avg, count,
min, max). The grouping columns are implicitly included in the output.

avg(...columnNames): DataFrame;

Defined in: grouped-data.ts:88

Shorthand for agg(avg(column))

ParameterType
columnNamesstring[]

DataFrame


count(): DataFrame;

Defined in: grouped-data.ts:65

Shorthand for agg(count(”*“))

DataFrame


max(...columnNames): DataFrame;

Defined in: grouped-data.ts:112

Shorthand for agg(max(column))

ParameterType
columnNamesstring[]

DataFrame


min(...columnNames): DataFrame;

Defined in: grouped-data.ts:100

Shorthand for agg(min(column))

ParameterType
columnNamesstring[]

DataFrame


pivot(pivotCol, values?): GroupedData;

Defined in: grouped-data.ts:124

Pivot a column for aggregation.

ParameterTypeDefault value
pivotColstring | Columnundefined
values(string | number | boolean)[][]

GroupedData


sum(...columnNames): DataFrame;

Defined in: grouped-data.ts:76

Shorthand for agg(sum(column))

ParameterType
columnNamesstring[]

DataFrame