Skip to content

SparkSession

Defined in: spark-session.ts:127

The client handle for a Spark Connect session.

Holds the transport, session identifier, and runtime-adapter hooks (for example the Arrow decoder). All DataFrame operations are scheduled against a SparkSession; most applications create one session at startup and reuse it.

Construct a session with the runtime-specific builder, for example SparkSessionBuilder from @spark-connect-js/node.

import { SparkSessionBuilder } from "@spark-connect-js/node";
const spark = await SparkSessionBuilder
.remote("sc://localhost:15002")
.build();
const df = await spark.sql("SELECT 1 AS n");
console.log(await df.collect());

Spark source: SparkSession.scala

readonly catalog: Catalog;

Defined in: spark-session.ts:156

Access the session catalog for inspecting databases, tables, and columns.


readonly conf: RuntimeConfig;

Defined in: spark-session.ts:162

Read and write Spark configuration entries on the connected server.


readonly sessionId: string;

Defined in: spark-session.ts:128


readonly udf: UDFRegistration;

Defined in: spark-session.ts:159

Register Java UDFs and UDAFs as SQL functions.

get read(): DataFrameReader;

Defined in: spark-session.ts:165

Returns a DataFrameReader for building Read plans.

DataFrameReader

addTag(tag): void;

Defined in: spark-session.ts:272

Tag every subsequent operation on this session with tag. Tags are carried on ExecutePlanRequest.tags and let you cancel a group of operations with interruptTag.

ParameterType
tagstring

void

InvalidInputError if the tag contains , or is empty.


clearTags(): void;

Defined in: spark-session.ts:288

Drop all active tags.

void


createDataFrame(data, schema?): DataFrame;

Defined in: spark-session.ts:210

Create a DataFrame from Arrow IPC data.

ParameterTypeDescription
dataUint8ArrayArrow IPC streaming format bytes
schema?stringOptional DDL-formatted schema string (e.g. “id INT, name STRING”)

DataFrame

const arrowData = ArrowEncoder.encode(rows, schema);
const df = spark.createDataFrame(arrowData);

getTags(): string[];

Defined in: spark-session.ts:283

Return a snapshot of the currently active tags.

string[]


interruptAll(): Promise<string[]>;

Defined in: spark-session.ts:296

Interrupt every running operation in this session.

Promise<string[]>


interruptOperation(operationId): Promise<string[]>;

Defined in: spark-session.ts:307

Interrupt a single running operation by its operation ID.

ParameterType
operationIdstring

Promise<string[]>


interruptTag(tag): Promise<string[]>;

Defined in: spark-session.ts:301

Interrupt every running operation tagged with tag.

ParameterType
tagstring

Promise<string[]>


range(
startOrEnd,
end?,
step?,
numPartitions?): DataFrame;

Defined in: spark-session.ts:188

Create a DataFrame with a single id column containing a sequence of integers from start (inclusive) to end (exclusive), incrementing by step.

Mirrors PySpark’s spark.range(start, end, step, numPartitions).

ParameterTypeDefault value
startOrEndnumberundefined
end?numberundefined
step?number1
numPartitions?numberundefined

DataFrame

spark.range(10) // 0, 1, 2, ..., 9
spark.range(1, 10) // 1, 2, 3, ..., 9
spark.range(0, 10, 2) // 0, 2, 4, 6, 8

removeTag(tag): void;

Defined in: spark-session.ts:278

Remove a previously added tag. No-op if the tag wasn’t set.

ParameterType
tagstring

void


sql(query): DataFrame;

Defined in: spark-session.ts:170

Execute a SQL query.

ParameterType
querystring

DataFrame


stop(): Promise<void>;

Defined in: spark-session.ts:340

Stop the session: releases server-side state and closes the transport.

Promise<void>


version(): Promise<string>;

Defined in: spark-session.ts:332

Return the Apache Spark version reported by the connected server.

One AnalyzePlan RPC. Result is not cached; call once and store if you need it repeatedly.

Mirrors pyspark.sql.SparkSession.version.

Promise<string>


static builder(): SparkSessionBuilder;

Defined in: spark-session.ts:149

SparkSessionBuilder