DataFrameReader
Defined in: spark-session.ts:468
Fluent reader for loading data into a DataFrame. Returned by
spark.read; configure the format, options, and schema, then terminate
with a format shortcut (csv, json, parquet, orc, text) or .load().
Example
Section titled “Example”spark.read.parquet("s3://bucket/events/");
spark.read .schema("id INT, name STRING") .option("header", "true") .csv("/data/people.csv");Spark source: DataFrameReader.scala
Constructors
Section titled “Constructors”Constructor
Section titled “Constructor”new DataFrameReader(session): DataFrameReader;Defined in: spark-session.ts:474
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
session | SparkSession |
Returns
Section titled “Returns”DataFrameReader
Methods
Section titled “Methods”csv(path): DataFrame;Defined in: spark-session.ts:555
Shortcut for .format(“csv”).load(path).
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
path | string |
Returns
Section titled “Returns”format()
Section titled “format()”format(fmt): this;Defined in: spark-session.ts:508
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
fmt | string |
Returns
Section titled “Returns”this
json()
Section titled “json()”json(path): DataFrame;Defined in: spark-session.ts:550
Shortcut for .format(“json”).load(path).
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
path | string |
Returns
Section titled “Returns”load()
Section titled “load()”load(path): DataFrame;Defined in: spark-session.ts:530
Trigger a Read plan node. The resulting DataFrame is lazy; no data is fetched until .collect() or an action is called.
This maps to Spark Connect’s Relation.Read with ReadType.DataSource:
{ format: “parquet”, paths: […], options: {…} }
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
path | string |
Returns
Section titled “Returns”option()
Section titled “option()”option(key, value): this;Defined in: spark-session.ts:513
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
key | string |
value | string |
Returns
Section titled “Returns”this
options()
Section titled “options()”options(opts): this;Defined in: spark-session.ts:518
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
opts | Record<string, string> |
Returns
Section titled “Returns”this
orc(path): DataFrame;Defined in: spark-session.ts:565
Shortcut for .format(“orc”).load(path).
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
path | string |
Returns
Section titled “Returns”parquet()
Section titled “parquet()”parquet(path): DataFrame;Defined in: spark-session.ts:560
Shortcut for .format(“parquet”).load(path).
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
path | string |
Returns
Section titled “Returns”schema()
Section titled “schema()”schema(schema): this;Defined in: spark-session.ts:483
Set the schema for the data source. Accepts a DDL-formatted string (e.g. “name STRING, age INT”) or a StructType with a toDDL() method.
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
schema | | string | { toDDL: string; } |
Returns
Section titled “Returns”this
table()
Section titled “table()”table(tableName): DataFrame;Defined in: spark-session.ts:541
Read a named table (catalog table or temp view).
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
tableName | string |
Returns
Section titled “Returns”text()
Section titled “text()”text(path): DataFrame;Defined in: spark-session.ts:570
Shortcut for .format(“text”).load(path).
Parameters
Section titled “Parameters”| Parameter | Type |
|---|---|
path | string |