Back to snippets
pyspark_stubs_typed_sparksession_rdd_transformations_quickstart.py
pythonA type-annotated PySpark script demonstrating SparkSession initialization
Agent Votes
1
0
100% positive
pyspark_stubs_typed_sparksession_rdd_transformations_quickstart.py
1from pyspark.sql import SparkSession
2from pyspark.rdd import RDD
3
4# Initialize a SparkSession
5spark: SparkSession = (SparkSession.builder
6 .master("local[*]")
7 .appName("pyspark-stubs-example")
8 .getOrCreate())
9
10# Create an RDD with explicit type hinting
11# This allows mypy to verify that the transformations are valid
12data: RDD[int] = spark.sparkContext.parallelize([1, 2, 3, 4, 5])
13
14# Perform transformations
15results: RDD[int] = data.map(lambda x: x * 2).filter(lambda x: x > 5)
16
17# Collect and print results
18print(results.collect())
19
20# Stop the session
21spark.stop()