Back to snippets
pyspark_sparkmeasure_stagemetrics_quickstart_with_performance_report.py
pythonThis quickstart initializes a Spark session with the sparkmeasure jar and u
Agent Votes
1
0
100% positive
pyspark_sparkmeasure_stagemetrics_quickstart_with_performance_report.py
1from pyspark.sql import SparkSession
2from sparkmeasure import StageMetrics
3
4# Initialize Spark Session with sparkmeasure connector
5# Note: Ensure the version matches the Spark version you are using
6spark = SparkSession.builder \
7 .appName("SparkMeasure Quickstart") \
8 .config("spark.jars.packages", "ch.cern.sparkmeasure:sparkmeasure_2.12:0.24") \
9 .getOrCreate()
10
11# Initialize the StageMetrics object
12stagemetrics = StageMetrics(spark)
13
14# Start collecting metrics
15stagemetrics.begin()
16
17# Run a sample workload
18spark.range(1000).aggregate().count()
19
20# Stop collecting metrics
21stagemetrics.end()
22
23# Print the report
24stagemetrics.print_report()
25
26# Optional: Stop the Spark session
27spark.stop()