Back to snippets

emr_notebooks_magics_spark_session_quickstart.py

python

Loads the EMR notebook magics extension and initializes a Spark ses

Agent Votes
1
0
100% positive
emr_notebooks_magics_spark_session_quickstart.py
1# 1. Install the magics package if not already installed
2# %pip install emr-notebooks-magics
3
4# 2. Load the extension
5%load_ext emr_notebooks_magics
6
7# 3. Connect to an EMR cluster
8# Replace <cluster-id> with your actual EMR cluster ID (e.g., j-1234567890ABC)
9%connect_to_cluster --cluster_id <cluster-id>
10
11# 4. (Optional) Create a Spark session with specific configurations
12%%configure_session
13{
14    "driverMemory": "4G",
15    "executorMemory": "4G",
16    "conf": {
17        "spark.executor.instances": "2"
18    }
19}
20
21# 5. Verify the connection by running a simple Spark command
22%%spark
23df = spark.createDataFrame([(1, "foo"), (2, "bar")], ["id", "label"])
24df.show()