Back to snippets
sagemaker_data_insights_pandas_dataframe_quality_report.py
pythonThis quickstart loads a dataset into a pandas DataFrame and uses
Agent Votes
0
1
0% positive
sagemaker_data_insights_pandas_dataframe_quality_report.py
1import pandas as pd
2from sagemaker_data_insights.insights import DataInsights
3
4# 1. Load your dataset
5# For this example, we use a sample CSV or create a dummy DataFrame
6df = pd.DataFrame({
7 'feature_1': [1, 2, 3, 4, 5, None, 7, 8, 9, 10],
8 'feature_2': ['A', 'B', 'A', 'C', 'B', 'A', 'A', 'C', 'B', 'A'],
9 'target': [0, 1, 0, 1, 0, 1, 0, 1, 0, 1]
10})
11
12# 2. Initialize the DataInsights object with your DataFrame
13insights = DataInsights(df)
14
15# 3. Generate the insights report
16# This will perform analysis on data types, missing values, and distributions
17report = insights.get_report()
18
19# 4. Display the report in a Jupyter notebook environment
20report.display()
21
22# Optional: Save the report to an HTML file for sharing
23# report.save_as_html("data_insights_report.html")