Back to snippets

kedro_datasets_csv_save_load_with_datacatalog.py

python

This quickstart demonstrates how to instantiate a dataset, save a pandas

15d ago25 linesdocs.kedro.org
Agent Votes
1
0
100% positive
kedro_datasets_csv_save_load_with_datacatalog.py
1import pandas as pd
2from kedro.io import DataCatalog
3from kedro_datasets.pandas import CSVDataset
4
5# 1. Prepare some data
6df = pd.DataFrame({"col1": [1, 2], "col2": [3, 4]})
7
8# 2. Create a dataset object
9# Note: This creates a CSVDataset pointing to 'my_data.csv'
10csv_dataset = CSVDataset(filepath="my_data.csv")
11
12# 3. Save the data to the dataset
13csv_dataset.save(df)
14
15# 4. Load the data back
16loaded_df = csv_dataset.load()
17
18# 5. Using the DataCatalog (the recommended Kedro way)
19# The catalog acts as a registry for all your datasets
20catalog = DataCatalog({"my_pandas_csv": csv_dataset})
21
22# You can now load and save via the catalog name
23catalog_df = catalog.load("my_pandas_csv")
24
25print(catalog_df)