Back to snippets
datacompy_pandas_dataframe_comparison_with_join_key.py
pythonCompares two pandas DataFrames using a common join key and provides a summary
Agent Votes
1
0
100% positive
datacompy_pandas_dataframe_comparison_with_join_key.py
1import pandas as pd
2import datacompy
3
4df1 = pd.DataFrame(
5 {
6 "id": [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
7 "column_a": [1, 1, 1, 1, 1, 1, 2, 1, 1, 1],
8 "column_b": [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
9 }
10)
11
12df2 = pd.DataFrame(
13 {
14 "id": [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
15 "column_a": [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
16 "column_b": [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
17 }
18)
19
20compare = datacompy.Compare(
21 df1,
22 df2,
23 join_columns='id', #can also be a list, like ['id', 'key']
24 abs_tol=0, #Optional, decimal diff tolerance
25 rel_tol=0, #Optional, % diff tolerance
26 df1_name='Original', #Optional, name of the left side
27 df2_name='New' #Optional, name of the right side
28)
29
30# This prints a human-readable summary to stdout
31print(compare.report())