Pandas helpers¶
Simple helper methods related to pandas
pd_equals
¶
If you ever pull your hair out with None
values converted to np.nan
when stored on disk by pandas to_csv
, causing issues when comparing two dataframes,
then pd_equals
is for you. It is meant to compare a stored csv from a one in memory, and does so by writing and reading the latter to have the same funky conversion for both.
Of course the real solution would be to use the converters
option from read_csv
(look the official documentation. ), but it can be quite tedious and frankly overkill for tests.
# test pd_equals method (subtlety of None np.Nan that imposes to write on/read from disk)
df = pd.DataFrame.from_dict({'a': [1], 'b': None})
self.assertTrue(pd_equals(df, TEST_FILE) is None)
jsonify_series
¶
As for pd_equals
, converting a pandas serie to something that is json acceptable can be useful.
Simple example
# test jsonify_series (subtlety of None/np.Nan)
df = pd.DataFrame.from_dict({'a': [1, 2], 'b': [np.nan, 2]})
self.assertDictEqual(jsonify_series(df['b']), {0: None, 1: 2.0})