python - Unable to save DataFrame to HDF5 ("object header message is too large") -
i have dataframe in pandas:
in [7]: my_df out[7]: <class 'pandas.core.frame.dataframe'> int64index: 34 entries, 0 0 columns: 2661 entries, airplane zoo dtypes: float64(2659), object(2)
when try save disk:
store = pd.hdfstore(p_full_h5) store.append('my_df', my_df)
i get:
file "h5a.c", line 254, in h5acreate2 unable create attribute file "h5a.c", line 503, in h5a_create unable create attribute in object header file "h5oattribute.c", line 347, in h5o_attr_create unable create new attribute in header file "h5omessage.c", line 224, in h5o_msg_append_real unable create new message file "h5omessage.c", line 1945, in h5o_msg_alloc unable allocate space message file "h5oalloc.c", line 1142, in h5o_alloc object header message large end of hdf5 error trace can't set attribute 'non_index_axes' in node: /my_df(group) u''.
why?
note: in case matters, dataframe column names simple small strings:
in[12]: max([len(x) x in list(my_df.columns)]) out{12]: 47
this pandas 0.11 , latest stable version of ipython, python , hdf5.
hdf5 has header limit of 64kb metadata of columns. include name, types, etc. when go 2000 columns, run out of space store metadata. fundamental limitation of pytables. don't think make workarounds on side time soon. either have split table or choose storage format.
Comments
Post a Comment