python - Iteratively writing to HDF5 Stores in Pandas -
pandas has following examples how store series
, dataframes
, panels
in hdf5 files:
prepare data:
in [1142]: store = hdfstore('store.h5') in [1143]: index = date_range('1/1/2000', periods=8) in [1144]: s = series(randn(5), index=['a', 'b', 'c', 'd', 'e']) in [1145]: df = dataframe(randn(8, 3), index=index, ......: columns=['a', 'b', 'c']) ......: in [1146]: wp = panel(randn(2, 5, 4), items=['item1', 'item2'], ......: major_axis=date_range('1/1/2000', periods=5), ......: minor_axis=['a', 'b', 'c', 'd']) ......:
save in store:
in [1147]: store['s'] = s in [1148]: store['df'] = df in [1149]: store['wp'] = wp
inspect what's in store:
in [1150]: store out[1150]: <class 'pandas.io.pytables.hdfstore'> file path: store.h5 /df frame (shape->[8,3]) /s series (shape->[5]) /wp wide (shape->[2,5,4])
close store:
in [1151]: store.close()
questions:
in code above, when data written disk?
say want add thousands of large dataframes living in
.csv
files single.h5
file. need load them , add them.h5
file 1 one since cannot afford have them in memory @ once take memory. possible hdf5? correct way it?the pandas documentation says following:
"these stores not appendable once written (though remove them , rewrite). nor queryable; must retrieved in entirety."
what mean not appendable nor queryable? also, shouldn't once closed instead of written?
as statement exectued, eg
store['df'] = df
.close
closes actual file (which closed if process exists, print warning message)read section http://pandas.pydata.org/pandas-docs/dev/io.html#storing-in-table-format
it not idea put lot of nodes in
.h5
file. want append , create smaller number of nodes.you can iterate thru
.csv
,store/append
them 1 one. like:for f in files: df = pd.read_csv(f) df.to_hdf('file.h5',f,df)
would 1 way (creating separate node each file)
not appendable - once write it, can retrieve @ once, e.g. cannot select sub-section
if have table, can things like:
pd.read_hdf('my_store.h5','a_table_node',['index>100'])
which database query, getting part of data
thus, store not appendable, nor queryable, while table both.
Comments
Post a Comment