Append data to HDF5 file with Pandas, Python -
i have large pandas dataframes financial data. have no problem appending , concatenating additional columns , dataframes .h5 file.
the financial data being updated every minute, need append row of data of existing tables inside of .h5 file every minute.
here have tried far, no matter do, overwrites .h5 file , not append data.
hdfstore way:
#we open hdf5 file save_hdf = hdfstore('test.h5') ohlcv_candle.to_hdf('test.h5') #we give dataframe key value #format=table can append data save_hdf.put('name_of_frame',ohlcv_candle, format='table', data_columns=true) #we print our dataframe calling hdf file key #just doing test print(save_hdf['name_of_frame'])
the other way have tried it, to_hdf:
#format=t can append data , mode=r+ specify file exists , #we want append tohlcv_candle.to_hdf('test.h5',key='this_is_a_key', mode='r+', format='t') #again printing check if worked print(pd.read_hdf('test.h5', key='this_is_a_key'))
here 1 of dataframes looks after being read_hdf:
time open high low close volume pp 0 1505305260 3137.89 3147.15 3121.17 3146.94 6.205397 3138.420000 1 1505305320 3146.86 3159.99 3130.00 3159.88 8.935962 3149.956667 2 1505305380 3159.96 3160.00 3159.37 3159.66 4.524017 3159.676667 3 1505305440 3159.66 3175.51 3151.08 3175.51 8.717610 3167.366667 4 1505305500 3175.25 3175.53 3170.44 3175.53 3.187453 3173.833333
the next time getting data (every minute), row of added index 5 of columns..and 6 , 7 ..and on, without having read , manipulate entire file in memory defeat point of doing this. if there better way of solving this, not shy recommend it.
p.s. sorry formatting of table in here
try this:
store = pd.hdfstore('test.h5') store.append('name_of_frame', ohlcv_candle, format='t', data_columns=true)
Comments
Post a Comment