python - Efficient way to build a data set from fits image -
i have set of fits images: 32000 images resolution (256,256). dataset i've build matrix like, output shape (32000, 256*256).
the simple solution for
loop, samething like:
#file_names list of paths samples=[] file_name in file_names: hdu=pyfits.open(file_name) samples.append(hdu[0].data.flatten()) hdu.close() #then can use numpy.concatenate have numpy ndarray
this solution very, slow. best solution build big data set?
this isn't intended main answer, felt long comment , relevant.
i not python expert means, believe there few things can without adjusting code.
python syntactical language , implemented in different ways. traditional implementation cpython, download website. however, there other implementations (see here).
long story short, try pypy runs faster "memory-hungry python" such yours. here nice reddit post advantages of each, use pypy, , more experienced me optimize code. additionally, have never used numpy post suggests might able keep numpy , still use pypy.
(normally, suggest use cython, not appear work nicely numpy @ all. don't know if cython has support numpy, can google yourself.) luck!
Comments
Post a Comment