python - Pandas groupy on series with intervals -
here example point :
missing_values=-999.0     level1=pd._libs.interval.interval(-np.inf, 1, closed='right') level2=pd._libs.interval.interval(1,np.inf, closed='right') data=pd.dataframe({'a':[level1,missing_values,level2]})  >>> data            0  (-inf, 1] 1       -999 2   (1, inf] and when try data.groupby(['a']).count(), goes wrong typeerror: unorderable types: interval() > float() 
but if set -999 @ first line, or set 3 interval levels, can run!
>>> data            0       -999 1  (-inf, 1] 2   (1, inf]  >>> data.groupby(['a']).count() -999.0       1 (-inf, 1]    1 (1, inf]     1   >>> data   0  (-inf, 1] 1       -999 2     (1, 2] 3   (2, inf]  >>> data.groupby(['a']).count() (-inf, 1]    1 -999.0       1 (1, 2]       1 (2, inf]     1 name: a, dtype: int64 that means groupby can sort interval , float? typeerror means?
i'm not sure groupby works intervals, works categories. can use pd.categorical , groupby.
data.groupby(pd.categorical(data.a)).count()             (-inf, 1]  1 -999.0     1 (1, inf]   1 
Comments
Post a Comment