Group a list of Object arrays by timestamp in Java -
i have list < object[]> , 1 of columns in object[] localdatetime. other columns location (string) , item price (double).
basically, list looks this:
2017-01-01 02:05:00 newyork 26.89 2017-01-01 02:10:00 newyork 72.00 2017-01-01 02:15:00 newyork 73.10 2017-01-01 02:20:00 newyork 70.11 2017-01-01 02:25:00 newyork 79.90 2017-01-01 02:30:00 newyork 72.33 2017-01-01 02:35:00 newyork 75.69 2017-01-01 02:40:00 newyork 72.12 2017-01-01 02:45:00 newyork 73.09 2017-01-01 02:50:00 newyork 72.67 2017-01-01 02:55:00 newyork 72.56 2017-01-01 03:00:00 newyork 72.76 2017-01-01 02:05:00 boston 26.89 2017-01-01 02:10:00 boston 42.00 2017-01-01 02:15:00 boston 23.10 2017-01-01 02:20:00 boston 77.11 2017-01-01 02:25:00 boston 49.92 2017-01-01 02:30:00 boston 72.63 2017-01-01 02:35:00 boston 73.19 2017-01-01 02:40:00 boston 76.18 2017-01-01 02:45:00 boston 83.59 2017-01-01 02:50:00 boston 76.67 2017-01-01 02:55:00 boston 52.06 2017-01-01 03:00:00 boston 76.06
what need time-weighted average of price on interval of 15-minutes, per city. datetime associated interval latest one. running algorithm on list above produce list looks this:
01-01-2017 02:15:00 newyork 57.33 (average of 2:05, 2:10 , 2:15) 01-01-2017 02:30:00 newyork 74.11 (average of 2:20, 2:25 , 2:30) 01-01-2017 02:45:00 newyork 73.63 (...) 01-01-2017 03:00:00 newyork 72.60 01-01-2017 02:15:00 boston 30.66 (average of 2:05, 2:10 , 2:15) 01-01-2017 02:30:00 boston 66.55 (average of 2:20, 2:25 , 2:30) 01-01-2017 02:45:00 boston 77.65 (...) 01-01-2017 03:00:00 boston 68.26
i'm thinking first step in doing group records 15 minute interval , city. rest matter of iterating through groups , getting average, can figure out on own.
i have no idea how go grouping per localdatetime , less on 15 minute basis. 1 last thing mention there missing rows. intervals empty, in case can ignore interval altogether. appreciated.
update1: i'm assuming there better way group them sort them , iterating on each 1 , comparing timestamps. first answer in post: how group objects in list other lists attribute using streams & java 8?
update2: also, timestamps not every 5 minutes. @ random times , intervals have 3 or 5 rows in them.
update3: not duplicate, question grouping , not rounding down. understand how rounding down 15 minutes 1 way of doing it, afterwards, i'd have keep real timestamps perform time-weighted average. not way this.
design class hold data instead of object
arrays. each object of class hold timestamp, location , item price. may hold result of rounding timestamp down whole 15 minutes; alternatively method rounding down. understand, can rounding (otherwise there’s inspiration found in this question: round time seconds).
with such class can use stream the answer linked to group , average. if prefer, may start stream list <object[]>
, map each array object before further processing.
edit seem understand prefer without class rows. of course can done:
private static list<object[]> averagebyquarterofhour(final int indexoftime, int othergroupingindex, int indextoaverage, list<object[]> mylist) { return mylist.stream() .collect(collectors.groupingby(arr -> arrays.aslist(rounduptowholequarterofhour((localdatetime) arr[indexoftime]), arr[othergroupingindex]))) .entryset() .stream() .map(e -> new object[] { e.getkey().get(0), e.getkey().get(1), e.getvalue().stream() .map(arr -> (number) arr[indextoaverage]) .maptodouble(number::doublevalue) .average() .getasdouble() }) .collect(collectors.tolist()); } static localdatetime rounduptowholequarterofhour(localdatetime timetoround) { localdatetime truncated = timetoround.truncatedto(chronounit.minutes); int minute = truncated.getminute(); if (truncated.isequal(timetoround) && minute % 15 == 0) { // on whole quarter return timetoround; } int minutestoadd = 15 - (minute % 15); return truncated.plusminutes(minutestoadd); }
feeding list averagebyquarterofhour()
gives:
[2017-01-01t03:00, newyork, 72.66333333333334] [2017-01-01t02:15, newyork, 57.330000000000005] [2017-01-01t02:30, newyork, 74.11333333333333] [2017-01-01t02:45, newyork, 73.63333333333334] [2017-01-01t02:45, boston, 77.65333333333334] [2017-01-01t03:00, boston, 68.26333333333334] [2017-01-01t02:15, boston, 30.663333333333338] [2017-01-01t02:30, boston, 66.55333333333333]
i leave sorting you.
you may want think twice though. class rows may have advantage of better modelling, impact entire application, not average calculation, though on other hand require model each of entities separately, ever data entity holds. more locally, class, auxiliary class holding array , method getting quarter of hour, might still make above code more readable.
Comments
Post a Comment