Speeding up SQL Server cross apply to get aggregated data -

- April 15, 2013

in sql server, trying put single query grabs row , includes aggregated data 2 hour window before row aggregated data 1 hour window after. how can make run faster?

the rows have time stamps millisecond precision, , not evenly spaced. have on 50 million rows in table, , query not seem completing. there indexes in many places, don't seem help. thinking using window function, not sure possible have sliding window unevenly distributed rows. also, future 1 hour window, not sure how done sql window.

box string , has 10 unique values. process string , has 30 unique values. average duration_ms 200 ms. errors account less 0.1% of data. 50 million rows describes years worth of data.

select  c1.start_time, c1.end_time, c1.box, c1.process, datediff(ms,c1.start_time,c1.end_time) duration_ms, datepart(dw,c1.start_time) day_of_week, datepart(hour,c1.start_time) hour_of_day, c3.*, c5.* metrics_table c1 cross apply (select      avg(cast(datediff(ms,c2.start_time,c2.end_time) numeric)) avg_ms,     count(1) num_process_total,     count(distinct process) num_process_unique,     count(distinct box) num_box_unique     metrics_table c2     datediff(minute,c2.start_time,c1.start_time) <= 120     , c1.start_time> c2.start_time     , c2.error_code = 0 ) c3 cross apply (select     avg(case when datediff(ms,c4.start_time,c4.end_time)>1000 1.0 else 0.0 end) percent_over_thresh     metrics_table c4     datediff(hour,c1.start_time,c4.start_time) <= 1     , c4.start_time> c1.start_time     , c4.error_code= 0 ) c5 c1.error_code= 0

edit

version: sql azure 12.0

adding execution plan:

the following should step in right direction... note: c2.start_time & c4.start_time no longer wrappen in datediff functions making them sargable...

select     c1.start_time,     c1.end_time,     c1.box,     c1.process,     datediff(ms, c1.start_time, c1.end_time) duration_ms,     datepart(dw, c1.start_time) day_of_week,     datepart(hour, c1.start_time) hour_of_day,     --c3.*,     avg_ms = case when      c5.*     dbo.metrics_table c1     cross apply (                 select                     avg(cast(datediff(ms, c2.start_time, c2.end_time) numeric)) avg_ms,                     count(1) num_process_total,                     count(distinct process) num_process_unique,                     count(distinct box) num_box_unique                                     dbo.metrics_table c2                                     --datediff(minute,c2.start_time,c1.start_time) <= 120                     c2.start_time <= dateadd(minute, -120, c1.start_time)                     --and c1.start_time> c2.start_time                     , c2.error_code = 0                 ) c3     cross apply (                 select                     avg(case when datediff(ms, c4.start_time, c4.end_time) > 1000 1.0 else 0.0 end                     ) percent_over_thresh                                     dbo.metrics_table c4                                     --datediff(hour, c1.start_time, c4.start_time) <= 1                     c4.start_time >= dateadd(hour, 1, c1.start_time)                     --and c4.start_time> c1.start_time                     , c4.error_code = 0                 ) c5     c1.error_code = 0;

of course, making query sargable doesn't unless there's appropriate index available. following should 3 metrics_table references... (see indexes available, there's chance may not need create new index)

create nonclustered index ixf_metricstable_errorcode_starttime on dbo.metrics_table (     error_code,     start_time     ) include (     end_time,     box,     process     )      error_code = 0;

Search This Blog

ANy

Speeding up SQL Server cross apply to get aggregated data -

Comments

Post a Comment

Popular posts from this blog

ZeroMQ on Windows, with Qt Creator -

ios - MKAnnotationView layer is not of expected type: MKLayer -

python - Error while using APScheduler: 'NoneType' object has no attribute 'now' -