sql - calculate weighted average for each day and id based on time intervals in PostgreSQL -

September 15, 2015

i have table in postgresql database looks this:

stid | e5   | e10  | diesel | date -----+------+------+--------+------------------------ e850 | 1300 | 1400 | 1500   | 2016-05-02 05:30:01+02 e850 | 1400 | 1500 | 1700   | 2016-05-02 08:30:01+02 e850 | 1300 | 1400 | 1500   | 2016-05-02 21:00:01+02 e850 | 1200 | 1300 | 1350   | 2016-05-03 10:30:01+02 e850 | 1300 | 1400 | 1500   | 2016-05-03 21:00:01+02 954d | 1200 | 1100 | 1300   | 2016-05-02 03:30:01+02 954d | 1300 | 1100 | 1300   | 2016-05-02 15:00:01+02 954d | 1400 | 1800 | 1400   | 2016-05-02 22:30:01+02 954d | 1700 | 1900 | 1400   | 2016-05-03 09:30:01+02 954d | 1500 | 1900 | 1200   | 2016-05-03 23:30:01+02

so have unique id's (stid), prices (e5,e10,diesel) , timestamp (date) indicates when price introduced. want calculate average price per day , stid, weighted duration price charged. , want take period between 8 , 8 pm account.

to calculate weighted average price of e5 stid e850 , date 2016-05-02 between 8 , 8 pm following:

(1300 * 1800 + 1400 * 41400) / 43200 = 1395.83333  1300 price set @ 5:30 , 1800 duration in  seconds between 8 , 8:30 am. 1400 price set @ 8:30 , 41400 duration in  seconds between 8:30 , 8 pm.

in end have table looks this:

stid | date       | average_e5 | average_e10 | average_diesel -----+------------+------------+-------------+--------------- e850 | 2016-05-02 | 1395.83333 | 1495.83333  | 1691.66667 e850 | 2016-05-03 | 1220.83333 | 1320.83333  | 1381.25 954d | 2016-05-02 | 1241.66667 | 1100        | 1300 954d | 2016-05-03 | 1662.5     | 1887.5      | 1400

what able achieve far, calculate duration each price charged, using query:

select    stid,    e5,    e10,    diesel,    date,    nextdate,    extract(epoch (nextdate - date))    (   select              stid,             e5,             e10,             diesel,             date,             (   select  min(date)                     mytable t2                   t2.stid = t1.stid                 ,     t2.date > t1.date             ) nextdate     mytable t1 ) t;

the table , query can found here rextester

but i'm having trouble figure out way calculate weighted average time restriction. thought adding dummy prices @ 8 , 8 pm before calculating durations, don't know how this.

since dataset pretty big, in time efficient way.

i did not needed cte, make more readable:

t=# ( select * , case   when date < date_trunc('day', date) + '8 hours'::interval date_trunc('day', date) + '8 hours'::interval   when date > date_trunc('day', date) + '20 hours'::interval date_trunc('day', date) + '20 hours'::interval   else date end d , date_trunc('day', date) dt mytable ) , b ( select stid, e5, e10, diesel,date,d, dt , extract(epoch lead(d) on (partition stid,dt order stid,d) - d) diff ) select  stid, e5,date,d, diff,sum(e5*diff*1.0) on (partition stid,dt)/sum(diff) on (partition stid,dt) e5_weight_avg b order stid desc, date;  stid |   e5    |        date         |          d          | diff  |  e5_weight_avg ------+---------+---------------------+---------------------+-------+------------------  e850 | 1300.00 | 2016-05-02 05:30:01 | 2016-05-02 08:00:00 |  1801 | 1395.83101851852  e850 | 1400.00 | 2016-05-02 08:30:01 | 2016-05-02 08:30:01 | 41399 | 1395.83101851852  e850 | 1300.00 | 2016-05-02 21:00:01 | 2016-05-02 20:00:00 |       | 1395.83101851852  e850 | 1200.00 | 2016-05-03 10:30:01 | 2016-05-03 10:30:01 | 34199 |             1200  e850 | 1300.00 | 2016-05-03 21:00:01 | 2016-05-03 20:00:00 |       |             1200  954d | 1200.00 | 2016-05-02 03:30:01 | 2016-05-02 08:00:00 | 25201 | 1241.66435185185  954d | 1300.00 | 2016-05-02 15:00:01 | 2016-05-02 15:00:01 | 17999 | 1241.66435185185  954d | 1400.00 | 2016-05-02 22:30:01 | 2016-05-02 20:00:00 |       | 1241.66435185185  954d | 1700.00 | 2016-05-03 09:30:01 | 2016-05-03 09:30:01 | 37799 |             1700  954d | 1500.00 | 2016-05-03 23:30:01 | 2016-05-03 20:00:00 |       |             1700 (10 rows)

and thus, skipping middle steps:

t=# ( select * , case   when date < date_trunc('day', date) + '8 hours'::interval date_trunc('day', date) + '8 hours'::interval   when date > date_trunc('day', date) + '20 hours'::interval date_trunc('day', date) + '20 hours'::interval   else date end d , date_trunc('day', date) dt mytable ) , b ( select stid, e5, e10, diesel,date,d, dt , extract(epoch lead(d) on (partition stid,dt order stid,d) - d) diff ) select distinct  stid, dt,sum(e5*diff*1.0) on (partition stid,dt)/sum(diff) on (partition stid,dt) e5_weight_avg b order stid desc, dt;  stid |         dt          |  e5_weight_avg ------+---------------------+------------------  e850 | 2016-05-02 00:00:00 | 1395.83101851852  e850 | 2016-05-03 00:00:00 |             1200  954d | 2016-05-02 00:00:00 | 1241.66435185185  954d | 2016-05-03 00:00:00 |             1700 (4 rows)

Search This Blog

Enable

sql - calculate weighted average for each day and id based on time intervals in PostgreSQL -

Comments

Post a Comment

Popular posts from this blog

resizing Telegram inline keyboard -

javascript - How to bind ViewModel Store to View? -

recursion - Can every recursive algorithm be improved with dynamic programming? -