python - Pandas: Multiple rolling periods -


i multiple rolling period means , std several columns simultaneously.

this code using rolling(5):

def add_mean_std_cols(df):     res = df.rolling(5).agg(['mean','std'])      res.columns = res.columns.map('_'.join)      cols = np.concatenate(list(zip(df.columns, res.columns[0::2], res.columns[1::2])))      final = res.join(df).loc[:, cols]     return final 

i rolling (5), (15), (30), (45) periods on same operation.

i thought iterating on periods not know how avoid getting rolling mean/std of rolling mean/std...

i suggest creating dataframe multiindex columns. there's no way around using loop here iterate on windows. resulting form that's easy index , easy read pd.read_csv. initialize empty dataframe np.empty of appropriate shape , use .loc assign values.

import numpy np import pandas pd np.random.seed(123)  df = pd.dataframe(np.random.randn(100,3)).add_prefix('col')  windows = [5, 15, 30, 45] stats = ['mean', 'std'] cols = pd.multiindex.from_product([windows, df.columns, stats],                                    names=['window', 'feature', 'metric'])  df2 = pd.dataframe(np.empty((df.shape[0], len(cols))), columns=cols,                    index=df.index)  window in windows:     df2.loc[:, window] = df.rolling(window=window).agg(stats).values 

now have result df2 has same index original object. has 3 column levels: first window, second columns original frame, , third statistic.

print(df2.shape) (100, 24) 

this makes easy check values specific rolling window:

print(df2[5])  # rolling window = 5 feature     col0              col1              col2          metric      mean      std     mean      std     mean      std 0            nan      nan      nan      nan      nan      nan 1            nan      nan      nan      nan      nan      nan 2            nan      nan      nan      nan      nan      nan 3            nan      nan      nan      nan      nan      nan 4       -0.87879  1.45348 -0.26559  0.71236  0.53233  0.89430 ..           ...      ...      ...      ...      ...      ... 95      -0.44231  1.02552 -1.22138  0.45140 -0.36440  0.95324 96      -0.58638  1.10246 -0.90165  0.79723 -0.44543  1.00166 97      -0.70564  0.85711 -0.42644  1.07174 -0.44766  1.00284 98      -0.95702  1.01302 -0.03705  1.05066  0.16437  1.32341 99      -0.57026  1.10978  0.08730  1.02438  0.39930  1.31240  print(df2[5]['col0'])  # rolling window = 5, stats of col0 metric     mean      std 0           nan      nan 1           nan      nan 2           nan      nan 3           nan      nan 4      -0.87879  1.45348 ..          ...      ... 95     -0.44231  1.02552 96     -0.58638  1.10246 97     -0.70564  0.85711 98     -0.95702  1.01302 99     -0.57026  1.10978  print(df2.loc[:, (5, slice(none), 'mean')]) # rolling window = 5,                                             # means of each column period         5                   feature     col0     col1     col2 metric      mean     mean     mean 0            nan      nan      nan 1            nan      nan      nan 2            nan      nan      nan 3            nan      nan      nan 4       -0.87879 -0.26559  0.53233 ..           ...      ...      ... 95      -0.44231 -1.22138 -0.36440 96      -0.58638 -0.90165 -0.44543 97      -0.70564 -0.42644 -0.44766 98      -0.95702 -0.03705  0.16437 99      -0.57026  0.08730  0.39930 

and lastly make single-indexed dataframe, here's kludgy use of itertools.

df = pd.dataframe(np.random.randn(100,3)).add_prefix('col')  import itertools  means = [col + '_mean' col in df.columns] stds = [col + '_std' col in df.columns] iters = [iter(means), iter(stds)] iters = list(it.__next__() in itertools.cycle(iters)) iters = list(itertools.product(iters, [str(win) win in windows])) iters = ['_'.join(it) in iters]  df2 = [df.rolling(window=window).agg(stats).values window in windows] df2 = pd.dataframe(np.concatenate(df2, axis=1), columns=iters,                    index=df.index) 

Comments

Popular posts from this blog

resizing Telegram inline keyboard -

command line - How can a Python program background itself? -

php - "cURL error 28: Resolving timed out" on Wordpress on Azure App Service on Linux -