python - Infer header names pandas dataframe for numeric type -
id 91 57 60 79 888 111 06/03/2015 1 2 2 4 1 1 03/03/2015 1 2 2 2 2 3 06/04/2015 1 2 2 2 1 1 17/04/2015 1 3 2 2 1 3 21/04/2015 3 2 1 1 2 1 12/05/2015 1 3 2 2 2 3 i have csv file columns of id's (numeric value) , value (1-4) assigned each id dates. have data in following format:
date score id 06/03/2015 1 91 03/03/2015 1 91 06/04/2015 1 91 17/04/2015 1 91 21/04/2015 3 91 12/05/2015 1 91 06/03/2015 2 57 03/03/2015 2 57 06/04/2015 2 57 17/04/2015 3 57 21/04/2015 2 57 12/05/2015 3 57 etc...
attempt:
my thinking start creating pandas dataframe follows:
df = pd.read_csv("file.csv", sep=', ', delimiter=none, header='infer') the problem having infer not seem able detect header names values numeric?
from here, hoping perform dataframe operations data desired format
use melt rename columns if necessary:
#s\+ space separator, if necessary change df = pd.read_csv("file.csv", sep='\s+') d = {'id':'date'} cols = ['date','score','id'] df = df.rename(columns=d).melt('date', var_name='id', value_name='score')[cols] #convert id column int df['id'] = df['id'].astype(int) print (df) date score id 0 2015-06-03 1 91 1 2015-03-03 1 91 2 2015-06-04 1 91 3 2015-04-17 1 91 4 2015-04-21 3 91 5 2015-12-05 1 91 6 2015-06-03 2 57 7 2015-03-03 2 57 8 2015-06-04 2 57 ... but if first column index possible use unstack:
df = pd.read_csv("file.csv", sep='\s+', index_col=[0], parse_dates=[0]) then possible convert columns int:
df.columns = df.columns.astype(int) cols = ['date','score','id'] df = df.unstack().rename_axis(('id','date')).reset_index(name='score')[cols] print (df) date score id 0 2015-06-03 1 91 1 2015-03-03 1 91 2 2015-06-04 1 91 3 2015-04-17 1 91 4 2015-04-21 3 91 5 2015-12-05 1 91 6 2015-06-03 2 57 ... ...
Comments
Post a Comment