Python - format csv data for Binary Classifier -
i have csv
data formatted so:
headers = [artist_list, song_list, lyrics_track, lyrics_artist, lyrics]`,
and snippet:
with open('lyrics.tsv', "ru") f: reader = csv.reader(f, delimiter="\t") i, line in enumerate(reader): print 'line[{}] = {}'.format(i, line)
prints:
(...) line[808] = ['pearl jam', 'wishlist', 'wishlist', 'pearl jam', "i wish neutron bomb\nfor once go off\ni wish sacrifice\nbut somehow still lived on\ni wish sentimental\nornament hung on\nthe christmas tree, wish was\nthe star went on top\ni wish evidence\ni wish grounds\nfor fifty million hands upraised , opened toward sky\ni wish sailor with\nsomeone waited me\ni wish fortunate\nas fortunate me\ni wish messenger\nand news good\ni wish full moon shining\noff camaro's hood\ni wish alien\nat home behind sun\ni wish souvenir\nyou kept house key on\ni wish pedal break\nthat depended on\ni wish verb trust\nand never let down\ni wish radio song\nthe 1 turned up\ni wish ..."]
now use data classification, keeping lyrics
lines , adding column binary value (always same, 0
), data transformed into:
lyrics type (...) (...) wish neutron bomb\nfor once go off.. 0
how can starting code above?
i think might work (assuming data in dataframe named lyrics_df):
lyrics_df['type']=0
Comments
Post a Comment