python - What is the best way to access values in a dataframe column? -
for example have
df=pd.dataframe({'a':[1,2,3]}) df[df['a']==3].a = 4
this not assign 4 3
df[df['a']==3] = 4
but works.
it confused me on how assignment works. appreciate if can give me references or explanation.
you not want use second method. returns dataframe subslice , assigns same value every single row.
for example,
df b 0 1 4 1 2 3 2 3 6 df[df['a'] == 3] b 2 3 6 df[df['a']==3] = 3 df b 0 1 4 1 2 3 2 3 3
the first method not work because boolean indexing returns copy of column (series), trying assign to, assignment fails:
df[df['a'] == 3].a = 4 /library/frameworks/python.framework/versions/3.4/lib/python3.4/site-packages/pandas/core/generic.py:3110: settingwithcopywarning: value trying set on copy of slice dataframe. try using .loc[row_indexer,col_indexer] = value instead see caveats in documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy self[name] = value
so, options using .loc
(access name) or iloc
(access index) based indexing:
df.loc[df.a == 3, 'a'] = 4 df 0 1 1 2 2 4
if passing boolean mask, cannot use iloc
.
Comments
Post a Comment