python - Filter NaN values in a dataframe column -
y = data.loc[data['column1'] != float('nan'),'column1']   the code above still returning rows nan values in 'column1'. not sure i'm doing wrong.. please help!
nan, definition not equal nan. 
in [1262]: np.nan == np.nan out[1262]: false   read mathematical concept on wikipedia.
option 1
using pd.series.notnull:
df     column1 0      1.0 1      2.0 2    345.0 3      nan 4      4.0 5     10.0 6      nan 7    100.0 8      nan  y = df.loc[df.column1.notnull(), 'column1'] y  0      1.0 1      2.0 2    345.0 4      4.0 5     10.0 7    100.0 name: column1, dtype: float64   option 2
as mseifert suggested, use np.isnan:
y = df.loc[~np.isnan(df.column1), 'column1'] y  0      1.0 1      2.0 2    345.0 4      4.0 5     10.0 7    100.0 name: column1, dtype: float64   option 3
if it's 1 column, call pd.series.dropna:
y = df.column1.dropna() y  0      1.0 1      2.0 2    345.0 4      4.0 5     10.0 7    100.0 name: column1, dtype: float64      
Comments
Post a Comment