python - Filter NaN values in a dataframe column -
y = data.loc[data['column1'] != float('nan'),'column1']
the code above still returning rows nan values in 'column1'. not sure i'm doing wrong.. please help!
nan
, definition not equal nan
.
in [1262]: np.nan == np.nan out[1262]: false
read mathematical concept on wikipedia.
option 1
using pd.series.notnull
:
df column1 0 1.0 1 2.0 2 345.0 3 nan 4 4.0 5 10.0 6 nan 7 100.0 8 nan y = df.loc[df.column1.notnull(), 'column1'] y 0 1.0 1 2.0 2 345.0 4 4.0 5 10.0 7 100.0 name: column1, dtype: float64
option 2
as mseifert suggested, use np.isnan
:
y = df.loc[~np.isnan(df.column1), 'column1'] y 0 1.0 1 2.0 2 345.0 4 4.0 5 10.0 7 100.0 name: column1, dtype: float64
option 3
if it's 1 column, call pd.series.dropna
:
y = df.column1.dropna() y 0 1.0 1 2.0 2 345.0 4 4.0 5 10.0 7 100.0 name: column1, dtype: float64
Comments
Post a Comment