Handling missing data

python
pandas
numpy
Examples of dealing with missing data
Author

Youfeng Zhou

Published

October 30, 2022

In Pandas, there are 4 methods to handle NA values, which are dropna, fillna, isnull, notnull.

flowchart LR
  A([Handling NA]) --> B(dropna)
  A --> C(fillna)
  A --> D(isnull)
  A --> E(notnull)
  C --> F[ffill]
  C --> G[bfill]
import pandas as pd
import numpy as np
string_data = pd.Series(['apple', 'orange', np.nan, 'avocado'])
string_data
0      apple
1     orange
2        NaN
3    avocado
dtype: object
string_data.isna()
0    False
1    False
2     True
3    False
dtype: bool
string_data.isnull()
0    False
1    False
2     True
3    False
dtype: bool
string_data[0] = None
string_data.isnull()
0     True
1    False
2     True
3    False
dtype: bool
string_data.notnull()
0    False
1     True
2    False
3     True
dtype: bool