While working with Pandas dataframes, I encountered a situation where I needed to locate the row indexes that satisfy a certain condition so that I could later drop them from the dataframe.

Getting a True/False series for the entire index of a dataframe can be done various based on the user’s needs, but for this example a True/False series of wether a row contains a NaN value or not will be used.

Here’s the trick to do it.

isna_bool_list = data.isna().any(axis=1)

pop_row_index_list = data.index[isna_bool_list]
pop_row_index_list.tolist()

Later on, if I wanted to drop the rows with these indices, then I could do something like this:

data4 = data.drop(data.index[pop_row_index_list])
print(data4.shape)
print(data.shape)

>> output
(682, 9)
(768, 9)

There is another way which is using numpy functions but its messy so I recommend using this approach.

Categories: data science

0 Comments

Leave a Reply

Your email address will not be published.