2 min read
Loop over rows in pandas
Re-learning pandas in a deeper way

Loop over rows

For example, you may have a dataset where one column contains a comma-separated list—for instance, the hobbies column:

idnamehobbies
1virdiolisten to music, reading, sleeping
2juansleeping, fishing, eating
3danireading, ski, gym

and your goal is to transform that table into a more structured format like this:

idnamehobbies
1virdiolisten to music
1virdioreading
1virdiosleeping
2juanfishing
2juaneating
3danireading
3daniski
3danigym

Two approaches you can use

  1. Vectorized operations

Using .apply(func), we can process each row without writing an explicit for loop:

data['hobbies']=data['hobbies'].str.split(',').apply(
    lambda items:
    list(
        map(lambda item:item.strip(),items)
    )
)

This is generally the best practice.

  1. Looping with a for loop
data['hobbies'] = data['hobbies'].astype(object)

for idx in data.index:
    items = str(data.at[idx, 'hobbies']).split(',')
    cleaned = [i.strip() for i in items]
    data.at[idx, 'hobbies'] = cleaned

This approach is more time-consuming and often redundant.

Both work, but the final step is to explode the list into rows:

data = data.explode('hobbies')

This creates a more granular and specific table.