panda数据处理

  1. 计算非空列数df.count()
  2. 删除含非空行/列pd.dropna(axis)
  3. 按id/匹配删除行pd.drop(indexList)

计算非空列数df.count()

输出

1
2
3
4
5
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 eid 21317 non-null float64
1 21-0.0 21251 non-null float64

删除含非空行/列pd.dropna(axis)

axis: 0的时候删除空行, 1删除空列

subset=['name', 'born'] 子列中出现空项则删除行

thresh=2 至少2个空值时候删除行/列

1
2
3
4
5
6
7
8
9
10
11
print(df)
print(df.dropna())
a b
0 1.0 NaN
1 NaN 2.0
2 3.0 3.0
3 4.0 4.0

a b
2 3.0 3.0
3 4.0 4.0

按id/匹配删除行pd.drop(indexList)

df['age']<18 返回l = [False, True, True, False]

df[l].index 返回indexList=[1, 2]

1
2
3
4
# 用法
df = df.drop(
df[ df['age']<18 ]
)

转载请注明来源 https://tianweiye.github.io