@浙大疏锦行 Python训练营Day4
内容,pandas处理表格信息:
- 查看表格统计信息:
- data.mean()
- data.mode()
- data.median()
- 查看表格信息:
- data.info()
- data.describe()
- data.isnull()
- data.head()
- 填充空缺列:
- 数值型,使用mean()或者mode()
- 字符型(object),使用常量""或者mode()填充,但是针对object,需要使用to_string()或者使用数组下标获取 string 类型的数据进行填充
代码:
# 缺失值的处理
import pandas as pd
import numpy as np
import matplotlib.pyplot as pltdata = pd.read_csv('./data/credit_data.csv')
print(f"data shape: {data.shape}")
print(f"data head: {data.head()}")
print(f"data info: {data.info()}")
print(f"查看空值: {data.isnull()}")print(data.isnull().sum())
data_columns = data.columns.to_list()
for column in data_columns:if data[column].dtype != 'object':data[column].fillna(data[column].mean(), inplace=True) # 均值elif data[column].dtype == 'object':if data[column].isnull().sum() > 0:print(column)data[column].fillna(data[column].mode()[0], inplace=True) # 众数
print(data.isnull().sum())# data.head()
# data.info()
# data.describe()
# data.isnull()
#
# data.mode()
# data.mean()
# data.median()
#
# print(data.dtypes, data.columns)