← 返回题库
初级

处理revolUtil的异常值和缺失值

未完成
初级参考 完整示例代码供参考,建议自己理解后重新输入
def solve():
    from pyodide.http import open_url
    from io import StringIO
    loans_raw_csv = open_url("https://data.zuihe.com/dbd/riskctrl/state_02/loans_raw.csv").read()
    loans_clean_csv = open_url("https://data.zuihe.com/dbd/riskctrl/state_02/loans_clean.csv").read()
    import pandas as pd
    from io import StringIO
    df = pd.read_csv(StringIO(loans_raw_csv))
    df['revolUtil'] = pd.to_numeric(df['revolUtil'],errors='coerce')
    print("处理前:"); print(df['revolUtil'].describe().round(2).to_string())
    print(f"异常值(>100): {(df['revolUtil']>100).sum()}")
    df['revolUtil'] = df['revolUtil'].clip(upper=100).fillna(df['revolUtil'].median())
    print("处理后:"); print(df['revolUtil'].describe().round(2).to_string())

示例

输入
solve()
期望输出
处理前:
count    9996.00
mean       51.59
std        24.64
min         0.00
25%        33.20
50%        51.70
75%        70.80
max       118.10
异常值(>100): 36
处理后:
count    10000.00
mean        51.58
std         24.62
min          0.00
25%         33.20
50%         51.70
75%         70.80
max        100.00
Python 代码 🔒 登录后使用
🔒

登录后即可练习

注册免费账号,在浏览器中直接运行 Python 代码