开发可修剪前导和尾随空白的功能。
这是一个简单的示例,但是实际文件包含的行和列要复杂得多。
df=pd.DataFrame([["A b ",2,3],[np.nan,2,3],\ [" random",43,4],[" any txt is possible "," 2 1",22],\ ["",23,99],[" help ",23,np.nan]],columns=['A','B','C'])
结果应消除所有前导和尾随空格,但保留文本之间的空格。
df=pd.DataFrame([["A b",2,3],[np.nan,2,3],\ ["random",43,4],["any txt is possible","2 1",22],\ ["",23,99],["help",23,np.nan]],columns=['A','B','C'])
请注意,该功能需要涵盖所有可能的情况。谢谢
我认为需要检查值是否为字符串,因为列中的混合值-带字符串的数字和每个字符串调用strip:
strip
df = df.applymap(lambda x: x.strip() if isinstance(x, str) else x) print (df) A B C 0 A b 2 3.0 1 NaN 2 3.0 2 random 43 4.0 3 any txt is possible 2 1 22.0 4 23 99.0 5 help 23 NaN
如果列具有相同的dtype,则NaN对于列中的数值,您的示例中不会得到B:
NaN
B
cols = df.select_dtypes(['object']).columns df[cols] = df[cols].apply(lambda x: x.str.strip()) print (df) A B C 0 A b NaN 3.0 1 NaN NaN 3.0 2 random NaN 4.0 3 any txt is possible 2 1 22.0 4 NaN 99.0 5 help NaN NaN