我正在尝试过滤出包含产品列表的数据框。但是,我遇到了熊猫-每当我运行代码时,“ dataframe”对象都没有属性“ str”错误。
这是代码行:
include_clique = log_df.loc[log_df['Product'].str.contains("Product A")]
如果有人有任何建议的想法,请告诉我。我已经搜索了很多次,而且非常困惑。
产品是对象数据类型。
编辑:
import __future__ import os import pandas as pd import numpy as np import tensorflow as tf import math data = pd.read_csv("FILE.csv", header = None) headerName=["DRID","Product","M24","M23","M22","M21","M20","M19","M18","M17","M16","M15","M14","M13","M12","M11","M10","M9","M8","M7","M6","M5","M4","M3","M2","M1"] cliques = [(Confidential)] data.columns=[headerName] log_df = data log_df = np.log(1+data[["M24","M23","M22","M21","M20","M19","M18","M17","M16","M15","M14","M13","M12","M11","M10","M9","M8","M7","M6","M5","M4","M3","M2","M1"]]) copy = data[["DRID","Product"]].copy() log_df = copy.join(log_df) include_clique = log_df.loc[log_df['Product'].str.contains("Product A")]
这是头:
ID PRODUCT M24 M23 M22 M21 0 123421 A 0.000000 0.000000 1.098612 0.0 1 141840 A 0.693147 1.098612 0.000000 0.0 2 212006 A 0.693147 0.000000 0.000000 0.0 3 216097 A 1.098612 0.000000 0.000000 0.0 4 219517 A 1.098612 0.693147 1.098612 0.0
编辑2:这是print(data),A是产品。当我将其打印出来时,看起来好像A不在类别产品下。
DRID Product M24 M23 M22 M21 M20 \ 0 52250 A 0.0 0.0 2.0 0.0 0.0 1 141840 A 1.0 2.0 0.0 0.0 0.0 2 212006 A 1.0 0.0 0.0 0.0 0.0 3 216097 A 2.0 0.0 0.0 0.0 0.0
答案很简单: 改变data.columns=[headerName]成data.columns=headerName
data.columns=[headerName]
data.columns=headerName
说明: 设置时data.columns=[headerName],列为MultiIndex对象。因此,您log_df['Product']是一个DataFrame,对于DataFrame,没有str属性。
log_df['Product']
str
设置时data.columns=headerName,您log_df['Product']只有一列,可以使用strattribute。
出于任何原因,如果需要将数据保留为MultiIndex对象,则还有另一种解决方案:首先将您的数据转换log_df['Product']为Series。之后,str属性可用。
products = pd.Series(df.Product.values.flatten()) include_clique = products[products.str.contains("Product A")]
但是,我想第一个解决方案就是您要寻找的