我需要在pandas DataFrame中以某种格式格式化Json文件的内容,以便我可以运行pandassql转换数据并通过评分模型运行它。
文件= C:\ scoring_model \ json.js(“文件”的内容如下)
{ "response":{ "version":"1.1", "token":"dsfgf", "body":{ "customer":{ "customer_id":"1234567", "verified":"true" }, "contact":{ "email":"mr@abc.com", "mobile_number":"0123456789" }, "personal":{ "gender": "m", "title":"Dr.", "last_name":"Muster", "first_name":"Max", "family_status":"single", "dob":"1985-12-23", } } }
我需要数据框看起来像这样(显然,同一行上的所有值都试图对此问题进行最佳格式化):
version | token | customer_id | verified | email | mobile_number | gender | 1.1 | dsfgf | 1234567 | true | mr@abc.com | 0123456789 | m | title | last_name | first_name |family_status | dob Dr. | Muster | Max | single | 23.12.1985
我查看了有关此主题的所有其他问题,尝试了各种方法将Json文件加载到熊猫中
`with open(r'C:\scoring_model\json.js', 'r') as f:` c = pd.read_json(f.read()) `with open(r'C:\scoring_model\json.js', 'r') as f:` c = f.readlines()
在此解决方案中尝试过pd.Panel()PythonPandas:如何在数据框的列中拆分已排序的字典
[yo =f.readlines()]的数据帧结果与考虑过尝试基于(“”)拆分每个单元格的内容,并找到了一种将拆分后的内容放入不同列的方法,但到目前为止还算不上成功。非常感谢您的专业知识。先感谢您。
如果您将整个json作为字典(或列表)加载(例如使用)json.load,则可以使用json_normalize:
json.load
json_normalize
In [11]: d = {"response": {"body": {"contact": {"email": "mr@abc.com", "mobile_number": "0123456789"}, "personal": {"last_name": "Muster", "gender": "m", "first_name": "Max", "dob": "1985-12-23", "family_status": "single", "title": "Dr."}, "customer": {"verified": "true", "customer_id": "1234567"}}, "token": "dsfgf", "version": "1.1"}} In [12]: df = pd.json_normalize(d) In [13]: df.columns = df.columns.map(lambda x: x.split(".")[-1]) In [14]: df Out[14]: email mobile_number customer_id verified dob family_status first_name gender last_name title token version 0 mr@abc.com 0123456789 1234567 true 1985-12-23 single Max m Muster Dr. dsfgf 1.1