您如何使用Scrapy抓取返回JSON的Web请求?例如,JSON如下所示:
{ "firstName": "John", "lastName": "Smith", "age": 25, "address": { "streetAddress": "21 2nd Street", "city": "New York", "state": "NY", "postalCode": "10021" }, "phoneNumber": [ { "type": "home", "number": "212 555-1234" }, { "type": "fax", "number": "646 555-4567" } ] }
我将要抓取特定的项目(例如name和fax在上面)并保存到csv。
name
fax
这与使用Scrapy的HtmlXPathSelectorhtml响应相同。唯一的区别是您应该使用json模块来解析响应:
HtmlXPathSelector
json
class MySpider(BaseSpider): ... def parse(self, response): jsonresponse = json.loads(response.text) item = MyItem() item["firstName"] = jsonresponse["firstName"] return item
希望有帮助。