这是蜘蛛:
import scrapy from danmurphys.items import DanmurphysItem class MySpider(scrapy.Spider): name = 'danmurphys' allowed_domains = ['danmurphys.com.au'] start_urls = ['https://www.danmurphys.com.au/dm/navigation/navigation_results_gallery.jsp?params=fh_location%3D%2F%2Fcatalog01%2Fen_AU%2Fcategories%3C%7Bcatalog01_2534374302084767_2534374302027742%7D%26fh_view_size%3D120%26fh_sort%3D-sales_value_30_days%26fh_modification%3D&resetnav=false&storeExclusivePage=false'] def parse(self, response): urls = response.xpath('//h2/a/@href').extract() for url in urls: request = scrapy.Request(url , callback=self.parse_page) yield request def parse_page(self , response): item = DanmurphysItem() item['brand'] = response.xpath('//span[@itemprop="brand"]/text()').extract_first().strip() item['name'] = response.xpath('//span[@itemprop="name"]/text()').extract_first().strip() item['url'] = response.url return item
这是项目:
import scrapy class DanmurphysItem(scrapy.Item): brand = scrapy.Field() name = scrapy.Field() url = scrapy.Field()
当我使用以下命令运行Spider时:
scrapy crawl danmurphys -o output.csv
输出是这样的:
要在Scrapy 1.3中解决此问题,您可以通过在中的类的方法中将newline=''作为参数添加到io.TextIOWrapper中进行修补。__init__``CsvItemExporter``scrapy.exporters
newline=''
io.TextIOWrapper
__init__``CsvItemExporter``scrapy.exporters