Python Scrapy: Storing the Scraped Data

The simplest way to store scraped data is using Feed exports, with the following command:
scrapy crawl quotes -O quotes.json

This will create a quotes.json file with all scraped items, serialized in JSON.

The -O command-line switch overwrites any existing file. To append new content to an existing file, use the -o switch. However, appending to a JSON file can make the resulting file invalid JSON. For this reason, consider using a different serialization format like JSON Lines:
scrapy crawl quotes -o quotes.jsonl

JSON Lines is a simple format where each line represents a separate JSON object, making it easier to append data without affecting the overall file structure.


Previous     Next

Use the Search Bar to find content on MarketingMind.