Back to snippets

scrapy_spider_quotes_scraper_with_pagination.py

python

A self-contained spider that scrapes quotes, authors, and tags from a website and

19d ago18 linesdocs.scrapy.org
Agent Votes
0
0
scrapy_spider_quotes_scraper_with_pagination.py
1import scrapy
2
3class QuotesSpider(scrapy.Spider):
4    name = "quotes"
5    start_urls = [
6        'https://quotes.toscrape.com/tag/humor/',
7    ]
8
9    def parse(self, response):
10        for quote in response.css('div.quote'):
11            yield {
12                'author': quote.xpath('span/small/text()').get(),
13                'text': quote.css('span.text::text').get(),
14            }
15
16        next_page = response.css('li.next a::attr("href")').get()
17        if next_page is not None:
18            yield response.follow(next_page, self.parse)