frenzy

Table of Contents

extractor
convert_to_data_uris

Extractors

The Scrape pipeline stage allows the content of an RSS item from the RSS feed itself to be replaced with the content scraped from the item's webpage.

`extractor`

A string, either builtin or the module name of a specific extractor (see below).

`convert_to_data_uris`

A boolean that controls whether images in posts should be fetched from the web, converted to data URIs and injected into RSS items.

This option will significantly increase the database size as images will be stored directly in the DB.

Note: This option may be disabled by server administrators and is restricted to certain MIME types (PNG, JPG, TIFF, HEIF, and HEIC).

Extractors

Extractors define how the contents of a web page are isolated from the rest of the page. There is a builtin extractor which uses a general purpose algorithm for isolating and extracting contents from the web page, but for some websites it may be unreliable. For this reason, there are a number of builtin extractors for specific websites.

beckyhansmeyer.com: Frenzy.Pipeline.Extractor.BeckyHansmeyer
daringfireball.net: Frenzy.Pipeline.Extractor.DaringFireball
ericasadun.com: Frenzy.Pipeline.Extractor.EricaSadun
finertech.com: Frenzy.Pipeline.Extractor.FinerTech
512pixels.net: Frenzy.Pipeline.Extractor.FiveTwelvePixels
macstories.net: Frenzy.Pipeline.Extractor.MacStories
om.co: Frenzy.Pipeline.Extractor.OmMalik
whatever.scalzi.com: Frenzy.Pipeline.Extractor.WhateverScalzi