Commit Graph

32 Commits

Author SHA1 Message Date
Shadowfacts 1beff21fc5
Switch to Mojito for HTTP requests 2020-09-11 19:15:19 -04:00
Shadowfacts bd42073e24
Fix whatever.scalzi.com extractor 2020-08-14 21:55:38 -04:00
Shadowfacts ab105d71ae
Add Gemini document -> HTML converter stage 2020-07-18 23:13:42 -04:00
Shadowfacts 26bfb2e58f
Store item content MIME type 2020-07-18 23:13:24 -04:00
Shadowfacts 12bb742be9
Add Gemini protocol scrape stage 2020-07-18 19:50:41 -04:00
Shadowfacts 4f16933198
Add gemini protocol feed fetching 2020-07-18 19:27:53 -04:00
Shadowfacts fc2b8f6036
Add basic LiveView pipeline editor, scrape stage config editing 2020-06-08 22:49:45 -04:00
Shadowfacts 55c6d6fd88
Remove newsletter info from om.co extractor 2020-06-01 23:05:18 -04:00
Shadowfacts 4cccab8df0
Remove old code 2020-06-01 18:30:59 -04:00
Shadowfacts c37bed932f
Fix pipeline validation not working 2020-05-31 15:56:27 -04:00
Shadowfacts 4a09ce1cb0
Fix scraping images w/ URLs w/o schemes 2020-02-17 12:09:03 -05:00
Shadowfacts e684737fcd
Implement basic favicon scraping 2019-11-10 14:23:07 -05:00
Shadowfacts f84d849432
Add conditional stage
Allows applying another pipeline stage based on a condition, which can
either be a whole filter or a single filter rule.
2019-11-01 22:50:25 -04:00
Shadowfacts 13c44d5e10
Refactor filtering logic into separate module 2019-11-01 22:49:52 -04:00
Shadowfacts c9cc9f2428
Fix crash while scraping images 2019-11-01 18:29:41 -04:00
Shadowfacts 9264c9a97d
Add extractor for om.co 2019-11-01 18:27:15 -04:00
Shadowfacts 5d38d9567e
Fix error while validating scrape stage options 2019-11-01 18:27:08 -04:00
Shadowfacts 3bc37952d1
Add option to convert images in article content to data URIs 2019-10-31 21:59:55 -04:00
Shadowfacts 98a182986c
Fix module name of whatever.scalzi.com extractor 2019-10-31 19:15:35 -04:00
Shadowfacts 8c96a94cd3
Add extractor for finertech.com 2019-10-31 19:04:15 -04:00
Shadowfacts 118de4ae53
Add extractor for macstories.net 2019-10-31 18:48:36 -04:00
Shadowfacts 957f271425
Add extractor for 512pixels.net 2019-10-31 18:38:01 -04:00
Shadowfacts c0f04791b6
Add extractor for beckyhansmeyer.com 2019-10-31 17:46:32 -04:00
Shadowfacts cfd9f7505a
Rewrite image URLs without hosts to use the host of the article URL 2019-10-31 17:38:16 -04:00
Shadowfacts eec0b918e7
Change extractors to accept/return html trees 2019-10-31 17:12:02 -04:00
Shadowfacts d476839fce
Add extractor for ericasadun.com 2019-10-31 17:03:34 -04:00
Shadowfacts 6f568a03e1
Add whatever.scalzi.com extractor 2019-10-31 16:52:01 -04:00
Shadowfacts 3192969889
Replace site-specific pipeline stages with new extractor architecture 2019-10-31 16:45:52 -04:00
Shadowfacts 1015fd5162
Add types, Dialyzer, fix Dialyzer warnings 2019-08-30 19:31:38 -04:00
Shadowfacts e55a694194
Add Daring Fireball scraper 2019-07-21 19:04:43 -04:00
Shadowfacts 17310911ce
Add pipeline stage option validation/error reporting 2019-07-21 12:21:28 -04:00
Shadowfacts 0a1909dbc4
Start pipeline system 2019-07-08 22:45:02 -04:00