Commit Graph

80 Commits

Author SHA1 Message Date
Shadowfacts 5087ef0395
Fix feed favicon_url not setting 2020-06-01 22:45:47 -04:00
Shadowfacts 2a3c085fef
Add item pagination 2020-06-01 22:24:18 -04:00
Shadowfacts 4cccab8df0
Remove old code 2020-06-01 18:30:59 -04:00
Shadowfacts 50af019c6f
Update hackney, fixes issue with OTP 23 2020-06-01 18:27:23 -04:00
Shadowfacts e429d256d6
Add configurable refresh frequency to feeds 2020-05-31 22:48:04 -04:00
Shadowfacts 939470767b
Use HTTP helper when refreshing feeds 2020-05-31 16:23:51 -04:00
Shadowfacts b63081392c
Make feed refresh request asynchronous 2020-05-31 16:13:00 -04:00
Shadowfacts c37bed932f
Fix pipeline validation not working 2020-05-31 15:56:27 -04:00
Shadowfacts 09d2e4ae72
Fix wrong return type 2020-05-29 20:06:42 -04:00
Shadowfacts 5ee8515bb2
Add missing error case 2020-05-29 20:04:41 -04:00
Shadowfacts b0d9189399
Prevent unnecessary refetching of favicons 2020-05-29 19:47:14 -04:00
Shadowfacts 4a09ce1cb0
Fix scraping images w/ URLs w/o schemes 2020-02-17 12:09:03 -05:00
Shadowfacts bc8b3a8a38
Change default item date to now if item date couldn't be parsed 2020-01-29 22:01:23 -05:00
Shadowfacts 2086a31537
Update HTTPoison, improve handling for URIs with no scheme 2020-01-26 21:46:49 -05:00
Shadowfacts 740e0e7fca
Allow items without dates 2020-01-12 21:51:58 -05:00
Shadowfacts fbe98c2e66
Fix typo 2019-12-21 22:57:15 -05:00
Shadowfacts 66f7206b47
Add fallback handler for unknown response codes 2019-12-21 22:56:10 -05:00
Shadowfacts 2c06b785c9
Fix incorrect handling of relative favicon links 2019-11-10 15:27:40 -05:00
Shadowfacts 8d790b8af0
Fix HTTP not handling relative reidrects correctly 2019-11-10 15:15:54 -05:00
Shadowfacts 2a5bfb22db
Prevent crash when trying to load a favicon from a non-existent URL 2019-11-10 15:05:48 -05:00
Shadowfacts a09901e44e
When fetching favicons, if a feed doens't have a site URL, fallback on
the root page of the domain.
2019-11-10 15:03:38 -05:00
Shadowfacts e684737fcd
Implement basic favicon scraping 2019-11-10 14:23:07 -05:00
Shadowfacts 2d88b0b4e1
Prune unread items after two weeks 2019-11-10 14:03:13 -05:00
Shadowfacts 4888a45243
Make item titles not required by changesets
Fixes a bug where items without titles could not be marked as read
2019-11-10 11:48:27 -05:00
Shadowfacts 0d0c749b68
Make pipelines not tied directly to feeeds
Allows using the same pipeline for multiple different feeds
2019-11-08 22:27:46 -05:00
Shadowfacts f84d849432
Add conditional stage
Allows applying another pipeline stage based on a condition, which can
either be a whole filter or a single filter rule.
2019-11-01 22:50:25 -04:00
Shadowfacts 13c44d5e10
Refactor filtering logic into separate module 2019-11-01 22:49:52 -04:00
Shadowfacts c9cc9f2428
Fix crash while scraping images 2019-11-01 18:29:41 -04:00
Shadowfacts 9264c9a97d
Add extractor for om.co 2019-11-01 18:27:15 -04:00
Shadowfacts 5d38d9567e
Fix error while validating scrape stage options 2019-11-01 18:27:08 -04:00
Shadowfacts 1a934430cc
Change feed refresh interval to 30 minutes 2019-10-31 22:21:32 -04:00
Shadowfacts d8139c6ce0
Make item creation multi-threaded 2019-10-31 22:21:17 -04:00
Shadowfacts 3bc37952d1
Add option to convert images in article content to data URIs 2019-10-31 21:59:55 -04:00
Shadowfacts 98a182986c
Fix module name of whatever.scalzi.com extractor 2019-10-31 19:15:35 -04:00
Shadowfacts 8c96a94cd3
Add extractor for finertech.com 2019-10-31 19:04:15 -04:00
Shadowfacts 118de4ae53
Add extractor for macstories.net 2019-10-31 18:48:36 -04:00
Shadowfacts 957f271425
Add extractor for 512pixels.net 2019-10-31 18:38:01 -04:00
Shadowfacts c0f04791b6
Add extractor for beckyhansmeyer.com 2019-10-31 17:46:32 -04:00
Shadowfacts cfd9f7505a
Rewrite image URLs without hosts to use the host of the article URL 2019-10-31 17:38:16 -04:00
Shadowfacts eec0b918e7
Change extractors to accept/return html trees 2019-10-31 17:12:02 -04:00
Shadowfacts d476839fce
Add extractor for ericasadun.com 2019-10-31 17:03:34 -04:00
Shadowfacts 6f568a03e1
Add whatever.scalzi.com extractor 2019-10-31 16:52:01 -04:00
Shadowfacts 3192969889
Replace site-specific pipeline stages with new extractor architecture 2019-10-31 16:45:52 -04:00
Shadowfacts 9e6b185cfd
Fix crash caused filter tombstone-ing an item 2019-10-31 14:26:02 -04:00
Shadowfacts 07a62fec25
Remove old code from before filters and scraping were combined into pipelines 2019-10-31 14:18:41 -04:00
Shadowfacts 27dd2c4a8e
Add default HTTP response handler 2019-09-26 11:59:44 -04:00
Shadowfacts bd896641df
Fix crash on certificate errors 2019-09-26 11:55:42 -04:00
Shadowfacts 4f002037fb
Better logging for feed fetch error 2019-09-18 15:58:43 -04:00
Shadowfacts 56121de540
Fix crash onn storing items with too precise dates 2019-09-18 15:55:22 -04:00
Shadowfacts 5c4b989446
Use feed_parser instead of fiet 2019-09-01 16:46:56 -04:00