Commit Graph

12 Commits

Author SHA1 Message Date
Ben Olive 5dd52d5698 Ensure `remove_tag` returns a valid html_tree
If the entire input is stripped out, this used to return `nil` which
caused downstream parsing to fail. Instead, return `[]` which is the
Floki representation of an empty tree.

Fixes #36
2018-10-11 10:35:16 +09:00
Ben Olive b35746bfed Strip out atom tags
Standard tags are returned by Mochiweb as binaries. The atom tags are
for special case parsing (such as php includes). Since that's not oging
to be part of the article, simply exclude those while normalizing.

Fixes #30

See also:

Mochiweb parser: 9608d786ef/src/mochiweb_html.erl (L345)
2018-10-11 10:34:29 +09:00
keepcosmos 2ed20b6fe1 update deps and deprecated 2018-07-24 18:50:08 +09:00
Fernando Mendes ebc8c90e71 Convert relative img paths into absolute
Fixes #27
2018-06-30 11:14:17 +01:00
Chi Ngan Lee b2f8a3b4da Add Elixir 1.6 formatter config file and formatted the codebase 2018-02-09 11:42:08 +08:00
Adlan Razalan 9e43c454e7 Manually compare tag type for candidate
The match? method is no longer available starting Floki 0.15.0.
2017-11-03 23:40:18 +08:00
keepcosmos 1aa682a31a fix some bug and update deps 2017-02-05 18:48:26 +09:00
Eason Goodale 6840a9d0d7 Fixes crash when html has an xml version tag by stripping it out 2016-08-13 22:11:01 -07:00
keepcosmos 93bdf48b8c add summarize function
this closes #4, closes #3
2016-05-07 18:28:39 +09:00
keepcosmos 23db20bbf0 add document 2016-04-24 18:40:35 +09:00
keepcosmos b131d7effa add candidate builder
add test
2016-04-23 12:31:03 +09:00
keepcosmos 4e4a712718 add filter algorithms 2016-04-17 15:28:33 +09:00