about summary refs log tree commit diff
diff options
context:
space:
mode:
authorDavid Yip <yipdw@member.fsf.org>2018-02-10 02:32:39 -0600
committerDavid Yip <yipdw@member.fsf.org>2018-02-10 10:36:16 -0600
commit9105b0c95428e3bbecd6f8ad106508095eed5643 (patch)
tree11ff2569e754309924eece01021bc118b5d3c3c0
parent53c86b29f05049d77d17a35a0ca6287174431783 (diff)
Introduce html2text for extracting plaintext from statuses. #236.
Unlike strip_tags, html2text will preserve text present in other nodes,
e.g. anchor tags:

    [1] pry(main)> str = '<a href="http://www.example.com">A link</a>'
    => "<a href=\"http://www.example.com\">A link</a>"
    [2] pry(main)> Html2Text.convert(str)
    => "[A link](http://www.example.com)"
    [3] pry(main)> include ActionView::Helpers::SanitizeHelper
    => Object
    [4] pry(main)> strip_tags(str)
    => "A link"

Preserving the href of an anchor allows keyword mutes to also match on
URLs, which is something that the frontend regex filter can currently
do.
-rw-r--r--Gemfile1
-rw-r--r--Gemfile.lock3
2 files changed, 4 insertions, 0 deletions
diff --git a/Gemfile b/Gemfile
index 1d128d657..d2cd3b42d 100644
--- a/Gemfile
+++ b/Gemfile
@@ -42,6 +42,7 @@ gem 'fast_blank', '~> 1.0'
 gem 'goldfinger', '~> 2.1'
 gem 'hiredis', '~> 0.6'
 gem 'redis-namespace', '~> 1.5'
+gem 'html2text'
 gem 'htmlentities', '~> 4.3'
 gem 'http', '~> 3.0'
 gem 'http_accept_language', '~> 2.1'
diff --git a/Gemfile.lock b/Gemfile.lock
index 3a65f35a5..3400b1a0f 100644
--- a/Gemfile.lock
+++ b/Gemfile.lock
@@ -205,6 +205,8 @@ GEM
     highline (1.7.10)
     hiredis (0.6.1)
     hkdf (0.3.0)
+    html2text (0.2.1)
+      nokogiri (~> 1.6)
     htmlentities (4.3.4)
     http (3.0.0)
       addressable (~> 2.3)
@@ -601,6 +603,7 @@ DEPENDENCIES
   goldfinger (~> 2.1)
   hamlit-rails (~> 0.2)
   hiredis (~> 0.6)
+  html2text
   htmlentities (~> 4.3)
   http (~> 3.0)
   http_accept_language (~> 2.1)