diff options
author | Teddy Wing | 2021-03-14 17:24:18 +0100 |
---|---|---|
committer | Teddy Wing | 2021-03-14 17:24:18 +0100 |
commit | 62c083b5e3a164d596b49132c8c53248aa2daf42 (patch) | |
tree | 7d6ef1866c54e09a9a5b9716f779934cb6bd29b4 /Cargo.toml | |
parent | 7d46438c015e400ca6c035f5d99da040e6765740 (diff) | |
download | mutt-ottolangy-62c083b5e3a164d596b49132c8c53248aa2daf42.tar.bz2 |
Strip HTML tags from single-part HTML emails
When an HTML body is fed to 'whatlang', it recognises it as English.
This is likely due to the English HTML syntax. Remove all HTML tags with
a simple regex substitution to get the language recognition working more
properly.
This doesn't remove CSS, which could also confuse the language
recogniser. In a limited test, it seemed to work without having to
remove any CSS, so not bothering with that.
Still need to get this working for multipart emails.
Diffstat (limited to 'Cargo.toml')
-rw-r--r-- | Cargo.toml | 1 |
1 files changed, 1 insertions, 0 deletions
@@ -6,6 +6,7 @@ edition = "2018" [dependencies] exitcode = "1.1.2" mailparse = "0.13.2" +regex = "1.4.4" thiserror = "1.0.24" whatlang = "0.11.1" xdg = "2.2.0" |