aboutsummaryrefslogtreecommitdiffstats
path: root/Cargo.toml
diff options
context:
space:
mode:
authorTeddy Wing2021-03-14 17:24:18 +0100
committerTeddy Wing2021-03-14 17:24:18 +0100
commit62c083b5e3a164d596b49132c8c53248aa2daf42 (patch)
tree7d6ef1866c54e09a9a5b9716f779934cb6bd29b4 /Cargo.toml
parent7d46438c015e400ca6c035f5d99da040e6765740 (diff)
downloadmutt-ottolangy-62c083b5e3a164d596b49132c8c53248aa2daf42.tar.bz2
Strip HTML tags from single-part HTML emails
When an HTML body is fed to 'whatlang', it recognises it as English. This is likely due to the English HTML syntax. Remove all HTML tags with a simple regex substitution to get the language recognition working more properly. This doesn't remove CSS, which could also confuse the language recogniser. In a limited test, it seemed to work without having to remove any CSS, so not bothering with that. Still need to get this working for multipart emails.
Diffstat (limited to 'Cargo.toml')
-rw-r--r--Cargo.toml1
1 files changed, 1 insertions, 0 deletions
diff --git a/Cargo.toml b/Cargo.toml
index ba30916..d64f53d 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -6,6 +6,7 @@ edition = "2018"
[dependencies]
exitcode = "1.1.2"
mailparse = "0.13.2"
+regex = "1.4.4"
thiserror = "1.0.24"
whatlang = "0.11.1"
xdg = "2.2.0"