diff options
| author | Teddy Wing | 2016-04-27 05:50:40 -0400 |
|---|---|---|
| committer | Teddy Wing | 2016-04-27 05:50:40 -0400 |
| commit | 78e7b4c607d8fc3b35f90ed88614093bda437195 (patch) | |
| tree | 97edb00d77cb39866f0b9ab4abf0b126a488bb03 /.gitignore | |
| parent | 2b99fd30c28fecc95d9425f229eab426080dbc85 (diff) | |
| download | mutt-alias-auto-add-78e7b4c607d8fc3b35f90ed88614093bda437195.tar.bz2 | |
Read aliases file as bytes and convert to string
Discovered that my Mutt aliases file uses the latin1 character encoding.
That caused a "stream did not contain valid UTF-8" error when trying to
read the file in the `Alias#find_in_file` function.
This error was ostensibly triggered by a `str::from_utf8` call in the
standard library
(https://github.com/rust-lang/rust/blob/2174bd97c1458d89a87eb2b614135d7ad68d6f18/src/libstd/io/mod.rs#L315-L338).
I ended up finding this Stack Overflow answer with an easy solution:
http://stackoverflow.com/questions/28169745/what-are-the-options-to-convert-iso-8859-1-latin-1-to-a-string-utf-8/28175593#28175593
fn latin1_to_string(s: &[u8]) -> String {
s.iter().map(|c| c as char).collect()
}
Since latin1 is a subset of Unicode, we can just read the bytes from the
file and typecast them to Rust chars (which are UTF-8). That gives us
the opportunity to easily get the text into an encoding that we can
actually work with in Rust.
At first I got frustrated because the suggestion didn't compile for me.
It was suggested in January 2015, before Rust 1.0, so perhaps that
factors into the error I was getting. Here it is:
src/alias.rs:59:41: 59:45 error: mismatched types:
expected `&[u8]`,
found `core::result::Result<collections::string::String, std::io::error::Error>`
(expected &-ptr,
found enum `core::result::Result`) [E0308]
src/alias.rs:59 let line = latin1_to_string(line);
^~~~
src/alias.rs:59:41: 59:45 help: run `rustc --explain E0308` to see a detailed explanation
src/alias.rs:99:22: 99:31 error: only `u8` can be cast as `char`, not `&u8`
src/alias.rs:99 s.iter().map(|c| c as char).collect()
^~~~~~~~~
error: aborting due to 2 previous errors
A recommendation from 'niconii' Mozilla#rust-beginners was to use the
Encoding library in order to do the conversion
(https://github.com/lifthrasiir/rust-encoding). That certainly seems
more robust and would be a good idea to try if this change doesn't work
out in the long term. But the Stack Overflow answer just seemed so short
and sweet that I really didn't like the idea of adding a dependency if I
could get what I wanted with 3 lines of code.
Finally took another look and reworked the suggested code to take a
vector (which is what `BufReader#split` gives us) and clone the u8
characters to clear the compiler error of not being able to cast an &u8.
Diffstat (limited to '.gitignore')
0 files changed, 0 insertions, 0 deletions
