aboutsummaryrefslogtreecommitdiffstats
path: root/src/lib.rs
AgeCommit message (Collapse)Author
2019-11-02Add license (GNU GPLv3+)Teddy Wing
2019-11-02get_urls_from_pdf: Extract link annotation check to a functionTeddy Wing
Give this condition a more descriptive name.
2019-11-02get_urls_from_pdf: Add a short doc stringTeddy Wing
2019-11-02get_urls_from_pdf: Allow out of order URLs in testTeddy Wing
For now I'm going to allow URLs to be printed out of their apparent visual order. Change the test so that it passes.
2019-11-02get_urls_from_pdf: Remove duplicate URLsTeddy Wing
2019-11-02get_urls_from_pdf: Test extracted URLsTeddy Wing
Add a test with a simple text-only PDF with three URLs. Currently I'm getting the following failure, so visibly the order is not necessarily the same as the visible order, and multi-line hyperlinks can be encoded as two link areas: ---- tests::get_urls_from_pdf_extracts_urls_from_pdf stdout ---- thread 'tests::get_urls_from_pdf_extracts_urls_from_pdf' panicked at 'assertion failed: `(left == right)` left: `["http://www.gutenberg.org/ebooks/11", "https://ia800908.us.archive.org/6/items/alicesadventures19033gut/19033-h/images/i002.jpg", "https://science.nasa.gov/news-article/black-hole-image-makes-history"]`, right: `["http://www.gutenberg.org/ebooks/11", "https://science.nasa.gov/news-article/black-hole-image-makes-history", "https://ia800908.us.archive.org/6/items/alicesadventures19033gut/19033-h/images/i002.jpg", "https://ia800908.us.archive.org/6/items/alicesadventures19033gut/19033-h/images/i002.jpg"]`', src/lib.rs:65:9
2019-11-02get_urls_from_pdf: Return a `Vec<String>` instead of printingTeddy Wing
Facilitate testing by returning a vec of URLs instead of printing them directly to STDOUT.
2019-11-02get_urls_from_pdf: Remove `return`s to fix URL outputTeddy Wing
Turns out when I removed the `unwrap`s in 92f8f57b76b32c3d3e52d4b61dcdf25969f47ab7, the `return`s I added to the `match` expressions caused the loops to exit early without iterating over all the objects in the PDF. Remove the `return`s and fix up the expression return types to get URLs printing again.
2019-11-02get_urls_from_pdf: Remove `unwrap`s and replace with an error typeTeddy Wing
Create a custom error type to use instead of the `unwrap`s.
2019-11-01lib: Use `std::str`Teddy Wing
Get rid of `::str`-prefixed calls.
2019-11-01get_urls_from_pdf: Change argument type to `AsRef<Path>`Teddy Wing
2019-11-01get_urls_from_pdf: Take PDF path as an argumentTeddy Wing
2019-11-01main: Move URL extraction code into lib.rsTeddy Wing