aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2021-06-12main: Remove commented multithreading test codeTeddy Wing
Remove my old tests now that we have a multi-threading setup that actually works.
2021-06-12Process repositories on multiple threadsTeddy Wing
Use 'rayon' to parallelise the repository processing. Each repository is processed in a thread in the default 'rayon' pool. In order to get thread-safe access to the database, I followed some advice from a Stack Overflow answer by VasiliNovikov (https://stackoverflow.com/users/1091436/vasilinovikov): https://stackoverflow.com/questions/62560396/how-to-use-sqlite-via-rusqlite-from-multiple-threads/62560397#62560397 VasiliNovikov recommended creating a database connection pool using 'r2d2_sqlite'. This way we don't have to share a database connection between threads, but each thread can have its own connection. This also means we can remove mutable requirements in a bunch of places involving our `database::Db` type since we're no longer managing the database connections directly.
2021-06-12Switch from 'reqwest' to 'ureq'; Remove asyncTeddy Wing
Remove all async from the project by switching from 'reqwest' to 'ureq'. This should make the code simpler, and hopefully enable us to try out multithreading.
2021-06-12Db::connect(): Fix database open callTeddy Wing
Turns out I need to specify all the flag I want in the open call, including the one to open the database for reading and writing. This fixes the "Error code 21: Library used incorrectly" error I was getting earlier.
2021-06-12run(): Add context to database errorsTeddy Wing
To allow us to work out where the error is coming from.
2021-06-12main: try! error from `process_repo`Teddy Wing
2021-06-12main: Remove async database callsTeddy Wing
Remove all the async database calls and Tokio spawning. Still haven't worked out the error code 21 database error from earlier, but this will hopefully allow us to use normal threads directly.
2021-06-11Replace 'sqlx' with 'rusqlite'Teddy Wing
Trying to get rid of async. This compiles, but fails with the following runtime error: Error code 21: Library used incorrectly Need to investigate further.
2021-06-11Try moving things around for multi-threadingTeddy Wing
Still isn't multi-threaded. Not sure what I'm doing wrong.
2021-06-07Add usage example in READMETeddy Wing
2021-06-07Update TODOTeddy Wing
2021-06-07Add license (GNU GPLv3+)Teddy Wing
2021-06-07Delete src/repo.rsTeddy Wing
Looks like I didn't delete the file in 67d7632b900f7221c1a3fb1927cd97b7cb60c71e.
2021-06-07main: Limit to 5 repos for thread debuggingTeddy Wing
2021-06-07main: Not multi-threadedTeddy Wing
Looks like the work doesn't happen on multiple threads. All of the tasks printed the same thread ID. Need to do some more work to get this working properly, it seems.
2021-06-07main: Collect errors from spawned tasksTeddy Wing
Collect all errors into a list. I think I'm going to return them as a list from this function. The runtime appears a lot slower with this change. Need to figure out what that's about.
2021-06-07Switch `futures::executor` to Tokio runtimeTeddy Wing
Use the Tokio runtime we created to run the blocking async tasks. Trying to set this up so I can get results back from the spawned tasks, but I'm currently having trouble working out how to extract them from the async task and return them from `run()`. I suppose I could just print out the errors directly in that `while let` loop, but ideally I'd like to return all errors from `run()` rather than printing in `run()`.
2021-06-06Split database mutex lock and create calls onto multiple linesTeddy Wing
To separate the actions more.
2021-06-06main: Add a comment about the repo size flag parse error handlingTeddy Wing
2021-06-06main::run(): Get repositories from GitHub API callTeddy Wing
Remove the hard-coded test repositories I was using and replace them with real ones retrieved from the GitHub API. Enable I/O and timers on the Tokio runtime in order to enable the async GitHub API request.
2021-06-06main: Remove `unwrap` when parsing `--skip-larger-than`Teddy Wing
Don't panic here so we can use our own error message template.
2021-06-06main(): Remove `unwrap`Teddy Wing
Print the error instead of unwrapping.
2021-06-06main: Add function documentationTeddy Wing
2021-06-06github::Repo: Remove TODOTeddy Wing
Don't see any reason to do this now.
2021-06-06database: Add documentation headersTeddy Wing
2021-06-06Update TODOTeddy Wing
2021-06-06Provide an option to skip repos larger than a given sizeTeddy Wing
Allows a maximum repo size to be given as a command line argument. Repos larger than this will not be mirrored. This gives us a way to save server space by avoiding gigantic repositories.
2021-06-06Remove old in-progress threading codeTeddy Wing
Remove this now that we have something that I think works.
2021-06-06Explicitly use tokio's multi-threaded runtimeTeddy Wing
Rather that relying on the Cargo features we've enabled to define this, create a multi-threaded runtime in code.
2021-06-06Make repo mirroring multi-threadedTeddy Wing
I think, at least. Took a lot of research and trial and error to get this to compile, working out how to set up the multi-threading for async code. The idea here is to be able to process each repo in potentially multiple threads and do that processing work in parallel.
2021-06-06Update TODOTeddy Wing
2021-06-06Update TODOTeddy Wing
2021-06-05run(): Move commentTeddy Wing
2021-06-05run(): Remove `unwrap`sTeddy Wing
2021-06-05main(): Remove commented test codeTeddy Wing
This is no longer relevant.
2021-06-05Add commented GitHub fetch call with command line username argumentTeddy Wing
Add command line argument value here in preparation for when we enable this code.
2021-06-05Replace hard-coded values with command line option valuesTeddy Wing
2021-06-05Use database path from command line argumentTeddy Wing
2021-06-05Move command line option parsing code to `run()`Teddy Wing
2021-06-05Add command line option parsingTeddy Wing
Define the options we want to take. Not using them yet.
2021-06-03main::update_mtime(): Use the packed-refs file if no default branch refTeddy Wing
A repository cloned with: $ git clone --mirror REPO doesn't have any ref files in `repo.git/refs/heads/*`. Instead, the refs are stored in `repo.git/packed-refs`. Update the pack file if the default branch ref file doesn't exist. CGit will look at the time on the 'packed-refs' file when that's the case.
2021-06-03main(): Use a smaller forked repository for testingTeddy Wing
The Angular.js repo was 51 MB, while DDHotKey is 95 K.
2021-05-30main::update_mtime(): Add function documentationTeddy Wing
2021-05-30Update TODOTeddy Wing
2021-05-30Set repository mtime to GitHub `updated_at` timeTeddy Wing
CGit reads the repository modification time from the following locations, in order from top to bottom: 1. agefile 2. repo.git/refs/heads/{default_branch | "master"} 3. repo.git/packed-refs (https://git.zx2c4.com/cgit/tree/ui-repolist.c?id=bd6f5683f6cde4212364354b3139c1d521f40f39#n35) Update the `/refs/heads/{default_branch}` file mtime when cloning and updating the repo to match the GitHub `updated_at` time. This ensures that when mirroring old repositories, they don't appear at the top of the CGit repository index list when sorting by age.
2021-05-30github::fetch_repos(): Request repos be sorted by updated timeTeddy Wing
2021-05-30github::fetch_repos(): Extract username to function argumentTeddy Wing
2021-05-30github::fetch_repos(): Add documentationTeddy Wing
2021-05-30Add TODOTeddy Wing
2021-05-30github::fetch_repos(): Fetch all repos from all pagesTeddy Wing
Also switch from `reqwest::blocking` to async because I was getting this error, probably because I call `fetch_repos()` in the async 'tokio' function `main()`: thread 'main' panicked at 'Cannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context.', $HOME/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.1/src/runtime/blocking/shutdown.rs:51:21