reflectub - Mirror a user’s GitHub repositories

Age	Commit message (Collapse)	Author
2021-06-13	run(): Return multiple errors	Teddy Wing
	Return all errors from repo processing. This allows us to provide information on all errors that happened while processing, but continue processing all the repos even if there's an error in one of them. A new `MultiError` type wraps a list of errors to do this.
2021-06-13	main: Remove unused `r2d2_sqlite::SqliteConnectionManager` import	Teddy Wing
	Not sure when or why I added this.
2021-06-12	main: Remove commented multithreading test code	Teddy Wing
	Remove my old tests now that we have a multi-threading setup that actually works.
2021-06-12	Process repositories on multiple threads	Teddy Wing
	Use 'rayon' to parallelise the repository processing. Each repository is processed in a thread in the default 'rayon' pool. In order to get thread-safe access to the database, I followed some advice from a Stack Overflow answer by VasiliNovikov (https://stackoverflow.com/users/1091436/vasilinovikov): https://stackoverflow.com/questions/62560396/how-to-use-sqlite-via-rusqlite-from-multiple-threads/62560397#62560397 VasiliNovikov recommended creating a database connection pool using 'r2d2_sqlite'. This way we don't have to share a database connection between threads, but each thread can have its own connection. This also means we can remove mutable requirements in a bunch of places involving our `database::Db` type since we're no longer managing the database connections directly.
2021-06-12	Switch from 'reqwest' to 'ureq'; Remove async	Teddy Wing
	Remove all async from the project by switching from 'reqwest' to 'ureq'. This should make the code simpler, and hopefully enable us to try out multithreading.
2021-06-12	run(): Add context to database errors	Teddy Wing
	To allow us to work out where the error is coming from.
2021-06-12	main: try! error from `process_repo`	Teddy Wing

2021-06-12	main: Remove async database calls	Teddy Wing
	Remove all the async database calls and Tokio spawning. Still haven't worked out the error code 21 database error from earlier, but this will hopefully allow us to use normal threads directly.
2021-06-11	Replace 'sqlx' with 'rusqlite'	Teddy Wing
	Trying to get rid of async. This compiles, but fails with the following runtime error: Error code 21: Library used incorrectly Need to investigate further.
2021-06-11	Try moving things around for multi-threading	Teddy Wing
	Still isn't multi-threaded. Not sure what I'm doing wrong.
2021-06-07	Add license (GNU GPLv3+)	Teddy Wing

2021-06-07	main: Limit to 5 repos for thread debugging	Teddy Wing

2021-06-07	main: Not multi-threaded	Teddy Wing
	Looks like the work doesn't happen on multiple threads. All of the tasks printed the same thread ID. Need to do some more work to get this working properly, it seems.
2021-06-07	main: Collect errors from spawned tasks	Teddy Wing
	Collect all errors into a list. I think I'm going to return them as a list from this function. The runtime appears a lot slower with this change. Need to figure out what that's about.
2021-06-07	Switch `futures::executor` to Tokio runtime	Teddy Wing
	Use the Tokio runtime we created to run the blocking async tasks. Trying to set this up so I can get results back from the spawned tasks, but I'm currently having trouble working out how to extract them from the async task and return them from `run()`. I suppose I could just print out the errors directly in that `while let` loop, but ideally I'd like to return all errors from `run()` rather than printing in `run()`.
2021-06-06	Split database mutex lock and create calls onto multiple lines	Teddy Wing
	To separate the actions more.
2021-06-06	main: Add a comment about the repo size flag parse error handling	Teddy Wing

2021-06-06	main::run(): Get repositories from GitHub API call	Teddy Wing
	Remove the hard-coded test repositories I was using and replace them with real ones retrieved from the GitHub API. Enable I/O and timers on the Tokio runtime in order to enable the async GitHub API request.
2021-06-06	main: Remove `unwrap` when parsing `--skip-larger-than`	Teddy Wing
	Don't panic here so we can use our own error message template.
2021-06-06	main(): Remove `unwrap`	Teddy Wing
	Print the error instead of unwrapping.
2021-06-06	main: Add function documentation	Teddy Wing

2021-06-06	Provide an option to skip repos larger than a given size	Teddy Wing
	Allows a maximum repo size to be given as a command line argument. Repos larger than this will not be mirrored. This gives us a way to save server space by avoiding gigantic repositories.
2021-06-06	Remove old in-progress threading code	Teddy Wing
	Remove this now that we have something that I think works.
2021-06-06	Explicitly use tokio's multi-threaded runtime	Teddy Wing
	Rather that relying on the Cargo features we've enabled to define this, create a multi-threaded runtime in code.
2021-06-06	Make repo mirroring multi-threaded	Teddy Wing
	I think, at least. Took a lot of research and trial and error to get this to compile, working out how to set up the multi-threading for async code. The idea here is to be able to process each repo in potentially multiple threads and do that processing work in parallel.
2021-06-05	run(): Move comment	Teddy Wing

2021-06-05	run(): Remove `unwrap`s	Teddy Wing

2021-06-05	main(): Remove commented test code	Teddy Wing
	This is no longer relevant.
2021-06-05	Add commented GitHub fetch call with command line username argument	Teddy Wing
	Add command line argument value here in preparation for when we enable this code.
2021-06-05	Replace hard-coded values with command line option values	Teddy Wing

2021-06-05	Use database path from command line argument	Teddy Wing

2021-06-05	Move command line option parsing code to `run()`	Teddy Wing

2021-06-05	Add command line option parsing	Teddy Wing
	Define the options we want to take. Not using them yet.
2021-06-03	main::update_mtime(): Use the packed-refs file if no default branch ref	Teddy Wing
	A repository cloned with: $ git clone --mirror REPO doesn't have any ref files in `repo.git/refs/heads/*`. Instead, the refs are stored in `repo.git/packed-refs`. Update the pack file if the default branch ref file doesn't exist. CGit will look at the time on the 'packed-refs' file when that's the case.
2021-06-03	main(): Use a smaller forked repository for testing	Teddy Wing
	The Angular.js repo was 51 MB, while DDHotKey is 95 K.
2021-05-30	main::update_mtime(): Add function documentation	Teddy Wing

2021-05-30	Set repository mtime to GitHub `updated_at` time	Teddy Wing
	CGit reads the repository modification time from the following locations, in order from top to bottom: 1. agefile 2. repo.git/refs/heads/{default_branch \| "master"} 3. repo.git/packed-refs (https://git.zx2c4.com/cgit/tree/ui-repolist.c?id=bd6f5683f6cde4212364354b3139c1d521f40f39#n35) Update the `/refs/heads/{default_branch}` file mtime when cloning and updating the repo to match the GitHub `updated_at` time. This ensures that when mirroring old repositories, they don't appear at the top of the CGit repository index list when sorting by age.
2021-05-30	github::fetch_repos(): Extract username to function argument	Teddy Wing

2021-05-30	github::fetch_repos(): Fetch all repos from all pages	Teddy Wing
	Also switch from `reqwest::blocking` to async because I was getting this error, probably because I call `fetch_repos()` in the async 'tokio' function `main()`: thread 'main' panicked at 'Cannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context.', $HOME/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.1/src/runtime/blocking/shutdown.rs:51:21
2021-05-30	main: Move current main code to `run()`	Teddy Wing
	We'll call this from `main()` when things are more ready.
2021-05-30	main::mirror(): Remove handled comment	Teddy Wing
	This is all done now.
2021-05-30	main::mirror(): Make the base cgitrc file optional	Teddy Wing
	Allow user to not be required to specify a base cgitrc file for cloned repositories.
2021-05-30	main::mirror(): Copy a base cgitrc file into the mirrored repository	Teddy Wing
	This lets us define common cgitrc configuration for all mirrored repos.
2021-05-30	Only update repository description if the description changed	Teddy Wing
	Check the repository description that comes back from the GitHub API against our cached description in the database. Only write the new description if it changed so we can avoid writing to the file in that case.
2021-05-30	database: Always try to create the database and tables	Teddy Wing
	This is simpler, and means we don't have to check if the database file exists and only initialise if it doesn't. Here, we can just run the code and trust it will do the right thing in both cases.
2021-05-30	main::update(): Add TODO	Teddy Wing

2021-05-30	main::update(): Update repository description on fetch update	Teddy Wing
	If the repository was updated, write the description into the `description` file. Add a `github::Repo.description()` method to get an empty string if the description is `None`. This facilitates writing to the `description` file.
2021-05-30	main: Remove unused repo variable `r`	Teddy Wing
	Looks like I'm not going to be using this, since the functions in this match arm that take `database::Repo`s should take the one based on the `github::Repo` rather than the one fetched from the database.
2021-05-30	main: Fetch from repositories that exist and have been updated	Teddy Wing

2021-05-30	Clone forks to a `/fork/` path	Teddy Wing
	Separate source and fork repositories into different paths.