Age | Commit message (Collapse) | Author |
|
Return all errors from repo processing. This allows us to provide
information on all errors that happened while processing, but continue
processing all the repos even if there's an error in one of them.
A new `MultiError` type wraps a list of errors to do this.
|
|
Not sure when or why I added this.
|
|
Remove my old tests now that we have a multi-threading setup that
actually works.
|
|
Use 'rayon' to parallelise the repository processing. Each repository is
processed in a thread in the default 'rayon' pool.
In order to get thread-safe access to the database, I followed some
advice from a Stack Overflow answer by VasiliNovikov
(https://stackoverflow.com/users/1091436/vasilinovikov):
https://stackoverflow.com/questions/62560396/how-to-use-sqlite-via-rusqlite-from-multiple-threads/62560397#62560397
VasiliNovikov recommended creating a database connection pool using
'r2d2_sqlite'. This way we don't have to share a database connection
between threads, but each thread can have its own connection.
This also means we can remove mutable requirements in a bunch of places
involving our `database::Db` type since we're no longer managing the
database connections directly.
|
|
Remove all async from the project by switching from 'reqwest' to 'ureq'.
This should make the code simpler, and hopefully enable us to try out
multithreading.
|
|
To allow us to work out where the error is coming from.
|
|
|
|
Remove all the async database calls and Tokio spawning. Still haven't
worked out the error code 21 database error from earlier, but this will
hopefully allow us to use normal threads directly.
|
|
Trying to get rid of async. This compiles, but fails with the following
runtime error:
Error code 21: Library used incorrectly
Need to investigate further.
|
|
Still isn't multi-threaded. Not sure what I'm doing wrong.
|
|
|
|
|
|
Looks like the work doesn't happen on multiple threads. All of the tasks
printed the same thread ID. Need to do some more work to get this
working properly, it seems.
|
|
Collect all errors into a list. I think I'm going to return them as a
list from this function.
The runtime appears a lot slower with this change. Need to figure out
what that's about.
|
|
Use the Tokio runtime we created to run the blocking async tasks.
Trying to set this up so I can get results back from the spawned tasks,
but I'm currently having trouble working out how to extract them from
the async task and return them from `run()`. I suppose I could just
print out the errors directly in that `while let` loop, but ideally I'd
like to return all errors from `run()` rather than printing in `run()`.
|
|
To separate the actions more.
|
|
|
|
Remove the hard-coded test repositories I was using and replace them
with real ones retrieved from the GitHub API.
Enable I/O and timers on the Tokio runtime in order to enable the async
GitHub API request.
|
|
Don't panic here so we can use our own error message template.
|
|
Print the error instead of unwrapping.
|
|
|
|
Allows a maximum repo size to be given as a command line argument.
Repos larger than this will not be mirrored. This gives us a way to save
server space by avoiding gigantic repositories.
|
|
Remove this now that we have something that I think works.
|
|
Rather that relying on the Cargo features we've enabled to define this,
create a multi-threaded runtime in code.
|
|
I think, at least. Took a lot of research and trial and error to get
this to compile, working out how to set up the multi-threading for async
code. The idea here is to be able to process each repo in potentially
multiple threads and do that processing work in parallel.
|
|
|
|
|
|
This is no longer relevant.
|
|
Add command line argument value here in preparation for when we enable
this code.
|
|
|
|
|
|
|
|
Define the options we want to take. Not using them yet.
|
|
A repository cloned with:
$ git clone --mirror REPO
doesn't have any ref files in `repo.git/refs/heads/*`. Instead, the refs
are stored in `repo.git/packed-refs`. Update the pack file if the
default branch ref file doesn't exist. CGit will look at the time on the
'packed-refs' file when that's the case.
|
|
The Angular.js repo was 51 MB, while DDHotKey is 95 K.
|
|
|
|
CGit reads the repository modification time from the following
locations, in order from top to bottom:
1. agefile
2. repo.git/refs/heads/{default_branch | "master"}
3. repo.git/packed-refs
(https://git.zx2c4.com/cgit/tree/ui-repolist.c?id=bd6f5683f6cde4212364354b3139c1d521f40f39#n35)
Update the `/refs/heads/{default_branch}` file mtime when cloning and
updating the repo to match the GitHub `updated_at` time.
This ensures that when mirroring old repositories, they don't appear at
the top of the CGit repository index list when sorting by age.
|
|
|
|
Also switch from `reqwest::blocking` to async because I was getting this
error, probably because I call `fetch_repos()` in the async 'tokio'
function `main()`:
thread 'main' panicked at 'Cannot drop a runtime in a context where
blocking is not allowed. This happens when a runtime is dropped from
within an asynchronous context.',
$HOME/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.6.1/src/runtime/blocking/shutdown.rs:51:21
|
|
We'll call this from `main()` when things are more ready.
|
|
This is all done now.
|
|
Allow user to not be required to specify a base cgitrc file for cloned
repositories.
|
|
This lets us define common cgitrc configuration for all mirrored repos.
|
|
Check the repository description that comes back from the GitHub API
against our cached description in the database. Only write the new
description if it changed so we can avoid writing to the file in that
case.
|
|
This is simpler, and means we don't have to check if the database file
exists and only initialise if it doesn't. Here, we can just run the code
and trust it will do the right thing in both cases.
|
|
|
|
If the repository was updated, write the description into the
`description` file.
Add a `github::Repo.description()` method to get an empty string if the
description is `None`. This facilitates writing to the `description`
file.
|
|
Looks like I'm not going to be using this, since the functions in this
match arm that take `database::Repo`s should take the one based on the
`github::Repo` rather than the one fetched from the database.
|
|
|
|
Separate source and fork repositories into different paths.
|