We Have Github At Home
The art of writing software is the art of willful underestimation, combined with creative choice of goalposts. Last weekend I decided to quickly throw together my own github.
Over the past few years I’ve found my enjoyment of the github platform rapidly declining. It seems some boffin over at microsoft hq has commanded a hard pivot to AI and we regular folk are to be left dealing with the fallout.
It’s also about time they start squeezing on cost and features. I would be happy to pay for github (and in fact, I was happy paying), however, I’m not going to pay for llm garbage. For me at least, github has become sourceforge.
But what to use instead?
SourceHut is a bit much for me, mailing-list driven workflows aren’t my vibe, and anyhow their lack of support for large repositories is a non-starter for games.
Codeberg is open-source exclusive, which is cool, but I don’t want to be stuck with only public open-source repositories.
Self-hosted Gitlab is really just a pure clone of github, LLM memes and all, which is not an improvement. It’s also just pure overkill and complex to host.
Forgejo has a cringe name.
Bitbucket is made by Atlassian… Plus more LLM nonsense.
Azure Git Repos Just can’t manage to build a normal pricing page. Also under the microsoft umbrella, so hard to imagine it not becoming regular github over time.
With the reasonable options out of the way, the only choice remaining is to build and self-host my own software! The VPS which hosts this site is mostly idle anyway, and there’s always room for one more hobby project in the clown car called “free time”.
Let’s compare feature lists:
| github | git.nega.tv | |
|---|---|---|
| Git Hosting | ☑ | ☑ |
| SSH Repo Access | ☑ | ☑ |
| HTTPS Repo Access | ☑ | ☑ |
| Web Code View | ☑ | ☑ |
| Issue Tracking | ☑ | ☐ |
| Code Review | ☑ | ☐ |
| Automation | ☑ | ☐ |
| Project Wiki | ☑ | ☐ |
| Analytics and Social | ☑ | ☐ |
| AI Dogshit | ☑ | ☐ |
SSH Access
Basic SSH access for git repos is trivial,
- Create a linux user account for each git user.
- Set their shell to git-shell
However, that approach breaks down if you plan on more than a handful of users. It also fails if you need fine-grained access control where multiple maintainers can share a single account.
I had no plan for either of those things, but I wanted to support them regardless. I also wanted to isoloate git access behind a ‘git’ user account in the way that the Real Github does it. I’m a serious engineer.
Once again, there are a few turnkey options, most notably gitosis and gitolite, but I’ll roll my own.
The goal is to have a single git user, which everyone can access, and then to
filter repository access based on explicit permissions. There’s a handy feature
of sshd which I use, command= in the
AuthorizedKeys file.
Essentially, instead of executing the user’s login shell (configured in
/etc/passwd), I can configure per authorized key a specific command,
including arbitrary parameters, to run after authentication.
For example, the git user account’s authorized_keys file might look like:
command="/home/git/git-shell-multiplex josh",restrict sk-ecdsa-nsa-backdoor ...
command="/home/git/git-shell-multiplex sophie",restrict sk-ecdsa-nsa-backdoor ...
command="/home/git/git-shell-multiplex bazza69",restrict sk-ecdsa-nsa-backdoor ...
Then, I implement git-shell-multiplex (in Rust, of course) to run permission checks and validate users are only using git commands. Rust is a questionable choice for this kind of scripting, but I don’t let bad ideas get in the way of doing whatever I like. The script is available here!.
In order to define the repository permissions I’m currently reusing the git-daemon-export-ok
marker file to enable read-access, and a new file, git-shell-multiplex-contributors,
which contains a list of users with write access1.
With that sorted, git clone git@nega.tv:josh/narcissus.git and
git push origin main work! Not too hard after all.
One downside is the configuration nightmare in the authorized_keys file, but
I’m just one person so it’s not a big deal. You could also replace the
authorized_keys file with an authorized_keys command using
AuthorizedKeysCommand in sshd_config.
This would allow writing a single script to look up the appropriate keys and
configuration from a database shared with the multiplexer.
HTTPS Git Access
Git has two different http protocols, a simple v1 protocol, and a smart v2 protocol. Since I don’t care about pushing over https, I just setup git-http-backend in read-only configuration.
nginx.conf (excerpt)
# requests that need to go to git-http-backend
location ~ ^(.*)\.git/(HEAD|info/refs|objects/info/.*|git-(upload|receive)-pack)$ {
include fastcgi_params;
gzip off;
fastcgi_param SCRIPT_FILENAME /usr/libexec/git-core/git-http-backend;
fastcgi_param PATH_INFO $1/$2;
fastcgi_param GIT_PROJECT_ROOT /var/git;
fastcgi_param REMOTE_USER $remote_user;
fastcgi_pass unix:/run/fcgiwrap.sock;
}
I filter out a bunch of git-specific paths and punt them to git-http-backend
over fastcgi. Of special note are the captures in the location regex; they let
me drop the .git suffix when constructing PATH_INFO so an incoming url like
https://git.nega.tv/josh/git-shell-multiplex.git invokes git-http-backend
without the suffix, finding the actual repository at /var/git/josh/git-shell-multiplex/.
git-http-backend looks for the file git-daemon-export-ok in a repo, and only
allows access to those with the marker file. This leaks the existance of private
repos, since it reports ‘no permissions’ rather than ‘not found’ but for my
purposes this isn’t a big deal.
Web Interface
There are two simple options for a web based repository viewer, cgit, and gitweb. Since gitweb is hosted as part of the git distribution I use that one, but honestly I’m not super excited about either option. This also requires fastcgi.
nginx.conf (excerpt)
location /gitweb.cgi {
include fastcgi_params;
gzip off;
fastcgi_param SCRIPT_FILENAME /var/www/git/gitweb.cgi;
fastcgi_param PATH_INFO $uri;
fastcgi_param GITWEB_CONFIG /etc/gitweb.conf;
fastcgi_pass unix:/run/fcgiwrap.sock;
}
location / {
root /var/www/git;
index gitweb.cgi;
}
There’s nothing exciting in the config here. In gitweb.conf I enable syntax
highlighting and blame, but that’s it. In the same way as git-http-backend,
gitweb is configured to only expose repositories which contain the
git-daemon-export-ok marker file.
FastCGI Setup
Honestly I’m somewhat baffled by the fact that CGI still exists. One of the first work tasks I ever was given was configuring CGI scripts, and somehow this practice continues in the current day. Incredible.
Both gitweb, and git-http-backend use FastCGI, so I set it up to play nice with the git user and the httpd user. For sanity, I set this up as a systemd service triggered by a systemd socket.
/etc/systemd/system/fcgiwrap.socket
[Unit]
Description=fcgiwrap Socket
[Socket]
ListenStream=/run/fcgiwrap.sock
[Install]
WantedBy=sockets.target
/etc/systemd/system/fcgiwrap.service
[Unit]
Description=Simple CGI Server
After=nss-user-lookup.target
[Service]
ExecStart=/usr/sbin/fcgiwrap
User=git
Group=nginx
StandardError=syslog
[Install]
Also=fcgiwrap.socket
Take note of the user and group. git tooling doesn’t like repositories that it doesn’t own, so I make the owner ‘git’, and nginx needs access to the unix socket, so I use the ‘nginx’ group. This is probably not ideal - you might want to change the groups of each user instead.
Also note the StandardError=syslog line. It’s important when trying to debug.
Future Work
I’ve duct-taped enough software together to create the basics of a scm host. It’s good enough for me, but I wouldn’t mind having some more bells and whistles.
Issue Tracker.
Code Review.
Build automation, and BORS equivalent.
Web view with code analysis. Would be nice to somehow plug rust-analyzer into the code view so it’s actually explorable.
Beautiful Rust monolith web application instead of nightmare configuration file soup.
However these are all significantly more work than is achievable in a single weekend; they’ll have to wait until next weekend.
You can find the fruits of my labors over at git.nega.tv
This is in-addition to the account which contains the repository. So ‘josh’ always has full access to ‘josh/repo.git’↩