Command Palette
Search for a command to run...

What pgit is

A Git-like CLI that imports a repo into PostgreSQL so you can run SQL, pre-built analyses, and cross-history search on its commit history.

explanation · beginner · 5m
gilt für pgit 4
von Oliver Seifert ·

What pgit is

pgit is a Git-like version control CLI where the repository lives in PostgreSQL instead of on the filesystem. You import a git repo, and every version of every file lands in a handful of database tables. From there you ask questions: which files change together, who owns which subsystem, what the busiest week of the year was. No scripts, no parsing git log, no piping things through awk. Just answers.

The storage underneath is pg-xpatch, a PostgreSQL extension that keeps a thousand near-identical versions of a file for about the cost of one. pgit is, in the author's words, primarily a demo for that compression. It also happens to be a genuinely useful way to interrogate a codebase's history.

What it is good at

Importing a git repo

Pull a full history (merges, renames, real author dates) into Postgres with one command.

Pre-built analyses

churn, coupling, hotspots, authors, activity, bus-factor. One command each, no SQL.

Raw SQL

Everything is in tables you can query directly when a built-in analysis is not enough.

Search across history

Grep every version of every file, not just the current checkout.

What it is not

pgit is not trying to replace git for day-to-day development. Its history is append-only: an imported repo is immutable, and the primary loop is import, then analyze / sql / search. The reset and resolve commands that earlier versions had were removed in v4 for exactly this reason.

It can still act like git

init, add, commit, log, diff, checkout, blame, and friends all exist and behave the way you expect. They are a real (if secondary) workflow. But the reason to reach for pgit is the analysis, not the daily commit.

How it fits together

Three pieces do the work:

  • The CLI (pgit), a single Go binary with no runtime dependencies of its own.
  • A local PostgreSQL container running the pg-xpatch image, shared across every pgit repo on your machine. Each repo gets its own database inside it. This is why pgit needs Docker or Podman.
  • pg-xpatch, the table access method that delta-compresses the versioned content. pgit configures it; you rarely touch it directly.

A PostgreSQL connection URL doubles as your "remote". There is no separate auth system: if you can reach a pg-xpatch database, you can push, pull, and clone against it.

When it pays off

pgit shines when you want to understand a repository rather than develop in it:

  • finding hidden coupling between files that always change together
  • spotting maintenance hotspots and knowledge silos (bus-factor)
  • charting commit velocity over years
  • running any custom analytics you can express in SQL

It is a poorer fit when you just need a working tree and fast local commits; git already does that better. And the compression is competitive with git, not magic: it wins on source-heavy repos with incremental changes and loses on repos full of large binary or vendored assets. Compression vs git has the honest numbers.

Where to go next

Install pgit

Go, a prebuilt binary, or a system package, plus the one runtime requirement.

Quickstart

Import a real repo and run your first analysis in a few minutes.

How it works

The tables, the delta groups, and why the queries are shaped the way they are.

What pg-xpatch is

The compression engine underneath, documented separately.

Hat das auf deinem Setup funktioniert?

Noch nicht bewertet