Migrating ReadonlyREST's build catalog off AWS S3 — in progress

I'm migrating Beshu Tech's own ReadonlyREST build catalog — 145 versions, 8 years of release history, ~103,088 objects, 1.71 TB on AWS S3 (eu-west-1) — to Hetzner Object Storage (hel1) via DeltaGlider. This page is the engineering retrospective: what worked, what broke, what I fixed.

Final compression ratio will be published when the full migration completes (next week or two as of writing).

Why migrate at all

Versioned artifact storage compounds. The same set of Kibana plugin variants is uploaded for every release across 8 years of 1.12.x → 1.69.0. Most of the bytes are identical between versions. Storing each release as a full copy is paying to store the same bytes hundreds of times.

I picked Hetzner Object Storage as the destination because it's the cheapest S3-compatible option that fits the migration shape: cold-tier-friendly pricing, EU jurisdiction, and no enterprise sales motion — which is fine for this use case. Beshu Tech has no partnership with Hetzner; this is a vendor choice, not a sponsored decision.

Migration shape

  • Source: s3://readonlyrest-data/build/ (AWS S3, eu-west-1)
  • Destination: s3://beshu/ror/builds/ (Hetzner, hel1)
  • Runner: single t4g.medium EC2 in eu-west-1 (intra-region read = no AWS egress charge)
  • Pipeline per version: aws s3 sync → merge enterprise/free/pro/* into legacy/deltaglider cp -r to Hetzner
  • Cold storage: older 99 versions were on Glacier Deep Archive; restored via Bulk tier (~$2 total). ~$9 in EC2 compute so far.

Per-version results so far (verified, SHA-256 round-tripped)

Version Size (Encrypted) Verdict
1.69.0
warm-path
2.8 GB 32 MB -98.8%
Excellent
1.38.0
warm-path, multi-shape
13.7 GB 3.3 GB -75.6%
Fair (cross-major ES range)
1.17.1
cold from Glacier
26 MB 6.8 MB -74.0%
Small dataset, dominated by reference

The 1.38.0 result is publishable as-is because it tells the truth: the legacy deltaspace mixes ES 6/7/8 Kibana plugins — structurally dissimilar artifacts. Root deltaspace compressed 15.2×; legacy deltaspace only 2.7×. The Delta Efficiency Panel correctly classifies legacy as "Fair, near Poor." Useful signal for an operator who wants to fix it.

What broke and how I fixed it

Real migrations don't go in a straight line. Here's what tripped me:

Bug 1: deltaglider CLI hardcoded /tmp

The Python deltaglider 6.1.1 CLI hardcoded /tmp for working files. /tmp on my EC2 was tmpfs, 1.9 GB. Mid-large-version migration: out of space.

Fix: patched client.py + main.py in-place to honor os.environ.get("TMPDIR"). Upstreaming the fix.

Bug 2: 30 GB EBS too small

Staging 13 GB of source + xdelta3 temp files + downloaded archive + staging area = busted the 30 GB EBS at the first multi-GB version.

Fix: grew EBS to 100 GB. Cost difference: ~$8/month for the migration window. Cheap.

Bug 3: pipefail + head silent kill

My migration script had set -o pipefail and piped deltaglider output through grep -E ... | head -200. After 200 matched lines, head closed the pipe. pipefail made the script exit. Migration silently aborted mid-upload.

Fix: dropped the | head. Logs are slightly noisier but the script doesn't lie about whether it succeeded.

Bug 4: Traefik 60s read timeout

Production Coolify deployment had Traefik with the default respondingTimeouts.readTimeout = 60s. Cross- internet uploads of multi-GB files hit exactly 60s and got killed with a 502.

Fix: set Traefik readTimeout to 30m. Documented the gotcha in /docs/troubleshooting because every reverse-proxy default is wrong for large S3 uploads.

How I'm verifying the migration is correct

For every migrated version:

  1. SHA-256 of the source object (computed by aws s3api head-object)
  2. SHA-256 of the round-tripped object via the DeltaGlider proxy (GET, hash, compare)
  3. Manifest check: same object count, same total source size

No exceptions logged so far. Bit-perfect round-trip on every sampled object (root, legacy, universal deltaspaces).

Final ratio

Published when the full 145-version migration completes. The mix of "Excellent" newer versions and "Fair" older cross-major versions means the average will be somewhere between 10× and 50×. Watch this page.

Reproducibility

When the migration completes, this page will include:

  • The migration script (open source)
  • The per-version manifest (CSV)
  • The SHA-256 verification script
  • A tarball with raw results so anyone can verify

This is the kind of case study that's hard to fake. That's the point.

Try the same migration on your data