Metrics reference

Every Prometheus metric the proxy exposes, with labels, types, and bucket boundaries.

Storage analytics dashboard

GET /_/metrics returns Prometheus text format on the same port as the S3 API. Metrics are collected via lock-free atomics on the hot path — no mutexes, no sampling, no performance impact.

For scrape configuration, Grafana panels, and alerting rules, see Monitoring and alerts.

Quick sanity check

curl -s http://localhost:9000/_/metrics | head -20

# If you have promtool:
curl -s http://localhost:9000/_/metrics | promtool check metrics

Process and build

MetricTypeLabelsDescription
process_start_time_secondsGaugeUnix timestamp when the process started
deltaglider_build_infoGaugeversion, backend_typeAlways 1; labels carry build metadata
process_peak_rss_bytesGaugePeak resident set size (updated on scrape)
process_* (Linux only)variousStandard process collector: RSS, CPU seconds, open FDs, virtual memory

HTTP requests

MetricTypeLabelsDescription
deltaglider_http_requests_totalCountermethod, status, operationTotal requests by method, HTTP status code, S3 operation
deltaglider_http_request_duration_secondsHistogrammethod, operationRequest latency distribution
deltaglider_http_request_size_bytesHistogrammethodRequest body size distribution
deltaglider_http_response_size_bytesHistogrammethodResponse body size distribution

operation label values (bounded)

ValueMeaning
list_bucketsGET /
head_rootHEAD /
list_objectsGET /:bucket
create_bucketPUT /:bucket
delete_bucketDELETE /:bucket
head_bucketHEAD /:bucket
post_bucketPOST /:bucket (batch delete)
get_objectGET /:bucket/*key
put_objectPUT /:bucket/*key
delete_objectDELETE /:bucket/*key
head_objectHEAD /:bucket/*key
post_objectPOST /:bucket/*key (multipart)
healthGET /health
statsGET /stats
metricsGET /_/metrics

Histogram buckets

  • Duration: default Prometheus buckets (0.005s … 10s)
  • Body sizes: exponential [1KB, 10KB, 100KB, 1MB, 10MB, 100MB]

Delta compression

MetricTypeLabelsDescription
deltaglider_delta_compression_ratioHistogramRatio distribution (delta_size / original_size). Lower = better; 0.1 = 90% saved
deltaglider_delta_bytes_saved_totalCounterCumulative bytes saved by delta compression
deltaglider_delta_encode_duration_secondsHistogramTime spent in xdelta3 encode
deltaglider_delta_decode_duration_secondsHistogramTime spent in xdelta3 decode
deltaglider_delta_decisions_totalCounterdecisionStorage decision counts

decision label values

  • delta — stored as a delta patch against the reference baseline
  • passthrough — stored as-is (non-eligible file type, or poor compression ratio)
  • reference — new reference baseline created for a deltaspace

Histogram buckets

  • Codec duration: [1ms, 5ms, 10ms, 25ms, 50ms, 100ms, 250ms, 500ms, 1s, 2.5s, 5s, 10s, 30s]
  • Compression ratio: [0.01, 0.05, 0.1, 0.15, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]

Cache

MetricTypeLabelsDescription
deltaglider_cache_hits_totalCounterReference cache hits (cheap Bytes refcount clone)
deltaglider_cache_misses_totalCounterReference cache misses (triggers backend read)
deltaglider_cache_size_bytesGaugeCurrent weighted cache size (updated on scrape)
deltaglider_cache_entriesGaugeCurrent number of cached reference entries
deltaglider_cache_max_bytesGaugeConfigured max capacity (constant, set at startup)
deltaglider_cache_utilization_ratioGaugeweighted_size / max_capacity (0.0–1.0)
deltaglider_cache_miss_rate_ratioGaugemisses / (hits + misses) since startup (0.0–1.0)

The ratio gauges are pre-computed so dashboards + alerts don't need PromQL arithmetic:

deltaglider_cache_utilization_ratio > 0.9   # cache nearly full
deltaglider_cache_miss_rate_ratio > 0.5     # cache thrashing

Codec concurrency

MetricTypeLabelsDescription
deltaglider_codec_semaphore_availableGaugeAvailable xdelta3 subprocess permits. 0 = all slots busy

Auth

MetricTypeLabelsDescription
deltaglider_auth_attempts_totalCounterresultAuth attempts: success or failure
deltaglider_auth_failures_totalCounterreasonFailure breakdown: missing_header, invalid_presigned, invalid_signature

Auth metrics stay at zero when SigV4 is disabled.

Label cardinality

All label sets are bounded:

LabelMax values
method~5 (GET, PUT, HEAD, DELETE, POST)
status~15 HTTP status codes in practice
operation15 (see table above)
decision3 (delta, passthrough, reference)
result2 (success, failure)
reason3 (missing_header, invalid_presigned, invalid_signature)

No bucket names, no object keys in labels. No unbounded cardinality.

What's NOT in /_/metrics

/_/stats returns aggregate storage statistics (total_objects, total_original_size, total_stored_size, savings_percentage, truncated). These are intentionally excluded from /_/metrics because computing them requires scanning storage objects. The endpoint has a 10-second server-side cache and caps at 1,000 objects (the truncated field signals more exist). Use /_/stats for admin dashboards; use /_/metrics for Prometheus.

Implementation details

  • Counters and histograms use the prometheus crate's atomic collectors — no mutex on the hot path.
  • Gauges requiring state inspection (cache_size_bytes, codec_semaphore_available, process_peak_rss_bytes) are computed lazily on each scrape via O(1) atomic reads.
  • The HTTP metrics middleware sits between TraceLayer and auth, so it captures the full request lifecycle including auth time.
  • The process feature of the prometheus crate adds standard Linux process metrics. On macOS, only process_peak_rss_bytes is populated (via getrusage).