Files
sx/design/bundled-zig-link-backend-design.md
agra b6a7378af4 feat(dist): bundled-zig link backend for hermetic macOS/Linux/Windows builds
Drive a bundled `zig` as `zig cc` for the AOT link step, supplying lld + CRT
+ libc (musl/glibc/mingw) so `sx build` produces native binaries with no host
toolchain. Default Linux output is static musl (portable-anywhere).

- src/zig_backend.zig: discover zig ($SX_ZIG / bundled-next-to-exe / PATH);
  bundled-vs-PATH provenance gates auto-activation.
- src/target.zig: selectZigLinker + emitZigLinkArgv + zigTargetTriple, dispatched
  before the per-OS branches; macOS/Linux/Windows in scope.
- src/ir/emit_llvm.zig: LLVMNormalizeTargetTriple so vendor-less zig triples
  (e.g. x86_64-windows-gnu) parse to the correct OS/object format (COFF not ELF).
- src/main.zig: --self-contained / --no-self-contained; linux-musl, linux-musl-arm,
  windows-gnu shorthands; de-vendor linux/linux-arm to match the corpus runner.
- examples/1660: Windows Win32 print-42 + exit(0) via kernel32 (ir-only off-Windows).

Auto-activates only for a bundled zig; a PATH-only zig engages under
--self-contained, so native dev/CI builds are never silently rerouted.

Docs: readme Cross-Compilation, design/bundled-zig-link-backend-design.md, current/PLAN-DIST.md.
2026-06-16 15:56:06 +03:00

18 KiB
Raw Permalink Blame History

Bundled zig Link Backend for sx — Design Doc & Proposal

Status: core landed (macOS / Linux / Windows). This is the design-of-record for how a distributed sx links native binaries hermetically. The phased plan lives in ../current/PLAN-DIST.md; keep the two in sync. User-facing surface is documented in readme.md (Cross-Compilation §).


Implementation status (landed)

The core backend is implemented and verified on a macOS host:

Target Result Notes
--target linux-musl static ELF zig cc -target x86_64-linux-musl -static
--target windows-gnu PE32+ zig cc -target x86_64-windows-gnu
--target macos Mach-O (runs) zig cc -target <arch>-macos, no -static

What shipped, and where it refined the original locked decisions:

  • Scope = macOS + Linux + Windows (not Linux-first). iOS/Android/wasm keep their specialized toolchains. (TargetConfig.zigBackendInScope.)
  • Auto-activation = a bundled zig is found (a real distribution, or a pinned $SX_ZIG). A PATH-only zig is the dev fallback and engages only under --self-contained — so native dev/CI builds are never silently rerouted, across all three OSes. This is the precise meaning of the §5.5 "zig found (B)" column: B = bundled. (Refinement of "auto when zig found": PATH-zig does not auto-engage; the musl-only auto gating considered mid-design was dropped in favor of bundled-vs-PATH, which is OS-agnostic.)
  • No translation table (per the triple-scheme decision): sx triples are passed straight to zig cc, and emit_llvm runs them through LLVMNormalizeTargetTriple so vendor-less zig triples (e.g. x86_64-windows-gnu) land their OS/env in LLVM's canonical positions — otherwise "windows" sits in the vendor slot and the object silently falls back to ELF. The one unavoidable exception is macOS: the object must be emitted from Apple's apple-darwin triple (LLVM needs it for Mach-O), but zig's -target parser rejects that scheme, so the linker triple alone is the vendor-less <arch>-macos. One OS-specific line, not a table.
  • New shorthands: linux-musl, linux-musl-arm, windows-gnu (zig scheme). The existing linux/linux-arm shorthands were also de-vendored (x86_64-linux-gnu, matching the corpus runner's own expander).

Files: src/zig_backend.zig (discovery), src/target.zig (selectZigLinker / emitZigLinkArgv / zigTargetTriple / dispatch in link), src/ir/emit_llvm.zig (triple normalization), src/main.zig (--self-contained / --no-self-contained + shorthands).

Not yet done: distribution packaging (Phase 3 — vendoring zig into libexec/), and a corpus regression test (needs the runner to thread --self-contained; manual verification only so far).

The sections below are the original proposal; where they say "Linux-first" or "follow-up" for macOS/Windows, the table above supersedes them.


0. TL;DR + feasibility

Problem. A distributed sx compiler can run on a Linux box (static-LLVM binary + relocatable library/), but it cannot finish a build: the final link step shells out to the host's cc, and relies on the host's libc + CRT objects. No cc/glibc/SDK on the box → no binary. That is the gap between "sx runs here" and "sx is a toolchain here."

Proposal. Bundle a pinned zig binary inside the sx distribution and use zig cc as the link backend for sx build. zig cc brings its own lld, CRT objects, and libc (musl or glibc) for the chosen target. Default Linux output is statically-linked musl, which runs on any Linux with zero dependencies — the property that makes Zig's own output portable.

Feasibility: high. The change is contained:

  • The linker is selected through a single hook — TargetConfig.getLinker() at src/target.zig:194-196 — and the final link argv is built in one place, the Unix cc-style branch at src/target.zig:524-564.
  • zig cc is a clang-compatible driver, so -o / -L / -l / extra objects pass through that branch unchanged. The backend only has to prepend zig cc and add -target … / -static.
  • Exe-relative resolution (for finding the bundled zig) is already solved for the stdlib in src/imports.zig:204-227 and can be mirrored.
  • sx run is JIT and never links, so it is wholly unaffected.

The cost is a ~5060 MB vendored zig (binary + its lib/) in the distribution, and version-pinning discipline.


1. Motivation & background

1.1 Current state

Concern Today File
Compiler binary Self-containable via -Dstatic-llvm (no system LLVM) build.zig:9-10,156-162
Stdlib Relocatable, found relative to the exe src/imports.zig:204-227
Linking Shells to system cc src/target.zig:524-564
libc / CRT Provided by the host cc driver implicitly (no -lc/crt passed)

So two of three legs of a portable toolchain already stand. The third — the linker and the libc/CRT it pulls in — is the host dependency this design removes.

1.2 Why this matters for distribution

The goal is to hand someone a tarball and have sx build app.sx produce a working binary on a stock Linux machine — a fresh container, a minimal CI image, a box without build-essential. Today that fails at the link step. Zig solved exactly this problem for its own users; since sx is built with Zig, the cleanest fix is to stand on Zig's hermetic toolchain rather than re-implement it.


2. Goals & non-goals

Goals

  • sx build produces a native Linux binary with no host cc/ld/libc/SDK.
  • Default Linux output is portable (static musl): runs on any Linux.
  • Zero-config in the common case: a bundled or PATH zig is detected and used automatically; the operator sets nothing.
  • A fully-specified, documented configuration surface (this document) for the cases that do need tuning.
  • No regression for existing users: system cc remains a fallback, and any explicit --linker still wins.

Non-goals (this iteration)

  • Reimplementing lld in-process or building libc from source (see §7 — Zig already does both; we reuse it).
  • First-class Windows/macOS cross-compilation (nearly free as a follow-up, but unverified — §11).
  • Routing C-import compilation (src/c_import.zig, which also shells cc) through the backend.
  • Glibc-floor version pinning (…-gnu.2.28); exposed only if needed.

3. How Zig achieves hermetic builds (the model we're borrowing)

Zig's turnkey cross-compilation rests on bundling the two things sx borrows from the host:

  1. In-process lld. Zig embeds LLVM's lld (ELF/COFF/Mach-O/wasm) and links without spawning an external linker.
  2. libc as data. Zig ships musl source (builds libc.a + crt*.o on demand, cached → static, no dynamic linker → portable output) and glibc stubs generated from .abilist per version. For Windows it ships mingw .def files and synthesizes import libraries.

zig cc exposes all of this behind a clang-compatible driver: zig cc -target x86_64-linux-musl -static foo.o -o foo yields a portable binary on any host, with nothing installed. This design consumes that driver rather than rebuilding its internals — the whole second column above arrives for free by vendoring the zig binary.


4. Design overview

sx build gains a link backend abstraction with two implementations:

  • system_cc — today's behavior (shell cc, host libc).
  • bundled_zig — shell <zig> cc -target <triple> [-static] ….

Selection is automatic (§5.5): if a usable zig is discovered and the user gave no explicit --linker, bundled_zig is used; otherwise system_cc. The backend plugs into the existing Unix link branch — it contributes the leading zig cc tokens and the -target/-static flags; the rest of the argv assembly is unchanged because zig cc is clang-compatible.

One supporting change: when bundled_zig is active, the triple handed to LLVM in src/ir/emit_llvm.zig is aligned to the link target (x86_64-linux) so the emitted object links cleanly against the selected musl CRT.


5. Detailed design (the configuration surface)

5.1 zig discovery — resolution order

discoverZig() (new src/zig_backend.zig) returns the first hit:

  1. $SX_ZIG — explicit override.
  2. <exe_dir>/../libexec/zig/ziginstall layout (§6).
  3. <exe_dir>/../../zig-bundle/zigdev vendored layout (§6).
  4. zig on PATHdev fallback (the only one active today).

<exe_dir> is resolved exactly as src/imports.zig resolves the stdlib. If none resolve, behavior depends on activation (§5.5): auto-mode silently falls back to system_cc; --self-contained errors.

5.2 Environment variables

Var Effect Default
SX_ZIG Absolute path to the zig used as the link backend. Highest-priority discovery source. unset
ZIG_LIB_DIR Path to the bundled zig's lib/. Needed only if zig was relocated away from its lib/. In the supported layout (§6) they ship together and zig self-locates — leave unset. unset
SX_DEBUG_ZIG Trace discovery: each candidate path and the chosen one (or "none → cc"). Mirrors SX_DEBUG_STDLIB. unset
SX_DEBUG_LINK Existing. Prints the full link argv — shows the exact zig cc … invocation. unset
SX_STDLIB_PATH Existing. Stdlib override; unrelated to linking but noted because a full distribution sets neither and relies on exe-relative discovery for both. unset

5.3 CLI flags (sx build)

Flag Effect
--self-contained Force bundled_zig ON. If no usable zig is found, error — do not silently fall back.
--no-self-contained Force system_cc.
--linker <cmd> Existing. Explicit linker; supplying it disables auto-activation (user's choice wins). To pin a specific zig, prefer SX_ZIG + --self-contained.
--target <triple|shorthand> Existing. Selects target + ABI (§5.4). With bundled_zig active and target unspecified on a Linux host → x86_64-linux-musl static.
--sysroot <path> Existing. Forwarded to the linker; rarely needed with bundled_zig (zig brings its own sysroot).

5.4 Target → ABI mapping

The default (no --target) deliberately differs from the legacy linux shorthand, because portable static output is the entire point.

sx invocation zig -target Link mode Portable?
(no --target, Linux host) x86_64-linux-musl -static any Linux
--target linux-musl (new) x86_64-linux-musl -static
--target linux / linux-x86 x86_64-linux-gnu dynamic host glibc, versioned
--target linux-arm aarch64-linux-musl -static
--target windows x86_64-windows-gnu per zig follow-up (§11)
--target macos / macos-arm aarch64-macos per zig follow-up (§11)
  • A new linux-musl shorthand is added; the existing linux shorthand keeps its current gnu/dynamic meaning for back-compat.
  • The LLVM emit triple is aligned to the link target so the .o links cleanly against the selected libc/CRT (§4).

5.5 Activation truth table

B = a usable zig was discovered (§5.1). Subcommand = sx build.

--self-contained --no-self-contained --linker zig found (B) Result
no yes bundled_zig (auto)
no no system cc (silent fallback)
yes * user's --linker
yes * yes bundled_zig (forced)
yes * no error: --self-contained but no zig
yes * * system cc (forced off)
  • --self-contained + --linker together: backend choice goes to --self-contained; treat the literal combination as a usage error (document, don't guess).
  • sx run / sx ir / sx asm never link → backend not consulted.

5.6 Emit-triple alignment

src/ir/emit_llvm.zig (LLVMSetTarget, ~L246-284) currently uses the host default triple when --target is unspecified (on Linux, x86_64-unknown-linux-gnu). When bundled_zig is active, set the module triple to match the link target (x86_64-linux) so codegen and the musl CRT agree. Pure codegen objects are ABI-compatible across gnu/musl; aligning the triple removes the edge-case risk (TLS model, stack protector) up front.


6. Distribution layout (packaging)

A relocatable tree; everything resolves relative to bin/sx, so the whole directory moves/untars anywhere with no env vars set:

sx-<os>-<arch>/
├── bin/
│   └── sx                 # built -Dstatic-llvm (no system LLVM dep)
├── libexec/
│   └── zig/
│       ├── zig            # pinned zig binary
│       └── lib/           # zig's lib/ (musl/glibc sources, lld data, …)
└── library/               # sx stdlib (existing discovery)
    └── modules/…

Rules:

  • zig and its lib/ must ship together under libexec/zig/ so zig self-locates lib/; splitting them forces ZIG_LIB_DIR.
  • Pinned zig version: 0.16.0 (matches the build toolchain). Record the exact version in the release manifest — a mismatched zig cc CLI is the likeliest future breakage.
  • Vendor the matching zig release per host os/arch from ziglang.org at package time.

7. Alternatives considered

Alternative Why not (now)
In-process lld + bundled musl sysroot (sx owns the pipeline; no zig) Requires a custom LLVM build with lld — the Homebrew llvm@19 here ships none (liblld*.a, headers, ld.lld all absent) — plus a C++ lld shim and per-arch prebuilt musl. Strictly more work for the same user-visible result. The right eventual target if we want zero foreign binaries; tracked as a follow-up.
Full Zig-style: build libc from source on demand Most flexible (any arch/libc version, no prebuilt blobs) but the most work; only worth it after the in-process-lld path exists.
Document a hard dependency on system cc Zero engineering, but defeats the goal — the box still needs build-essential. Acceptable only as the current fallback, not the distribution story.
Bundle just ld.lld + a musl sysroot (no full zig) Smaller than a whole zig, but we'd hand-manage crt object selection, dynamic-linker paths, and import libs — i.e. re-derive what zig cc already encapsulates. Bundle-size saving doesn't justify the fragility.

Vendoring zig wins on effort-to-result because sx already builds with Zig: it's a first-party dependency, not a foreign toolchain, and it unlocks Windows/macOS targets later for nearly free.


8. Phasing

Detail in ../current/PLAN-DIST.md. Summary:

  1. Resolve zigdiscoverZig() + SX_DEBUG_ZIG; PATH fallback only.
  2. Link backend — generalize the linker to a driver argv; emit zig cc -target … -static; align the emit triple.
  3. Auto activation — wire the §5.5 truth table; cc fallback intact.
  4. Packagingbuild.zig dist step assembling the §6 tree.
  5. Verify & lockfile/ldd shows "statically linked"; host/arch-gated corpus test honoring the snapshot-integrity + FFI-cadence rules.

The minimum end-to-end proof is Phases 0+1 against PATH zig.


9. Open decisions

Locked:

  • Default Linux ABI = static musl (portable output).
  • Activation = auto when a usable zig is found and no --linker.
  • Dev uses PATH zig; vendoring deferred to Phase 3.

Still open:

  • Exact spelling of the force flags (--self-contained vs e.g. --bundled-linker); name chosen here pending review.
  • Whether auto-mode should warn on silent cc fallback or stay quiet (leaning quiet, with SX_DEBUG_ZIG for diagnosis).
  • Whether to gate the Phase-4 corpus test behind a .build target sidecar or keep it manual until a Linux CI runner exists.

10. Risks

  • Bundle size ≈ 5060 MB (zig + lib/). Acceptable for a toolchain; call it out in release notes.
  • zig CLI drift across versions — pin hard, record in the manifest; the most likely future breakage.
  • gnu vs musl ABI for the emitted object — covered by the emit-triple alignment (§5.6); TLS/stack-protector are the only realistic friction.
  • Operator confusion: default-no-target (musl) diverging from the linux shorthand (gnu). Mitigated by the new linux-musl shorthand and explicit documentation (§5.4).

11. Out of scope / follow-ups

  • Windows / macOS targets via the same zig cc -target: nearly free after the Linux path, but Apple-SDK and Windows specifics need their own verification — not documented as supported until tested.
  • src/c_import.zig still shells system cc for C imports in JIT mode; route through the backend later.
  • In-process lld (alternative in §7) as the eventual zero-foreign-binary endgame.

Appendix — quick recipes (once implemented)

# Portable static Linux binary (default when a bundled zig is present):
sx build app.sx -o app
file app        # → "ELF 64-bit … statically linked"

# Force the backend; fail loudly if no zig is bundled:
sx build app.sx --self-contained

# Use a specific zig:
SX_ZIG=/opt/zig-0.16.0/zig sx build app.sx --self-contained

# Opt out, use the system toolchain:
sx build app.sx --no-self-contained

# Dynamic glibc instead of static musl:
sx build app.sx --target linux

# Debug discovery + the exact link invocation:
SX_DEBUG_ZIG=1 SX_DEBUG_LINK=1 sx build app.sx