Files
sx/design/bundled-zig-link-backend-design.md
agra b6a7378af4 feat(dist): bundled-zig link backend for hermetic macOS/Linux/Windows builds
Drive a bundled `zig` as `zig cc` for the AOT link step, supplying lld + CRT
+ libc (musl/glibc/mingw) so `sx build` produces native binaries with no host
toolchain. Default Linux output is static musl (portable-anywhere).

- src/zig_backend.zig: discover zig ($SX_ZIG / bundled-next-to-exe / PATH);
  bundled-vs-PATH provenance gates auto-activation.
- src/target.zig: selectZigLinker + emitZigLinkArgv + zigTargetTriple, dispatched
  before the per-OS branches; macOS/Linux/Windows in scope.
- src/ir/emit_llvm.zig: LLVMNormalizeTargetTriple so vendor-less zig triples
  (e.g. x86_64-windows-gnu) parse to the correct OS/object format (COFF not ELF).
- src/main.zig: --self-contained / --no-self-contained; linux-musl, linux-musl-arm,
  windows-gnu shorthands; de-vendor linux/linux-arm to match the corpus runner.
- examples/1660: Windows Win32 print-42 + exit(0) via kernel32 (ir-only off-Windows).

Auto-activates only for a bundled zig; a PATH-only zig engages under
--self-contained, so native dev/CI builds are never silently rerouted.

Docs: readme Cross-Compilation, design/bundled-zig-link-backend-design.md, current/PLAN-DIST.md.
2026-06-16 15:56:06 +03:00

385 lines
18 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Bundled `zig` Link Backend for sx — Design Doc & Proposal
> Status: **core landed (macOS / Linux / Windows).** This is the
> design-of-record for how a distributed sx links native binaries
> hermetically. The phased plan lives in
> [../current/PLAN-DIST.md](../current/PLAN-DIST.md); keep the two in sync.
> User-facing surface is documented in `readme.md` (Cross-Compilation §).
---
## Implementation status (landed)
The core backend is implemented and verified on a macOS host:
| Target | Result | Notes |
|--------|--------|-------|
| `--target linux-musl` | static ELF | `zig cc -target x86_64-linux-musl -static` |
| `--target windows-gnu` | PE32+ | `zig cc -target x86_64-windows-gnu` |
| `--target macos` | Mach-O (runs) | `zig cc -target <arch>-macos`, no `-static` |
What shipped, and where it **refined** the original locked decisions:
- **Scope = macOS + Linux + Windows** (not Linux-first). iOS/Android/wasm keep
their specialized toolchains. (`TargetConfig.zigBackendInScope`.)
- **Auto-activation = a *bundled* zig is found** (a real distribution, or a
pinned `$SX_ZIG`). A `PATH`-only zig is the dev fallback and engages **only**
under `--self-contained` — so native dev/CI builds are never silently
rerouted, across all three OSes. This is the precise meaning of the §5.5
"zig found (B)" column: **B = bundled**. *(Refinement of "auto when zig
found": PATH-zig does not auto-engage; the musl-only auto gating considered
mid-design was dropped in favor of bundled-vs-PATH, which is OS-agnostic.)*
- **No translation table** (per the triple-scheme decision): sx triples are
passed straight to `zig cc`, and `emit_llvm` runs them through
`LLVMNormalizeTargetTriple` so vendor-less zig triples (e.g.
`x86_64-windows-gnu`) land their OS/env in LLVM's canonical positions —
otherwise "windows" sits in the vendor slot and the object silently falls
back to ELF. The one unavoidable exception is **macOS**: the object must be
emitted from Apple's `apple-darwin` triple (LLVM needs it for Mach-O), but
zig's `-target` parser rejects that scheme, so the *linker* triple alone is
the vendor-less `<arch>-macos`. One OS-specific line, not a table.
- **New shorthands:** `linux-musl`, `linux-musl-arm`, `windows-gnu` (zig
scheme). The existing `linux`/`linux-arm` shorthands were also de-vendored
(`x86_64-linux-gnu`, matching the corpus runner's own expander).
Files: `src/zig_backend.zig` (discovery), `src/target.zig`
(`selectZigLinker` / `emitZigLinkArgv` / `zigTargetTriple` / dispatch in
`link`), `src/ir/emit_llvm.zig` (triple normalization), `src/main.zig`
(`--self-contained` / `--no-self-contained` + shorthands).
Not yet done: distribution packaging (Phase 3 — vendoring `zig` into
`libexec/`), and a corpus regression test (needs the runner to thread
`--self-contained`; manual verification only so far).
The sections below are the original proposal; where they say "Linux-first" or
"follow-up" for macOS/Windows, the table above supersedes them.
---
## 0. TL;DR + feasibility
**Problem.** A distributed `sx` compiler can run on a Linux box (static-LLVM
binary + relocatable `library/`), but it cannot *finish a build*: the final
link step shells out to the host's `cc`, and relies on the host's libc + CRT
objects. No `cc`/glibc/SDK on the box → no binary. That is the gap between
"sx runs here" and "sx is a toolchain here."
**Proposal.** Bundle a pinned `zig` binary inside the sx distribution and use
`zig cc` as the link backend for `sx build`. `zig cc` brings its own lld,
CRT objects, and libc (musl or glibc) for the chosen target. Default Linux
output is **statically-linked musl**, which runs on any Linux with zero
dependencies — the property that makes Zig's own output portable.
**Feasibility: high.** The change is contained:
- The linker is selected through a single hook —
`TargetConfig.getLinker()` at `src/target.zig:194-196` — and the final
link argv is built in one place, the Unix `cc`-style branch at
`src/target.zig:524-564`.
- `zig cc` is a clang-compatible driver, so `-o` / `-L` / `-l` / extra
objects pass through that branch unchanged. The backend only has to
prepend `zig cc` and add `-target …` / `-static`.
- Exe-relative resolution (for finding the bundled zig) is already solved
for the stdlib in `src/imports.zig:204-227` and can be mirrored.
- `sx run` is JIT and never links, so it is wholly unaffected.
The cost is a ~5060 MB vendored `zig` (binary + its `lib/`) in the
distribution, and version-pinning discipline.
---
## 1. Motivation & background
### 1.1 Current state
| Concern | Today | File |
|---------|-------|------|
| Compiler binary | Self-containable via `-Dstatic-llvm` (no system LLVM) | `build.zig:9-10,156-162` |
| Stdlib | Relocatable, found relative to the exe | `src/imports.zig:204-227` |
| **Linking** | **Shells to system `cc`** | `src/target.zig:524-564` |
| **libc / CRT** | **Provided by the host `cc` driver implicitly** | (no `-lc`/crt passed) |
So two of three legs of a portable toolchain already stand. The third — the
linker and the libc/CRT it pulls in — is the host dependency this design
removes.
### 1.2 Why this matters for distribution
The goal is to hand someone a tarball and have `sx build app.sx` produce a
working binary on a stock Linux machine — a fresh container, a minimal CI
image, a box without `build-essential`. Today that fails at the link step.
Zig solved exactly this problem for its own users; since sx is *built with*
Zig, the cleanest fix is to stand on Zig's hermetic toolchain rather than
re-implement it.
---
## 2. Goals & non-goals
### Goals
- `sx build` produces a native Linux binary with **no host `cc`/ld/libc/SDK**.
- Default Linux output is **portable** (static musl): runs on any Linux.
- **Zero-config in the common case**: a bundled or PATH `zig` is detected and
used automatically; the operator sets nothing.
- A fully-specified, documented configuration surface (this document) for the
cases that *do* need tuning.
- No regression for existing users: system `cc` remains a fallback, and any
explicit `--linker` still wins.
### Non-goals (this iteration)
- Reimplementing lld in-process or building libc from source (see §7 —
Zig already does both; we reuse it).
- First-class Windows/macOS cross-compilation (nearly free as a follow-up,
but unverified — §11).
- Routing C-import compilation (`src/c_import.zig`, which also shells `cc`)
through the backend.
- Glibc-floor version pinning (`…-gnu.2.28`); exposed only if needed.
---
## 3. How Zig achieves hermetic builds (the model we're borrowing)
Zig's turnkey cross-compilation rests on bundling the two things sx borrows
from the host:
1. **In-process lld.** Zig embeds LLVM's lld (ELF/COFF/Mach-O/wasm) and links
without spawning an external linker.
2. **libc as data.** Zig ships musl *source* (builds `libc.a` + `crt*.o` on
demand, cached → static, no dynamic linker → portable output) and glibc
stubs generated from `.abilist` per version. For Windows it ships mingw
`.def` files and synthesizes import libraries.
`zig cc` exposes all of this behind a clang-compatible driver: `zig cc
-target x86_64-linux-musl -static foo.o -o foo` yields a portable binary on
any host, with nothing installed. **This design consumes that driver rather
than rebuilding its internals** — the whole second column above arrives for
free by vendoring the `zig` binary.
---
## 4. Design overview
`sx build` gains a **link backend** abstraction with two implementations:
- `system_cc` — today's behavior (shell `cc`, host libc).
- `bundled_zig` — shell `<zig> cc -target <triple> [-static] …`.
Selection is automatic (§5.5): if a usable `zig` is discovered and the user
gave no explicit `--linker`, `bundled_zig` is used; otherwise `system_cc`.
The backend plugs into the existing Unix link branch — it contributes the
leading `zig cc` tokens and the `-target`/`-static` flags; the rest of the
argv assembly is unchanged because `zig cc` is clang-compatible.
One supporting change: when `bundled_zig` is active, the triple handed to
LLVM in `src/ir/emit_llvm.zig` is aligned to the link target (`x86_64-linux`)
so the emitted object links cleanly against the selected musl CRT.
---
## 5. Detailed design (the configuration surface)
### 5.1 zig discovery — resolution order
`discoverZig()` (new `src/zig_backend.zig`) returns the first hit:
1. `$SX_ZIG` — explicit override.
2. `<exe_dir>/../libexec/zig/zig`**install layout** (§6).
3. `<exe_dir>/../../zig-bundle/zig`**dev vendored layout** (§6).
4. `zig` on `PATH`**dev fallback** (the only one active today).
`<exe_dir>` is resolved exactly as `src/imports.zig` resolves the stdlib.
If none resolve, behavior depends on activation (§5.5): auto-mode silently
falls back to `system_cc`; `--self-contained` errors.
### 5.2 Environment variables
| Var | Effect | Default |
|-----|--------|---------|
| `SX_ZIG` | Absolute path to the `zig` used as the link backend. Highest-priority discovery source. | unset |
| `ZIG_LIB_DIR` | Path to the bundled zig's `lib/`. Needed **only** if `zig` was relocated away from its `lib/`. In the supported layout (§6) they ship together and zig self-locates — leave unset. | unset |
| `SX_DEBUG_ZIG` | Trace discovery: each candidate path and the chosen one (or "none → cc"). Mirrors `SX_DEBUG_STDLIB`. | unset |
| `SX_DEBUG_LINK` | **Existing.** Prints the full link argv — shows the exact `zig cc …` invocation. | unset |
| `SX_STDLIB_PATH` | **Existing.** Stdlib override; unrelated to linking but noted because a full distribution sets neither and relies on exe-relative discovery for both. | unset |
### 5.3 CLI flags (`sx build`)
| Flag | Effect |
|------|--------|
| `--self-contained` | Force `bundled_zig` ON. If no usable zig is found, **error** — do not silently fall back. |
| `--no-self-contained` | Force `system_cc`. |
| `--linker <cmd>` | **Existing.** Explicit linker; supplying it **disables** auto-activation (user's choice wins). To pin a specific zig, prefer `SX_ZIG` + `--self-contained`. |
| `--target <triple\|shorthand>` | **Existing.** Selects target + ABI (§5.4). With `bundled_zig` active and target unspecified on a Linux host → `x86_64-linux-musl` static. |
| `--sysroot <path>` | **Existing.** Forwarded to the linker; rarely needed with `bundled_zig` (zig brings its own sysroot). |
### 5.4 Target → ABI mapping
The default (no `--target`) deliberately differs from the legacy `linux`
shorthand, because portable static output is the entire point.
| `sx` invocation | zig `-target` | Link mode | Portable? |
|-----------------|---------------|-----------|-----------|
| *(no `--target`, Linux host)* | `x86_64-linux-musl` | `-static` | ✅ any Linux |
| `--target linux-musl` *(new)* | `x86_64-linux-musl` | `-static` | ✅ |
| `--target linux` / `linux-x86` | `x86_64-linux-gnu` | dynamic | ❌ host glibc, versioned |
| `--target linux-arm` | `aarch64-linux-musl` | `-static` | ✅ |
| `--target windows` | `x86_64-windows-gnu` | per zig | follow-up (§11) |
| `--target macos` / `macos-arm` | `aarch64-macos` | per zig | follow-up (§11) |
- A **new** `linux-musl` shorthand is added; the existing `linux` shorthand
keeps its current gnu/dynamic meaning for back-compat.
- The LLVM emit triple is aligned to the link target so the `.o` links
cleanly against the selected libc/CRT (§4).
### 5.5 Activation truth table
`B` = a usable zig was discovered (§5.1). Subcommand = `sx build`.
| `--self-contained` | `--no-self-contained` | `--linker` | zig found (B) | Result |
|:---:|:---:|:---:|:---:|--------|
| — | — | no | yes | **bundled_zig** (auto) |
| — | — | no | no | system `cc` (silent fallback) |
| — | — | yes | * | user's `--linker` |
| yes | — | * | yes | **bundled_zig** (forced) |
| yes | — | * | no | **error**: `--self-contained` but no zig |
| — | yes | * | * | system `cc` (forced off) |
- `--self-contained` + `--linker` together: backend choice goes to
`--self-contained`; treat the literal combination as a usage error
(document, don't guess).
- `sx run` / `sx ir` / `sx asm` never link → backend not consulted.
### 5.6 Emit-triple alignment
`src/ir/emit_llvm.zig` (`LLVMSetTarget`, ~L246-284) currently uses the host
default triple when `--target` is unspecified (on Linux,
`x86_64-unknown-linux-gnu`). When `bundled_zig` is active, set the module
triple to match the link target (`x86_64-linux`) so codegen and the musl CRT
agree. Pure codegen objects are ABI-compatible across gnu/musl; aligning the
triple removes the edge-case risk (TLS model, stack protector) up front.
---
## 6. Distribution layout (packaging)
A relocatable tree; everything resolves relative to `bin/sx`, so the whole
directory moves/untars anywhere with no env vars set:
```
sx-<os>-<arch>/
├── bin/
│ └── sx # built -Dstatic-llvm (no system LLVM dep)
├── libexec/
│ └── zig/
│ ├── zig # pinned zig binary
│ └── lib/ # zig's lib/ (musl/glibc sources, lld data, …)
└── library/ # sx stdlib (existing discovery)
└── modules/…
```
Rules:
- `zig` and its `lib/` **must** ship together under `libexec/zig/` so zig
self-locates `lib/`; splitting them forces `ZIG_LIB_DIR`.
- Pinned zig version: **0.16.0** (matches the build toolchain). Record the
exact version in the release manifest — a mismatched `zig cc` CLI is the
likeliest future breakage.
- Vendor the matching zig release per host os/arch from ziglang.org at
package time.
---
## 7. Alternatives considered
| Alternative | Why not (now) |
|-------------|---------------|
| **In-process lld + bundled musl sysroot** (sx owns the pipeline; no zig) | Requires a custom LLVM build *with* lld — the Homebrew `llvm@19` here ships none (`liblld*.a`, headers, `ld.lld` all absent) — plus a C++ lld shim and per-arch prebuilt musl. Strictly more work for the same user-visible result. The right *eventual* target if we want zero foreign binaries; tracked as a follow-up. |
| **Full Zig-style: build libc from source on demand** | Most flexible (any arch/libc version, no prebuilt blobs) but the most work; only worth it after the in-process-lld path exists. |
| **Document a hard dependency on system `cc`** | Zero engineering, but defeats the goal — the box still needs `build-essential`. Acceptable only as the current fallback, not the distribution story. |
| **Bundle just `ld.lld` + a musl sysroot (no full zig)** | Smaller than a whole zig, but we'd hand-manage crt object selection, dynamic-linker paths, and import libs — i.e. re-derive what `zig cc` already encapsulates. Bundle-size saving doesn't justify the fragility. |
Vendoring `zig` wins on effort-to-result because sx already builds with Zig:
it's a first-party dependency, not a foreign toolchain, and it unlocks
Windows/macOS targets later for nearly free.
---
## 8. Phasing
Detail in [../current/PLAN-DIST.md](../current/PLAN-DIST.md). Summary:
0. **Resolve zig**`discoverZig()` + `SX_DEBUG_ZIG`; PATH fallback only.
1. **Link backend** — generalize the linker to a driver argv; emit
`zig cc -target … -static`; align the emit triple.
2. **Auto activation** — wire the §5.5 truth table; `cc` fallback intact.
3. **Packaging**`build.zig` `dist` step assembling the §6 tree.
4. **Verify & lock**`file`/`ldd` shows "statically linked"; host/arch-gated
corpus test honoring the snapshot-integrity + FFI-cadence rules.
The minimum end-to-end proof is Phases 0+1 against PATH zig.
---
## 9. Open decisions
**Locked:**
- Default Linux ABI = **static musl** (portable output).
- Activation = **auto** when a usable zig is found and no `--linker`.
- Dev uses **PATH zig**; vendoring deferred to Phase 3.
**Still open:**
- Exact spelling of the force flags (`--self-contained` vs e.g.
`--bundled-linker`); name chosen here pending review.
- Whether auto-mode should *warn* on silent `cc` fallback or stay quiet
(leaning quiet, with `SX_DEBUG_ZIG` for diagnosis).
- Whether to gate the Phase-4 corpus test behind a `.build` `target`
sidecar or keep it manual until a Linux CI runner exists.
---
## 10. Risks
- **Bundle size** ≈ 5060 MB (zig + `lib/`). Acceptable for a toolchain;
call it out in release notes.
- **zig CLI drift** across versions — pin hard, record in the manifest;
the most likely future breakage.
- **gnu vs musl ABI** for the emitted object — covered by the emit-triple
alignment (§5.6); TLS/stack-protector are the only realistic friction.
- **Operator confusion**: default-no-target (musl) diverging from the
`linux` shorthand (gnu). Mitigated by the new `linux-musl` shorthand and
explicit documentation (§5.4).
---
## 11. Out of scope / follow-ups
- **Windows / macOS targets** via the same `zig cc -target`: nearly free
after the Linux path, but Apple-SDK and Windows specifics need their own
verification — not documented as supported until tested.
- **`src/c_import.zig`** still shells system `cc` for C imports in JIT mode;
route through the backend later.
- **In-process lld** (alternative in §7) as the eventual zero-foreign-binary
endgame.
---
## Appendix — quick recipes (once implemented)
```sh
# Portable static Linux binary (default when a bundled zig is present):
sx build app.sx -o app
file app # → "ELF 64-bit … statically linked"
# Force the backend; fail loudly if no zig is bundled:
sx build app.sx --self-contained
# Use a specific zig:
SX_ZIG=/opt/zig-0.16.0/zig sx build app.sx --self-contained
# Opt out, use the system toolchain:
sx build app.sx --no-self-contained
# Dynamic glibc instead of static musl:
sx build app.sx --target linux
# Debug discovery + the exact link invocation:
SX_DEBUG_ZIG=1 SX_DEBUG_LINK=1 sx build app.sx
```