From 64dcbca06ad48e1a4003138538fa6b55bd79e764 Mon Sep 17 00:00:00 2001 From: agra Date: Wed, 27 May 2026 21:29:08 +0300 Subject: [PATCH] =?UTF-8?q?ffi=20issue-0049:=20new-form=20variadic=20cross?= =?UTF-8?q?-module=20LLVM=20crash=20=E2=80=94=20xfail=20lock-in?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Migrating stdlib's `path_join` to the new variadic syntax (`(..parts: []string) -> string`) surfaces a latent compiler bug: `resolveParamType` and `packVariadicCallArgs` treat the new-form declaration the same as the legacy `parts: ..string` and wrap the element type in `sliceOf` regardless of whether it already is one. The new form's `[]string` becomes `[][]string`; the call-site marshal pack emits `[N x string]` (correct) but the callee stores its slice param into a `[]([]string)`-typed slot. The shape mismatch propagates as null/undef Refs that crash `LLVMBuildExtractValue` inside `emitStrCmp` during emission. `examples/121-ios-sim-bundle.sx` (existing) and the new focused `examples/174-new-form-variadic-cross-module.sx` both fail today with the segfault. The next commit fixes `resolveParamType` + `packVariadicCallArgs` so both flip green. Stdlib's `format` / `print` / `open` and the example fixtures stay on the legacy form in this commit — they migrate in the follow-up cleanup commit. --- .../174-new-form-variadic-cross-module.sx | 28 +++ ...m-variadic-cross-module-llvm-emit-crash.md | 160 ++++++++++++++++++ library/modules/std.sx | 2 +- .../174-new-form-variadic-cross-module.exit | 1 + .../174-new-form-variadic-cross-module.txt | 4 + 5 files changed, 194 insertions(+), 1 deletion(-) create mode 100644 examples/174-new-form-variadic-cross-module.sx create mode 100644 issues/0049-new-form-variadic-cross-module-llvm-emit-crash.md create mode 100644 tests/expected/174-new-form-variadic-cross-module.exit create mode 100644 tests/expected/174-new-form-variadic-cross-module.txt diff --git a/examples/174-new-form-variadic-cross-module.sx b/examples/174-new-form-variadic-cross-module.sx new file mode 100644 index 0000000..0cb1c21 --- /dev/null +++ b/examples/174-new-form-variadic-cross-module.sx @@ -0,0 +1,28 @@ +// Regression: new-form variadic `..name: []T` defined in an imported +// module is callable from another module without crashing LLVM emit. +// +// Before the fix in `resolveParamType` + `packVariadicCallArgs`, +// the new-form variadic's element type went through one extra +// `sliceOf` wrap (the helpers treated `..parts: []string` the same +// as the legacy `parts: ..string` and added a slice level on top +// of the already-declared slice). The double-wrapped `[][]T` +// signature mismatched what the call-site marshalling emitted as +// `[N x T]`, producing null/undef Refs that crashed +// `LLVMBuildExtractValue` inside `emitStrCmp` during emission. +// +// Today's stdlib `path_join` uses the new form +// (`(..parts: []string) -> string`); it lives in `modules/std.sx` +// and is called here from the test module. Two- and three-arg +// shapes must round-trip the slice through the function-call +// boundary and concatenate the parts with '/'. Empty join (no +// args) returns "". + +#import "modules/std.sx"; + +main :: () -> s32 { + print("{}\n", path_join()); + print("{}\n", path_join("a")); + print("{}\n", path_join("a", "b")); + print("{}\n", path_join("a", "b", "c", "d")); + return 0; +} diff --git a/issues/0049-new-form-variadic-cross-module-llvm-emit-crash.md b/issues/0049-new-form-variadic-cross-module-llvm-emit-crash.md new file mode 100644 index 0000000..fc9b6e8 --- /dev/null +++ b/issues/0049-new-form-variadic-cross-module-llvm-emit-crash.md @@ -0,0 +1,160 @@ +# 0049 — new-form variadic `..name: []Type` defined in an imported module crashes LLVM emit + +## Symptom + +A pack-fn declared with the **new** variadic syntax +`..name: []Type` (the form the FFI plan migrates to, replacing +the legacy `name: ..Type`) crashes LLVM IR emission with a +null-operand `LLVMBuildExtractValue` inside `emitStrCmp` when: + +1. The pack-fn lives in an **imported** module (e.g. `library/modules/std.sx`). +2. The caller is in a **different** module than the definition. + +The same function written with the legacy `name: ..Type` syntax +compiles and runs cleanly. The same new-form definition placed +**locally** in the caller's module (or shadowing the imported +name) also compiles cleanly. The two together — new form + import +boundary — are what trip the emit. + +``` +Segmentation fault at address 0x0 +???:?:?: 0x... in __ZN4llvm5Value11setNameImplERKNS_5TwineE (.../libLLVM.dylib) +???:?:?: 0x... in _LLVMBuildExtractValue (.../libLLVM.dylib) +/Users/agra/projects/sx/src/ir/emit_llvm.zig:3570:48: in emitStrCmp (sx) + const rhs_ptr = c.LLVMBuildExtractValue(b, rhs, 0, "str.rp"); + ^ +/Users/agra/projects/sx/src/ir/emit_llvm.zig:1805:45: in emitInst (sx) + .str_eq => |bin| self.emitStrCmp(bin, true), +``` + +So an emitted `.str_eq` op has a `rhs` Ref that resolves to a null +LLVM Value. The `.str_eq` is somewhere downstream of the +new-form pack-fn's monomorphisation — most likely an +`any_to_string` / format-side string comparison that the migrated +call path threads back. The `rhs` was either never materialised in +the imported-mono's IR, or was registered against a stale +function/module slot that `emit_llvm` resolves to null. + +## Reproduction + +Modify `library/modules/std.sx`: + +```sx +- path_join :: (parts: ..string) -> string { ++ path_join :: (..parts: []string) -> string { +``` + +Body unchanged. Then: + +```sx +// repro.sx +#import "modules/std.sx"; + +main :: () { + p := path_join("a", "b"); + print("{}\n", p); +} +``` + +`zig build && ./zig-out/bin/sx run repro.sx` → segfault as above. + +Negative controls (all compile and run fine): + +```sx +// Same body, OLD form — works: +path_join :: (parts: ..string) -> string { ... } + +// New form but defined LOCALLY in the caller: +path_join :: (..parts: []string) -> string { ... } +#import "modules/std.sx"; // imported anyway, just no path_join call +main :: () { p := path_join("a", "b"); ... } + +// New form with `$`-prefixed pack name — works EITHER locally or imported: +path_join :: (..$parts: []string) -> string { ... } +``` + +So the bug is specifically: +- new-form variadic (`..name: []Type`) +- WITHOUT the `$` prefix on `name` +- defined in a module that gets imported (not in the caller's own file) + +Suite state when the bug first surfaced: commit `0ede097` (master, +2026-05-27, just after the issue-0048 fix landed and the suite +was green at 213/213). The only delta on top is the +`path_join :: (parts: ..string)` → `path_join :: (..parts: []string)` +edit in `library/modules/std.sx`. + +## Investigation prompt + +The FFI plan migrates all stdlib variadic decls from the legacy +form to the new `..name: []Type` form (`path_join`, `format`, +`print`, plus the foreign `open` decl, plus the example fixtures). +Per the FFI cadence rule the migration is supposed to be a +mechanical textual change with identical semantics. This bug +blocks that. + +The fault location (`emitStrCmp` line 3570 with null `rhs`) is the +crash point, not the root cause. The root cause is one of: + +1. **Cross-module pack-fn monomorphisation** — the new-form path + in `monomorphizePackFn` registers the mono'd function in the + current module, but if the mono'd body uses a Ref that + resolves through a stale module/function context, the LLVM + pass through `emit_llvm.functions[fid]` lookup hands back a + null. Compare the new-form mono path to the legacy `name: + ..Type` mono path side-by-side — look for any place the latter + threads the caller's module ID / FuncId but the former forgets + to. +2. **Synthesised slot names** — the new-form pack-fn body uses + synthesised `__pack__` idents for per-position arg + substitution (per CHECKPOINT-FFI step-2b). If these are + re-emitted in the caller's IR against the imported function's + body without re-resolving against the caller's scope, they'd + appear as undef in the final LLVM pass. +3. **The `$` workaround as a hint** — the bug disappears when the + pack name is `..$parts: []string`. Inside the parser / + `isPackFn` discriminator, the `$` prefix routes the function + through the heterogeneous-pack mono path; the new-form + WITHOUT `$` likely routes through a near-but-not-identical + path. Diff the two routes — what does the `$` version do that + the no-`$` version skips when the call crosses an import? + +Where to start: + +- `src/ir/lower.zig` — `monomorphizePackFn` (around line 8460), + `materialisePackSlice` (around line 8261), `buildPackSliceValue` + (around line 8225). Trace which gets called for the new-form + no-`$` path. +- `src/ir/lower.zig:lowerPackFnCall` — call-site mono dispatch. + Look for a `$`-prefix branch. +- `src/parser.zig:parseParam` (variadic handling) — confirm what + AST shape the two forms produce. The new form sets `is_variadic + = true` AND `is_comptime = false` (no `$`); the new form WITH + `$` sets both true. The mono path probably gates on `is_comptime`. + +Verification step: + +After the fix, run with the path_join edit re-applied: + +```sh +git diff library/modules/std.sx # confirm new form lands +./zig-out/bin/sx run examples/121-ios-sim-bundle.sx +bash tests/run_examples.sh +``` + +Both should be green, no segfault. + +A regression test goes in +`examples/NNN-new-form-variadic-cross-module.sx` once the fix +lands. The migration of `path_join`, `format`, `print`, and +`open` then proceeds. + +## Why this matters + +The FFI plan calls for the legacy `name: ..Type` form to be +dropped entirely (`current/CHECKPOINT-FFI.md` references the +`'args: ..Any' is in the plan to change to '..args: []Any'` +migration). Every stdlib consumer plus the chess game needs the +new form. With this bug, stdlib can't be migrated — moving any +stdlib variadic to the new form breaks every program that imports +it. The migration is gated on this fix. diff --git a/library/modules/std.sx b/library/modules/std.sx index a172de0..264c731 100644 --- a/library/modules/std.sx +++ b/library/modules/std.sx @@ -191,7 +191,7 @@ xml_escape :: (s: string) -> string { // components and collapses duplicate separators at component // boundaries. Used for bundle paths where Apple .app and Android APK // both expect POSIX-style paths. -path_join :: (parts: ..string) -> string { +path_join :: (..parts: []string) -> string { result := ""; i := 0; while i < parts.len { diff --git a/tests/expected/174-new-form-variadic-cross-module.exit b/tests/expected/174-new-form-variadic-cross-module.exit new file mode 100644 index 0000000..573541a --- /dev/null +++ b/tests/expected/174-new-form-variadic-cross-module.exit @@ -0,0 +1 @@ +0 diff --git a/tests/expected/174-new-form-variadic-cross-module.txt b/tests/expected/174-new-form-variadic-cross-module.txt new file mode 100644 index 0000000..7dce32e --- /dev/null +++ b/tests/expected/174-new-form-variadic-cross-module.txt @@ -0,0 +1,4 @@ + +a +a/b +a/b/c/d