// issue-0025: Composite types larger than 16 bytes are passed without the // LLVM `byval()` attribute, and the `call_indirect` (fn-pointer cast) // path doesn't apply C-ABI parameter coercion at all. Both gaps cause // silent shape-mismatch when sx code calls foreign C functions that take // large aggregates by value, OR when sx code calls a sx fn through a // fn-pointer typed with a large-aggregate parameter. // // ── Two failing forms ───────────────────────────────────────────────────── // // (A) Direct call to a sx function with a >16B param: // // Wide :: struct { a: s64; b: s64; c: s64; d: s64; } // 32 bytes // accept :: (w: Wide) -> s64 { w.a + w.b + w.c + w.d; } // accept(Wide.{ a = 1, b = 10, c = 100, d = 1000 }) // expect 1111 // // src/ir/emit_llvm.zig:2747-2795 (`abiCoerceParamType`): // - <=8 bytes → coerced to i64 // - 9-16 bytes → coerced to [2 x i64] // - >16 bytes → returns llvm_ty unchanged with TODO at line 2793 // // The TODO is the bug: large composites should be coerced to `ptr` // with a `byval(struct.T)` LLVM attribute. LLVM's mid-end then // materializes the right machine code per target. Today the struct // is left as-is, which LLVM tries to pass across registers + stack // slots in ways that don't match the C ABI promise. // // (B) Indirect call via fn-pointer cast (the `xx objc_msgSend` idiom): // // fn_ptr : (Wide) -> s64 = xx accept; // fn_ptr(Wide.{ ... }) // // src/ir/emit_llvm.zig:902-967 (`.call_indirect`): both the // FunctionInfo-known arm (939-952) and the LLVMTypeOf-fallback arm // (953-956) construct `param_tys[j]` WITHOUT routing through // `abiCoerceParamType`. So even if (A) is fixed, fn-pointer-cast call // sites still mis-marshal large composites. // // ── Real-world impact ────────────────────────────────────────────────────── // // Every `xx objc_msgSend` call site in library/modules/platform/uikit.sx // + library/modules/gpu/metal.sx. Works in practice today only because: // - We never pass aggregates >16 bytes by value through fn-pointer casts // (workaround: declare param as `*T` + pass `@local`; arm64 AAPCS's // indirect-by-ref happens to match this machine-state-wise). // - HFAs (CGSize 2×f64, MTLClearColor 4×f64, CGRect 4×f64 as return) // are correctly classified at emit_llvm.zig:2766-2779. // // ── Workarounds in use ───────────────────────────────────────────────────── // // library/modules/gpu/metal.sx declares MTLRegion (48B) + MTLScissorRect // (32B) call sites with `*MTLRegion` / `*MTLScissorRect` and passes // `@region` / `@rect`. Should not be needed once this issue is fixed. // // ── Fix sketch ───────────────────────────────────────────────────────────── // // (A) emit_llvm.zig:2793 — return `ptr` and emit `byval(struct.T)` on // the param via `LLVMAddCallSiteAttribute` / `LLVMCreateTypeAttribute`. // At call sites, alloca + memcpy + pass the alloca pointer. Apply // identically at function-definition emission so direct calls roundtrip. // // (B) emit_llvm.zig:902-967 — factor out a helper // `coerceCallParams(param_count, src_args, dst_fn_param_tys) // -> (coerced_args, coerced_tys)` that wraps `abiCoerceParamType`. // Use the helper from both arms. // // Edge cases to preserve: // - Variadic foreign functions (printf family) — variadic tail per // AAPCS64 still passes composites in their natural form. Keep // existing behavior for variadic args. // - HFAs already handled at 2766-2779 — don't touch. // - Structs <=8 bytes coerced to `i64`, 9-16 bytes to `[2 x i64]` — // don't touch. #import "modules/std.sx"; Wide :: struct { a: s64; b: s64; c: s64; d: s64; } accept :: (w: Wide) -> s64 { w.a + w.b + w.c + w.d; } main :: () -> s32 { w := Wide.{ a = 1, b = 10, c = 100, d = 1000 }; direct := accept(w); // exercises path (A) if direct != 1111 { return 1; } fn_ptr : (Wide) -> s64 = xx accept; indirect := fn_ptr(w); // exercises path (B) if indirect != 1111 { return 2; } 0; }