Files
sx/examples/issue-0025.sx
agra a1647eab9b metal: pause step 3b pending sx-side fixes (filed 0024-0030)
Step 3b code is wired across UIRenderer + GlyphCache + UIPipeline +
chess game (gpu_mode = .metal on iOS, MetalGPU bound via the GPU
protocol). macOS GL chess, iOS-sim GLES chess, and iOS-sim Metal
triangle (63-metal-clear.sx) all still render.

iOS-sim Metal chess crashes inside replaceRegion uploading the 1MB
font atlas. Bisecting that crash exposed several sx-language issues
where mid-bisect tracers (NSLog inside if/else branch bodies) didn't
produce output, blocking further investigation.

Filing each finding as examples/issue-NNNN.sx rather than working
around piecemeal:

Bugs:
- 0024 NSLog/foreign-call inside if/else body not producing output
- 0025 C-ABI param coercion incomplete for composites >16B
       (combined direct-call abiCoerceParamType TODO + call_indirect
        path that doesn't apply C-ABI coercion at all)
- 0026 replaceRegion 1MB upload crash (likely downstream of 0025)

Features needed for step 4 + cleanup:
- 0027 Obj-C block bridge (^{...}) for animateWithDuration:
- 0028 Optional protocol box (?GPU = null) replaces T = ---; has_T: bool
- 0029 destroy_texture/buffer/shader on GPU protocol
- 0030 extern cross-file globals

Library-side: renderer.sx + glyph_cache.sx + pipeline.sx gain a
`gpu: GPU = ---; has_gpu: bool` field pair + branches that route every
GL touchpoint through the protocol when has_gpu. glyph_cache.init
saves/restores those fields around its memset. pipeline.set_gpu()
propagates to renderer + font. Renderer's MSL shader source added as
UI_MSL_SRC using packed_float2/packed_float4 to keep the 12-float
interleaved vertex layout tight (48 bytes).

metal.sx: dual-phase init (init(null, 0, 0) for eager device+queue,
re-init with the layer once UIKit installs the SxMetalView).
setStorageMode:.shared on every texture descriptor to ensure CPU-
writable atlas pixels on Apple Silicon iOS-sim.

Regression suite: 68 passing, 0 failed. WASM chess build currently
broken under step 3b state (silent compiler crash); documented in
CHECKPOINT.md, likely fallout from one of the filed issues (probably
0028 — the verbose protocol-box pattern). Step 3b resumes after
0024-0030 land.
2026-05-17 21:17:17 +03:00

95 lines
4.5 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
// issue-0025: Composite types larger than 16 bytes are passed without the
// LLVM `byval(<ty>)` attribute, and the `call_indirect` (fn-pointer cast)
// path doesn't apply C-ABI parameter coercion at all. Both gaps cause
// silent shape-mismatch when sx code calls foreign C functions that take
// large aggregates by value, OR when sx code calls a sx fn through a
// fn-pointer typed with a large-aggregate parameter.
//
// ── Two failing forms ─────────────────────────────────────────────────────
//
// (A) Direct call to a sx function with a >16B param:
//
// Wide :: struct { a: s64; b: s64; c: s64; d: s64; } // 32 bytes
// accept :: (w: Wide) -> s64 { w.a + w.b + w.c + w.d; }
// accept(Wide.{ a = 1, b = 10, c = 100, d = 1000 }) // expect 1111
//
// src/ir/emit_llvm.zig:2747-2795 (`abiCoerceParamType`):
// - <=8 bytes → coerced to i64
// - 9-16 bytes → coerced to [2 x i64]
// - >16 bytes → returns llvm_ty unchanged with TODO at line 2793
//
// The TODO is the bug: large composites should be coerced to `ptr`
// with a `byval(struct.T)` LLVM attribute. LLVM's mid-end then
// materializes the right machine code per target. Today the struct
// is left as-is, which LLVM tries to pass across registers + stack
// slots in ways that don't match the C ABI promise.
//
// (B) Indirect call via fn-pointer cast (the `xx objc_msgSend` idiom):
//
// fn_ptr : (Wide) -> s64 = xx accept;
// fn_ptr(Wide.{ ... })
//
// src/ir/emit_llvm.zig:902-967 (`.call_indirect`): both the
// FunctionInfo-known arm (939-952) and the LLVMTypeOf-fallback arm
// (953-956) construct `param_tys[j]` WITHOUT routing through
// `abiCoerceParamType`. So even if (A) is fixed, fn-pointer-cast call
// sites still mis-marshal large composites.
//
// ── Real-world impact ──────────────────────────────────────────────────────
//
// Every `xx objc_msgSend` call site in library/modules/platform/uikit.sx
// + library/modules/gpu/metal.sx. Works in practice today only because:
// - We never pass aggregates >16 bytes by value through fn-pointer casts
// (workaround: declare param as `*T` + pass `@local`; arm64 AAPCS's
// indirect-by-ref happens to match this machine-state-wise).
// - HFAs (CGSize 2×f64, MTLClearColor 4×f64, CGRect 4×f64 as return)
// are correctly classified at emit_llvm.zig:2766-2779.
//
// ── Workarounds in use ─────────────────────────────────────────────────────
//
// library/modules/gpu/metal.sx declares MTLRegion (48B) + MTLScissorRect
// (32B) call sites with `*MTLRegion` / `*MTLScissorRect` and passes
// `@region` / `@rect`. Should not be needed once this issue is fixed.
//
// ── Fix sketch ─────────────────────────────────────────────────────────────
//
// (A) emit_llvm.zig:2793 — return `ptr` and emit `byval(struct.T)` on
// the param via `LLVMAddCallSiteAttribute` / `LLVMCreateTypeAttribute`.
// At call sites, alloca + memcpy + pass the alloca pointer. Apply
// identically at function-definition emission so direct calls roundtrip.
//
// (B) emit_llvm.zig:902-967 — factor out a helper
// `coerceCallParams(param_count, src_args, dst_fn_param_tys)
// -> (coerced_args, coerced_tys)` that wraps `abiCoerceParamType`.
// Use the helper from both arms.
//
// Edge cases to preserve:
// - Variadic foreign functions (printf family) — variadic tail per
// AAPCS64 still passes composites in their natural form. Keep
// existing behavior for variadic args.
// - HFAs already handled at 2766-2779 — don't touch.
// - Structs <=8 bytes coerced to `i64`, 9-16 bytes to `[2 x i64]` —
// don't touch.
#import "modules/std.sx";
Wide :: struct {
a: s64; b: s64; c: s64; d: s64;
}
accept :: (w: Wide) -> s64 {
w.a + w.b + w.c + w.d;
}
main :: () -> s32 {
w := Wide.{ a = 1, b = 10, c = 100, d = 1000 };
direct := accept(w); // exercises path (A)
if direct != 1111 { return 1; }
fn_ptr : (Wide) -> s64 = xx accept;
indirect := fn_ptr(w); // exercises path (B)
if indirect != 1111 { return 2; }
0;
}