Step 3b code is wired across UIRenderer + GlyphCache + UIPipeline +
chess game (gpu_mode = .metal on iOS, MetalGPU bound via the GPU
protocol). macOS GL chess, iOS-sim GLES chess, and iOS-sim Metal
triangle (63-metal-clear.sx) all still render.
iOS-sim Metal chess crashes inside replaceRegion uploading the 1MB
font atlas. Bisecting that crash exposed several sx-language issues
where mid-bisect tracers (NSLog inside if/else branch bodies) didn't
produce output, blocking further investigation.
Filing each finding as examples/issue-NNNN.sx rather than working
around piecemeal:
Bugs:
- 0024 NSLog/foreign-call inside if/else body not producing output
- 0025 C-ABI param coercion incomplete for composites >16B
(combined direct-call abiCoerceParamType TODO + call_indirect
path that doesn't apply C-ABI coercion at all)
- 0026 replaceRegion 1MB upload crash (likely downstream of 0025)
Features needed for step 4 + cleanup:
- 0027 Obj-C block bridge (^{...}) for animateWithDuration:
- 0028 Optional protocol box (?GPU = null) replaces T = ---; has_T: bool
- 0029 destroy_texture/buffer/shader on GPU protocol
- 0030 extern cross-file globals
Library-side: renderer.sx + glyph_cache.sx + pipeline.sx gain a
`gpu: GPU = ---; has_gpu: bool` field pair + branches that route every
GL touchpoint through the protocol when has_gpu. glyph_cache.init
saves/restores those fields around its memset. pipeline.set_gpu()
propagates to renderer + font. Renderer's MSL shader source added as
UI_MSL_SRC using packed_float2/packed_float4 to keep the 12-float
interleaved vertex layout tight (48 bytes).
metal.sx: dual-phase init (init(null, 0, 0) for eager device+queue,
re-init with the layer once UIKit installs the SxMetalView).
setStorageMode:.shared on every texture descriptor to ensure CPU-
writable atlas pixels on Apple Silicon iOS-sim.
Regression suite: 68 passing, 0 failed. WASM chess build currently
broken under step 3b state (silent compiler crash); documented in
CHECKPOINT.md, likely fallout from one of the filed issues (probably
0028 — the verbose protocol-box pattern). Step 3b resumes after
0024-0030 land.
95 lines
4.5 KiB
Plaintext
95 lines
4.5 KiB
Plaintext
// issue-0025: Composite types larger than 16 bytes are passed without the
|
||
// LLVM `byval(<ty>)` attribute, and the `call_indirect` (fn-pointer cast)
|
||
// path doesn't apply C-ABI parameter coercion at all. Both gaps cause
|
||
// silent shape-mismatch when sx code calls foreign C functions that take
|
||
// large aggregates by value, OR when sx code calls a sx fn through a
|
||
// fn-pointer typed with a large-aggregate parameter.
|
||
//
|
||
// ── Two failing forms ─────────────────────────────────────────────────────
|
||
//
|
||
// (A) Direct call to a sx function with a >16B param:
|
||
//
|
||
// Wide :: struct { a: s64; b: s64; c: s64; d: s64; } // 32 bytes
|
||
// accept :: (w: Wide) -> s64 { w.a + w.b + w.c + w.d; }
|
||
// accept(Wide.{ a = 1, b = 10, c = 100, d = 1000 }) // expect 1111
|
||
//
|
||
// src/ir/emit_llvm.zig:2747-2795 (`abiCoerceParamType`):
|
||
// - <=8 bytes → coerced to i64
|
||
// - 9-16 bytes → coerced to [2 x i64]
|
||
// - >16 bytes → returns llvm_ty unchanged with TODO at line 2793
|
||
//
|
||
// The TODO is the bug: large composites should be coerced to `ptr`
|
||
// with a `byval(struct.T)` LLVM attribute. LLVM's mid-end then
|
||
// materializes the right machine code per target. Today the struct
|
||
// is left as-is, which LLVM tries to pass across registers + stack
|
||
// slots in ways that don't match the C ABI promise.
|
||
//
|
||
// (B) Indirect call via fn-pointer cast (the `xx objc_msgSend` idiom):
|
||
//
|
||
// fn_ptr : (Wide) -> s64 = xx accept;
|
||
// fn_ptr(Wide.{ ... })
|
||
//
|
||
// src/ir/emit_llvm.zig:902-967 (`.call_indirect`): both the
|
||
// FunctionInfo-known arm (939-952) and the LLVMTypeOf-fallback arm
|
||
// (953-956) construct `param_tys[j]` WITHOUT routing through
|
||
// `abiCoerceParamType`. So even if (A) is fixed, fn-pointer-cast call
|
||
// sites still mis-marshal large composites.
|
||
//
|
||
// ── Real-world impact ──────────────────────────────────────────────────────
|
||
//
|
||
// Every `xx objc_msgSend` call site in library/modules/platform/uikit.sx
|
||
// + library/modules/gpu/metal.sx. Works in practice today only because:
|
||
// - We never pass aggregates >16 bytes by value through fn-pointer casts
|
||
// (workaround: declare param as `*T` + pass `@local`; arm64 AAPCS's
|
||
// indirect-by-ref happens to match this machine-state-wise).
|
||
// - HFAs (CGSize 2×f64, MTLClearColor 4×f64, CGRect 4×f64 as return)
|
||
// are correctly classified at emit_llvm.zig:2766-2779.
|
||
//
|
||
// ── Workarounds in use ─────────────────────────────────────────────────────
|
||
//
|
||
// library/modules/gpu/metal.sx declares MTLRegion (48B) + MTLScissorRect
|
||
// (32B) call sites with `*MTLRegion` / `*MTLScissorRect` and passes
|
||
// `@region` / `@rect`. Should not be needed once this issue is fixed.
|
||
//
|
||
// ── Fix sketch ─────────────────────────────────────────────────────────────
|
||
//
|
||
// (A) emit_llvm.zig:2793 — return `ptr` and emit `byval(struct.T)` on
|
||
// the param via `LLVMAddCallSiteAttribute` / `LLVMCreateTypeAttribute`.
|
||
// At call sites, alloca + memcpy + pass the alloca pointer. Apply
|
||
// identically at function-definition emission so direct calls roundtrip.
|
||
//
|
||
// (B) emit_llvm.zig:902-967 — factor out a helper
|
||
// `coerceCallParams(param_count, src_args, dst_fn_param_tys)
|
||
// -> (coerced_args, coerced_tys)` that wraps `abiCoerceParamType`.
|
||
// Use the helper from both arms.
|
||
//
|
||
// Edge cases to preserve:
|
||
// - Variadic foreign functions (printf family) — variadic tail per
|
||
// AAPCS64 still passes composites in their natural form. Keep
|
||
// existing behavior for variadic args.
|
||
// - HFAs already handled at 2766-2779 — don't touch.
|
||
// - Structs <=8 bytes coerced to `i64`, 9-16 bytes to `[2 x i64]` —
|
||
// don't touch.
|
||
|
||
#import "modules/std.sx";
|
||
|
||
Wide :: struct {
|
||
a: s64; b: s64; c: s64; d: s64;
|
||
}
|
||
|
||
accept :: (w: Wide) -> s64 {
|
||
w.a + w.b + w.c + w.d;
|
||
}
|
||
|
||
main :: () -> s32 {
|
||
w := Wide.{ a = 1, b = 10, c = 100, d = 1000 };
|
||
direct := accept(w); // exercises path (A)
|
||
if direct != 1111 { return 1; }
|
||
|
||
fn_ptr : (Wide) -> s64 = xx accept;
|
||
indirect := fn_ptr(w); // exercises path (B)
|
||
if indirect != 1111 { return 2; }
|
||
|
||
0;
|
||
}
|