Adds `examples/ffi-jni-call-04-jint-return.sx` exercising
`#jni_call(s32)(env, target, "getCount", "()I")` inside a runtime-
reachable but never-invoked helper (`g_should_call` stays false, so
the dereferences don't fire). Today the emit_llvm switch falls
through to `LLVMGetUndef` for any non-void return — the IR snapshot
captures that placeholder.
The next commit adds the `.s32 => 49` (CallIntMethod) arm. The
snapshot will update to show the full GetObjectClass → GetMethodID →
CallIntMethod sequence (reusing the slot interning landed in 1.17,
since `("getCount", "()I")` is a fresh literal pair).
Two `#jni_call` sites with the same string-literal `(name, sig)` pair
now share a single `jclass` GlobalRef slot and a single `jmethodID`
slot, populated lazily on the first call to any matching site.
Non-literal sites keep the per-call `GetObjectClass` + `GetMethodID`
sequence from step 1.15.
Per-call-site lowering for literal sites:
%cached_mid = load ptr, @SX_JNI_MID_<key>
%is_cached = icmp ne ptr %cached_mid, null
br i1 %is_cached, cont, miss
miss:
%local_cls = GetObjectClass(env, target)
%global_cls = NewGlobalRef(env, local_cls) ; vtable slot 21
store ptr %global_cls, @SX_JNI_CLS_<key>
%fresh_mid = GetMethodID(env, global_cls, name, sig)
store ptr %fresh_mid, @SX_JNI_MID_<key>
br cont
cont:
%mid = phi ptr [%cached_mid, before], [%fresh_mid, miss]
call <Type>Method(env, target, %mid, args...)
Wiring:
- `JniMsgSend.cache_key: ?CacheKey` (new) carries `(name_str,
sig_str)` when both `name` and `sig` are string-literal AST nodes;
empty for non-literal call sites.
- `lower.zig` populates `cache_key` from the AST.
- `emit_llvm.zig` `getOrCreateJniSlots(name, sig)` returns the
`{cls_slot, mid_slot}` pair, creating and caching them on first
lookup. Key is `name\x00sig` so the separator can't collide with
any JNI identifier byte.
- `mangleJniKey` builds an LLVM-identifier suffix from the pair, used
in the `@SX_JNI_{CLS,MID}_<suffix>` global names.
IR snapshot at `tests/expected/ffi-jni-call-03-methodid-sharing.ir`
updated: two call sites against literal `("noop", "()V")` now share
`@SX_JNI_CLS_noop____V` and `@SX_JNI_MID_noop____V`. Pre-1.17 snapshot
had two independent `GetMethodID` calls; post-1.17 has one global
slot pair plus per-call lazy-init branches.
Note: an unrelated regression in `examples/ffi-objc-call-12-rect-u64-returns.sx`
exists in the working tree (parse error from an in-progress C-import
block) and is left untouched.
Adds `examples/ffi-jni-call-03-methodid-sharing.sx` with two
`#jni_call` sites against the same (class, method, sig). Today each
site emits its own `GetObjectClass` + `GetMethodID` + `Call<Type>Method`
sequence (8 vtable indirections total for the two-call test); 1.17
will collapse the two `GetMethodID` calls into a single cached
`jmethodID` static slot populated at module init, mirroring the
`OBJC_SELECTOR_REFERENCES_*` shape that 1.5 introduced for `#objc_call`.
Runtime is a no-op — `unused_jni` is reachable through a
runtime-readable `g_should_call` global that stays false, so the JNI
dereferences never execute. A plain `if false` would get
constant-folded, taking the function definition out of the IR
entirely; the global keeps both the function and its body present
for the IR-snapshot harness.
IR snapshot at `tests/expected/ffi-jni-call-03-methodid-sharing.ir`
locks the pre-caching shape. The next commit (1.17) updates it to the
collapsed shape.
113/113 host tests pass.
New `.jni_msg_send` IR opcode carrying `{env, target, name, sig,
args[], is_static}`. `lowerFfiIntrinsicCall` now dispatches on
`fic.kind`: `.objc_call` keeps the existing path; `.jni_call` and
`.jni_static_call` route through `lowerJniCall`, which emits the new
opcode.
emit_llvm.zig expands `.jni_msg_send` into the JNI vtable
indirection:
%ifs = load ptr, %env ; vtable
%get_obj_class = load ptr, gep(%ifs, i32 31)
%cls = call ptr %get_obj_class(%env, %target)
%get_method_id = load ptr, gep(%ifs, i32 33)
%mid = call ptr %get_method_id(%env, %cls, %name, %sig)
%call_void_method = load ptr, gep(%ifs, i32 61)
call void %call_void_method(%env, %target, %mid, args...)
Per step 1.15's scope: only `.jni_call` (instance) + `void` return
are wired through the switch. `.jni_static_call` (1.23) and the
non-void returns (1.18–1.22) drop to a placeholder `LLVMGetUndef` so
the build doesn't fault — the next-step commits flip those arms one
shape at a time. Method-ID caching is step 1.17.
Two small helpers landed alongside:
- `loadJniFn(ifs, offset, name)` — GEP into the vtable + load.
- `extractSlicePtr(val)` — string literals lower as `{ptr, i64}`
slices in sx IR; JNI's `GetMethodID` expects raw C strings, so
this extracts field 0 when the source is a slice.
Android cross-compile now passes for `examples/ffi-jni-call-02-void.sx`
(2/2 cross targets green). Host run_examples still passes 112/112.
Chess iOS-sim + Android both compile clean.
Adds `examples/ffi-jni-call-02-void.sx` exercising `#jni_call(void)
(env, target, "name", "sig")` inside an `inline if OS == .android`
arm, plus a new tuple in `tests/cross_compile.sh`. Host run_examples
passes (the inline-if strips the JNI body, leaving "skipped"); the
Android cross-compile FAILs because `lowerFfiIntrinsicCall` still
emits the placeholder diagnostic for any `fic.kind != .objc_call`.
Per the FFI cadence rule this is a test-add (xfail); the next
commit makes the Android cross-compile green by adding the
`.jni_msg_send` opcode and its emit_llvm expansion.
Closes the runtime-verification gap from cluster 1.32. The migrated
`uikit_keyboard_will_change_frame` body uses both shapes but isn't
reached by chess startup (the soft keyboard doesn't open without user
input), so runtime verification was transitive only: `#objc_call(CGRect)`
via the structurally-identical `#objc_call(UIEdgeInsets)` (4×f64 HFA)
in ffi-objc-call-07, and `#objc_call(u64)` via the LLVM-equivalent
`#objc_call(s64)` `hash` test in ffi-objc-call-04.
This example installs two IMPs via `class_addMethod`:
- `rect_imp` returns a CGRect of {10.5, 20.5, 30.5, 40.5} through the
32-byte HFA path (v0..v3 on AAPCS64).
- `u64_imp` returns `0x7FEDCBA987654321` through the i64 path.
`#objc_call(CGRect)` and `#objc_call(u64)` dispatch through them and
the values are printed for snapshot lockdown.
Reused the parser quirk noted in the checkpoint and in 0.1 — integer
literals ≥ 2^63 are rejected even when the receiving type is u64, so
the test value keeps the high bit clear.
111/111 host tests pass.
`collectCaptures` in `src/ir/lower.zig` was the closure free-variable
analyzer that decides which names from a closure body need to be
boxed into the env struct at lambda-build time. Its switch on AST
node kind enumerated every other shape (`.call`, `.if_expr`,
`.match_expr`, `.for_expr`, etc.) but no arm for `.ffi_intrinsic_call`,
so the trailing `else => {}` quietly dropped its `args[]` and
`return_type` walks. Names referenced inside `#objc_call(T)(recv,
"sel:", ...)` from a closure body never made it into the captures
list, so when lowering bound the closure scope from env, those names
came back as "unresolved".
The fix adds the missing arm — walk `return_type` and every `args[i]`
the same way `.call` walks `callee` + `args`.
Companion changes:
- `examples/issue-0038.sx` → `examples/103-ffi-closure-capture.sx`
(out of the open-issue namespace; comment header tightened to
describe the feature, not the historical bug).
- `examples/ffi-objc-call-09-in-construct.sx` drops the
`g_hasher_recv` module-global workaround that was added for this
bug — the closure now captures `recv` from `make_hasher`'s arg
list normally.
Uncomments the second passthrough case in `examples/issue-0038.sx`
that captures `recv` from the enclosing function into a closure body
that uses it inside `#objc_call(s64)(recv, "hash")`. Current behavior
is a hard error from the name-resolution pass:
examples/issue-0038.sx:28:48: error: unresolved: 'recv'
Snapshot locks the failure in (exit 1 + that error message) so the
next commit can flip it to passing without ambiguity. Per the FFI
cadence rule this is a test-add (xfail); the make-green follow-up
adds the missing recursion arm in `lower.zig`'s `collectCaptures` for
`.ffi_intrinsic_call` nodes.
Six remaining dispatch clusters migrated in one pass:
- `uikit_setup_renderbuffer`: `renderbufferStorage:fromDrawable:` (BOOL).
- `uikit_present_renderbuffer`: `presentRenderbuffer:` (BOOL, every frame).
- `uikit_gl_view_tick`: `targetTimestamp` and `duration` reads (f64,
every frame — three call sites total across the keyboard-anim path
and the frame-closure path).
- `uikit_compute_layer_pixel_size`: `bounds` (CGRect HFA).
- `uikit_touch_location`: `locationInView:` (CGPoint HFA — first
standalone `#objc_call(CGPoint)` exercise, structurally identical to
the 2×f64 NSPoint already verified by ffi-objc-call-05).
- `uikit_first_touch`: `anyObject` (*void).
Net -15 lines. uikit.sx is now 839 lines — Phase 1D started at 937,
so this is -98 cumulative across the migration. Zero `xx objc_msgSend`
typed casts left in the file.
iOS-sim chess regression smoke: launched chess, tapped a black pawn
through the Simulator window, watched the move (d7→d5) play, then a
second tap played d5→d4. The render loop, touch handlers, layout
math, and the BOOL-returning EAGL presentation calls are all on the
exercised path, so this is the strongest runtime verification any
Phase 1D commit has had so far.
22 `sel_registerName` calls remain in the file, all legitimate:
- `class_addMethod` IMP registrations (runtime class build-out).
- SEL-as-arg to dispatch selectors that take a SEL value
(`addObserver:selector:name:object:`,
`displayLinkWithTarget:selector:`). A future `#objc_selector("foo")`
literal would replace these, but it's not part of Phase 1.
The keyboard notification callback. First standalone exercises of
`#objc_call(CGRect)` (HFA — structurally equivalent to UIEdgeInsets,
already verified by 1.25 and ffi-objc-call-07) and `#objc_call(u64)`
(LLVM-equivalent to s64; ffi-objc-call-04 already locks in the i64
return path).
Migrates:
- `userInfo` (*void)
- `objectForKey:` with NSString arg (*void)
- `CGRectValue` (CGRect HFA)
- `doubleValue` (f64)
- `unsignedLongValue` (u64)
- `screen` (*void)
- `bounds` (CGRect HFA)
Net -14 lines. uikit.sx now 854 lines (-83 cumulative across Phase 1D).
iOS-sim chess regression smoke: launch is clean; the callback is
registered through cluster 1.30's notification-center wiring and the
function lowers without IR-verifier complaints. The callback body
itself isn't exercised at runtime by chess startup (the game doesn't
open the soft keyboard) — runtime verification of this specific
function is transitive via the other clusters that exercise the same
call shapes.
The biggest Phase 1D cluster: the iOS scene-lifecycle entry that runs
at every launch. UIWindow alloc/init, UIViewController alloc/init, GL
view alloc/init/install, root-view-controller wiring, layer access +
setOpaque:, EAGL drawable-properties dictionary build,
screen/nativeScale DPI scaling, makeKeyAndVisible, UITextField subview
install, CADisplayLink construct + addToRunLoop. Every return shape
this file uses (void, *void, f64) and every arg shape (BOOL via `xx
0`/`xx 1`, multi-arg selectors `displayLinkWithTarget:selector:` and
`setObject:forKey:`) is exercised by this single launch.
Net -44 lines on this commit (104 → 60). Also drops a stale
`EAGLContext := objc_getClass(...)` decl that wasn't referenced inside
this function — EAGL context creation lives in uikit_create_gl_context
(already migrated in 1.29). uikit.sx is now 868 lines (-69 cumulative
across Phase 1D).
iOS-sim chess regression smoke: app launches cleanly, board renders
with status-bar clearance, sharp DPI scaling, compositor working,
display-link tick driving frames. Every part of the migrated function
is on the launch path and all of it succeeds.
Closes the runtime-verification gap from cluster 1.28: chess startup
doesn't reach the keyboard `becomeFirstResponder` / `resignFirstResponder`
path, so `#objc_call(bool)` was only compile-verified. This example
installs two BOOL-returning IMPs via `class_addMethod` (type encoding
"B@:") and dispatches both through `#objc_call(bool)`. Also exercises
the nil-receiver guarantee (libobjc returns a zero slot, which decodes
as false).
This is a test-add commit (per the FFI cadence rule): it locks in
current behavior without changing any lowering. Lowering shape is
identical to `#objc_call(u8)` at the ABI layer; this test makes the
source-level type explicit and gives `git bisect` a target if a
future emit_llvm change inadvertently breaks single-byte returns.
110/110 host tests pass.
Apple documents `-becomeFirstResponder` and `-resignFirstResponder` as
returning `BOOL`. The pre-`#objc_call` cast pattern in this file used
`u8` because BOOL is ABI-equivalent to a 1-byte unsigned integer on
both i386 (signed char) and arm64 (`bool`). The initial 1.28
migration carried that `u8` typing forward without question; switching
to `bool` matches the documented API and aligns with the BOOL→bool
mapping called out in PLAN-FFI.md Phase 3.
First standalone exercise of `#objc_call(bool)`. The lowering is
identical to `#objc_call(u8)` at the ABI layer (single byte in `w0`
on AAPCS64), but the source-level type is now meaningful.
Three Phase 1D clusters in one commit (user opted for less iOS-sim
verification between each).
1.28 — `show_keyboard` / `hide_keyboard` use `#objc_call(u8)` against
`becomeFirstResponder` / `resignFirstResponder`. Compile-only; chess
startup doesn't reach the keyboard path, so the runtime side of this
cluster is a verification gap to backfill at the end of Phase 1D.
1.29 — `uikit_create_gl_context` migrates `alloc` / `initWithAPI:` /
`setCurrentContext:` and folds in the same `mainScreen.nativeScale`
read shape already migrated in 1.27. EAGL context creation runs on
launch, so this cluster IS runtime-exercised.
1.30 — `uikit_subscribe_keyboard_notifications` migrates the
`defaultCenter` + `addObserver:selector:name:object:` pair. First
standalone exercise of a 4-keyword selector through `#objc_call`.
Notification-center wiring runs at launch, so runtime-exercised.
Net -23 lines across the file.
iOS-sim chess regression smoke: app launches cleanly into a fresh
board state. Status-bar clearance, sharp rendering, and asset loading
all good — confirming clusters 1.25–1.27 still work alongside the new
ones.
Third Phase 1D cluster. `UIScreen.mainScreen.nativeScale` chain reads
through `#objc_call(*void)` + `#objc_call(f64)`. First standalone
`#objc_call(f64)` exercise — `f64` returns had only been covered
indirectly by the 4×f64 UIEdgeInsets HFA path. Net -4 lines.
iOS-sim chess regression smoke: sharp text rendering + accurate touch
hit-testing both confirm `plat.dpi_scale` is being populated correctly
through the new path.
Second Phase 1D cluster. NSBundle.mainBundle.resourcePath chain now
dispatches through `#objc_call(*void)` instead of a shared `msg_o`
typed cast — covers both class-method (`+mainBundle`) and
instance-method (`-resourcePath`) shapes through one intrinsic. Net
-3 lines.
iOS-sim chess regression smoke: app launches with all piece assets
rendered, which is the visible signal that `chdir` to the bundle's
resource path still succeeds.
First Phase 1D migration cluster. `uikit_refresh_safe_insets` reads
`safeAreaInsets` through `#objc_call(UIEdgeInsets)` instead of the
hand-typed `objc_msgSend` cast + `sel_registerName` triple, and a dead
`sel_safe_insets` selector decl in `uikit_scene_will_connect_ios` goes
away with it. Net -3 lines.
iOS-sim chess regression smoke: SxChess launches, board renders with
correct status-bar clearance — `safe_top` is populated correctly,
which is the actual ABI under test (32 B HFA returned in v0..v3).
109/109 regression tests pass; chess Android + iOS-sim still
build clean.
Root cause: sx's `xx <ptr>` cast targeting an integer type
(common pattern: `xx u64 = xx @some_global`) lowered to a no-op
because `coerceToType` had branches for int↔float and same-kind
widen/narrow, but nothing for pointer↔integer. The cast left the
value as a pointer Ref, and `emitInst`'s `.ret` arm tried to
coerce a `ptr` value to an `i64` slot — coerceArg had no
ptr↔int branch either, fell through to undef.
Why it worked in main but failed in helpers: an
`alloca u64`+`store ptr @g, alloca`+`load i64, alloca` sequence
preserves the address bits as raw memory, so the
"store-then-load through an alloca" workaround happened to do
the right thing without a real cast. A `ret i64 <ptr>` has no
such intermediate slot and triggers an LLVM type mismatch.
Fix layered into two existing IR opcodes:
lower.zig (coerceToType):
new branch — when src and dst types are ptr↔int, emit a
`bitcast` IR opcode with the right from/to. Mirrors how
int↔float emits `.int_to_float` / `.float_to_int`.
emit_llvm.zig (.bitcast arm):
dispatch ptr→int to `LLVMBuildPtrToInt` (+ trunc/zext if the
target int width != 64), int→ptr to `LLVMBuildIntToPtr`. The
"real bitcast" path stays for same-kind type punning.
Modern LLVM's BuildBitCast rejects ptr↔int directly, hence
the dispatch.
The fix also closes a quiet behavior gap that affected non-`#foreign`
globals (any `xx @<global>` from a helper fn). Surfaced while
investigating issue-0037; verified independently with a
non-`#foreign` sx-side global of type `s64`.
File mechanics: issue-0037 promoted to a focused feature example
per CLAUDE.md's resolution flow:
examples/issue-0037.sx -> examples/102-foreign-global-from-helper.sx
tests/expected/issue-0037.{txt,exit} -> tests/expected/102-foreign-global-from-helper.{txt,exit}
ffi-objc-call-03 + ffi-objc-call-06 IR snapshots updated to
reflect the ptr→int store-via-ptrtoint shape that's now correct
at the LLVM-IR level (same bits in memory, but properly typed).
109/109 host tests pass; tests/cross_compile.sh's first real tuple
(`android | examples/ffi-objc-call-10-os-gate.sx`) compiles
through `sx build --target android` without finding any
`@objc_msgSend` / `@sel_registerName` symbols in the output —
the `inline if OS == .ios { #objc_call(...) }` arm is stripped
at sx compile time before emit_llvm runs, so the Android
toolchain (Bionic + libGLESv3 / NDK linker) doesn't see the
Obj-C runtime references that would otherwise be undefined.
Host (macOS): the example prints "host stripped both" — the iOS
arm is stripped (we're not iOS) AND the Android arm is stripped
(we're not Android), confirming `inline if OS == { case }`
symmetric strip-and-render works around `#objc_call` sites.
The example carries a 3-line `android_main` trampoline so the
NDK linker's `-u ANativeActivity_onCreate` / entry-point
discovery is satisfied — pattern shared with chess + the other
android examples.
108/108 regression tests pass (+ffi-objc-call-09-in-construct,
+issue-0038 from the prior commit).
One trivial Obj-C call (`[obj hash]` returning NSUInteger) routed
through four sx surface constructs:
1. struct method body Probe.fetch
2. protocol impl method body impl Hashable for Probe
3. closure value body make_hasher
4. generic function body hash_through(recv: $T)
No new ABI shapes touched — pins that the `objc_msg_send` lowering
emits identical call shapes regardless of enclosing scope. Each
case validates the result `h_N == h_1` after threading `recv`
appropriately for each context.
The closure path reaches `recv` via a module-level global rather
than capturing the surrounding parameter — issue-0038 (prior
commit) documents the closure free-variable analyzer missing the
`FfiIntrinsicCall` node, with a clean workaround pinned.
Surfaced while writing the Phase 1.11 in-construct test. The
closure free-variable analyzer doesn't recursively visit the
`ffi_intrinsic_call` AST node introduced in Phase 1.1, so any
identifier used inside `#objc_call` / `#jni_call` /
`#jni_static_call` from a closure body trips:
error: unresolved: '<name>'
The same identifier captured from the same scope into a plain
expression resolves fine — so the bug is localized to whatever
recursive arm-walk powers the capture analysis.
Likely fix: add an `ffi_intrinsic_call => { ... }` arm wherever
the `.call =>` arm visits `callee` + `args`. Candidate files:
- src/sema.zig (capture / scope tracking)
- src/ir/lower.zig (closure body lowering / `lowerLambda`)
Both should be checked.
Workaround in the meantime: reach the captured value via a
module-level global from inside the closure body. See the
`g_hasher_recv` pattern in
examples/ffi-objc-call-09-in-construct.sx for an applied
instance.
106/106 regression tests pass (+ffi-objc-call-08-multi-keyword).
`#objc_call(s32)(instance, "combine:and:", 7, 42)` round-trips
end-to-end via class_addMethod-registered IMP that does
`a * 100 + b` → 742. Pins three things:
1. The two-keyword selector "combine:and:" parses, mangles, and
interns under the symbol `@OBJC_SELECTOR_REFERENCES_combine_and_`
(every `:` → `_` — matches clang).
2. Multi-arg call lowering correctly puts arg0 / arg1 in the right
slots after recv / sel.
3. The IMP-side sx fn signature `(self, _cmd, a: s32, b: s32)`
with `callconv(.c)` interops with the Obj-C runtime's typical
IMP shape, and the runtime forwards the keyword args to the
right physical positions.
No codegen change — Phase 1.6's variadic-args branch in the
`objc_msg_send` lowering already handled this; this test just
locks in the surface.
105/105 regression tests pass (+ffi-objc-call-07-fp-hfa-return).
Same round-trip pattern as 1.8 — register an Obj-C class at
runtime with class_addMethod, IMP returns specific non-zero values,
#objc_call reads them back — but for an all-double 32 B HFA
instead of a 24 B int aggregate.
Locks in the f32-vs-f64 landmine that bit us when we first
wrote safeAreaInsets in uikit.sx: the homogeneous-float-aggregate
ABI routes 1..4 f32 or f64 fields through v0..v3 (AAPCS64) /
xmm0..xmm3 (SysV AMD64) WITHOUT integer coercion. As long as the
LLVM call-site function type carries the precise struct (which
our `objc_msg_send` arm does), the backend lowers it correctly.
This is the smaller cousin of 1.8 — 1.8 needed an emit_llvm code
change to make the sret transform work; 1.9 needs no codegen
change because HFAs of any size up to v0..v3 stay register-resident.
The test just pins that path with a real, value-bearing IMP so a
future ABI-rule shake-up has a regression net.
104/104 regression tests pass. The Triple round-trip
(triple_imp writes {11, 22, 33} on the IMP side → #objc_call(Triple)
reads them back) is the test of record.
emit_llvm.zig changes:
1. `objc_msg_send` arm — when `needsByval(ret_ty)` (same predicate
the plain-foreign-call path uses), apply the sret transform:
- ret type collapses to void
- prepend a `ptr` param at index 0 (call site provides an
alloca slot)
- mirror `sret(<RetType>)` on the call site so the AArch64 x8
/ SysV-AMD64 hidden-ptr ABI lowers correctly
- load the result from the slot post-call
The IR shape now matches clang exactly:
call void @objc_msgSend(ptr sret({...}) %slot, ptr %recv, ptr %sel)
2. `.ret` arm — the body-side counterpart for sx fns whose declared
return type is sret-shaped (sx-defined IMPs registered via
`class_addMethod` produce these). When the current function's
`needsByval(func.ret)` predicate holds, store the IR ret value
through the prepended sret slot (param 0) and emit `ret void`.
Previously the unconditional coerceArg path turned the struct
value into `undef` and emitted `ret void undef` — illegal LLVM.
Test mechanics: registers `SxTripleProbe : NSObject` at runtime via
`objc_allocateClassPair` + `class_addMethod`, IMP returns
Triple{11, 22, 33}. `#objc_call(Triple)(instance, "tripleValue")`
gets them back, round-trip pinned in the .txt snapshot and the
IR-shape snapshot.
103/103 regression tests pass (+ffi-objc-call-06-sret-return).
The runtime output is misleadingly clean — `[nil tripleValue]`
zeros all three fields because libobjc's nil-stub clears the
return registers. But the IR snapshot reveals the actual ABI
mismatch:
%objc.msg = call { i64, i64, i64 } @objc_msgSend(ptr null, ptr %load)
A live receiver returning a non-zero `Triple` would surface
garbage in the third field — the AArch64 backend lowers
{ i64, i64, i64 } returns to x0/x1 pair + a third register that
the runtime's sret-shaped stub doesn't populate.
Next commit (1.8b): emit_llvm's `objc_msg_send` arm gains the
same sret transform we did for plain `#foreign` calls in Phase
0.3 — ret type collapses to void, prepend a ptr sret param,
alloca the result slot at the call site, mirror the
`sret(<T>)` attribute on the call, load result from the slot
post-call. IR snapshot will flip to:
%slot = alloca <Triple>
call void @objc_msgSend(ptr sret(<Triple>) %slot, ptr null, ptr %load)
%objc.msg = load <Triple>, ptr %slot
103/103 regression tests pass (+ffi-objc-call-05-struct-returns).
Three return shapes all round-trip cleanly with the existing Phase
1.6 `objc_msg_send` lowering — no codegen change needed because
emit_llvm.zig hands the IR struct type straight to LLVMBuildCall2
and the AArch64 / SysV AMD64 backends already know how to lower:
NSPoint — 16 B HFA (2×f64) → v0, v1 (AAPCS64) / xmm0, xmm1 (SysV)
NSRange — 16 B 2×u64 → x0, x1 register pair via [2 x i64]
NSRect — 32 B HFA (4×f64) → v0..v3 (AAPCS64) / xmm0..xmm3 (SysV)
Verified against the Obj-C runtime's `[nil structMethod]`-returns-
zero contract — no real-object setup needed, but the wider ABI
path runs exactly as it would for live receivers (the registers
the runtime stub uses come back through the same lowering).
>16 B non-HFA aggregates (e.g. {3×s64}) trip a sret cliff and
land in Phase 1.8. Verified locally that they return garbage in
the trailing field today — register pair / quad won't carry the
extra storage, and emit_llvm's `objc_msg_send` arm doesn't apply
the sret transform yet.
102/102 regression tests pass; chess Android + iOS-sim still build
clean. `ffi-objc-call-04-primitive-returns` flips from xfail to
passing with both nil-recv and real-recv flavors of *void / s64
returns exercised.
Key change: a new `objc_msg_send` IR opcode bundles (recv, sel,
extra args) and carries the return type via the `Inst.ty` field.
emit_llvm.zig builds a per-call-site LLVM function type from the
argument Refs' IR types (recv/sel as ptr; extra args through
abiCoerceParamType) and dispatches with LLVMBuildCall2. One
declared `@objc_msgSend` symbol is reused across every return
type — opaque pointers make the function value type-erased, so
each call site picks its own ABI.
before: one (recv, sel) -> ptr LLVM declaration, hard-coded
per call site; only void return wired in 1.3.
after: same declaration, each call site provides a fresh
LLVMBuildCall2 fn-type → s64 / *void / bool / f64
returns all dispatch correctly without separate FuncIds.
Selector init mechanism: stayed with the @llvm.global_ctors
constructor. Investigated clang's
`__DATA,__objc_selrefs` + `externally_initialized` shape — works
for fully-linked binaries (dyld substitutes the SEL at load
time) but **LLVM ORC JIT** (the engine behind `sx run`) doesn't
process Mach-O Obj-C metadata sections, so the slot keeps its
initial value (the method-name string pointer) and dispatch
crashes with "<null selector>". The portable choice: keep the
constructor AND inject a direct call to it at `main`'s entry —
idempotent under dyld (sel_registerName returns the same SEL on
re-registration), required for ORC JIT.
Files touched:
src/ir/inst.zig | new ObjcMsgSend struct + opcode
src/ir/lower.zig | drop the void-only restriction; emit the
new opcode; remove the orphaned
getObjcMsgSendFid path (objc_msgSend
declaration moved to emit_llvm)
src/ir/emit_llvm.zig | objc_msg_send arm (per-call-site
LLVMBuildCall2); lazy `@objc_msgSend`
declaration via getObjcMsgSendValue;
emitObjcSelectorInit refactored to inject
the ctor call at main's entry
src/ir/{print,interp}.zig | switch arms for the new opcode
`ffi-objc-call-03-selector-sharing.ir` snapshot updates to
reflect the new shape (the `call ... @objc_msgSend` call sites
no longer mention a typed wrapper).
102/102 regression tests pass (+ffi-objc-call-04-primitive-returns
with xfail snapshot capturing today's diagnostic).
Pinned scenario: `[NSObject class]` — `#objc_call(*void)(null, "class")`.
Should return a non-null Class pointer once the lowering supports
non-void returns. Today the Phase 1.3 restriction trips with:
#objc_call: only `void` return + (recv, selector) is lowered today;
non-void / arg-bearing arities land in later phase-1 steps
The next commit (1.6b) introduces an `objc_msg_send` IR opcode that
bundles (recv, sel, args, ret_ty) and emit_llvm builds a per-call-
site LLVM function type, sharing one declared `@objc_msgSend`
symbol across return-type variants. Five primitive returns
(*void / bool / s32 / s64 / f64) get folded in across 1.6b–c.
101/101 regression tests pass; the IR snapshot for the selector-
sharing test diff flips from four per-call `sel_registerName` calls
to two (one per unique selector) routed through a module-init
constructor — matching what clang emits for `@selector(...)`.
Hot-path cost collapses from a libobjc hashtable lookup per call to
a single load of a static `SEL*` slot:
Before (Phase 1.3):
%sel = call ptr @sel_registerName(<"init">)
call ptr @objc_msgSend(<recv>, %sel)
After (Phase 1.5):
%sel = load ptr, ptr @OBJC_SELECTOR_REFERENCES_init
call ptr @objc_msgSend(<recv>, %sel)
+ @OBJC_SELECTOR_REFERENCES_init = internal global ptr null
+ @OBJC_SELECTOR_REFERENCES_release = internal global ptr null
+ define internal void @__sx_objc_selector_init() {
+ %sel = call ptr @sel_registerName(ptr @OBJC_METH_VAR_NAME_)
+ store ptr %sel, ptr @OBJC_SELECTOR_REFERENCES_init
+ %sel1 = call ptr @sel_registerName(ptr @OBJC_METH_VAR_NAME_.2)
+ store ptr %sel1, ptr @OBJC_SELECTOR_REFERENCES_release
+ ret void
+ }
+ @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }]
+ [{ ..., ptr @__sx_objc_selector_init, ptr null }]
Implementation:
module.zig | new `objc_selector_cache: ArrayList(ObjcSelectorEntry)`
with `lookupObjcSelector` / `appendObjcSelector`. List
(not hashmap) keeps emit order stable across builds so
the IR snapshot doesn't flicker on rehash.
lower.zig | `internObjcSelector(sel)` creates the slot on first
use, returns the same `GlobalId` on every subsequent
call to the same selector. lowerFfiIntrinsicCall now
emits `global_addr + load` for literal selectors.
Non-literal selectors keep the `sel_registerName`
fallback. Declaring `sel_registerName` lazily on
first intern so emit_llvm finds it for the
constructor body.
emit_llvm.zig | new `emitObjcSelectorInit` pass synthesizes a void
constructor that loops over the cache, calls
`sel_registerName` for each unique selector string,
stores the result in the slot. Constructor is
registered in `@llvm.global_ctors` with default
priority (65535) so dyld runs it before main.
The `@OBJC_METH_VAR_NAME_` private string globals and unnamed-addr
flag match clang's exact emission shape — picked up by the system
linker into the right Mach-O sections on macOS / iOS. Chess
Android + iOS-sim still build clean (no `#objc_call` in chess yet —
phase-3 migration will start exercising this).
run_examples.sh now supports an optional `tests/expected/<name>.ir`
sibling to `.txt`/`.exit`. When present, the runner also captures
`sx ir <file>` output, normalizes target-/host-specific noise
(module ID, target triple/datalayout, attribute groups, LLVM's
auto-suffixed %temp numbering), and diffs against the snapshot.
`--update` regenerates it alongside the runtime output.
Catches lowering changes that don't affect what the program prints
— exactly the shape Phase 1.5's selector interning will produce
(same runtime output, very different IR).
First snapshot: `ffi-objc-call-03-selector-sharing.ir`. Today the
test emits four `call ptr @sel_registerName(ptr @str.N)` lines for
its four call sites; after 1.5 we expect two static
`@OBJC_SELECTOR_REFERENCES_<sel>` globals + loads at each call
site. The diff between the two snapshots will be the visible
artifact of the optimization.
101/101 regression tests pass (+ffi-objc-call-03-selector-sharing).
Test exercises four call sites — three sharing "init" and one
"release" — to pin the multi-site / multi-selector lowering before
1.5 changes how SEL lookups are cached.
Runtime behavior: identical before and after 1.5 (all call sites
hit nil receivers; libobjc returns 0 for void). The improvement is
visible only in the emitted IR — today:
$ ./zig-out/bin/sx ir examples/ffi-objc-call-03-selector-sharing.sx \\
| grep -c "call ptr @sel_registerName"
4
After 1.5 (planned): 2 — one `sel_registerName` per unique selector
string, materialized into a static `OBJC_SELECTOR_REFERENCES_<sel>`
global at module init, then loaded at each call site. Matches the
shape clang produces for `@selector(...)`. Worth re-running the
above grep after 1.5 lands as a manual sanity check.
The IR-shape snapshot harness (auto-diff of `sx ir` output) is
deferred; for now we verify by eye.
100/100 regression tests pass; ffi-objc-call-02-void-return flips
from xfail (codegen rejection) to passing ("ok").
Lowering for `#objc_call(void)(recv, "selector:")` lands in
lower.zig as `lowerFfiIntrinsicCall`:
%sel = call ptr @sel_registerName(<"selector:">)
%call = call ptr @objc_msgSend(<recv>, %sel)
Two extern decls (`sel_registerName(*u8) -> *void` and
`objc_msgSend(*void, *void) -> *void`) are declared lazily and
cached on the Lowering instance via `objc_msg_send_fid` /
`sel_register_name_fid`, so multiple call sites share one
declaration each.
Phase 1.3 deliberately keeps scope tight: only `void` return + just
(recv, selector) arity is wired. Non-void returns + variadic arity
fall through with a diagnostic and are owned by subsequent phase-1
steps (1.6 primitive returns; 1.7..1.9 struct shapes; 1.10 multi-
keyword selectors).
Selector resolution is still per-call-site `sel_registerName` —
the planned 1.5 interning turns the per-call hashtable lookup into
a single static-global load. Chess Android + iOS-sim builds clean
— no regression on the existing typed-`objc_msgSend`-cast pattern.
100/100 regression tests pass (+ffi-objc-call-02-void-return xfail
snapshot).
The intrinsic with no `inline if false` guard reaches sema/codegen
and trips an "unresolved: 'unknown_expr'" — the FfiIntrinsicCall
AST node from Phase 1.1 has no lowering rules in lower.zig /
emit_llvm.zig yet.
nil receiver was chosen so the test doesn't need a real Obj-C
object graph: the runtime guarantees `[nil msg]` is a no-op with
zero result for void returns. macOS-gated via `inline if OS == .macos`
so the runner stays portable.
Next commit: emit_llvm.zig produces the per-call-site
%sel = call ptr @sel_registerName(ptr "init.0")
call void @objc_msgSend(ptr null, ptr %sel)
lowering. Snapshot flips to "ok". Selector interning (one shared
global per unique selector string) lands as a separate step (1.5).
99/99 regression tests pass (+ffi-jni-call-01-parse).
Locks in the same parse-surface contract for the JNI intrinsics
that ffi-objc-call-01-parse pins for the Obj-C side:
#jni_call(*void)(null, null, "getWindow", "()Landroid/view/Window;");
#jni_static_call(s32)(null, null, "max", "(II)I", 3, 7);
#jni_call(bool)(null, null, "isShown", "()Z");
All three lower through the shared `FfiIntrinsicCall` AST node
added in 1.1; only the kind tag distinguishes them. `inline if false`
keeps sema/codegen out of the picture until later phase-1 steps
wire those in.
98/98 regression tests pass; ffi-objc-call-01-parse flips from
parse-error xfail to passing.
Shape: `#<intrinsic>(ReturnT)(args...)`. The return-type generic
sits in the first parens, the actual call args in the second. All
three intrinsics share the same parse rule; only the kind tag and
the downstream lowering differ.
token.zig | three new hash_* tags
lexer.zig | matches the directive keywords with the same
isIdentContinue boundary check as the rest
ast.zig | FfiIntrinsicCall node with `kind`, `return_type`,
and `args` fields; FfiIntrinsicKind enum
parser.zig | parseFfiIntrinsicCall — same call-arg loop shape
as Call, with the leading return-type slot
sema.zig | analyzeNode + findNodeAtOffset arms walk the args
+ return-type child nodes
lsp/server.zig | classify the new tokens as ST.keyword
Codegen for the new intrinsic isn't wired yet — examples that
reach the body of a non-suppressed call would fail at lowering.
The current parse test uses `inline if false { ... }` to suppress
the dead branch, so sema/codegen don't see the node. Phase 1.3+
adds the lowering and the gate comes off.
Chess Android + iOS-sim builds clean — no regression on the
existing `objc_msgSend` cast pattern or the JNI helper.
98/98 regression tests pass (+ffi-objc-call-01-parse with xfail
snapshot capturing today's parse error).
Phase 1 of PLAN-FFI.md introduces three compiler intrinsics
(`#objc_call`, `#jni_call`, `#jni_static_call`) that lift the
ceremony off the existing typed-`objc_msgSend` and JNI dispatch
patterns. This is the first step of the cadence:
1.0 (this commit): test-add. Locks the current parse rejection.
1.1 (next): make-green. Parser accepts the new syntax;
this snapshot updates to whatever the next
pipeline stage produces (sema/codegen still
can't lower the intrinsic — that's later
phase-1 steps).
1.3+: codegen lands; the test eventually runs
cleanly against Foundation.
`inline if false` wraps the call site so the AST carries the node
but no codegen runs for it. Lets Phase 1.1's parse-only test pass
without dragging in the sema/codegen plumbing prematurely.
97/97 regression tests pass (94 expected updated; +issue-0037 from
the prior commit).
The companion `94-foreign-global-helper.sx` ALSO declares
`__stdinp : *void #foreign;`. Two sx files referencing the same
extern symbol must link cleanly — LLVM dedupes the named global at
the module level, and the C linker resolves both refs to the one
libSystem definition.
The full ergonomic story (helper computes the *same* address as the
main file's direct read) is blocked on issue-0037: lower.zig's
`address_of(global)` branch produces `undef` when the body is a
non-main function, even single-file. Once issue-0037 closes, fold
the helper's address back into an equality check here.
The cross-file link itself works today and is the lemma we're
locking in. This is also the closest thing today to the cross-file
extern-global ergonomic issue-0030 wants — `#foreign` already works
across files; the missing piece is sx-side `extern` decls for
sx-defined globals.
Repro found while writing PLAN-FFI step 0.10.
In a single file:
__stdinp : *void #foreign;
stdinp_addr :: () -> u64 { xx @__stdinp; }
main :: () -> s32 {
a : u64 = xx @__stdinp; // a = real symbol address
b := stdinp_addr(); // b = 0
...
}
The emitted IR for the helper is `ret i64 undef`, suggesting the
`address_of(identifier=__stdinp)` branch in lower.zig (~line 1719)
doesn't see `__stdinp` in `global_names` at the moment the helper's
body is being lowered — even though the same lookup succeeds inside
main's body in the same compilation unit.
Likely cause: lazy-body lowering ordering vs. the pass that
registers extern global decls into `global_names`. Worth verifying
which before fixing — could also be per-function scoping of the
map. Phase 1 of the FFI plan doesn't depend on this, so it stays
filed as an open issue and gets addressed when convenient (or when
sx-side `extern` cross-file globals from issue-0030 land and need
the same lookup to work everywhere).
`inline if OS == { case .macos: ... case .ios: ... else: ... }` is
already supported (see library/modules/platform/sdl3.sx:42 and
examples/38-build-config.sx:30). Cleaner than the chained
`inline if OS == .a; inline if OS == .b; ...` form the prior
commit used.
Same expected output — only the macOS arm survives codegen on the
host. Snapshot unchanged.
96/96 regression tests pass (+ffi-09-foreign-result-chain).
Opaque C-handle pattern that mirrors how real sx code threads
MTLBuffer*, AAssetManager*, file pointers, etc. through composite
sx values. C side has a trivial heap-int handle (`ffi_chain_make`
returning `void*`, `ffi_chain_bump` / `_peek` / `_dispose`). The sx
side exercises:
1. Chained calls — make -> bump -> bump -> peek; one handle
threaded through four FFI sites in sequence.
2. Struct field — `Counter { handle: *void; label: string; }`
hosts the handle; methods/accesses go through
`.handle` to feed back into C.
3. List(*void) — push N handles, iterate, peek each, iterate
again to bump each, iterate again to read
back. Catches any aliasing / lifetime breakage
when handles round-trip through the slice
backing of List.
95/95 regression tests pass (+ffi-08-foreign-in-method).
One trivial C helper (`ffi_method_helper`) called from each of the
major sx surface constructs that can host an FFI site:
1. struct method body Counter.next
2. protocol impl method body impl Doubler for Counter
3. closure value body make_adder's `closure(...)`
4. comptime-gated branch `inline if OS == .macos { ... }`
No new ABI shapes — the lowering route a `#foreign` call site takes
shouldn't depend on its enclosing construct, and the test pins that
lemma. A future lowering refactor that, say, breaks protocol-dispatch
fast-paths for FFI-calling impl methods will fail here directly
instead of being caught only by the chess Android regression.
The `inline if` branches for ios/linux compile down to nothing on
macOS, so only the macOS arm fires at runtime — useful smoke test
that the comptime gate works around FFI sites too.
vendors/ is a third-party namespace (stb_image, kb_text_shape, etc.);
test fixtures don't belong there. The .c/.h companion files for the
Phase-0 FFI baselines now sit alongside the .sx that drives them in
examples/, with matching basenames:
examples/ffi-01-primitives.{sx,c,h} <- was vendors/ffi_primitives/
examples/ffi-02-small-struct.{sx,c,h} <- was vendors/ffi_structs/
examples/ffi-03-large-struct.{sx,c,h} <- was vendors/ffi_large_struct/
examples/ffi-04-fp-struct.{sx,c,h} <- was vendors/ffi_fp_struct/
examples/ffi-05-string-args.{sx,c,h} <- was vendors/ffi_strings/
examples/ffi-06-callback.{sx,c,h} <- was vendors/ffi_callback/
examples/101-ffi-medium-struct.{sx,c} <- was vendors/ffi_medium_struct/
`#source` / `#include` paths in the .sx files become bare filenames
(no prefix) since imports.zig's base_dir resolution finds them
relative to the importing .sx file's directory.
`library/vendors/sx_ffi_resolve_test/` stays put — that one's the
whole point: regression coverage for the stdlib-search branch of
the resolution chain, so it must live where ONLY that branch can
find it.
94/94 regression tests pass.
94/94 regression tests pass (+ffi-07-c-import-block).
Companion C helper lives only at
`library/vendors/sx_ffi_resolve_test/`. Critically NOT in
`sx/vendors/` (the sx repo root) and NOT in the importing
example's directory — so the `vendors/...` paths in this
example are findable solely via the stdlib search branch
(`<exe>/../../library`, `<exe>/../library`, `<exe>/library`).
That branch is the one the JNI insets bridge needs to reach
`library/vendors/sx_android_jni/sx_android_jni.c` without
forcing chess (or any consumer) to vendor an identically-named
copy. The test pins the resolution end-to-end:
- #include resolves; clang parses the .h; c_import.zig
synthesizes #foreign fn decls for `sx_ffi_resolve_test_add` /
`_mul`.
- #source resolves; the .c is compiled into the build's
object list.
- sx calls the synthesized decls and prints results.
Latent bug from the stdlib-path resolution introduced in 4849cfb.
The earlier shape captured `const ci = decl.data.c_import_decl;`
BEFORE mutating `decl.data.c_import_decl.{sources,includes}` with
the resolved paths, then passed the stale `ci.includes` to
`c_import.processCImport`. Result: `#include "vendors/..."` paths
that resolved via the stdlib branch (i.e. only existed under
sx/library/vendors/) reached clang as the original unresolved
string and failed to parse — silently producing no synthesized
`#foreign` decls.
`#source` survived because the source list is re-read from
decl.data later (collectCImportSources walks the AST), so it
picked up the mutated value. Only `#include`'s synthesis path was
broken.
Fix: do the resolution first inside its own scope, then re-bind
`ci` from `decl.data.c_import_decl` so the include list passed to
processCImport sees the resolved paths.
Caught by ffi-07 baseline (next commit) — the test deliberately
puts its C helper only under library/vendors/ so the path is
findable solely via the stdlib chain.
93/93 regression tests pass (+ffi-06-callback).
Mirrors the `app->onInputEvent` install pattern from
library/modules/platform/android.sx:
1. (s32) -> s32 — single primitive arg/return
2. (*void, s32) -> s32 — opaque ctx pointer + value
(the onInputEvent shape)
Side effects via two file-level globals so the test observes both
the return value AND state mutation across multiple calls:
- g_callback_hits = N proves the callback fired N times.
- g_callback_sum = sum of args proves each individual call landed
with the correct value.
The ctx-pointer variant casts `*void` back to `*s32` inside the
callback and reads through it (`p.*`), proving the pointer survives
the round-trip with no aliasing weirdness.
92/92 regression tests pass (+ffi-05-string-args).
Covers the four shapes that actually appear at the sx ↔ C boundary
today:
1. [:0]u8 string literal -> const char* (ffi_strlen, ffi_first_byte)
2. sx `string` value via .ptr (slice-decay branch in
coerceArg pulls the pointer)
3. [*]u8 raw buffer + length (ffi_sum_bytes, mutated via
ffi_write_byte and read back)
4. C-returned const char* (round-trips back as [*]u8)
The mutate-via-C path catches any pointer-aliasing regression — sx
allocates the fixed array `bytes : [4]u8`, passes `.ptr` to C which
writes index 1, and the sx side reads `bytes[1]` to confirm the
mutation took effect through the same memory.
91/91 regression tests pass (+ffi-04-fp-struct).
Single-file regression net for the all-float / all-double aggregate
ABI path:
FQuad — 16 B, 4×f32 (same slot as ffi-02's Vec4f)
DQuad — 32 B, 4×f64 (UIEdgeInsets-shape — the f32-vs-f64 landmine)
Already nominally covered by ffi-02's Vec4f, but pinning it as a
focused single-file test means a future ABI rule change that breaks
the HFA path fails *this* test directly without a noisy drag-in from
the multi-shape baseline.
DQuad at 32 B straddles the AAPCS64 HFA limit (≤4 floats of same
type, total ≤64 B); it stays as a struct value passed through
v0..v3 rather than going indirect. The snapshot confirms the values
arrive intact.
90/90 regression tests pass (+ffi-03-large-struct).
vendors/ffi_large_struct/{.h,.c} defines:
Big24 — 24 B, three s64 (byval params + sret return)
Big48 — 48 B, six s64 (same path, larger)
`make / rotate-or-reverse / sum` helpers per shape. The sx-side
example imports via `#source` only and declares matching structs +
hand-written #foreign decls.
Snapshot pins today's >16-byte aggregate ABI now that the
emit_llvm.zig sret-return transform is in place (previous commit).
That gives us a regression net for all four C-ABI aggregate slots
in one place:
≤8 B int — i64 coercion (ffi-01 vec-likes)
9..16 B int — [2 x i64] coercion (ffi-02 Pair64/Quad32, 101)
16 B HFA — struct, no coercion (ffi-02 Vec2/Vec4f)
>16 B — byval params + sret (this commit)
Foreign functions that return a >16-byte non-HFA aggregate (e.g.
Big24 / UIEdgeInsets on iOS / clang-shaped struct returns) need the
indirect-return ABI: caller allocates space, passes its pointer as a
hidden first arg with `sret(<T>)`, callee writes through it and
returns void. AAPCS64 puts the pointer in x8; SysV AMD64 puts it in
the first int register and treats the named return as void.
The existing >16-byte branch in `abiCoerceParamType` was returning
`ptr` for BOTH params and returns. That works for byval params (the
established pattern — caller alloca + store + pass ptr, callee loads
in prologue), but is wrong for returns: it caused the function decl
to look like `ptr @fn(...)` rather than `void @fn(ptr sret(<T>), ...)`,
and the call site read whatever happened to be in x0 as a struct
pointer — segfault on dereference (caught while writing the ffi-03
baseline).
Fix layered into the same `abiCoerceParamType` / call-site code path:
emitFunctionDecl:
- Compute `uses_sret = needs_c_abi && needsByval(ret_ty, raw_ret_ty)`.
- Ret type collapses to void.
- Prepend a `ptr` param at slot 0.
- Add `sret(<RetType>)` type attribute on param-index 1
(LLVMAttributeIndex 1 = first parameter; 0 = return value).
.call lowering:
- Detect callee_uses_sret via the same predicate.
- Allocate the result on the caller's stack (`sret.slot`).
- Prepend it as args[0] (with sret_off index alignment so the
original sx args land at args[1..]).
- After LLVMBuildCall2, set the same `sret(<T>)` attribute on
the call site's arg 1 (mirrors the fn-decl attribute — both
land in the AArch64 backend's lowering pass).
- Load the result from the slot to produce the IR value.
`call_indirect` (function-pointer dispatch — uikit.sx's typed
`objc_msgSend` casts) keeps its existing behavior for now; the iOS
path already round-trips UIEdgeInsets via that route. Folding the
same sret transform into call_indirect is a follow-up.
89/89 regression tests still pass. Chess Android + iOS-sim both
build clean.
Now that emit_llvm.zig bridges the struct<->[2 x i64] ABI mismatch
(previous commit), the 9..16-byte integer-only shapes round-trip
cleanly. Extended `examples/ffi-02-small-struct.sx` to cover all
four aggregate ABI slots in one place:
Vec2 — 8 B, two f32 (register pair, float)
Vec4f — 16 B, four f32 (HFA — homogeneous float aggregate)
Pair64 — 16 B, two s64 (9..16 B int — [2 x i64] coercion slot)
Quad32 — 16 B, four s32 (same slot as Pair64)
Vendor helpers (`vendors/ffi_structs/{ffi_structs.h,ffi_structs.c}`)
grow `ffi_pair64_*` + `ffi_quad32_*` companions. Snapshot updated
to capture the full output. 89/89 regression tests pass.
`examples/101-ffi-medium-struct.sx` keeps a minimal focused repro
of the Pair64 case so the issue's emergence-and-fix history stays
greppable.