Trailing `args: ..T` on a #foreign declaration now lowers to the C
calling convention's `...` instead of sx-side slice-packing. Drops
the per-arity #foreign-shim workaround for callers of variadic C
APIs (__android_log_print, printf-family, etc.). Closes issue-0043.
- IR: Function.is_variadic on inst.Function; declareFunction drops
the variadic param from the IR signature for foreign+variadic
decls.
- emit_llvm: LLVMFunctionType receives is_var_arg=1 when the flag
is set; call lowering passes extras through unchanged.
- Lowering: packVariadicCallArgs early-outs for foreign+variadic
(no slice-pack); new promoteCVariadicArgs applies C default
argument promotion (bool/s8/s16/u8/u16 -> s32, f32 -> f64) to
extras past the fixed param count.
- Test: examples/ffi-foreign-cvariadic.sx + .c exercise s64/f64/s32
returns through C va_arg over s32/f64/*u8 element types.
134 host + 6 cross tests pass on the WIP-less baseline.
Adds the constructor-invocation arm of the foreign-class DSL:
`SurfaceView.new(ctx)` (where `SurfaceView` is a `#foreign #jni_class`
with `static new :: (ctx: *Context) -> *Self;`) lowers to
`FindClass(env, "android/view/SurfaceView") + GetMethodID(env, cls,
"<init>", "(args)V") + NewObject(env, cls, mid, args...)`. Returns
the fresh jobject.
- inst.zig: `JniMsgSend.is_constructor` flag + `parent_class_path`
re-purposed to carry the class being constructed (alongside its
existing nonvirtual-super-class use). Mutually exclusive with
`is_static` / `is_nonvirtual`.
- lower.zig: `lowerCall.field_access` arm now recognises
`Alias.method(args)` where `Alias` resolves in `foreign_class_map`
and the matching member is `static`. `new` routes to a new
`lowerForeignStaticCall` that derives a `(args)V` JNI descriptor
and emits a `JniMsgSend` with `is_constructor=true`. Non-`new`
static calls report a clear "use #jni_static_call" diagnostic
until that sugar lands.
- emit_llvm.zig: new `NewObject` vtable slot (28) + `emitJniConstructor`
helper expanding the FindClass+GetMethodID+NewObject chain. The
jni_msg_send arm short-circuits to it when `is_constructor` is set.
Smoke `ffi-jni-main-03-ctor.sx` exercises both this slice and the
previous super-dispatch slice in a single `onCreate` body: calls
`super.onCreate(b)` then constructs a `SurfaceView` with the Activity
as Context. IR shows the expected six-stage chain (FindClass+GetMethodID+
CallNonvirtual + FindClass+GetMethodID+NewObject); APK builds clean.
Naming caveat: the Java type `android.content.Context` clashes with
sx stdlib's `Context :: struct {...}` (heap-context). The smoke aliases
it `JContext` — future work could add a path-prefix or `as` rename
form on `#jni_class` to avoid the manual rename.
133 host / 6 cross / zig build test all green.
Inside a `#jni_main` (or any sx-defined `#jni_class`) bodied method,
`super.method(args)` lowers to JNI's nonvirtual dispatch against the
parent class resolved via `#extends` (default `android.app.Activity`).
- lower.zig: tracks `current_foreign_class` + `current_foreign_method`
around each `synthesizeJniMainStub` body; pushes the JNIEnv* arg
onto the lexical `#jni_env` stack so omitted-env JNI calls inside
the body see env without a wrapper. New `lowerSuperCall` handles
the `super.method(args)` receiver pattern: derives parent path,
reuses the enclosing method's signature when names match (the
common `super.<override>(args)` case), or looks up the method on
the parent class declared as `#foreign #jni_class`.
- inst.zig: `JniMsgSend` gains `is_nonvirtual: bool` and
`parent_class_path: ?[]const u8` — the dispatch tag + super class
foreign path. Mutually exclusive with `is_static`.
- emit_llvm.zig: new `CallNonvirtual<T>Method` vtable slots + a
fourth dispatch arm. Resolves the parent jclass via
`FindClass(env, parent_path)` (per-call; caching is follow-up),
then `GetMethodID(env, parent_cls, name, sig)`, then
`CallNonvirtual<T>Method(env, obj, parent_cls, mid, args...)`.
Disassembly on the smoke confirms the chain:
`ldr [env+0x30]` (FindClass) → `ldr [env+0x108]` (GetMethodID) →
`ldr [env+0x2d8]` (CallNonvirtualVoidMethod) with `(env, self,
parent_cls, mid, bundle)`.
132 host / 5 cross / zig build test all green. The slice unblocks
Activity lifecycle overrides (onCreate, onResume, onPause) calling
their required `super.<method>(args)` without raw `#jni_call`
boilerplate.
`#jni_call` collapses to a single surface — env is *always* implicit:
either picked up from the lexically-enclosing `#jni_env(env) { ... }`
block's Ref (cheap, register-resident, no TL touch) or from the
runtime's thread-local slot via `sx_jni_env_tl_get()` (one fn call
per dispatch). The explicit-env shape is gone — chess and the
existing tests migrate cleanly by wrapping their helper-fn bodies
in `#jni_env(env) { ... }`.
The TL slot lives outside the user's IR module so the LLVM ORC JIT
can load object files cleanly without `orc_rt` for TLS support:
library/vendors/sx_jni_runtime/sx_jni_env_tl.c:
static _Thread_local void *sx_jni_env_tl_slot;
void *sx_jni_env_tl_get(void) { return sx_jni_env_tl_slot; }
void sx_jni_env_tl_set(void *env) { sx_jni_env_tl_slot = env; }
Linkage:
- sx-the-compiler links the .c file via build.zig so the JIT
process-symbol generator resolves `sx_jni_env_tl_get`/`_set`.
- AOT targets get the same .c file auto-linked via the lowering
pass: when lower touches the TL externs, it sets
`needs_jni_env_tl_runtime`, and `Compilation.lowerToIR` appends a
synthetic `CImportInfo` to `lowering_extra_c_sources` that
`collectCImportSources` merges with user-written ones.
Lowering-side changes:
- `getJniEnvTlFids` lazily declares the two externs (parallel
to `getSelRegisterNameFid`) and flips `needs_jni_env_tl_runtime`.
- `#jni_env(env) { body }` emits save→set→body→restore via three
`call` ops to the externs; the inner body sees env via the
lexical-direct stack.
- `lowerJniCall` resolves env from `jni_env_stack` (top) or the TL
fallback. The explicit-env branch is gone.
- `jni_env_stack_base` tracks per-fn lexical scope so lazy-lowering
a callee doesn't accidentally see the caller's Ref (Refs are only
valid inside one fn's instruction stream).
Test migration (mechanical):
- ffi-jni-call-{01..09}: each helper fn wraps `#jni_call(...)`
bodies in `#jni_env(env) { ... }`. Returning values pass through
the block as an expression — `#jni_env` now also lowers in
expression position.
Verified:
- zig build test + tests/run_examples.sh: 130/130 green.
- tests/cross_compile.sh: 3/3 green.
- Chess APK rebuilt + reinstalled on Pixel. Board renders with
status-bar clearance + info panel intact; no crashes in logcat.
Safe-insets dispatch through `#jni_env` + lexical-direct now
fully exercised end-to-end on real hardware.
Static dispatch wired in. The early `is_static` bail in
`.jni_msg_send` is gone; both paths now share the same lazy-cache +
phi structure with two static-specific differences:
1. `GetObjectClass` is skipped — for static calls, `target` IS the
`jclass`. The cached `cls` slot just stores `NewGlobalRef(target)`
directly.
2. The method-ID lookup uses `GetStaticMethodID` (slot 113), and the
dispatch uses `CallStatic<Type>Method` (Object 114 / Boolean 117
/ Int 129 / Long 132 / Float 135 / Double 138 / Void 141).
Slot interning still applies: the `@SX_JNI_{CLS,MID}_<key>` pair is
shared between instance and static literal call sites with the same
`(name, sig)` — though in practice the JNI runtime treats instance
and static method-IDs as distinct, so two sites with the same name
but different dispatch kinds would collide in the cache. This isn't
a problem the chess Android backend hits (each method is uniquely
either static or instance in the API), so the simpler single-key
intern stays.
IR snapshot updated: `ret i32 undef` replaced by the full
NewGlobalRef → GetStaticMethodID → CallStaticIntMethod sequence
through vtable slots 21, 113, 129. Args `i32 3, i32 7` thread through
the existing arg-coercion loop.
Closes the return-type matrix. Pointer-return types aren't a simple
`TypeId` enum case (they're user-defined types interned into the
table), so the dispatch checks `TypeInfo.pointer | .many_pointer`
ahead of the primitive switch:
const is_pointer_ret = switch (types.get(ret_ty_id)) {
.pointer, .many_pointer => true,
else => false,
};
const offset = if (is_pointer_ret)
Jni.CallObjectMethod
else switch (ret_ty_id) { .void => ..., .s32 => ..., ... };
LocalRef cleanup deferred: returned jobjects are JNI LocalRefs
bounded by the native frame. Chains of calls within one frame
consume them inline; cross-frame use must promote via `NewGlobalRef`
(already wired in the slot-interning path from 1.17). The chess
Android backend will consume objects inline, matching the manual
pattern in `sx_android_jni.c`.
Return-type matrix done: void, s32, s64, f64, bool, *void all
dispatch through their respective vtable slots. Static dispatch
(1.23) is next.
One-line addition: `.bool => Jni.CallBooleanMethod`. The lazy-cache
+ dispatch from 1.17 handles the rest. JNI's `jboolean` is i8 in the
C ABI but always carries 0 or 1; LLVM's call boundary truncates the
return byte to i1 and the sx-level bool reads the low bit
canonically.
IR snapshot updated: `ret i1 undef` replaced by the full sequence
through vtable slot 37 keyed on `("isShown", "()Z")`.
One-line addition to the switch: `.f64 => Jni.CallDoubleMethod`.
First non-integer JNI return type; same lazy-cache + dispatch
infrastructure from 1.17 handles the rest.
IR snapshot updated: `ret double undef` replaced by the full
sequence through vtable slot 58 keyed on `("getValue", "()D")`.
One-line addition to the `call_method_offset` switch: `.s64 =>
Jni.CallLongMethod`. The 1.17 caching infrastructure and the named-
constants struct from c1877fc handle the rest.
IR snapshot at `tests/expected/ffi-jni-call-05-jlong-return.ir`
updated: `ret i64 undef` replaced by the full lazy-cache +
CallLongMethod (vtable slot 52) sequence keyed on
`("currentTimeMillis", "()J")`.
The numeric slot indices (21, 31, 33, 49, 61) in the `#jni_call`
lowering are JNI-spec constants from `<jni.h>` but appeared as bare
magic numbers — only the trailing comment told you which JNI
function you were loading. Moving them into a private `const Jni`
namespace at file scope makes the call sites self-documenting:
loadJniFn(ifs, Jni.GetObjectClass, "jni.GetObjectClass")
loadJniFn(ifs, Jni.NewGlobalRef, "jni.NewGlobalRef")
loadJniFn(ifs, Jni.GetMethodID, "jni.GetMethodID")
switch (ret_ty_id) {
.void => Jni.CallVoidMethod,
.s32 => Jni.CallIntMethod,
...
}
Also pre-loaded the remaining Call<Type>Method slots (Object,
Boolean, Long, Float, Double) so steps 1.19–1.22 just add the
corresponding switch arm — no new magic-number lookups in the diff.
Behavior-preserving refactor: IR snapshots unchanged, all 113 host
tests still pass, both cross-compile tuples still green.
One-line addition to the `call_method_offset` switch in
`emit_llvm.zig` — `.s32 => 49` (CallIntMethod). The 1.17 caching
infrastructure handles the rest: GetObjectClass → NewGlobalRef →
GetMethodID populate the shared `@SX_JNI_{CLS,MID}_<key>` pair on
miss; per-call lowering loads the cached jmethodID and dispatches
through vtable slot 49 with an `i32` return.
IR snapshot at `tests/expected/ffi-jni-call-04-jint-return.ir`
updated: the `ret i32 undef` placeholder is replaced by the full
lazy-cache + CallIntMethod sequence keyed on
`("getCount", "()I")`. Pre-1.18 snapshot was 1d7ea72.
Two `#jni_call` sites with the same string-literal `(name, sig)` pair
now share a single `jclass` GlobalRef slot and a single `jmethodID`
slot, populated lazily on the first call to any matching site.
Non-literal sites keep the per-call `GetObjectClass` + `GetMethodID`
sequence from step 1.15.
Per-call-site lowering for literal sites:
%cached_mid = load ptr, @SX_JNI_MID_<key>
%is_cached = icmp ne ptr %cached_mid, null
br i1 %is_cached, cont, miss
miss:
%local_cls = GetObjectClass(env, target)
%global_cls = NewGlobalRef(env, local_cls) ; vtable slot 21
store ptr %global_cls, @SX_JNI_CLS_<key>
%fresh_mid = GetMethodID(env, global_cls, name, sig)
store ptr %fresh_mid, @SX_JNI_MID_<key>
br cont
cont:
%mid = phi ptr [%cached_mid, before], [%fresh_mid, miss]
call <Type>Method(env, target, %mid, args...)
Wiring:
- `JniMsgSend.cache_key: ?CacheKey` (new) carries `(name_str,
sig_str)` when both `name` and `sig` are string-literal AST nodes;
empty for non-literal call sites.
- `lower.zig` populates `cache_key` from the AST.
- `emit_llvm.zig` `getOrCreateJniSlots(name, sig)` returns the
`{cls_slot, mid_slot}` pair, creating and caching them on first
lookup. Key is `name\x00sig` so the separator can't collide with
any JNI identifier byte.
- `mangleJniKey` builds an LLVM-identifier suffix from the pair, used
in the `@SX_JNI_{CLS,MID}_<suffix>` global names.
IR snapshot at `tests/expected/ffi-jni-call-03-methodid-sharing.ir`
updated: two call sites against literal `("noop", "()V")` now share
`@SX_JNI_CLS_noop____V` and `@SX_JNI_MID_noop____V`. Pre-1.17 snapshot
had two independent `GetMethodID` calls; post-1.17 has one global
slot pair plus per-call lazy-init branches.
Note: an unrelated regression in `examples/ffi-objc-call-12-rect-u64-returns.sx`
exists in the working tree (parse error from an in-progress C-import
block) and is left untouched.
New `.jni_msg_send` IR opcode carrying `{env, target, name, sig,
args[], is_static}`. `lowerFfiIntrinsicCall` now dispatches on
`fic.kind`: `.objc_call` keeps the existing path; `.jni_call` and
`.jni_static_call` route through `lowerJniCall`, which emits the new
opcode.
emit_llvm.zig expands `.jni_msg_send` into the JNI vtable
indirection:
%ifs = load ptr, %env ; vtable
%get_obj_class = load ptr, gep(%ifs, i32 31)
%cls = call ptr %get_obj_class(%env, %target)
%get_method_id = load ptr, gep(%ifs, i32 33)
%mid = call ptr %get_method_id(%env, %cls, %name, %sig)
%call_void_method = load ptr, gep(%ifs, i32 61)
call void %call_void_method(%env, %target, %mid, args...)
Per step 1.15's scope: only `.jni_call` (instance) + `void` return
are wired through the switch. `.jni_static_call` (1.23) and the
non-void returns (1.18–1.22) drop to a placeholder `LLVMGetUndef` so
the build doesn't fault — the next-step commits flip those arms one
shape at a time. Method-ID caching is step 1.17.
Two small helpers landed alongside:
- `loadJniFn(ifs, offset, name)` — GEP into the vtable + load.
- `extractSlicePtr(val)` — string literals lower as `{ptr, i64}`
slices in sx IR; JNI's `GetMethodID` expects raw C strings, so
this extracts field 0 when the source is a slice.
Android cross-compile now passes for `examples/ffi-jni-call-02-void.sx`
(2/2 cross targets green). Host run_examples still passes 112/112.
Chess iOS-sim + Android both compile clean.
109/109 regression tests pass; chess Android + iOS-sim still
build clean.
Root cause: sx's `xx <ptr>` cast targeting an integer type
(common pattern: `xx u64 = xx @some_global`) lowered to a no-op
because `coerceToType` had branches for int↔float and same-kind
widen/narrow, but nothing for pointer↔integer. The cast left the
value as a pointer Ref, and `emitInst`'s `.ret` arm tried to
coerce a `ptr` value to an `i64` slot — coerceArg had no
ptr↔int branch either, fell through to undef.
Why it worked in main but failed in helpers: an
`alloca u64`+`store ptr @g, alloca`+`load i64, alloca` sequence
preserves the address bits as raw memory, so the
"store-then-load through an alloca" workaround happened to do
the right thing without a real cast. A `ret i64 <ptr>` has no
such intermediate slot and triggers an LLVM type mismatch.
Fix layered into two existing IR opcodes:
lower.zig (coerceToType):
new branch — when src and dst types are ptr↔int, emit a
`bitcast` IR opcode with the right from/to. Mirrors how
int↔float emits `.int_to_float` / `.float_to_int`.
emit_llvm.zig (.bitcast arm):
dispatch ptr→int to `LLVMBuildPtrToInt` (+ trunc/zext if the
target int width != 64), int→ptr to `LLVMBuildIntToPtr`. The
"real bitcast" path stays for same-kind type punning.
Modern LLVM's BuildBitCast rejects ptr↔int directly, hence
the dispatch.
The fix also closes a quiet behavior gap that affected non-`#foreign`
globals (any `xx @<global>` from a helper fn). Surfaced while
investigating issue-0037; verified independently with a
non-`#foreign` sx-side global of type `s64`.
File mechanics: issue-0037 promoted to a focused feature example
per CLAUDE.md's resolution flow:
examples/issue-0037.sx -> examples/102-foreign-global-from-helper.sx
tests/expected/issue-0037.{txt,exit} -> tests/expected/102-foreign-global-from-helper.{txt,exit}
ffi-objc-call-03 + ffi-objc-call-06 IR snapshots updated to
reflect the ptr→int store-via-ptrtoint shape that's now correct
at the LLVM-IR level (same bits in memory, but properly typed).
104/104 regression tests pass. The Triple round-trip
(triple_imp writes {11, 22, 33} on the IMP side → #objc_call(Triple)
reads them back) is the test of record.
emit_llvm.zig changes:
1. `objc_msg_send` arm — when `needsByval(ret_ty)` (same predicate
the plain-foreign-call path uses), apply the sret transform:
- ret type collapses to void
- prepend a `ptr` param at index 0 (call site provides an
alloca slot)
- mirror `sret(<RetType>)` on the call site so the AArch64 x8
/ SysV-AMD64 hidden-ptr ABI lowers correctly
- load the result from the slot post-call
The IR shape now matches clang exactly:
call void @objc_msgSend(ptr sret({...}) %slot, ptr %recv, ptr %sel)
2. `.ret` arm — the body-side counterpart for sx fns whose declared
return type is sret-shaped (sx-defined IMPs registered via
`class_addMethod` produce these). When the current function's
`needsByval(func.ret)` predicate holds, store the IR ret value
through the prepended sret slot (param 0) and emit `ret void`.
Previously the unconditional coerceArg path turned the struct
value into `undef` and emitted `ret void undef` — illegal LLVM.
Test mechanics: registers `SxTripleProbe : NSObject` at runtime via
`objc_allocateClassPair` + `class_addMethod`, IMP returns
Triple{11, 22, 33}. `#objc_call(Triple)(instance, "tripleValue")`
gets them back, round-trip pinned in the .txt snapshot and the
IR-shape snapshot.
102/102 regression tests pass; chess Android + iOS-sim still build
clean. `ffi-objc-call-04-primitive-returns` flips from xfail to
passing with both nil-recv and real-recv flavors of *void / s64
returns exercised.
Key change: a new `objc_msg_send` IR opcode bundles (recv, sel,
extra args) and carries the return type via the `Inst.ty` field.
emit_llvm.zig builds a per-call-site LLVM function type from the
argument Refs' IR types (recv/sel as ptr; extra args through
abiCoerceParamType) and dispatches with LLVMBuildCall2. One
declared `@objc_msgSend` symbol is reused across every return
type — opaque pointers make the function value type-erased, so
each call site picks its own ABI.
before: one (recv, sel) -> ptr LLVM declaration, hard-coded
per call site; only void return wired in 1.3.
after: same declaration, each call site provides a fresh
LLVMBuildCall2 fn-type → s64 / *void / bool / f64
returns all dispatch correctly without separate FuncIds.
Selector init mechanism: stayed with the @llvm.global_ctors
constructor. Investigated clang's
`__DATA,__objc_selrefs` + `externally_initialized` shape — works
for fully-linked binaries (dyld substitutes the SEL at load
time) but **LLVM ORC JIT** (the engine behind `sx run`) doesn't
process Mach-O Obj-C metadata sections, so the slot keeps its
initial value (the method-name string pointer) and dispatch
crashes with "<null selector>". The portable choice: keep the
constructor AND inject a direct call to it at `main`'s entry —
idempotent under dyld (sel_registerName returns the same SEL on
re-registration), required for ORC JIT.
Files touched:
src/ir/inst.zig | new ObjcMsgSend struct + opcode
src/ir/lower.zig | drop the void-only restriction; emit the
new opcode; remove the orphaned
getObjcMsgSendFid path (objc_msgSend
declaration moved to emit_llvm)
src/ir/emit_llvm.zig | objc_msg_send arm (per-call-site
LLVMBuildCall2); lazy `@objc_msgSend`
declaration via getObjcMsgSendValue;
emitObjcSelectorInit refactored to inject
the ctor call at main's entry
src/ir/{print,interp}.zig | switch arms for the new opcode
`ffi-objc-call-03-selector-sharing.ir` snapshot updates to
reflect the new shape (the `call ... @objc_msgSend` call sites
no longer mention a typed wrapper).
101/101 regression tests pass; the IR snapshot for the selector-
sharing test diff flips from four per-call `sel_registerName` calls
to two (one per unique selector) routed through a module-init
constructor — matching what clang emits for `@selector(...)`.
Hot-path cost collapses from a libobjc hashtable lookup per call to
a single load of a static `SEL*` slot:
Before (Phase 1.3):
%sel = call ptr @sel_registerName(<"init">)
call ptr @objc_msgSend(<recv>, %sel)
After (Phase 1.5):
%sel = load ptr, ptr @OBJC_SELECTOR_REFERENCES_init
call ptr @objc_msgSend(<recv>, %sel)
+ @OBJC_SELECTOR_REFERENCES_init = internal global ptr null
+ @OBJC_SELECTOR_REFERENCES_release = internal global ptr null
+ define internal void @__sx_objc_selector_init() {
+ %sel = call ptr @sel_registerName(ptr @OBJC_METH_VAR_NAME_)
+ store ptr %sel, ptr @OBJC_SELECTOR_REFERENCES_init
+ %sel1 = call ptr @sel_registerName(ptr @OBJC_METH_VAR_NAME_.2)
+ store ptr %sel1, ptr @OBJC_SELECTOR_REFERENCES_release
+ ret void
+ }
+ @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }]
+ [{ ..., ptr @__sx_objc_selector_init, ptr null }]
Implementation:
module.zig | new `objc_selector_cache: ArrayList(ObjcSelectorEntry)`
with `lookupObjcSelector` / `appendObjcSelector`. List
(not hashmap) keeps emit order stable across builds so
the IR snapshot doesn't flicker on rehash.
lower.zig | `internObjcSelector(sel)` creates the slot on first
use, returns the same `GlobalId` on every subsequent
call to the same selector. lowerFfiIntrinsicCall now
emits `global_addr + load` for literal selectors.
Non-literal selectors keep the `sel_registerName`
fallback. Declaring `sel_registerName` lazily on
first intern so emit_llvm finds it for the
constructor body.
emit_llvm.zig | new `emitObjcSelectorInit` pass synthesizes a void
constructor that loops over the cache, calls
`sel_registerName` for each unique selector string,
stores the result in the slot. Constructor is
registered in `@llvm.global_ctors` with default
priority (65535) so dyld runs it before main.
The `@OBJC_METH_VAR_NAME_` private string globals and unnamed-addr
flag match clang's exact emission shape — picked up by the system
linker into the right Mach-O sections on macOS / iOS. Chess
Android + iOS-sim still build clean (no `#objc_call` in chess yet —
phase-3 migration will start exercising this).
Foreign functions that return a >16-byte non-HFA aggregate (e.g.
Big24 / UIEdgeInsets on iOS / clang-shaped struct returns) need the
indirect-return ABI: caller allocates space, passes its pointer as a
hidden first arg with `sret(<T>)`, callee writes through it and
returns void. AAPCS64 puts the pointer in x8; SysV AMD64 puts it in
the first int register and treats the named return as void.
The existing >16-byte branch in `abiCoerceParamType` was returning
`ptr` for BOTH params and returns. That works for byval params (the
established pattern — caller alloca + store + pass ptr, callee loads
in prologue), but is wrong for returns: it caused the function decl
to look like `ptr @fn(...)` rather than `void @fn(ptr sret(<T>), ...)`,
and the call site read whatever happened to be in x0 as a struct
pointer — segfault on dereference (caught while writing the ffi-03
baseline).
Fix layered into the same `abiCoerceParamType` / call-site code path:
emitFunctionDecl:
- Compute `uses_sret = needs_c_abi && needsByval(ret_ty, raw_ret_ty)`.
- Ret type collapses to void.
- Prepend a `ptr` param at slot 0.
- Add `sret(<RetType>)` type attribute on param-index 1
(LLVMAttributeIndex 1 = first parameter; 0 = return value).
.call lowering:
- Detect callee_uses_sret via the same predicate.
- Allocate the result on the caller's stack (`sret.slot`).
- Prepend it as args[0] (with sret_off index alignment so the
original sx args land at args[1..]).
- After LLVMBuildCall2, set the same `sret(<T>)` attribute on
the call site's arg 1 (mirrors the fn-decl attribute — both
land in the AArch64 backend's lowering pass).
- Load the result from the slot to produce the IR value.
`call_indirect` (function-pointer dispatch — uikit.sx's typed
`objc_msgSend` casts) keeps its existing behavior for now; the iOS
path already round-trips UIEdgeInsets via that route. Folding the
same sret transform into call_indirect is a follow-up.
89/89 regression tests still pass. Chess Android + iOS-sim both
build clean.
Resolves issue-0036 (LLVM verifier failure on 16-byte integer-only
struct by value through #foreign). The mismatch:
Call parameter type does not match function signature!
%load = load { i64, i64 }, ptr %alloca, align 8
[2 x i64] %call = call [2 x i64] @fn({ i64, i64 } %load)
`abiCoerceParamType` had already chosen `[2 x i64]` for 9..16-byte
non-HFA structs (the AAPCS64 / SysV AMD64 register-pair ABI slot for
that size class) on the foreign-decl side, but `coerceArg` only knew
how to bridge struct<->integer (the ≤8 B case) — not struct<->array.
LLVM's verifier rejects type-mismatched call args, so the call site
never landed.
Added the symmetric branches in coerceArg:
- Struct -> Array : alloca <array>; store <struct>; load <array>
- Array -> Struct : alloca <array>; store <array>; load <struct>
Both use the LLVM opaque-pointer memory-bitcast pattern already in
place for the integer case. They're paired with the existing
i64 <-> small-struct bridge so all four (≤8 B int, 9..16 B int,
16 B HFA, >16 B byval) ABI slots round-trip cleanly through
emit_llvm now.
File mechanics: promotes the issue-0036 repro to a focused feature
example per CLAUDE.md's issue-resolution workflow:
examples/issue-0036.sx -> examples/101-ffi-medium-struct.sx
tests/expected/issue-0036.{txt,exit} -> tests/expected/101-ffi-medium-struct.{txt,exit}
vendors/issue_0036/issue_0036.c -> vendors/ffi_medium_struct/ffi_medium_struct.c
Snapshot updated to the passing output. 89/89 regression tests pass;
chess Android build still clean.
Protocol structs registered via registerProtocolDecl carry a new
is_protocol flag; the ?T paths in sizeOf/typeSizeBytes/toLLVMType
recognise it and lay out ?Protocol as the protocol struct itself
(ctx == null IS the "none" state), matching how ?Closure / ?*T are
sentinel-shaped — no extra storage.
Method dispatch on ?Protocol auto-unwraps in lowerCall's field-access
path; the unwrap is structurally a no-op so we just rebind obj_ty to
the payload type. resolveCallParamTypes extended for optional-protocol
receivers so enum-literal args (gpu.create_texture(.r8, ...)) get the
right target_type and don't silently collapse to tag=0 : s32 — same
issue-0031-class bug closed in Session 66, one type-system layer
deeper.
Library: UIRenderer / UIPipeline / GlyphCache migrated from the verbose
gpu: GPU = ---; has_gpu: bool pattern to gpu: ?GPU = null. set_gpu no
longer maintains a parallel bool flag.
Bundled: dock.sx threads delta_time as a struct field rather than via
a global pointer (cleanup unrelated to issue-0028, committed alongside).
Verified: 85/85 regression tests pass; iOS-sim chess + macOS chess
both render correctly post-migration.
Three stacked compiler bugs were causing iOS-sim chess to crash inside
[MTLTexture replaceRegion:...]. Fixing them lets every replaceRegion call
site succeed (1×1 RGBA8, 1MB R8 atlas, 440×440 chess pieces).
Path B for callconv(.c) fn-pointer casts:
- FunctionInfo now carries call_conv: CallConv (TypeInfo.CallConv) so
function-type interning distinguishes sx-CC from C-CC. Inst.zig's
Function.CallingConvention aliases the same enum.
- Parser accepts an optional `callconv(.c)` suffix on fn-pointer type
spellings (factored into parseOptionalCallConv() shared with parseFnDecl
and parseLambda).
- resolveFunctionType passes the parsed CC through functionTypeCC().
- .call_indirect reads fp.call_conv == .c and applies the C-ABI
alloca+materialize for >16B aggregate args (Path A's behaviour at .call).
Apple ARM64 ABI (drop LLVM byval):
- Side-by-side asm diff vs clang's emission for the equivalent C call site
showed LLVM's `byval` attribute lowers Apple-arm64 byval on the stack,
while clang passes the struct via a pointer in the next int register
(x2 for replaceRegion:). The runtime objc_msgSend dispatch path expects
clang's convention.
- Dropped the byval attribute from the function-signature emission and
from both call sites (.call and .call_indirect). The materialize-into-
alloca + pass-plain-ptr pattern stays — the call site now matches
clang's `mov x2, sp` exactly.
- Path A's sx-to-sx case continues to work since both ends use plain ptr
(caller does alloca+store+pass, callee loads from the ptr in prologue).
Protocol dispatch (emitProtocolDispatch):
- Untargeted `null` lowers as const_null with type .void (per
target_type orelse .void). The "wrap-value-in-alloca-pass-pointer"
branch alloca'd a void slot, which LLVM's IRBuilder asserts on —
EXC_BREAKPOINT in getTypeSizeInBits, manifesting as exit 133 / SIGTRAP
when building the chess game. Fixed by re-emitting as
constNull(void_ptr) when arg_ty == .void && expected_ty == void_ptr.
- is_pointer_ty only recognized .pointer, so [*]T (many_pointer) was
alloca-wrapped — the heap pixels pointer from stbi_load was stored
into a stack slot and the slot's address was passed as the *void arg.
Fixed by extending the check to `.pointer or .many_pointer`.
metal.sx call sites + lifecycle guards:
- msg_replace (replaceRegion:, MTLRegion = 48B) and the two setScissorRect:
sites (MTLScissorRect = 32B) now spell their fn-pointer types with
by-value params + callconv(.c) — the *MTLRegion/@local workaround is
gone.
- metal_begin_frame_ios bails before nextDrawable when pixel_w/h are 0
(drawableSize 0×0 makes nextDrawable abort via XPC).
- metal_init_ios only sets drawableSize when dims are positive.
- begin_frame's encoder/cmd_buffer failure paths now clear self.drawable
so a partial failure doesn't leak a drawable back into the pool.
Examples + tests:
- examples/86-callconv-c-fnptr-large-aggregate.sx — new, covers Path B
with C-CC fn-ptr cast.
- examples/87-fnptr-cast-large-aggregate.sx — renamed from issue-0025.sx,
covers Path B with default sx-CC (the negative case).
- examples/85-cc-c-large-aggregate.sx — from Session 60, covers Path A.
- examples/issue-0014.sx, issue-0024.sx, issue-0025.sx — removed
(resolved earlier this work).
71 regression tests pass, 0 failed. Chess game builds clean for iOS sim
and reaches its frame loop without aborting. Runtime: chess UI still
doesn't render — remaining issue is in the UIKit lifecycle / CAMetalLayer
setup (legacy-app vs scene-API hybrid), not a compiler bug. See
current/CHECKPOINT.md "Next step" for the diagnosis + options.
Phase 8 step 3a of the Metal renderer port:
- New library/modules/gpu/ with types.sx (handles + ClearColor +
TextureFormat enum), api.sx (GPU :: protocol { ... } covering the
lifecycle / per-frame / resource / per-draw surface), and metal.sx
(MetalGPU backend implementing the protocol against CAMetalLayer).
Resource handles are 1-based indices into backend List(*void) tables.
MTL aggregates >16 bytes (MTLRegion, MTLScissorRect) pass via *T to
match arm64 Apple's indirect-by-reference ABI; MTLClearColor + CGSize
go through the HFA path as direct fn-pointer casts on objc_msgSend.
- UIKitPlatform got a gpu_mode: GpuMode toggle + sibling SxMetalView
class registration. In metal mode init skips EAGL context, the
did_finish_launching IMP skips the EAGL drawable-properties dict,
layoutSubviews reads the layer's bounds * dpi_scale into pixel_w/h
instead of allocating a GL renderbuffer, and end_frame is a no-op
(the MetalGPU owns its own present).
- examples/63-metal-clear.sx verifies the pipeline end-to-end on iOS
sim — compiles a pass-through MSL shader (packed_float2/packed_float4
to avoid alignment padding), uploads 3 vertices, draws a colored
triangle on a dark-blue clear.
Compiler fixes (filed-and-fixed in this branch):
- inline if X { return E; } followed by a fall-through final expression
no longer emits two terminators into the same basic block. Verified
by examples/83-inline-if-return-fallthrough.sx.
- Top-level type alias Name :: u32; now resolves correctly as the type
annotation on a global variable (was treated as ptr {}, breaking
comparisons + initializers). Verified by examples/84-global-type-alias.sx.
Issue->feature promotion:
- 16 historical examples/issue-NNNN.sx repros now confirmed-fixed and
renamed to focused feature names (67-82). Each gains a
tests/expected/*.txt + .exit pair so the regression suite covers them.
- 5 stale issue repros deleted (subsumed by broader tests).
Regression suite: 68 passing, 0 failed. macOS chess builds + runs; wasm
chess builds; iOS sim GLES chess still renders the full board; iOS sim
Metal demo renders the triangle.