Commit Graph

70 Commits

Author SHA1 Message Date
agra
e388687f1a ffi 1.8b: sret transform for #objc_call(>16 B non-HFA struct)
104/104 regression tests pass. The Triple round-trip
(triple_imp writes {11, 22, 33} on the IMP side → #objc_call(Triple)
reads them back) is the test of record.

emit_llvm.zig changes:

1. `objc_msg_send` arm — when `needsByval(ret_ty)` (same predicate
   the plain-foreign-call path uses), apply the sret transform:
     - ret type collapses to void
     - prepend a `ptr` param at index 0 (call site provides an
       alloca slot)
     - mirror `sret(<RetType>)` on the call site so the AArch64 x8
       / SysV-AMD64 hidden-ptr ABI lowers correctly
     - load the result from the slot post-call
   The IR shape now matches clang exactly:
     call void @objc_msgSend(ptr sret({...}) %slot, ptr %recv, ptr %sel)

2. `.ret` arm — the body-side counterpart for sx fns whose declared
   return type is sret-shaped (sx-defined IMPs registered via
   `class_addMethod` produce these). When the current function's
   `needsByval(func.ret)` predicate holds, store the IR ret value
   through the prepended sret slot (param 0) and emit `ret void`.
   Previously the unconditional coerceArg path turned the struct
   value into `undef` and emitted `ret void undef` — illegal LLVM.

Test mechanics: registers `SxTripleProbe : NSObject` at runtime via
`objc_allocateClassPair` + `class_addMethod`, IMP returns
Triple{11, 22, 33}. `#objc_call(Triple)(instance, "tripleValue")`
gets them back, round-trip pinned in the .txt snapshot and the
IR-shape snapshot.
2026-05-19 18:50:26 +03:00
agra
d43385112c ffi 1.6: objc_msg_send IR opcode + per-call-site LLVM fn type
102/102 regression tests pass; chess Android + iOS-sim still build
clean. `ffi-objc-call-04-primitive-returns` flips from xfail to
passing with both nil-recv and real-recv flavors of *void / s64
returns exercised.

Key change: a new `objc_msg_send` IR opcode bundles (recv, sel,
extra args) and carries the return type via the `Inst.ty` field.
emit_llvm.zig builds a per-call-site LLVM function type from the
argument Refs' IR types (recv/sel as ptr; extra args through
abiCoerceParamType) and dispatches with LLVMBuildCall2. One
declared `@objc_msgSend` symbol is reused across every return
type — opaque pointers make the function value type-erased, so
each call site picks its own ABI.

  before:  one (recv, sel) -> ptr LLVM declaration, hard-coded
           per call site; only void return wired in 1.3.
  after:   same declaration, each call site provides a fresh
           LLVMBuildCall2 fn-type → s64 / *void / bool / f64
           returns all dispatch correctly without separate FuncIds.

Selector init mechanism: stayed with the @llvm.global_ctors
constructor. Investigated clang's
`__DATA,__objc_selrefs` + `externally_initialized` shape — works
for fully-linked binaries (dyld substitutes the SEL at load
time) but **LLVM ORC JIT** (the engine behind `sx run`) doesn't
process Mach-O Obj-C metadata sections, so the slot keeps its
initial value (the method-name string pointer) and dispatch
crashes with "<null selector>". The portable choice: keep the
constructor AND inject a direct call to it at `main`'s entry —
idempotent under dyld (sel_registerName returns the same SEL on
re-registration), required for ORC JIT.

Files touched:
  src/ir/inst.zig    | new ObjcMsgSend struct + opcode
  src/ir/lower.zig   | drop the void-only restriction; emit the
                       new opcode; remove the orphaned
                       getObjcMsgSendFid path (objc_msgSend
                       declaration moved to emit_llvm)
  src/ir/emit_llvm.zig | objc_msg_send arm (per-call-site
                       LLVMBuildCall2); lazy `@objc_msgSend`
                       declaration via getObjcMsgSendValue;
                       emitObjcSelectorInit refactored to inject
                       the ctor call at main's entry
  src/ir/{print,interp}.zig | switch arms for the new opcode

`ffi-objc-call-03-selector-sharing.ir` snapshot updates to
reflect the new shape (the `call ... @objc_msgSend` call sites
no longer mention a typed wrapper).
2026-05-19 18:39:10 +03:00
agra
b8a412ddc7 ffi 1.5: intern Obj-C selectors — one static SEL slot per unique name
101/101 regression tests pass; the IR snapshot for the selector-
sharing test diff flips from four per-call `sel_registerName` calls
to two (one per unique selector) routed through a module-init
constructor — matching what clang emits for `@selector(...)`.

Hot-path cost collapses from a libobjc hashtable lookup per call to
a single load of a static `SEL*` slot:

  Before (Phase 1.3):
    %sel = call ptr @sel_registerName(<"init">)
    call ptr @objc_msgSend(<recv>, %sel)

  After (Phase 1.5):
    %sel = load ptr, ptr @OBJC_SELECTOR_REFERENCES_init
    call ptr @objc_msgSend(<recv>, %sel)

  +  @OBJC_SELECTOR_REFERENCES_init    = internal global ptr null
  +  @OBJC_SELECTOR_REFERENCES_release = internal global ptr null
  +  define internal void @__sx_objc_selector_init() {
  +    %sel  = call ptr @sel_registerName(ptr @OBJC_METH_VAR_NAME_)
  +    store ptr %sel, ptr @OBJC_SELECTOR_REFERENCES_init
  +    %sel1 = call ptr @sel_registerName(ptr @OBJC_METH_VAR_NAME_.2)
  +    store ptr %sel1, ptr @OBJC_SELECTOR_REFERENCES_release
  +    ret void
  +  }
  +  @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }]
  +    [{ ..., ptr @__sx_objc_selector_init, ptr null }]

Implementation:
  module.zig    | new `objc_selector_cache: ArrayList(ObjcSelectorEntry)`
                  with `lookupObjcSelector` / `appendObjcSelector`. List
                  (not hashmap) keeps emit order stable across builds so
                  the IR snapshot doesn't flicker on rehash.
  lower.zig     | `internObjcSelector(sel)` creates the slot on first
                  use, returns the same `GlobalId` on every subsequent
                  call to the same selector. lowerFfiIntrinsicCall now
                  emits `global_addr + load` for literal selectors.
                  Non-literal selectors keep the `sel_registerName`
                  fallback. Declaring `sel_registerName` lazily on
                  first intern so emit_llvm finds it for the
                  constructor body.
  emit_llvm.zig | new `emitObjcSelectorInit` pass synthesizes a void
                  constructor that loops over the cache, calls
                  `sel_registerName` for each unique selector string,
                  stores the result in the slot. Constructor is
                  registered in `@llvm.global_ctors` with default
                  priority (65535) so dyld runs it before main.

The `@OBJC_METH_VAR_NAME_` private string globals and unnamed-addr
flag match clang's exact emission shape — picked up by the system
linker into the right Mach-O sections on macOS / iOS. Chess
Android + iOS-sim still build clean (no `#objc_call` in chess yet —
phase-3 migration will start exercising this).
2026-05-19 13:09:34 +03:00
agra
7fd6decdc9 emit_llvm: sret return for >16-byte aggregate foreign returns
Foreign functions that return a >16-byte non-HFA aggregate (e.g.
Big24 / UIEdgeInsets on iOS / clang-shaped struct returns) need the
indirect-return ABI: caller allocates space, passes its pointer as a
hidden first arg with `sret(<T>)`, callee writes through it and
returns void. AAPCS64 puts the pointer in x8; SysV AMD64 puts it in
the first int register and treats the named return as void.

The existing >16-byte branch in `abiCoerceParamType` was returning
`ptr` for BOTH params and returns. That works for byval params (the
established pattern — caller alloca + store + pass ptr, callee loads
in prologue), but is wrong for returns: it caused the function decl
to look like `ptr @fn(...)` rather than `void @fn(ptr sret(<T>), ...)`,
and the call site read whatever happened to be in x0 as a struct
pointer — segfault on dereference (caught while writing the ffi-03
baseline).

Fix layered into the same `abiCoerceParamType` / call-site code path:

  emitFunctionDecl:
    - Compute `uses_sret = needs_c_abi && needsByval(ret_ty, raw_ret_ty)`.
    - Ret type collapses to void.
    - Prepend a `ptr` param at slot 0.
    - Add `sret(<RetType>)` type attribute on param-index 1
      (LLVMAttributeIndex 1 = first parameter; 0 = return value).

  .call lowering:
    - Detect callee_uses_sret via the same predicate.
    - Allocate the result on the caller's stack (`sret.slot`).
    - Prepend it as args[0] (with sret_off index alignment so the
      original sx args land at args[1..]).
    - After LLVMBuildCall2, set the same `sret(<T>)` attribute on
      the call site's arg 1 (mirrors the fn-decl attribute — both
      land in the AArch64 backend's lowering pass).
    - Load the result from the slot to produce the IR value.

`call_indirect` (function-pointer dispatch — uikit.sx's typed
`objc_msgSend` casts) keeps its existing behavior for now; the iOS
path already round-trips UIEdgeInsets via that route. Folding the
same sret transform into call_indirect is a follow-up.

89/89 regression tests still pass. Chess Android + iOS-sim both
build clean.
2026-05-19 11:40:54 +03:00
agra
7d2c2fb062 emit_llvm: bridge struct<->array ABI for 9..16-byte foreign structs
Resolves issue-0036 (LLVM verifier failure on 16-byte integer-only
struct by value through #foreign). The mismatch:

  Call parameter type does not match function signature!
    %load = load { i64, i64 }, ptr %alloca, align 8
  [2 x i64]  %call = call [2 x i64] @fn({ i64, i64 } %load)

`abiCoerceParamType` had already chosen `[2 x i64]` for 9..16-byte
non-HFA structs (the AAPCS64 / SysV AMD64 register-pair ABI slot for
that size class) on the foreign-decl side, but `coerceArg` only knew
how to bridge struct<->integer (the ≤8 B case) — not struct<->array.
LLVM's verifier rejects type-mismatched call args, so the call site
never landed.

Added the symmetric branches in coerceArg:
  - Struct -> Array : alloca <array>; store <struct>; load <array>
  - Array -> Struct : alloca <array>; store <array>;  load <struct>

Both use the LLVM opaque-pointer memory-bitcast pattern already in
place for the integer case. They're paired with the existing
i64 <-> small-struct bridge so all four (≤8 B int, 9..16 B int,
16 B HFA, >16 B byval) ABI slots round-trip cleanly through
emit_llvm now.

File mechanics: promotes the issue-0036 repro to a focused feature
example per CLAUDE.md's issue-resolution workflow:

  examples/issue-0036.sx              -> examples/101-ffi-medium-struct.sx
  tests/expected/issue-0036.{txt,exit} -> tests/expected/101-ffi-medium-struct.{txt,exit}
  vendors/issue_0036/issue_0036.c     -> vendors/ffi_medium_struct/ffi_medium_struct.c

Snapshot updated to the passing output. 89/89 regression tests pass;
chess Android build still clean.
2026-05-19 11:31:04 +03:00
agra
79419b99bd issue-0028: ?Protocol = null sentinel-shaped optional protocols
Protocol structs registered via registerProtocolDecl carry a new
is_protocol flag; the ?T paths in sizeOf/typeSizeBytes/toLLVMType
recognise it and lay out ?Protocol as the protocol struct itself
(ctx == null IS the "none" state), matching how ?Closure / ?*T are
sentinel-shaped — no extra storage.

Method dispatch on ?Protocol auto-unwraps in lowerCall's field-access
path; the unwrap is structurally a no-op so we just rebind obj_ty to
the payload type. resolveCallParamTypes extended for optional-protocol
receivers so enum-literal args (gpu.create_texture(.r8, ...)) get the
right target_type and don't silently collapse to tag=0 : s32 — same
issue-0031-class bug closed in Session 66, one type-system layer
deeper.

Library: UIRenderer / UIPipeline / GlyphCache migrated from the verbose
gpu: GPU = ---; has_gpu: bool pattern to gpu: ?GPU = null. set_gpu no
longer maintains a parallel bool flag.

Bundled: dock.sx threads delta_time as a struct field rather than via
a global pointer (cleanup unrelated to issue-0028, committed alongside).

Verified: 85/85 regression tests pass; iOS-sim chess + macOS chess
both render correctly post-migration.
2026-05-18 18:32:55 +03:00
agra
f9ecf9d00e iOS lock step keyboard + metal 2026-05-18 17:40:10 +03:00
agra
63565e41ff abi: pass >16B aggregates by ptr-in-next-reg (Apple ARM64 ABI) + Path B for fn-ptr casts
Three stacked compiler bugs were causing iOS-sim chess to crash inside
[MTLTexture replaceRegion:...]. Fixing them lets every replaceRegion call
site succeed (1×1 RGBA8, 1MB R8 atlas, 440×440 chess pieces).

Path B for callconv(.c) fn-pointer casts:
- FunctionInfo now carries call_conv: CallConv (TypeInfo.CallConv) so
  function-type interning distinguishes sx-CC from C-CC. Inst.zig's
  Function.CallingConvention aliases the same enum.
- Parser accepts an optional `callconv(.c)` suffix on fn-pointer type
  spellings (factored into parseOptionalCallConv() shared with parseFnDecl
  and parseLambda).
- resolveFunctionType passes the parsed CC through functionTypeCC().
- .call_indirect reads fp.call_conv == .c and applies the C-ABI
  alloca+materialize for >16B aggregate args (Path A's behaviour at .call).

Apple ARM64 ABI (drop LLVM byval):
- Side-by-side asm diff vs clang's emission for the equivalent C call site
  showed LLVM's `byval` attribute lowers Apple-arm64 byval on the stack,
  while clang passes the struct via a pointer in the next int register
  (x2 for replaceRegion:). The runtime objc_msgSend dispatch path expects
  clang's convention.
- Dropped the byval attribute from the function-signature emission and
  from both call sites (.call and .call_indirect). The materialize-into-
  alloca + pass-plain-ptr pattern stays — the call site now matches
  clang's `mov x2, sp` exactly.
- Path A's sx-to-sx case continues to work since both ends use plain ptr
  (caller does alloca+store+pass, callee loads from the ptr in prologue).

Protocol dispatch (emitProtocolDispatch):
- Untargeted `null` lowers as const_null with type .void (per
  target_type orelse .void). The "wrap-value-in-alloca-pass-pointer"
  branch alloca'd a void slot, which LLVM's IRBuilder asserts on —
  EXC_BREAKPOINT in getTypeSizeInBits, manifesting as exit 133 / SIGTRAP
  when building the chess game. Fixed by re-emitting as
  constNull(void_ptr) when arg_ty == .void && expected_ty == void_ptr.
- is_pointer_ty only recognized .pointer, so [*]T (many_pointer) was
  alloca-wrapped — the heap pixels pointer from stbi_load was stored
  into a stack slot and the slot's address was passed as the *void arg.
  Fixed by extending the check to `.pointer or .many_pointer`.

metal.sx call sites + lifecycle guards:
- msg_replace (replaceRegion:, MTLRegion = 48B) and the two setScissorRect:
  sites (MTLScissorRect = 32B) now spell their fn-pointer types with
  by-value params + callconv(.c) — the *MTLRegion/@local workaround is
  gone.
- metal_begin_frame_ios bails before nextDrawable when pixel_w/h are 0
  (drawableSize 0×0 makes nextDrawable abort via XPC).
- metal_init_ios only sets drawableSize when dims are positive.
- begin_frame's encoder/cmd_buffer failure paths now clear self.drawable
  so a partial failure doesn't leak a drawable back into the pool.

Examples + tests:
- examples/86-callconv-c-fnptr-large-aggregate.sx — new, covers Path B
  with C-CC fn-ptr cast.
- examples/87-fnptr-cast-large-aggregate.sx — renamed from issue-0025.sx,
  covers Path B with default sx-CC (the negative case).
- examples/85-cc-c-large-aggregate.sx — from Session 60, covers Path A.
- examples/issue-0014.sx, issue-0024.sx, issue-0025.sx — removed
  (resolved earlier this work).

71 regression tests pass, 0 failed. Chess game builds clean for iOS sim
and reaches its frame loop without aborting. Runtime: chess UI still
doesn't render — remaining issue is in the UIKit lifecycle / CAMetalLayer
setup (legacy-app vs scene-API hybrid), not a compiler bug. See
current/CHECKPOINT.md "Next step" for the diagnosis + options.
2026-05-18 00:11:23 +03:00
agra
a938c4f900 metal: GPU protocol + MetalGPU renders MSL triangle on iOS
Phase 8 step 3a of the Metal renderer port:

- New library/modules/gpu/ with types.sx (handles + ClearColor +
  TextureFormat enum), api.sx (GPU :: protocol { ... } covering the
  lifecycle / per-frame / resource / per-draw surface), and metal.sx
  (MetalGPU backend implementing the protocol against CAMetalLayer).
  Resource handles are 1-based indices into backend List(*void) tables.
  MTL aggregates >16 bytes (MTLRegion, MTLScissorRect) pass via *T to
  match arm64 Apple's indirect-by-reference ABI; MTLClearColor + CGSize
  go through the HFA path as direct fn-pointer casts on objc_msgSend.

- UIKitPlatform got a gpu_mode: GpuMode toggle + sibling SxMetalView
  class registration. In metal mode init skips EAGL context, the
  did_finish_launching IMP skips the EAGL drawable-properties dict,
  layoutSubviews reads the layer's bounds * dpi_scale into pixel_w/h
  instead of allocating a GL renderbuffer, and end_frame is a no-op
  (the MetalGPU owns its own present).

- examples/63-metal-clear.sx verifies the pipeline end-to-end on iOS
  sim — compiles a pass-through MSL shader (packed_float2/packed_float4
  to avoid alignment padding), uploads 3 vertices, draws a colored
  triangle on a dark-blue clear.

Compiler fixes (filed-and-fixed in this branch):

- inline if X { return E; } followed by a fall-through final expression
  no longer emits two terminators into the same basic block. Verified
  by examples/83-inline-if-return-fallthrough.sx.

- Top-level type alias Name :: u32; now resolves correctly as the type
  annotation on a global variable (was treated as ptr {}, breaking
  comparisons + initializers). Verified by examples/84-global-type-alias.sx.

Issue->feature promotion:

- 16 historical examples/issue-NNNN.sx repros now confirmed-fixed and
  renamed to focused feature names (67-82). Each gains a
  tests/expected/*.txt + .exit pair so the regression suite covers them.

- 5 stale issue repros deleted (subsumed by broader tests).

Regression suite: 68 passing, 0 failed. macOS chess builds + runs; wasm
chess builds; iOS sim GLES chess still renders the full board; iOS sim
Metal demo renders the triangle.
2026-05-17 19:36:37 +03:00
agra
f9dda972d2 fixes 2026-03-05 16:20:36 +02:00
agra
22bc2439ce fixes 2026-03-04 17:12:56 +02:00
agra
67e02a20a5 ... 2026-03-04 09:18:24 +02:00
agra
004aff5f67 wasm shell + destructuring 2026-03-03 13:21:54 +02:00
agra
03074472e5 build options #compiler 2026-03-03 09:35:50 +02:00
agra
bbb5426777 sm 2026-03-02 21:00:55 +02:00
agra
2f4f898d54 asm... 2026-03-02 17:19:41 +02:00
agra
ba9c4d69ce wasm 2026-03-02 09:49:43 +02:00
agra
f763765ea2 ir done'ish 2026-03-01 22:38:41 +02:00
agra
6a920dbd2c ir 2026-02-28 18:03:38 +02:00
agra
2552882ce6 05 2026-02-26 14:46:21 +02:00