Commit Graph

17 Commits

Author SHA1 Message Date
agra
1d17b0abcf lang: introduce cstring — the C-boundary string (Odin model)
cstring is ONE pointer to a null-terminated u8 buffer, C's char*: thin
(8 bytes, no length; cstring_len walks to the terminator), crossing
#foreign boundaries verbatim in both directions, with ?cstring as the
nullable case lowering to the same bare pointer (null = absent).

Conversion discipline mirrors Odin: a string LITERAL coerces implicitly
(its bytes are terminated constants); any other string is rejected with
a diagnostic naming to_cstring (it may be an unterminated view); and
cstring never coerces to string implicitly — from_cstring(c) is the
explicit zero-copy view, pricing the strlen.

Plumbing: TypeId/TypeInfo builtin slot 18 (first_user 19), name
classifiers, size/align/name tables, LLVM ptr lowering, the ?T pointer
niche, the xx pointer ladder, the literal-gated coercion plan
(isConstString + data_ptr), and the reserved-spelling set. std gains
cstring_len/from_cstring/to_cstring (fmt.sx, re-exported); the old
cstring(size) allocator helper is renamed alloc_string everywhere;
getenv migrates to (name: cstring) -> ?cstring as the canonical user
and env() drops its manual strlen/memcpy.

Pinned: examples/1222 (FFI both directions, literal coercion,
?cstring null paths, round trip) and examples/1173 (both coercion
diagnostics); FAIL pre-feature. The alloc_string rename + getenv
signature shift the .ir snapshots — regenerated. zig build test
426/426; run_examples 604/604.

Spec: reserved spelling + cstring section + C-interop rows.
2026-06-12 14:50:53 +03:00
agra
d88bdd7242 fix(0128): foreign cstring returns + conflicting same-symbol bindings
Two genuine defects behind the 0128 filing (whose original repros were
both poisoned by binding getenv, which std already declares -> *u8):

1. Re-declaring a C symbol was silent first-wins: every call through
   the later declaration was typed by the older signature. Foreign
   registration now dedupes — equal signatures share one FuncId,
   conflicting ones are diagnosed.

2. Foreign -> string / -> ?string returns read garbage: C returns one
   char*, but the LLVM signature declared the fat {ptr,i64} (len =
   register garbage), and ?string was mis-declared SRET (the hidden
   out-pointer landed in the callee's first arg register). cstrRetKind
   now classifies such returns, declares them as plain ptr (never
   sret), and the call site synthesizes {ptr, strlen} via a
   branch-guarded strlen (NULL -> {null,0} / optional null), wrapping
   {string, i1} for ?string.

?[:0]u8 itself resolves fine (it is ?string); the spelling works in
return, param, local, and alias positions.

Regression: examples/1221 (plain + optional non-null + NULL paths) and
examples/1172 (conflict diagnostic); both FAIL pre-fix. The extern
dedupe collapses duplicate libc decls, so affected .ir snapshots were
regenerated. zig build test 426/426; run_examples 602/602;
distribution suite 21/21.
2026-06-12 14:13:01 +03:00
agra
d8076b9333 lang: rename signed integer types sN -> iN
Surface rename of the signed integer family: s1..s64 become i1..i64
(u1..u64, usize, isize unchanged). 'string' keeps the s-prefix arm in
name classification; width parsing moves to the i-prefix arm next to
isize.

Internal TypeId tags follow the surface (.s8/.s16/.s32/.s64 ->
.i8/.i16/.i32/.i64), as do mono-key mangle fragments (ptr_i64,
tu_i64_bool) and all display/diagnostic formatting (i{d}).

Migrated in the same sweep: stdlib + examples + issue repros + FFI C
companions (shared symbol names like ffi_id_i64), expected
stdout/stderr/ir snapshots, specs.md, readme.md, CLAUDE.md/AGENTS.md,
implementation_plan.md, docs/, issue writeups. Vendored stb_image and
historical flow state left untouched.

zig build test: 426/426; examples suite: 595/595.
2026-06-12 09:31:53 +03:00
agra
837b5d375f fix(0124): large stack arrays lower to in-place access, not first-class values
Two lowering sites materialized a local array as a whole LLVM value;
the legalizer scalarizes each such op into one SelectionDAG node per
element, and at ~64K elements the DAG combiner segfaults
(DAGCombiner::visitMERGE_VALUES → ReplaceAllUsesWith).

- lowerVarDecl: an array-typed `---` initializer emits NO store — the
  slot stays uninitialized instead of receiving a whole-array undef
  store. The tuple zero-init carve-out stays; non-array `---` keeps
  the undef store. The interp is unchanged either way (slots start
  .undef).
- lowerIndexExpr: element reads on an array with addressable storage
  GEP the storage and load one element — the general-expression
  sibling of 0110's lowerFor fix — without value-lowering the object
  (a dead whole-array load would still reach the DAG). Storage-less
  arrays keep the index_get fallback.

Sibling shape filed as 0125: any_to_string's per-array-type arms still
pass the array by value, so a 64K+ array type + any {} print crashes.

Regression: examples/0055-basic-large-stack-array.sx (sx build
segfaulted pre-fix). 22 .ir snapshots re-pinned: removed undef stores
and ig.tmp spills, in-place gep+load (instruction-shape-only churn,
reviewed).
2026-06-12 08:19:20 +03:00
agra
b3b78e25c0 std: organise the facade — thematic groups, aligned columns
Same 47 re-exports, regrouped: compiler-resolved types, type-system/
reflection builtins, output + libc escape hatch, formatting (print/
format up front, the *_to_string family together), string ops +
allocation helpers, fmt internals marked as legacy-surface, List, tail.
Zero output-stream diffs; 37 .ir snapshots re-pinned (renumbering from
decl reordering). Gates: suite 588/588, zig build test 0, m3te 23/23,
game builds + bundles.
2026-06-11 21:46:03 +03:00
agra
c75cd9c63c std: drop the redundant flat mem.sx import from the facade
The flat #import of mem.sx predated the namespace tail — the tail's
mem :: #import already puts mem.sx in the program graph, which is all
the ufcs helpers (context.allocator.create/alloc/free/clone) and the
CAllocator default-context machinery need; std.sx itself references no
mem name. Probe-verified the full mem surface + all gates: suite
588/588, zig build test 0, m3te 23/23, game builds + bundles. The
double import was also duplicating lowered IR — the 37 re-pinned .ir
snapshots net ~2.5k lines smaller; output streams byte-identical.
2026-06-11 21:25:32 +03:00
agra
49a36bb492 std: the prelude becomes a pure re-export facade — implementations move to std/core.sx, std/fmt.sx, std/list.sx
std.sx now contains only alias declarations (the re-export mechanism:
own decls carry one flat-import level) over three part-files: core.sx
(builtins, libc escape hatch, Source_Location/Allocator/Context/Into,
the reserved `string` decl — which needs and permits no alias), fmt.sx
(print/format/any_to_string/string ops/cstring/alloc_slice), list.sx
(List). The namespace tail is unchanged; the part-file namespaces
(core/fmt/list) carry alongside it. Consumer surface is byte-identical
— every bare prelude name resolves through the aliases (0120/0121
machinery). 37 .ir snapshots re-pinned: pure string-constant
renumbering from the changed import graph (digit-normalized diff is
empty). Gates: zig build test 426/426, suite 588/588, m3te 23/23,
game SxChess builds + bundles.
2026-06-11 19:25:49 +03:00
agra
51194a26d8 mem: BufAlloc.init returns the state by value — full buffer usable, no header carve 2026-06-11 17:31:20 +03:00
agra
84e0fb0752 mem: typed allocation helpers + drop bare malloc/free (Phase 2.2); resolve 0119 as |>-contract clarification 2026-06-11 16:17:39 +03:00
agra
88bae3c9f5 mem: rename Allocator primitives to alloc_bytes/dealloc_bytes (Phase 4 naming pulled forward, Agra-approved) 2026-06-11 15:33:35 +03:00
agra
8908e78943 lang: array-typed '::' consts as immutable globals (PLAN-CONST-AGG step 1)
K : [4]s64 : .[...] and the untyped A :: .[1, 2, 3] register as
is_const globals: one storage, reads GEP it, dead-global elimination
drops unused ones, source-aware reads come free via selectGlobalAuthor.

- registerConstArrayGlobal (scanDecls pass 2): typed via the annotation
  (array-ness + dimension/count checked), untyped via element-type
  unification — all ints s64; ANY float promotes the element type to
  f64 with ints converting exactly; bool/string homogeneous; a
  non-numeric mix or non-inferable element asks for an annotation.
- constExprValue converts int elements into float destinations exactly
  (the int+float promotion rule, element-wise).
- emitGlobals marks is_const globals LLVMSetGlobalConstant — also flips
  the comptime-backed #run globals and __sx_default_context to
  'constant' (37 pinned IR snapshots regenerated; runtime unchanged).
- Element shapes: nested arrays, struct elements, strings, bools.
  Non-constant elements / dim mismatch / mixed types diagnose loudly.

Examples: 0177 (feature matrix incl. @K reads through *[4]s64 — needs
the 0117 fix), 1159/1160/1161 (diagnostics), 0837 repointed to values.
2026-06-11 12:26:26 +03:00
agra
330c3aeef7 std: full namespace tail — fs/process/socket/json/cli/hash/test
With 0115's own-wins globals landed, the remaining tail modules join
std.sx: every '#import "modules/std.sx"' now carries mem/xml/log/fs/
process/socket/json/cli/hash/test as namespaces (trace stays a direct
import).

Enablers in the same change:
- emit: dead-global elimination — a plain-data global no instruction
  references is not emitted, so tail modules' data (hash's 64-entry K
  table, OS/ARCH/POINTER_SIZE) stays out of binaries that don't use it.
  Comptime-backed globals keep their #run evaluation. 37 pinned IR
  snapshots regenerated (dead globals dropped + string renumbering from
  the larger module).
- 1055/1056 stop pinning the global error-tag ordinal (it shifts with
  program composition); they assert nonzero + tag identity + name.
- specs/readme/CLAUDE.md tail docs updated.
2026-06-11 10:49:39 +03:00
agra
59f0aa7716 std: restructure — std/ modules, namespace tail, std/xml.sx
allocators/fs/process/socket/log/trace/test move under modules/std/
(allocators.sx becomes std/mem.sx; the Allocator protocol moves into
the std.sx prelude, impls stay in mem.sx). New std/xml.sx holds
xml_escape as xml.escape. std.sx gains the carried namespace tail —
flat-importing std.sx now also provides mem./xml./log. — with the
remaining modules (fs/process/socket/json/cli/hash/test) deferred from
the tail until the global last-wins maps are fully own-wins (pulling
them into every closure collides bare names corpus-wide; they stay
direct imports: modules/std/fs.sx etc.). log.sx's internal emit
renamed log_emit (it clobbered consumer fns named emit program-wide).
bundle.sx uses xml.escape via the carried alias. Consumer import paths
swept mechanically; .ir snapshots recaptured for the larger std
closure. m3te + game build unchanged.
2026-06-11 06:10:59 +03:00
agra
878c4226a6 fix(0109): hoist all per-instruction allocas to the function entry block
An alloca built at its use site re-executes on every pass through that
block, and LLVM reclaims allocas only at ret — so loop-body locals,
nested-loop index slots, and emitter spill temps (ig.tmp, sret slots, ABI
coercion temps, byval materialization) grew the stack per iteration and
long loops segfaulted on stack exhaustion.

New LLVMEmitter.buildEntryAlloca inserts after existing entry-block
allocas and restores the builder position; every LLVMBuildAlloca site
reachable during instruction emission now routes through it.
Initialization stores stay at the use site (per-iteration re-init is
unchanged), and entry slots become mem2reg-promotable. The 35 .ir
snapshot diffs are pure alloca position moves (type multisets verified
identical per file).

Regression: examples/0047-basic-loop-local-stack-reuse.sx (segfaulted
pre-fix on both the 1M-iteration body-local loop and the 3M-iteration
nested loop).
2026-06-10 17:27:11 +03:00
agra
5f64ee4426 fix(ir): reflection builtins on an Any read its runtime tag, not payload [F0.8]
`type_name` / `type_is_unsigned` on an `Any` argument unconditionally read
the Any's payload as a TypeId index. That is correct only when the Any holds
a Type value (`{ .any, tid }`); for an Any holding a runtime *value*
(`av : Any = 6`, tag s64, payload 6) it returned `types[6]` — `type_name(av)`
gave "u8" and `type_is_unsigned(av)` gave true.

Both backends now branch on the Any's runtime type-tag: tag == `.any` → the
box is a Type value, use the payload as the TypeId; otherwise the tag IS the
held value's type. So `type_name(av)` → "s64", `type_is_unsigned(av)` → false,
while `type_name(type_of(x))` still names the held type. The `{}` formatter is
unchanged (it already passed `type_of(val)`, a proper Type value).

- src/ir/interp.zig: shared `Value.reflectTypeId` tag-branching resolver; the
  `type_name` / `type_is_unsigned` interp arms route through it.
- src/backend/llvm/ops.zig: shared `Ops.reflectArgTypeId` emits
  extractvalue-tag / icmp-eq-.any / select for the runtime path; both
  reflection arms route through it. The two backends agree.
- examples/0164-types-reflection-any-tag.sx: regression pinning type_name /
  type_is_unsigned / print on an Any holding a value vs a Type.
- src/ir/interp.test.zig: unit test for `reflectTypeId`.
- 22 .ir snapshots: the new select appears in every std-importing program's
  IR (any_to_string embeds these builtins) — benign, verified structurally
  identical apart from the three new instructions.
- issues/0090, specs.md: documented the Any-tag rule.
2026-06-05 12:09:52 +03:00
agra
64f77e9779 fix(std): render integer formatter extremes — i64::MIN and unsigned all-ones [F0.8]
Resolves issue 0090. The `{}` integer formatter mis-rendered both ends of
the 64-bit range:

- `int_to_string` computed the magnitude as `0 - n`, which overflows for
  `s64::MIN` (its magnitude is unrepresentable as a positive s64) — the
  value stayed negative, the digit loop ran zero times, so only `-`
  printed. It now extracts digits straight from `n` (per-digit
  `|n % 10|`, `n` truncating toward zero), never negating MIN.

- `any_to_string`'s `case int:` formatted every integer as s64, so a u64
  all-ones value printed as `-1`. There was no `uint` type-category to
  distinguish signedness. Added an additive `type_is_unsigned(T)`
  reflection builtin (static fold + dynamic interp/LLVM paths, mirroring
  `type_name`), backed by the new `TypeTable.isUnsignedInt` predicate, and
  a `uint_to_string` formatter (unsigned decimal via long-division over
  four 16-bit limbs). `case int:` routes through `type_is_unsigned(type)`.

The 16-bit-limb split is factored into a shared `decompose_u16x4`, now
reused by `int_to_hex_string` (no second unsigned-math routine).

Regression: examples/0046-basic-int-formatter-extremes pins both extremes
plus a width spread; unit tests cover `isUnsignedInt`. Docs (specs.md
representation note, readme std API) updated for unsigned/extreme `{}`
behavior. IR snapshots refreshed for the two new std functions.
2026-06-05 09:05:37 +03:00
agra
df386a422e test(ir): lock protocol/impl lookup before A4.2 extraction (A4.2 scaffolding step 1)
Test-first scaffolding ahead of extracting src/ir/protocols.zig — no code change
to the refactor targets (registerProtocolDecl / registerImplBlock /
registerParamImpl / hasImplPlain / tryUserConversion / tryPackImplMatch /
createProtocolThunk / buildProtocolValue).

Adds 4 .ir snapshots (only 0400 existed for 04xx), each captured surgically via
`sx ir | normalize_ir`, path-free, idempotent, and print-free at IR-gen time
(the 0524 contamination lesson):
- 0413-protocols-parameterized-protocol-value  parameterized protocol
                                               (registerParamImpl + tryUserConversion)
- 0414-protocols-generic-struct-protocol-erase generic-struct erasure
                                               (createProtocolThunk + buildProtocolValue)
- 0416-protocols-auto-type-erasure             auto erasure (buildProtocolValue + thunk)
- 0528-packs-protocol-pack-methods             pack-variadic impl (tryPackImplMatch)

With existing 0400 (impl-for-builtin) they pin erasure (auto/generic/builtin) +
parameterized + pack-variadic + dispatch; the 0410/0411/0412 runtime anchors
already pin cross-module visibility + duplicate-impl rejections.

Adds 1 unit test via the public surface (no new exposure, mirroring A4.1
sub-step 1): registerProtocolDecl -> getProtocolInfo builds the dispatch method
table (method names, param_types with self excluded, concrete vs Self return
with ret_is_self + *void encoding). The impl-lookup / conversion plan-object
tests (hasImplPlain, tryUserConversion, tryPackImplMatch — private today) land
with the registry in sub-step 2.

zig build, zig build test, tests/run_examples.sh (357/0) all green.
2026-06-02 21:44:01 +03:00