feat(asm): Phase F — global (module-scope) asm
A top-level `asm { "tmpl", };` block (template only) lowers to LLVM `module asm`;
a lib-less `extern` declaration calls into the symbols it defines (the import
direction reuses the existing C-FFI extern path — no new surface).
- ast.zig: asm_global node (AsmGlobal { template }).
- parser.zig: parseAsmGlobal, dispatched from parseTopLevel on kw_asm — rejects
`volatile` and any operands/clobbers (template only). The in-function asm
expression form stays in parsePrimary.
- module.zig: Module.global_asm list; lower/decl.zig captures each template in
lowerMainAndComptime (the real top-level pass — lowerDecls is dead for
top-level); emit_llvm.zig emit() appends each via LLVMAppendModuleInlineAsm in
source order.
- the new node forced asm_global arms in sema.zig (analyzeNode +
findNodeAtOffset) and semantic_diagnostics.zig (checkBindingNames).
Verified end-to-end: an aarch64 `_my_add` global routine, called via `extern`,
returns 42 — AOT only (the ORC JIT doesn't link module-asm symbols; global-asm
symbols live in the final linked binary). Locked with 1648-platform-asm-global
({ "aot": true, "target": "macos" } → AOT build+run on aarch64, ir-only else).
zig build test green (656 corpus, 446 unit).
This commit is contained in:
@@ -6,7 +6,26 @@ commit, one step at a time per the cadence rule (no commit may both add a test
|
||||
and make it pass).
|
||||
|
||||
## Last completed step
|
||||
**E** — multi-output tuples. **Inline asm now returns tuples.** Replaced the
|
||||
**F** — global (module-scope) asm. A top-level `asm { "tmpl", };` block (template
|
||||
only) lowers to LLVM `module asm`, and a lib-less `extern` calls into the symbols
|
||||
it defines. New `asm_global` AST node (`src/ast.zig`) + `parseAsmGlobal`
|
||||
(`src/parser.zig`, dispatched from `parseTopLevel` on `kw_asm`) — rejects
|
||||
`volatile` and any operands/clobbers. The node forced (and got) arms in the same
|
||||
three `Node.Data` switches as `asm_expr` (`sema.zig` ×2, `semantic_diagnostics.zig`).
|
||||
`Module` gains a `global_asm: ArrayList([]const u8)` (`src/ir/module.zig`);
|
||||
`lowerMainAndComptime` captures each template (the dead `lowerDecls` is NOT the
|
||||
top-level pass — `lowerRoot` Pass 2 uses `lowerMainAndComptime`); `emit_llvm.zig`'s
|
||||
`emit()` appends each via `LLVMAppendModuleInlineAsm` (source order). Verified
|
||||
end-to-end: an aarch64 `_my_add` global routine called via `extern` returns 42 —
|
||||
**AOT only** (the ORC JIT doesn't link module-asm symbols, so `sx run` is wrong;
|
||||
the design ties global-asm symbols to the final linked binary). Locked with
|
||||
`examples/1648-platform-asm-global.sx` (`.build { "aot": true, "target": "macos" }`
|
||||
→ AOT build+run on aarch64, ir-only elsewhere). `zig build test` green (656
|
||||
corpus, 446 unit). Files: `src/ast.zig`, `src/parser.zig`, `src/sema.zig`,
|
||||
`src/ir/semantic_diagnostics.zig`, `src/ir/module.zig`, `src/ir/lower/decl.zig`,
|
||||
`src/ir/emit_llvm.zig`, `examples/1648-*`.
|
||||
|
||||
Prior: **E** — multi-output tuples. **Inline asm now returns tuples.** Replaced the
|
||||
N>1 bail with a shared `asmResultType` helper (`src/ir/lower/expr.zig`, mixed
|
||||
into `Lowering`) that derives the result type from the `out_value` operands
|
||||
(0→void, 1→T, N→named tuple, named via the §II.5 effective-name rule). The key
|
||||
@@ -135,10 +154,13 @@ pipeline: lex (A.0) → parse (A.1) → validate (B.0/B.1 + `%[name]` check) →
|
||||
(C.0) → lower-builds-op + LLVM emit + JIT asm-parser init (C.1/D) → multi-output
|
||||
tuples (E). Register-class + register-pinned operands, inputs, clobbers, `#string`
|
||||
multi-instruction templates, `%[name]`/`%%` rewriting, and the §II.5 auto-naming
|
||||
rule all work and execute on the host JIT. **Remaining feature gaps:** `-> @place`
|
||||
write-through / read-write / indirect-memory outputs (rejected at parse — Phase 2)
|
||||
and global `asm { … }` + `extern` call-into-asm (Phase F). `readme.md` has no
|
||||
inline-asm section yet (docs-track-changes follow-up).
|
||||
rule all work and execute on the host JIT. Global `asm { … }` (Phase F) works AOT (call-into-asm
|
||||
via lib-less `extern`). **Remaining feature gap:** `-> @place` write-through /
|
||||
read-write / indirect-memory outputs (rejected at parse — Phase 2). Smaller
|
||||
follow-ups: the comptime-call guard for global asm (`#run` into a module-asm
|
||||
symbol should fail loud via dlsym-miss — pin a test), a JIT-vs-global-asm note
|
||||
(`sx run` silently mishandles module-asm symbols; AOT is correct), the x86_64
|
||||
syscall ir-only example, and a `readme.md` inline-asm section (docs-track-changes).
|
||||
|
||||
Known orthogonal bug: **issue 0137** — `sx run` on a program with no `main`
|
||||
segfaults (`src/target.zig:256-273`, unguarded JIT entry lookup). Pre-existing,
|
||||
@@ -150,21 +172,17 @@ Phase E–F feasibility already confirmed against the live tree
|
||||
`extern`, 60 sites; `--target` a global CLI flag).
|
||||
|
||||
## Next step
|
||||
Two independent directions (pick either):
|
||||
- **Phase F — global asm** (smaller; the plan calls it "Small"): top-level
|
||||
`asm { … }` decl (template only — reject operands/`volatile`) → lower to
|
||||
`c.LLVMAppendModuleInlineAsm`; the call-INTO-asm direction reuses the existing
|
||||
lib-less `extern` (no new surface). Parser: recognize `asm {` at decl scope →
|
||||
an `asm_global` decl. Plus the comptime-call guard (a global-asm symbol isn't
|
||||
in the JIT host — dlsym-miss must be loud). See `PLAN-ASM.md` Phase F.
|
||||
- **Phase 2 — `-> @place` outputs** (write-through, read-write `"+r" -> @place`,
|
||||
indirect-memory `"=*m"`): currently rejected at parse. Needs place-expr
|
||||
lowering for the output target + the indirect-constraint handling, plus
|
||||
output-to-`const` rejection.
|
||||
**Phase 2 — `-> @place` outputs** (the last feature gap): write-through
|
||||
(`"=…" -> @place`), read-write (`"+…" -> @place`), and indirect-memory (`"=*m"`)
|
||||
outputs, currently rejected at parse. Needs: parse `-> @<place-expr>` into an
|
||||
`out_place` operand (payload = the place expr), lower the place to an address +
|
||||
`store` the asm result through it (place outputs don't join the result tuple),
|
||||
the `+` read-write seeding, and output-to-`const` rejection. See `PLAN-ASM.md`
|
||||
Phase G / design §II.2 Dev 5 + cookbook (`cas`, `memcpy_bytes`, `cpuid_into`).
|
||||
|
||||
Also worth doing soon: the **x86_64 syscall-write** ir-only example (plan's D
|
||||
verification) and a **readme.md** inline-asm section (docs-track-changes). And the
|
||||
orthogonal **issue 0137** (no-`main` segfault) whenever.
|
||||
Smaller polish (any order): comptime-call guard test for global asm; `sx run`
|
||||
should error (not silently mishandle) a module-asm symbol; x86_64 syscall-write
|
||||
ir-only example; `readme.md` inline-asm section. Orthogonal: **issue 0137**.
|
||||
|
||||
## Log
|
||||
- (init) Plan + design doc written; ASM stream opened.
|
||||
@@ -205,6 +223,10 @@ orthogonal **issue 0137** (no-`main` segfault) whenever.
|
||||
struct, so emit unchanged; the asm struct return IS the sx tuple. Runs on
|
||||
aarch64 (1647: `split`→`(lo,hi)`); 1640 → x86 multi-output IR lock (ir-only).
|
||||
`zig build test` green (655 corpus, 446 unit).
|
||||
- (F) global asm — `asm_global` AST node + `parseAsmGlobal` (top-level, rejects
|
||||
volatile/operands); `Module.global_asm` captured in `lowerMainAndComptime`;
|
||||
`emit()` appends via `LLVMAppendModuleInlineAsm`; call-into via lib-less
|
||||
`extern`. AOT-verified (1648, `_my_add`→42). `zig build test` green (656 corpus).
|
||||
|
||||
## Known issues
|
||||
- **0137** — `sx run` on a program with no `main` segfaults (unguarded JIT entry
|
||||
|
||||
20
examples/1648-platform-asm-global.sx
Normal file
20
examples/1648-platform-asm-global.sx
Normal file
@@ -0,0 +1,20 @@
|
||||
// ASM stream Phase F — top-level (global) `asm { … }`: a template-only block at
|
||||
// module scope, lowered to LLVM `module asm` (LLVMAppendModuleInlineAsm). It
|
||||
// defines a symbol that a lib-less `extern` declaration calls into — the
|
||||
// import direction reuses the existing C-FFI extern path, no new surface.
|
||||
// Built+run via `aot` (a module-asm symbol lives in the final linked binary,
|
||||
// not the JIT host); aarch64-macos-pinned, so ir-only on a non-matching host.
|
||||
asm {
|
||||
#string ASM
|
||||
.global _my_add
|
||||
_my_add:
|
||||
add x0, x0, x1
|
||||
ret
|
||||
ASM,
|
||||
};
|
||||
|
||||
my_add :: (a: i64, b: i64) -> i64 extern;
|
||||
|
||||
main :: () -> i64 {
|
||||
return my_add(40, 2); // 42, computed by the global-asm routine
|
||||
}
|
||||
1
examples/expected/1648-platform-asm-global.build
Normal file
1
examples/expected/1648-platform-asm-global.build
Normal file
@@ -0,0 +1 @@
|
||||
{ "aot": true, "target": "macos" }
|
||||
1
examples/expected/1648-platform-asm-global.exit
Normal file
1
examples/expected/1648-platform-asm-global.exit
Normal file
@@ -0,0 +1 @@
|
||||
42
|
||||
16
examples/expected/1648-platform-asm-global.ir
Normal file
16
examples/expected/1648-platform-asm-global.ir
Normal file
@@ -0,0 +1,16 @@
|
||||
|
||||
module asm ".global _my_add"
|
||||
module asm "_my_add:"
|
||||
module asm " add x0, x0, x1"
|
||||
module asm " ret"
|
||||
|
||||
; Function Attrs: nounwind
|
||||
declare i64 @my_add(i64, i64) #0
|
||||
|
||||
; Function Attrs: nounwind
|
||||
define i32 @main() #0 {
|
||||
entry:
|
||||
%call = call i64 @my_add(i64 40, i64 2)
|
||||
%ca.tr = trunc i64 %call to i32
|
||||
ret i32 %ca.tr
|
||||
}
|
||||
1
examples/expected/1648-platform-asm-global.stderr
Normal file
1
examples/expected/1648-platform-asm-global.stderr
Normal file
@@ -0,0 +1 @@
|
||||
|
||||
1
examples/expected/1648-platform-asm-global.stdout
Normal file
1
examples/expected/1648-platform-asm-global.stdout
Normal file
@@ -0,0 +1 @@
|
||||
|
||||
10
src/ast.zig
10
src/ast.zig
@@ -96,6 +96,7 @@ pub const Node = struct {
|
||||
runtime_class_decl: RuntimeClassDecl,
|
||||
jni_env_block: JniEnvBlock,
|
||||
asm_expr: AsmExpr,
|
||||
asm_global: AsmGlobal,
|
||||
|
||||
pub fn declName(self: Data) ?[]const u8 {
|
||||
return switch (self) {
|
||||
@@ -259,6 +260,15 @@ pub const AsmOperand = struct {
|
||||
};
|
||||
};
|
||||
|
||||
/// Top-level (module-scope) global assembly: `asm { "tmpl", };` (ASM stream
|
||||
/// design §II.2 Deviation 6). Template only — no operands, no `volatile`, no
|
||||
/// `clobbers`, no `%` substitution. Lowers to `LLVMAppendModuleInlineAsm`;
|
||||
/// multiple blocks concatenate in source order. Symbols it defines are reached
|
||||
/// with a lib-less `extern` declaration.
|
||||
pub const AsmGlobal = struct {
|
||||
template: *Node, // string-literal / `#string` heredoc node
|
||||
};
|
||||
|
||||
pub const Identifier = struct {
|
||||
name: []const u8,
|
||||
/// True when written as a backtick raw identifier (`` `i2 ``). Carried so a
|
||||
|
||||
@@ -347,6 +347,14 @@ pub const LLVMEmitter = struct {
|
||||
// Must precede any DISubprogram (created per function below).
|
||||
self.debugInfo().initDebugInfo();
|
||||
|
||||
// Top-level global asm (ASM stream Phase F): append each block verbatim
|
||||
// to the module. Multiple blocks concatenate in source order; LLVM emits
|
||||
// them as module-level `module asm`. Symbols they define are reached via
|
||||
// lib-less `extern` declarations.
|
||||
for (self.ir_mod.global_asm.items) |asm_text| {
|
||||
c.LLVMAppendModuleInlineAsm(self.llvm_module, asm_text.ptr, asm_text.len);
|
||||
}
|
||||
|
||||
// Pass 0: Declare and initialize globals
|
||||
self.emitGlobals();
|
||||
|
||||
|
||||
@@ -1342,6 +1342,17 @@ pub fn lowerMainAndComptime(self: *Lowering, decls: []const *const Node) void {
|
||||
self.lowerMainAndComptime(ns.decls);
|
||||
}
|
||||
},
|
||||
// Top-level global asm (Phase F): capture the verbatim template; it
|
||||
// is appended to the LLVM module at emit time (source order). The
|
||||
// template must be a comptime-known string (parser guarantees a
|
||||
// string node here).
|
||||
.asm_global => |ag| {
|
||||
if (ag.template.data == .string_literal) {
|
||||
self.module.global_asm.append(self.alloc, ag.template.data.string_literal.raw) catch unreachable;
|
||||
} else if (self.diagnostics) |diags| {
|
||||
diags.addFmt(.err, decl.span, "global asm template must be a compile-time-known string", .{});
|
||||
}
|
||||
},
|
||||
else => {},
|
||||
}
|
||||
}
|
||||
|
||||
@@ -51,6 +51,10 @@ pub const Module = struct {
|
||||
/// (trampoline emission, +alloc/-dealloc synthesis) can re-walk
|
||||
/// `members` for fields / methods / `#extends` / `#implements`.
|
||||
objc_defined_class_cache: std.ArrayList(ObjcDefinedClassEntry),
|
||||
/// Top-level `asm { … }` blocks (ASM stream Phase F), in source order.
|
||||
/// Each is verbatim assembly appended to the LLVM module via
|
||||
/// `LLVMAppendModuleInlineAsm` at emit time; multiple blocks concatenate.
|
||||
global_asm: std.ArrayList([]const u8),
|
||||
alloc: Allocator,
|
||||
/// Owns the per-instruction operand slices the Builder dupes (aggregate
|
||||
/// fields, call args, branch args, switch cases, block params). These live
|
||||
@@ -100,6 +104,7 @@ pub const Module = struct {
|
||||
.objc_selector_cache = std.ArrayList(ObjcSelectorEntry).empty,
|
||||
.objc_class_cache = std.ArrayList(ObjcClassEntry).empty,
|
||||
.objc_defined_class_cache = std.ArrayList(ObjcDefinedClassEntry).empty,
|
||||
.global_asm = std.ArrayList([]const u8).empty,
|
||||
.alloc = alloc,
|
||||
.slice_arena = std.heap.ArenaAllocator.init(alloc),
|
||||
};
|
||||
@@ -115,6 +120,7 @@ pub const Module = struct {
|
||||
self.objc_selector_cache.deinit(self.alloc);
|
||||
self.objc_class_cache.deinit(self.alloc);
|
||||
self.objc_defined_class_cache.deinit(self.alloc);
|
||||
self.global_asm.deinit(self.alloc);
|
||||
self.types.deinit();
|
||||
self.slice_arena.deinit();
|
||||
}
|
||||
|
||||
@@ -316,6 +316,7 @@ pub const UnknownTypeChecker = struct {
|
||||
self.checkBindingNames(ae.template);
|
||||
for (ae.operands) |op| self.checkBindingNames(op.payload);
|
||||
},
|
||||
.asm_global => |ag| self.checkBindingNames(ag.template),
|
||||
// ── Named type / alias / import declarations: a bare reserved
|
||||
// spelling as the declared name is rejected. These
|
||||
// have no nested binding sites, so only the name is checked. A
|
||||
|
||||
@@ -104,6 +104,13 @@ pub const Parser = struct {
|
||||
return try self.createNode(start, .{ .import_decl = .{ .path = path, .name = null } });
|
||||
}
|
||||
|
||||
// Top-level (module-scope) global assembly: `asm { "tmpl", };`
|
||||
// (template only — no operands/volatile/clobbers). The in-function
|
||||
// `asm { … }` expression form is parsed in `parsePrimary` instead.
|
||||
if (self.current.tag == .kw_asm) {
|
||||
return self.parseAsmGlobal(start);
|
||||
}
|
||||
|
||||
// Top-level #run directive
|
||||
if (self.current.tag == .hash_run) {
|
||||
self.advance();
|
||||
@@ -2801,6 +2808,24 @@ pub const Parser = struct {
|
||||
} });
|
||||
}
|
||||
|
||||
/// Top-level global assembly `asm { "tmpl", };` — template only. Rejects
|
||||
/// `volatile` and any operands/clobbers (design §II.2 Deviation 6).
|
||||
fn parseAsmGlobal(self: *Parser, start: u32) anyerror!*Node {
|
||||
self.advance(); // consume `asm`
|
||||
if (self.isContextualWord("volatile")) {
|
||||
return self.fail("global (top-level) asm cannot be `volatile`");
|
||||
}
|
||||
try self.expect(.l_brace);
|
||||
const template = try self.parseExpr();
|
||||
if (self.current.tag == .comma) self.advance(); // optional trailing comma
|
||||
if (self.current.tag != .r_brace) {
|
||||
return self.fail("global (top-level) asm takes no operands, inputs, or clobbers — only a template string");
|
||||
}
|
||||
try self.expect(.r_brace);
|
||||
try self.expect(.semicolon);
|
||||
return try self.createNode(start, .{ .asm_global = .{ .template = template } });
|
||||
}
|
||||
|
||||
fn parsePrimary(self: *Parser) anyerror!*Node {
|
||||
const start = self.current.loc.start;
|
||||
// Pack references in expression position:
|
||||
|
||||
@@ -1367,6 +1367,7 @@ pub const Analyzer = struct {
|
||||
try self.analyzeNode(ae.template);
|
||||
for (ae.operands) |op| try self.analyzeNode(op.payload);
|
||||
},
|
||||
.asm_global => |ag| try self.analyzeNode(ag.template),
|
||||
.impl_block => |ib| {
|
||||
// Each impl block gets its own scope so methods don't conflict across impls
|
||||
try self.pushScope();
|
||||
@@ -1843,6 +1844,9 @@ pub fn findNodeAtOffset(node: *Node, offset: u32) ?*Node {
|
||||
if (findNodeAtOffset(op.payload, offset)) |found| return found;
|
||||
}
|
||||
},
|
||||
.asm_global => |ag| {
|
||||
if (findNodeAtOffset(ag.template, offset)) |found| return found;
|
||||
},
|
||||
}
|
||||
|
||||
return node;
|
||||
|
||||
Reference in New Issue
Block a user