# Slovo v0 Specification > This is not a complete language specification. > > This is the supported v0 contract for the first Slovo parser, formatter, > checker, diagnostics, test runner, and LLVM-oriented compiler prototype. --- ## 1. Status Version: `v0` File extension: `.slo` Primary backend target: **LLVM IR** Primary design goals: - parse Slovo forms - format Slovo forms - type-check a small core - run tests - lower simple programs toward LLVM IR - produce structured diagnostics Compiler-supported fixtures for the frozen v0 contract: ```text examples/supported/add.slo examples/supported/top-level-test.slo examples/supported/local-variables.slo examples/supported/if.slo examples/supported/while.slo examples/supported/struct.slo examples/supported/array.slo examples/supported/option-result.slo examples/supported/unsafe.slo ``` For strict-manifest iteration1, "compiler-supported" means the form is parsed, lowered, type-checked, emitted as LLVM IR or handled by its required tool mode, and covered by an automated Glagol test. This file uses "current" only inside the v0 boundary. Post-v0 promotions live in `SPEC-v1.md` and may expand the repository's supported surface without changing the frozen v0 contract. Supported v0 subset, as exercised by the v0 fixtures: - one `(module name)` form - top-level `(fn ...)` forms - top-level `(struct Name (field i32)...)` forms - top-level `(test "name" body... final-expression)` forms, where body forms are optional local declarations, assignments, or first-pass loops - explicitly typed `i32` function parameters and `i32` return values - direct constructor-return functions with `(option i32)` and `(result i32 i32)` return types under section 14 - integer literals used as `i32` values - function parameter references - binary integer addition with `+` - equality comparison `=` as a bool-producing test expression - ordering comparison `<` as a bool-producing `i32` condition - user-defined function calls - the temporary compiler intrinsic `(print_i32 value)` call as a statement-like expression returning builtin `unit` - local `i32` bindings with `(let name i32 value)` and `(var name i32 value)` - assignment to mutable local `i32` bindings with `(set name value)` - local references after declaration - value-producing `(if condition then-expression else-expression)` - first-pass `(while condition body...)` as a non-final sequential body form - first-pass struct constructor expressions and immediate field access - first-pass fixed `i32` array constructors, immutable array locals, and literal checked indexing - first-pass `i32` option/result constructors used only as direct function returns - lexical `(unsafe body... final-expression)` expression blocks whose body forms and final expression are otherwise supported safe v0 forms - final-expression function returns Implementation-recognized forms are not automatically supported language features. A form remains a design target until it has parser/lowerer behavior, checker behavior, backend behavior or explicit unsupported diagnostics, formatter behavior where applicable, and tests. All examples under `examples/speculative/` are v0 design targets, not current compiler-supported fixtures. Formatter fixtures under `examples/formatter/` are canonical-layout fixtures for the same strict supported syntax. They are not new language features and must not include design-target forms. Top-level `(test "name" body... final-expression)` is supported only in the strict form defined in section 16. Non-goals for v0: - macros - generics - concurrency - advanced ownership - package manager - full standard library - direct x86/ARM backend --- ## 2. Source Files A Slovo source file uses the `.slo` extension. A source file contains one top-level module form. ```slo (module main) ``` A practical file usually contains: ```slo (module main) (fn main () -> i32 0) ``` --- ## 3. Lexical Syntax Whitespace separates tokens but has no semantic meaning. Tabs and spaces are equivalent. Line comments begin with `;`. ```slo ; this is a comment (+ 1 2) ``` A form is a parenthesized tree: ```slo (name arg arg arg) ``` Identifiers may contain letters, digits, `_`, `-`, `?`, `!`, and selected symbolic operator names. String literals use double quotes. ```slo "Hello, Slovo" ``` Integer literals: ```slo 0 42 -7 ``` Current compiler-supported integer literals are signed `i32` values in the inclusive range `-2147483648` to `2147483647`. Glagol must reject literals outside that range with `IntegerOutOfRange` before narrowing them into the AST. Floating-point literals: ```slo 3.14 -0.5 ``` Floating-point literals are a design target, not current compiler-supported syntax. --- ## 4. Top-Level Forms Current compiler-supported top-level forms: ```text module fn test struct ``` Design-target top-level forms: ```text import ``` `test` is supported only as the top-level form described in section 16. `struct` is supported only as the first-pass top-level form described in section 12. `import` remains a broader design target. --- ## 5. Types Current compiler-supported scalar type: ```text i32 ``` First-pass struct declarations introduce nominal struct types as described in section 12. Those types are supported only as constructor temporaries consumed by immediate field access; they are not valid current supported function parameter types, function return types, or local declaration types. First-pass fixed arrays are specified in section 13. `(array i32 N)` is a current compiler-supported immutable local declaration type only when the local is initialized directly from an `(array i32 value...)` constructor with matching length. It is not a current supported function parameter type or function return type. First-pass option/result constructors are specified in section 14. `(option i32)` and `(result i32 i32)` are current compiler-supported function return types only for functions whose single body expression is the matching direct constructor. They are not current supported function parameter types or local declaration types. `print_i32` status: `print_i32` is a temporary compiler intrinsic in v0. It is not yet specified as a user-defined function, standard-library binding, runtime binding, import, or foreign function. Programs may use `(print_i32 value)` only as part of the current strict compiler subset described in section 1. `unit` status: `unit` is a valid internal/builtin result type for unit-like expressions such as `print_i32` and the supported `(set name value)` form. It is not a user-declarable supported function return type until Glagol supports parsing, checking, lowering, formatting, diagnostics, backend behavior, and tests for user-declared `unit` returns. Design-target primitive types: ```text bool i8 i16 i64 u8 u16 u32 u64 f32 f64 char string unit never ``` Broader design-target compound types: ```slo (ptr T) (array T N) (slice T) (option T) (result T E) ``` The first-pass `(array i32 N)`, `(option i32)`, and `(result i32 i32)` cases above are the only promoted compound-type exceptions. Generic, nested, and non-`i32` compound-type payloads remain design targets. --- ## 6. Functions Function form: ```slo (fn name ((arg Type) ...) -> ReturnType body...) ``` Example: ```slo (fn add ((a i32) (b i32)) -> i32 (+ a b)) ``` v0 rules: - parameters are explicitly typed - return type is explicit - body contains one or more expressions - the last expression is the return value - all control paths must produce the declared return type - current supported fixtures use `i32` returns, except for the direct option/result constructor-return functions described in section 14 --- ## 7. Variables Status: supported for explicit `i32` local bindings in function bodies and top-level test bodies. Immutable binding: ```slo (let name type value) ``` Mutable binding: ```slo (var name type value) (set name value) ``` Supported implementation scope: - Explicit local types are required in v0. - `i32` locals are supported with `let` and `var`; immutable `(array i32 N)` locals are supported only under the first-pass array contract in section 13. - Local declarations are allowed only inside function bodies, top-level test bodies, and promoted lexical unsafe blocks contained by those bodies. - Local declarations and `set` forms are sequential body forms before the final result expression. - A function or test body must still have a final result expression. Binding rules: - `(let name type value)` creates a new immutable local binding initialized from `value`. - `(var name type value)` creates a new mutable local binding initialized from `value`. - The initializer is checked before the new local is introduced, so a binding may not refer to itself. - The initializer expression must exactly match the explicit local type. - The explicit local type must be `i32`. - Local bindings are visible only after their declaration and only within the containing function body, test body, or lexical unsafe block. Bindings from an outer function or test body remain visible inside a nested unsafe block. - Lexical unsafe blocks introduce the only current nested local scope. `if` branches and first-pass `while` bodies do not introduce local declaration scopes. Shadowing and redeclaration: - v0 forbids shadowing in local bodies. - A local binding name must not match any parameter name in the same function. - A local binding name must not match any earlier local binding in the same function or test body. - A local binding name must not match a top-level function name or compiler intrinsic visible in the same module. - Parameters are immutable for v0 and are not valid `set` targets. - Top-level function names are not local bindings. A local reference resolves to a parameter or local binding; a call resolves to a callable top-level function or compiler intrinsic unless a later namespace rule says otherwise. Assignment: - `(set name value)` assigns an existing mutable local binding. - `set` produces builtin `unit`. - `set` may appear only as a non-final sequential body form, because supported function/test results are not `unit`. - The assigned expression must exactly match the mutable local's type. - Assigning an unknown name is an `UnknownVariable` diagnostic. - Assigning a parameter is a `CannotAssignParameter` diagnostic. - Assigning an immutable `let` binding is a `CannotAssignImmutableLocal` diagnostic. - Assigning a value with the wrong type is a `TypeMismatch` diagnostic. - Duplicating a local name is a `DuplicateLocal` diagnostic. - Redeclaring a parameter as a local is a `LocalRedeclaresParameter` diagnostic. - Colliding with a visible function or compiler intrinsic is a `LocalShadowsCallable` diagnostic. - Declaring any non-`i32` local is an `UnsupportedLocalType` diagnostic. - Using a local declaration in an inline expression is a `LocalDeclarationNotAllowed` diagnostic. Typed-core meaning: ```text LocalLet { name, type: i32, init: TExpr, mutable: false, span, name_span, type_span, init_span } LocalVar { name, type: i32, init: TExpr, mutable: true, span, name_span, type_span, init_span } SetLocal { target: LocalId, value: TExpr, type: unit, span, name_span, value_span } ``` Lowering: - `let` and `var` lower to typed local storage or SSA values at the implementation's discretion, provided source order, mutability checks, and diagnostics are preserved. - `set` lowers to an assignment/store to the resolved mutable local. - A local that is never read may be accepted in v0; unused-local warnings are outside the v0 required diagnostics. Formatter behavior: ```slo (fn add_then_double ((a i32) (b i32)) -> i32 (let sum i32 (+ a b)) (var doubled i32 (+ sum sum)) (set doubled (+ doubled 1)) doubled) ``` The formatter must print each local declaration and `set` as one body form indented by two spaces. Simple `let`, `var`, and `set` forms stay inline when their initializer or assigned expression is an inline expression under the current expression formatter. The final body expression follows the same two-space indentation and may refer to earlier locals. --- ## 8. Expressions Slovo v0 is expression-oriented. Most forms produce a value. Current compiler-supported expressions are the expression forms required by the supported fixtures: integer literals, parameter references, local references, local declarations, local assignment, binary `+`, user function calls, temporary intrinsic `print_i32` calls, equality comparison in tests, ordering comparison in loop conditions, value-producing `if`, first-pass `while`, struct construction with immediate field access, fixed `i32` array construction, literal array indexing, first-pass option/result constructors, lexical `unsafe` blocks containing otherwise supported safe body forms, and final-expression returns. Struct construction and immediate field access are supported under the strict first-pass contract in section 12. Array construction, immutable array locals, and literal checked indexing are supported under the strict first-pass contract in section 13. Option/result constructors are supported under the strict first-pass direct-return contract in section 14. Lexical `unsafe` blocks are supported under the strict first-pass contract in section 15. Arithmetic: ```slo (+ a b) (- a b) (* a b) (/ a b) ``` Comparison: ```slo (= a b) (< a b) (> a b) (<= a b) (>= a b) ``` v0 rules: - operands must have compatible exact types - numeric casts are explicit - comparisons return `bool` Status note: arithmetic and comparison beyond the supported fixture are design targets until covered by the strict support rule in section 1. The top-level `test` contract in section 16 introduces an expected-`bool` context, but it does not promote any expression form by itself. A promoted test fixture must use only bool-producing expressions that also satisfy the strict support rule, such as equality comparison once it is accepted as part of a promoted fixture. --- ## 9. Canonical Formatter Status: current strict supported syntax only. The formatter is part of the Slovo language contract. For the current supported subset, canonical output follows these rules: - use LF line endings and end every file with one trailing newline - remove trailing whitespace - use spaces, not tabs - put top-level forms at column 1 - keep `(module name)` on one line - separate top-level forms with exactly one blank line - print each supported top-level `fn` as a multi-line form - print the function header on one line as `(fn name ((arg Type) ...) -> ReturnType` - print an empty parameter list as `()` - keep supported parameter lists inline as `((name i32) (name i32))` - indent each function body expression by two spaces - print one supported body expression per line - keep supported expression forms inline: integer literals, parameter references, local references, local declarations, `set`, binary `+`, user-defined function calls, `(print_i32 value)`, and first-pass option/result constructors - keep nested supported calls inline, for example `(print_i32 (add 20 22))` - place the closing `)` for a function on the final body-expression line when that expression is inline - preserve full-line comments attached to the nearest following form, without reflowing comment text Canonical current fixture: ```slo (module main) (fn add ((a i32) (b i32)) -> i32 (+ a b)) (fn main () -> i32 (print_i32 (add 20 22)) 0) ``` Formatter rules for `let`, `var`, and `set` are part of the current supported formatter fixture set. The `if` formatter contract and first-pass `while` formatter contract are also part of the current formatter fixture set. The first-pass formatter rules for `struct` are also part of the current formatter fixture set. Section 13 specifies the first-pass array/index formatter contract, and `examples/formatter/array.slo` is the canonical formatter fixture for that promoted subset. Section 14 specifies the first-pass option/result constructor formatter contract, and `examples/formatter/option-result.slo` is the canonical formatter fixture for that promoted subset. Section 15 specifies the lexical `unsafe` formatter contract, and `examples/formatter/unsafe.slo` is the canonical formatter fixture for that promoted subset. Formatter rules for strings, slices, broader option/result value flow, pointer types, raw memory operations, and general user-declared `unit` returns remain design targets. Top-level `test` formatter contract: ```slo (test "add works" (= (add 2 3) 5)) ``` The formatter must print each top-level test as a multi-line top-level form, keep the string name on the opening line, indent the single test expression by two spaces, and place the closing `)` on the expression line when that expression is inline. Test forms follow the same top-level blank-line and comment attachment rules as `fn`. --- ## 10. Conditionals Status: compiler-supported value expression through `examples/supported/if.slo` under the strict support rule in section 1. Conditional form: ```slo (if condition then-expression else-expression) ``` v0 rules: - condition must be `bool` - both branches must have the same type - `if` returns that type - `if` is an expression form and may be used where its result type is expected. - Branches are expressions, not declaration scopes. A local declaration inside an `if` branch remains unsupported in this pass. - The formatter prints `if` as a multi-line expression: ```slo (if (< value 3) 10 20) ``` Diagnostics: - `MalformedIfForm`: missing condition, then expression, else expression, or an invalid operand shape. Span: whole `if` form. Expected: `(if condition then-expression else-expression)`. - `IfConditionNotBool`: checked condition type is not `bool`. Span: condition expression. Expected: `bool`. Found: checked condition type. - `IfBranchTypeMismatch`: then and else branches do not have the same checked type. Span: whole `if` form. Expected: then-branch type. Found: else-branch type. --- ## 11. Loops Status: compiler-supported first-pass contract through `examples/supported/while.slo` under the strict support rule in section 1. First-pass v0 loop form: ```slo (while condition body-form...) ``` Surface placement: - `(while condition body-form...)` is a sequential body form. - It is allowed only inside function bodies and top-level test bodies. - In the first supported implementation, it may appear only before the final result expression of supported `i32` functions and `bool` tests. - It is not valid as an inline expression argument. - It is not valid as the final result expression of a supported `i32` function or `bool` test, because it returns builtin `unit`. Checking rules: - The condition is checked in the current local environment and must have type `bool`. - The condition is re-evaluated before each iteration. - A false initial condition executes zero body iterations. - The body must contain one or more body forms. - `while` returns builtin `unit`. - The first implementation does not introduce nested local-scope semantics. Local declarations inside a `while` body are not supported. - Supported first-pass loop body forms are limited to assignments to existing mutable locals with `(set name value)` and supported unit-producing calls such as `(print_i32 value)`. - Loop body forms must each check as `unit`. - Existing locals visible before the `while` are visible in the condition and body. Assignments in the body update the existing mutable local. - `break` and `continue` are not part of this pass. - Nested `while` is not part of the first-pass support target. Typed-core meaning: ```text While { condition: TExpr, body: [TBodyForm], type: unit, span, condition_span, body_spans } ``` Lowering: - A `while` lowers to a condition block, a body block, and an exit block. - Control enters the condition block first. - If the condition evaluates to true, control enters the body block; after the body completes, control branches back to the condition block. - If the condition evaluates to false, control branches to the exit block. - The exit continuation has builtin `unit` as the loop result and then continues with the next sequential body form. - Locals assigned in the body must preserve source-order mutation semantics. The implementation may use storage slots or equivalent SSA construction, but it must not treat the loop body as a new declaration scope in this first pass. Formatter behavior: ```slo (fn sum_to ((n i32)) -> i32 (var i i32 0) (var sum i32 0) (while (< i n) (set sum (+ sum i)) (set i (+ i 1))) sum) ``` The formatter must print `while` as a multi-line body form. The opening line contains `(while` and the formatted condition. Each loop body form is printed on its own line, indented two spaces deeper than the `while` form. The closing `)` is placed on the final body-form line when that body form is inline. The following function or test body form resumes at the original body indentation. Diagnostics: - `MalformedWhileForm`: missing condition or an invalid operand shape. Span: whole `while` form or offending operand. Expected: `(while condition body...)`. - `WhileConditionNotBool`: checked condition type is not `bool`. Span: condition expression. Expected: `bool`. Found: checked condition type. - `EmptyWhileBody`: a `while` has no body forms after the condition. Span: whole `while` form or the location where the first body form was expected. - `LocalDeclarationInWhileBodyUnsupported`: a `let` or `var` appears directly in a first-pass `while` body. Span: offending local declaration. Hint: declare the local before the loop and update an existing mutable local with `set`. - `NestedWhileUnsupported`: a nested `while` appears directly in a first-pass `while` body. Span: nested `while`. Hint: keep first-pass loop bodies flat. - `WhileBodyFormNotUnit`: a loop body form does not produce `unit`. Span: offending body form. Expected: `unit`. Found: checked body-form type. - Existing diagnostics such as `UnknownVariable`, `CannotAssignParameter`, `CannotAssignImmutableLocal`, `TypeMismatch`, `ArityMismatch`, and `UnsupportedBackendFeature` apply inside `while` conditions and bodies. - Using a `while` as the final expression of an `i32` function must report `ReturnTypeMismatch` with expected `i32` and found `unit`. - Using a `while` as the final expression of a top-level test must report `TestExpressionNotBool` with expected `bool` and found `unit`. Minimal comparison dependency: The promoted first fixture for `while` uses `<` to produce the loop condition. Promoting `while` does not by itself promote the broader comparison family or unrelated arithmetic. The supported comparison surface for this pass is the exact `i32, i32 -> bool` `<` form exercised by the fixture, with parser/lowerer support, checker behavior, LLVM behavior, formatter behavior, and tests. --- ## 12. Structs Status: compiler-supported first-pass contract through `examples/supported/struct.slo` under the strict support rule in section 1. Promotion boundary: - The supported fixture may use only top-level `i32` struct declarations, constructor expressions, and field access returning `i32`. - Local struct storage is a follow-up. The first pass does not support `(let p Point ...)`, `(var p Point ...)`, assigning a struct value, passing a struct as a parameter, or returning a struct from a function. - Methods, generics, field mutation, nested struct fields, recursive structs, struct params/returns, layout reflection, cross-module structs, and ABI promises are explicitly outside the first pass. Struct definition: ```slo (struct Point (x i32) (y i32)) ``` Struct declarations are top-level forms, sibling to `fn` and `test`. Surface rules: - The form is `(struct Name (field i32)...)`. - `Name` is a normal identifier. - Each `field` is a normal identifier. - A struct must declare one or more fields. - First-pass fields must be exactly `i32`. - Field order is source order and is part of the v0 typed-core contract. - Struct names are nominal. Two structs with the same fields are not the same type. - A struct name must be unique in the module and must not collide with a top-level function name or compiler intrinsic visible in constructor expression position. - Field names must be unique within the struct. - A struct declaration is not an expression and produces no runtime value. Constructor expression: ```slo (Point (x 3) (y 4)) ``` Constructor rules: - The callee position must be the declared struct name. - Each constructor field is written as `(field value)`, not as a keyword token. - Constructor fields must list every declared field exactly once. - Constructor fields must appear in declaration order for the first pass. - Each field value is checked in source order and must exactly match the declared field type. Since first-pass fields are `i32`, each value must check as `i32`. - Constructor evaluation preserves field value evaluation order. - A constructor produces a value of the nominal struct type. - In the first promotion target, a struct constructor may be used only as the immediate value being inspected by field access, for example `(. (Point (x 3) (y 4)) x)`. Wider struct value flow requires the follow-up local/signature/storage contract. Field access: ```slo (. p x) ``` First-pass examples should use immediate constructor access: ```slo (. (Point (x 3) (y 4)) x) ``` Field access rules: - The form is `(. value field)`. - `field` is a normal identifier, not a keyword token or string. - `value` must check as a known struct type. - The named field must exist on that struct type. - The expression result type is the field type. In the first pass this is always `i32`. - The value expression is evaluated once. - Field access is read-only. There is no field mutation form in the first pass. Checking rules: - The checker collects and validates top-level struct declarations before checking function and test bodies in the same module. - Struct declaration checking validates top-level placement, identifier shape, module-level name uniqueness, field-name uniqueness, and `i32` field types. - Constructor checking resolves the callee name in the struct namespace, then checks constructor field names, field order, field completeness, duplicate fields, extra fields, and each field value type. - Field access checking first checks the value expression, then resolves the requested field against the value's nominal struct type. - A struct constructor or other struct-typed expression outside direct field access is not part of the first pass. It must report `StructValueFlowUnsupported`, except for the more specific local and signature diagnostics below. - Existing expected-type diagnostics apply after field access. For example, `(. (Point (x 3) (y 4)) x)` checks as `i32` and may be used anywhere an `i32` expression is currently supported. Typed-core meaning: ```text StructDecl { name: StructId, fields: [StructField], span, name_span } StructField { name: FieldId, type: i32, index: FieldIndex, span, name_span, type_span } StructConstruct { struct_id: StructId, fields: [(FieldId, TExpr)], type: Struct(StructId), span, name_span, field_spans } StructFieldAccess { value: TExpr, field_id: FieldId, type: i32, span, value_span, field_span } ``` Lowering: - A struct declaration lowers to type metadata in the checked module, not to a runtime declaration by itself. - A constructor lowers to an aggregate value or equivalent typed-core value with fields in declaration order. - Field access lowers to extracting the selected field from that aggregate. - For the first promotion target, an implementation may lower direct constructor field access without materializing addressable struct storage. - No stable LLVM, C, FFI, or cross-module ABI layout is promised in v0. - If a checked struct form reaches a backend gap, Glagol must report `UnsupportedBackendFeature` instead of panicking. Formatter behavior: ```slo (struct Point (x i32) (y i32)) (fn point_x () -> i32 (. (Point (x 3) (y 4)) x)) ``` The formatter must print each top-level `struct` as a multi-line top-level form. The opening line is `(struct Name`; each field declaration is printed on its own line indented two spaces; the closing `)` is placed on the final field line. Top-level blank-line and comment attachment rules match `fn` and `test`. Constructor expressions with inline `i32` field values stay inline as `(Name (field value)...)`. Field access stays inline when its value expression is inline. If a future formatter must break a constructor over multiple lines, the constructor opening line is `(Name`, each `(field value)` pair is indented two spaces, and field order remains declaration order. Diagnostics: - `MalformedStructForm`: missing name, non-identifier name, missing fields, or invalid field shape. Span: whole `struct` form or offending operand. Expected: `(struct Name (field i32)...)`. - `StructDeclarationNotTopLevel`: a `struct` form appears anywhere other than the top level. Span: offending `struct` form. - `EmptyStructUnsupported`: a struct declares no fields. Span: whole `struct` form. Hint: declare at least one `i32` field. - `DuplicateStruct`: duplicate struct name in the same module. Span: duplicate name, with a related span on the original struct name. - `StructNameConflictsCallable`: struct name collides with a top-level function name or compiler intrinsic visible in constructor position. Span: struct name, with a related span when the colliding source name exists. - `DuplicateStructField`: duplicate field name within one struct. Span: duplicate field name, with a related span on the original field name. - `UnsupportedStructFieldType`: field type is not `i32`. Span: field type. Expected: `i32`. Found: written field type. - `MalformedStructConstructor`: constructor operand shape is invalid. Span: whole constructor or offending field pair. Expected: `(Name (field value)...)`. - `UnknownStruct`: constructor name does not resolve to a struct. Span: constructor name. - `DuplicateStructConstructorField`: constructor lists a field more than once. Span: duplicate constructor field, with a related span on the first use. - `MissingStructField`: constructor omits a declared field. Span: whole constructor. Expected: missing field name. - `UnknownStructField`: constructor or field access names a field not declared by the struct. Span: extra constructor field name or accessed field name. - `StructConstructorFieldOrderMismatch`: constructor field order differs from declaration order. Span: first out-of-order constructor field. Expected: declared field name at that position. Found: written field name. - `TypeMismatch`: constructor field value does not match its declared field type. Span: field value. Expected: declared field type. Found: checked value type. - `MalformedFieldAccess`: field access is not `(. value field)`. Span: whole access form or offending operand. - `FieldAccessOnNonStruct`: field access value does not check as a struct type. Span: value expression. Expected: struct value. Found: checked value type. - `UnsupportedStructLocal`: a struct type is used as a local `let` or `var` type in the first pass. Span: local type. - `UnsupportedStructSignatureType`: a struct type is used as a function parameter or return type in the first pass. Span: signature type. - `UnsupportedStructFieldAccess`: field access uses a non-immediate struct constructor value in the first pass. Span: field access expression. - `FieldMutationUnsupported`: an implementation that accepts a generalized assignment target must reject field mutation in the first pass. Span: attempted field assignment target. --- ## 13. Arrays and Checked Indexing Status: strict first-pass compiler-supported syntax, promoted through `examples/supported/array.slo` and `examples/formatter/array.slo`. Promotion boundary: - The first pass supports only fixed-length arrays with `i32` elements. - Array lengths must be positive integer literals. Zero-length arrays are not supported in this pass. - Array constructors produce immutable temporary array values. - Immutable `let` locals may store `(array i32 N)` values only when initialized directly from a matching array constructor. - The first promoted fixture may index only an immediate array constructor or an immutable array local with a non-negative integer literal index known at compile time. - Dynamic indices are not supported in this pass, so v0 does not yet define a runtime bounds trap or test-runner trap result for arrays. - Mutable array locals, array parameters, array returns, array mutation, nested arrays, arrays of structs, slices, and unchecked indexing are follow-up work. Fixed array type: ```slo (array i32 N) ``` Type rules: - `N` is a source integer literal in type position, not an expression. - `N` must be greater than `0`. - The only first-pass element type is `i32`. - `(array i32 N)` is supported only as an immutable local declaration type in the first pass. Function parameter and return types remain unsupported. - `(array T N)` with any `T` other than `i32` is rejected by the first pass. Array constructor expression: ```slo (array i32 value...) ``` Constructor rules: - The first operand after `array` is the element type and must be `i32`. - At least one value is required. - Each value is checked in source order and must check exactly as `i32`. - The constructor length is inferred from the value count. - The constructor result type is `(array i32 count)`. - In an expected-type context, a constructor of type `(array i32 count)` is valid for `(array i32 N)` only when `count == N`. - Constructor value evaluation order is left to right. Checked index expression: ```slo (index array-expr index-expr) ``` First-pass supported example: ```slo (index (array i32 10 20 30) 1) ``` ```slo (let values (array i32 3) (array i32 4 5 6)) (index values 2) ``` Index rules: - The form is exactly `(index array-expr index-expr)`. - `array-expr` must check as `(array i32 N)`. - For the first promotion target, `array-expr` must be an immediate array constructor or an immutable array local. - `index-expr` must be a non-negative `i32` integer literal in the first pass. - The literal index must satisfy `0 <= index < N`. - The result type is `i32`. - The array expression is evaluated once. - Because accepted first-pass indices are compile-time literals, bounds checking happens during checking. A dynamic checked-index form must be rejected until runtime trap behavior is specified and tested. Typed-core meaning: ```text ArrayType { element: i32, length: NonZeroUsize, span, element_span, length_span } ArrayConstruct { element: i32, length: NonZeroUsize, values: [TExpr], type: Array(i32, length), span, element_span, value_spans } ArrayIndex { array: TExpr, index: usize, type: i32, span, array_span, index_span } ``` Lowering: - An array constructor lowers to an aggregate value or equivalent typed-core value with elements in source order. - A first-pass index lowers to extracting the compile-time constant in-bounds element from that aggregate. - An immutable array local lowers to implementation-owned storage sufficient for literal indexing inside the current function or test body. - The backend must not generate unchecked dynamic array access for this contract. Dynamic indices remain unsupported until runtime checked-index trap behavior is specified. - For the first promotion target, an implementation may lower immediate constructor indexing without materializing addressable array storage, and may lower immutable local indexing through local aggregate storage. - No stable LLVM, C, FFI, or cross-module ABI layout is promised in v0. - If a checked array form reaches a backend gap, Glagol must report `UnsupportedBackendFeature` instead of panicking. Formatter behavior: ```slo (fn second () -> i32 (index (array i32 10 20 30) 1)) (fn local_sum () -> i32 (let values (array i32 3) (array i32 4 5 6)) (+ (index values 0) (index values 2))) ``` The formatter prints `(array i32 N)` types inline. It prints constructor expressions with inline `i32` values as `(array i32 value...)`. It prints first-pass index expressions inline as `(index array-expr index-expr)` when the array expression and index expression are inline. Function and test body indentation follows the existing one-body-form-per-line rule. Diagnostics: - `MalformedArrayType`: array type syntax is not `(array i32 N)`. Span: whole type form or offending operand. Expected: `(array i32 N)`. - `UnsupportedArrayElementType`: array element type is not `i32`. Span: element type. Expected: `i32`. Found: written element type. - `ZeroLengthArrayUnsupported`: array length is zero. Span: array type. Hint: use one or more `i32` elements. - `MalformedArrayConstructor`: constructor syntax is not `(array i32 value...)`. Span: whole constructor or offending operand. - `InvalidArrayElementType`: constructor element type syntax is invalid. Span: element type. Hint: first-pass arrays use `i32` elements. - `EmptyArrayUnsupported`: constructor has no element values. Span: whole constructor. Hint: provide one or more `i32` values. - `ArrayLengthMismatch`: constructor value count does not match the expected `(array i32 N)` length. Span: whole constructor. Expected: expected length. Found: value count. - `TypeMismatch`: constructor element value does not check as `i32`. Span: element value. Expected: `i32`. Found: checked value type. - `MalformedArrayIndex`: index syntax is not `(index array-expr index-expr)`. Span: whole index form or offending operand. - `IndexOnNonArray`: `array-expr` does not check as an array type. Span: array expression. Expected: `(array i32 N)`. Found: checked value type. - `ArrayIndexNotI32`: `index-expr` does not check as `i32`. Span: index expression. Expected: `i32`. Found: checked value type. - `UnsupportedArrayIndexBase`: index uses an array value that is not an immediate constructor or immutable array local. Span: array expression. Hint: use `(index (array i32 ...) N)` or index a `let` array local. - `DynamicArrayIndexUnsupported`: `index-expr` is not an integer literal. Span: index expression. Hint: first-pass checked indexing requires a literal index. - `ArrayIndexOutOfBounds`: literal index is outside `0 <= index < N`. Span: index literal. Expected: valid index range. Found: literal index. - `UnsupportedArrayLocalInitializer`: an array local is initialized from anything other than a direct array constructor. Span: initializer expression. Hint: use `(let values (array i32 N) (array i32 ...))`. - `MutableArrayLocalUnsupported`: `(array i32 N)` is used as a mutable `var` local. Span: local type. Hint: declare array locals with `let`. - `UnsupportedArraySignatureType`: `(array i32 N)` is used as a function parameter or return type in the first pass. Span: signature type. - `UnsupportedArrayEquality`: an array value is compared with `=`. Span: equality expression. - `UnsupportedArrayPrint`: an array value is passed to `print_i32`. Span: print argument. - `ArrayMutationUnsupported`: an implementation that accepts a generalized assignment target must reject element mutation in the first pass. Span: attempted array element assignment target. Slices remain a broader design target. No slice type, slice constructor, borrowed view, pointer decay, or slice indexing behavior is part of this first-pass array contract. --- ## 14. Option and Result Status: strict first-pass compiler-supported syntax, promoted through `examples/supported/option-result.slo` and `examples/formatter/option-result.slo`. Nullable absence uses `option`. Recoverable failure uses `result`. Promotion boundary: - The first pass supports only `(option i32)` and `(result i32 i32)`. - Constructors are the only promoted option/result value operations. - Constructors are supported only as the single final body expression of a function whose declared return type exactly matches the constructor result. - Function parameters, local declarations, mutable storage, assignment, matching, unwrap, equality, printing, and calls that pass or receive option/result values are not part of this pass. - Nested option/result types, non-`i32` payloads, strings, arrays, structs, slices, and pointers as payloads are not part of this pass. - No stable LLVM, C, FFI, cross-module, or serialized ABI/layout promise is made for option/result values in v0. First-pass option type: ```slo (option i32) ``` First-pass result type: ```slo (result i32 i32) ``` Constructor expressions: ```slo (some i32 value) (none i32) (ok i32 i32 value) (err i32 i32 value) ``` Type rules: - `(option i32)` is supported only as a function return type under the direct constructor-return rule above. - `(result i32 i32)` is supported only as a function return type under the direct constructor-return rule above. - `(some i32 value)` checks `value` as `i32` and produces `(option i32)`. - `(none i32)` produces `(option i32)` and has no payload expression. - `(ok i32 i32 value)` checks `value` as `i32` and produces `(result i32 i32)`. - `(err i32 i32 value)` checks `value` as `i32` and produces `(result i32 i32)`. - Constructor type operands are type names in source, not expressions. - Constructor payload expressions use the already-supported `i32` expression rules. The constructor itself may not be nested in another expression for this first pass. - Pattern matching is not v0. There is no promoted operation that observes, destructures, compares, prints, or unwraps an option/result value. Typed-core meaning: ```text OptionType { payload: i32, span, payload_span } ResultType { ok: i32, err: i32, span, ok_span, err_span } OptionConstruct { variant: Some | None, payload: i32, value: Option>, type: Option(i32), span, payload_span, value_span } ResultConstruct { variant: Ok | Err, ok: i32, err: i32, value: TExpr, type: Result(i32, i32), span, ok_span, err_span, value_span } ``` Lowering and backend boundary: - A first-pass option/result constructor lowers to a compiler-owned tagged value representation sufficient to return the value from the direct-return function. - The backend may choose any internal representation that preserves the checked variant and `i32` payload for future typed-core use. - The backend must not expose or rely on a stable option/result ABI, layout, discriminant value, padding, or cross-module representation in v0. - No storage, mutation, comparison, print, unwrap, pattern-match, or parameter passing lowering is required or supported by this contract. - If a checked option/result form reaches a backend gap, Glagol must report `UnsupportedBackendFeature` instead of panicking. Formatter behavior: ```slo (fn some_value ((value i32)) -> (option i32) (some i32 value)) (fn no_value () -> (option i32) (none i32)) (fn ok_value ((value i32)) -> (result i32 i32) (ok i32 i32 value)) (fn err_value ((code i32)) -> (result i32 i32) (err i32 i32 code)) ``` The formatter prints `(option i32)` and `(result i32 i32)` return types inline. It prints first-pass constructors inline as `(some i32 value)`, `(none i32)`, `(ok i32 i32 value)`, and `(err i32 i32 value)` when the payload expression is inline. Function body indentation follows the existing one-body-form-per-line rule. Diagnostics: - `MalformedOptionType`: option type syntax is not `(option i32)`. Span: whole type form or offending operand. Expected: `(option i32)`. - `UnsupportedOptionPayloadType`: option payload type is not `i32`. Span: payload type. Expected: `i32`. Found: written payload type. - `MalformedResultType`: result type syntax is not `(result i32 i32)`. Span: whole type form or offending operand. Expected: `(result i32 i32)`. - `UnsupportedResultPayloadType`: result ok or err type is not `i32`. Span: offending payload type. Expected: `i32`. Found: written payload type. - `MalformedOptionConstructor`: constructor syntax is not `(some i32 value)` or `(none i32)`. Span: whole constructor or offending operand. - `MalformedResultConstructor`: constructor syntax is not `(ok i32 i32 value)` or `(err i32 i32 value)`. Span: whole constructor or offending operand. - `TypeMismatch`: constructor payload value does not check as `i32`. Span: payload value. Expected: `i32`. Found: checked value type. - `UnsupportedOptionResultReturn`: an option/result return type is used by a function whose body is not exactly one matching direct constructor expression. Span: return type or final expression. - `UnsupportedOptionResultSignatureType`: an option/result type is used as a function parameter type. Span: signature type. - `UnsupportedOptionResultLocal`: an option/result type is used as a local declaration type. Span: local type. - `UnsupportedOptionResultFlow`: an option/result constructor or value is used anywhere other than the supported direct-return position. Span: value expression. - `UnsupportedOptionResultEquality`: an option/result value is compared with `=`. Span: equality expression. - `UnsupportedOptionResultPrint`: an option/result value is passed to `print_i32`. Span: print argument. - `UnsupportedOptionResultPatternMatch`: a match/destructure/unwrap form tries to observe an option/result value. Span: observing expression. Hint: pattern matching is not part of the v0 option/result constructor pass. --- ## 15. Unsafe Status: strict lexical marker support, promoted through `examples/supported/unsafe.slo` and `examples/formatter/unsafe.slo`. First-pass unsafe form: ```slo (unsafe body-form... final-expression) ``` Promotion boundary: - The first pass supports only the lexical `unsafe` expression block. - The marker is lexically visible in source; there is no implicit unsafe context in v0. - An unsafe block may contain the same sequential body forms already supported in normal function and top-level test bodies, followed by one final expression. - The block returns the final expression's checked type. - The block marks a lexical unsafe context for diagnostics only. It does not make raw memory operations supported. - Raw allocation, deallocation, pointer loads or stores, pointer arithmetic, unchecked indexing, raw reinterpretation, and FFI calls remain unsupported. Example using only supported safe forms: ```slo (fn add_one_in_unsafe ((value i32)) -> i32 (unsafe (let one i32 1) (+ value one))) ``` Surface rules: - The form is exactly `(unsafe body-form... final-expression)`. - The form must contain at least one expression after `unsafe`. - `unsafe` is an expression form and may be used wherever its final expression's type is expected. - Body forms before the final expression use the same first-pass sequential body-form rules as the surrounding function or top-level test body. Today that means local declarations, local assignment, supported unit-producing calls, and first-pass loops only where those forms are otherwise supported. - Local declarations inside an unsafe block are scoped to that block. They are visible only after their declaration and only until the end of the block. - Names declared inside an unsafe block must not shadow parameters, locals from an outer body, earlier locals in the same unsafe block, top-level functions, or compiler intrinsics under the existing v0 no-shadowing rules. - A nested unsafe block is allowed only if its body forms and final expression satisfy this same lexical contract. Typed-core meaning: ```text UnsafeBlock { body: [TBodyForm], result: TExpr, type: T, span, body_spans, result_span } ``` The checker carries an `in_unsafe` lexical flag while checking the block body and final expression. That flag is only used to choose diagnostics for unsafe operation heads; it does not widen the set of supported expression forms. Unsafe operation heads requiring the lexical marker: ```text alloc dealloc load store ptr_add unchecked_index reinterpret ffi_call ``` These names are reserved as unsafe operation heads in expression position for v0. They do not resolve as ordinary user-defined calls in this pass. Diagnostic boundary for those heads: - Outside an unsafe block, any expression whose head is one of the names above must report `UnsafeRequired`. Span: whole operation form, with the head span as the primary operation name span. Hint: wrap the operation in an `unsafe` block. - Inside an unsafe block, the same operation heads must report `UnsupportedUnsafeOperation`. Span: whole operation form, with the head span as the primary operation name span. Hint: raw memory operations are outside the v0 unsafe contract. Lowering and backend boundary: - A lexical unsafe block lowers like an ordinary expression block: lower each checked body form in source order, then lower the final expression and use its value as the block value. - The lowered representation must preserve source order and local block scope. - There is no backend lowering for raw memory operations in this pass. - If a checked lexical unsafe block containing only supported safe forms reaches a backend gap, Glagol must report `UnsupportedBackendFeature` instead of panicking. Formatter behavior: ```slo (fn add_one_in_unsafe ((value i32)) -> i32 (unsafe (let one i32 1) (+ value one))) ``` The formatter must print `unsafe` as a multi-line expression block. The opening line contains only `(unsafe`. Each body form and the final expression are printed on their own lines, indented two spaces deeper than the `unsafe` form. The closing `)` is placed on the final expression line when that expression is inline. If an unsafe block appears inside another expression, the block may be broken over multiple lines using the same indentation rule. Diagnostics: - `MalformedUnsafeForm`: an `unsafe` form has no final expression or has an invalid operand shape. Span: whole `unsafe` form or the location where the first expression was expected. Expected: `(unsafe body-form... final-expression)`. - `UnsafeRequired`: a raw unsafe operation head appears outside a lexical unsafe block. Span: whole operation form, with the head span as the operation name. - `UnsupportedUnsafeOperation`: a raw unsafe operation head appears inside a lexical unsafe block. Span: whole operation form, with the head span as the operation name. - Existing diagnostics for local declarations, assignments, loops, calls, comparisons, and type mismatches apply inside unsafe blocks. Out of scope for v0: - pointer allocation - pointer load/store - pointer arithmetic - unchecked indexing behavior - raw reinterpretation - FFI calls - pointer locals - stable ABI/layout promises - unsafe abstractions --- ## 16. Tests Status: supported in the strict top-level form below. Current supported top-level test form: ```slo (test "name" body-form... final-expression) ``` Surface rules: - `test` is a top-level declaration, sibling to `fn`; it is not an expression and is not valid inside a function body or another expression. - The form has a string-literal name followed by one or more body expressions. - The test name is metadata for the test runner, not a runtime `string` value. - v0 test names must be non-empty printable ASCII without embedded quotes, backslashes, or newlines until the general string escaping contract is specified. - Test names must be unique within a module after decoding. - A single final expression is supported as the degenerate body. - Non-final body forms may be local declarations or local assignments: `(let name i32 value)`, `(var name i32 value)`, or `(set name value)`. - When `while` is promoted, it is also a valid non-final test body form under the first-pass loop contract in section 11. - The final test expression has expected type `bool`; Slovo performs no implicit conversion to `bool`. - Tests may refer to functions in the same module using the same name resolution rules as function bodies. Tests are not callable and do not introduce names into the value namespace. - Normal compilation must parse, lower, and check tests, but must not run them. Test mode runs top-level tests in source order. Typed-core meaning: ```text TypedTest { name: TestName, body: [TBodyForm], final_expr: TExpr, span, name_span, body_spans } ``` Top-level tests use the same sequential local-body rule as function bodies: ```slo (test "local update" (let base i32 20) (var total i32 (+ base 1)) (set total (+ total 1)) (= total 22)) ``` Local declarations and `set` forms may appear before the final test expression. The final expression remains the test result and must check as `bool`. This extension does not promote any broader block syntax. The checker resolves and checks each test's final expression in a test context with expected type `bool`. A false result is a failed test result, not a compile-time diagnostic. Compile-time diagnostics are reserved for malformed test forms, invalid names, duplicate names, name-resolution failures, type errors, and unsupported implementation gaps. Lowering and execution: - `TypedTest` entries are kept in a module test list separate from normal functions. - Tests do not affect `main`, exported functions, or the normal program ABI. - In test-runner lowering, each `TypedTest` may lower to an internal zero-argument `bool`/LLVM `i1` thunk plus a registry entry containing the test name and source span. - A test passes when its expression evaluates to true, fails when it evaluates to false, and errors if evaluation traps or the runner cannot execute it. - In normal LLVM emission, an implementation may omit test thunks after parsing, lowering, and checking them. It must not silently ignore malformed or invalid tests. Formatter behavior: ```slo (test "add works" (= (add 2 3) 5)) ``` The canonical formatter keeps the test name on the opening line. In the current supported subset, it prints each body form and the final expression on separate two-space-indented lines. Each expression itself uses the normal canonical expression formatter. Diagnostics: - `MalformedTestForm`: missing name, non-string name, missing final expression, wrong operand count, or invalid body form. Span: whole test form or offending operand. Expected: `(test "name" body... final-expression)`. - `InvalidTestName`: empty name or a name outside the v0 name subset. Span: string literal name. - `DuplicateTestName`: duplicate decoded name in the same module. Span: duplicate name, with a related span on the original test name. - `TestExpressionNotBool`: checked expression type is not `bool`. Span: test expression. Expected: `bool`. Found: checked expression type. - Existing expression diagnostics such as `UnknownVariable`, `ArityMismatch`, `TypeMismatch`, and `UnsupportedBackendFeature` apply inside test expressions. Supported example: ```slo (test "add works" (= (add 2 3) 5)) ``` Supported local-body example: ```slo (test "locals work" (let base i32 2) (var value i32 (add_local base)) (set value (+ value 1)) (= value 5)) ``` --- ## 17. Diagnostics Diagnostics should have both human-readable and machine-readable forms. Status: Glagol currently has diagnostics with byte spans and human/machine rendering. Slovo v0 requires both byte spans and line/column ranges for user-facing and machine-readable diagnostics. Exact diagnostic snapshots remain an implementation/test contract, but implementations must not report only unstructured text or only byte offsets. Machine-readable diagnostic example: ```slo (error (code TypeMismatch) (expected i32) (found string) (message "expected i32, found string") (span "main.slo" (bytes 42 49) (range 12 8 12 15)) (hint "Use an integer value or convert explicitly.")) ``` Required diagnostic fields: - code - primary source span as a zero-based, half-open byte range - primary source range as one-based start line, start column, end line, and end column - expected, when applicable - found, when applicable - message - hint, when useful and safe Byte spans are the canonical location for tools. Line/column ranges are required for human-readable output and machine diagnostics so editors and external tools can display errors without reimplementing Slovo's source mapper. Line and column values are derived from the original source file, not formatter output. Columns are one-based byte columns within a UTF-8 source line; a tab in source counts as one input byte for the machine range, even though canonical formatting uses spaces. Checked forms that reach a backend feature gap must return a structured `UnsupportedBackendFeature` diagnostic instead of panicking. Unsupported signature types, string literals without runtime lowering, and other speculative forms must remain out of `examples/supported/` until they have backend behavior or explicit diagnostics plus tests. Test-specific diagnostics introduced by the top-level `test` contract are `MalformedTestForm`, `InvalidTestName`, `DuplicateTestName`, and `TestExpressionNotBool`. They follow the same required fields above and must use original source spans, not formatter output. `DuplicateTestName` includes a related span pointing to the original test name. Loop-specific diagnostics introduced by the first-pass `while` contract are `MalformedWhileForm`, `WhileConditionNotBool`, `EmptyWhileBody`, `LocalDeclarationInWhileBodyUnsupported`, `NestedWhileUnsupported`, and `WhileBodyFormNotUnit`. They follow the same required fields above and must use original source spans, not formatter output. Struct-specific diagnostics introduced by the first-pass `struct` contract are listed in section 12. They follow the same required fields above, must use original source spans, and must keep related spans for duplicate names or fields when applicable. Array-specific diagnostics introduced by the first-pass array/indexing contract are listed in section 13. They follow the same required fields above and must use original source spans, including byte ranges for malformed array types, constructors, literal indices, and unsupported array value-flow forms. Option/result-specific diagnostics introduced by the first-pass constructor contract are listed in section 14. They follow the same required fields above and must use original source spans, including byte ranges for malformed option/result types, constructors, and unsupported option/result value-flow forms. --- ## 18. LLVM Lowering Sketch Slovo v0 should lower through an internal typed core before LLVM IR. Example Slovo: ```slo (fn add ((a i32) (b i32)) -> i32 (+ a b)) ``` Approximate LLVM IR shape: ```llvm define i32 @add(i32 %a, i32 %b) { entry: %0 = add i32 %a, %b ret i32 %0 } ``` --- ## 19. v0 Implementation Milestones 1. Parse forms 2. Print parsed tree 3. Canonical formatter 4. Basic name resolution 5. Primitive type checker 6. Function checker 7. `if` checker 8. `let` / `var` / `set` 9. Struct definitions 10. Top-level `test` checker and test runner 11. Structured diagnostics 12. Lower simple functions to LLVM IR 13. Compile and run `examples/supported/add.slo` 14. Preserve the `glagol` binary CLI contract for supported v0 modes, stderr diagnostics, and exit codes 15. Keep native executable output as an explicit LLVM-plus-runtime linking step until a later backend contract