slovo/docs/language/SPEC-v0.md
2026-05-22 08:38:43 +02:00

59 KiB

Slovo v0 Specification

This is not a complete language specification.

This is the supported v0 contract for the first Slovo parser, formatter, checker, diagnostics, test runner, and LLVM-oriented compiler prototype.


1. Status

Version: v0

File extension: .slo

Primary backend target: LLVM IR

Primary design goals:

  • parse Slovo forms
  • format Slovo forms
  • type-check a small core
  • run tests
  • lower simple programs toward LLVM IR
  • produce structured diagnostics

Compiler-supported fixtures for the frozen v0 contract:

examples/supported/add.slo
examples/supported/top-level-test.slo
examples/supported/local-variables.slo
examples/supported/if.slo
examples/supported/while.slo
examples/supported/struct.slo
examples/supported/array.slo
examples/supported/option-result.slo
examples/supported/unsafe.slo

For strict-manifest iteration1, "compiler-supported" means the form is parsed, lowered, type-checked, emitted as LLVM IR or handled by its required tool mode, and covered by an automated Glagol test.

This file uses "current" only inside the v0 boundary. Post-v0 promotions live in SPEC-v1.md and may expand the repository's supported surface without changing the frozen v0 contract.

Supported v0 subset, as exercised by the v0 fixtures:

  • one (module name) form
  • top-level (fn ...) forms
  • top-level (struct Name (field i32)...) forms
  • top-level (test "name" body... final-expression) forms, where body forms are optional local declarations, assignments, or first-pass loops
  • explicitly typed i32 function parameters and i32 return values
  • direct constructor-return functions with (option i32) and (result i32 i32) return types under section 14
  • integer literals used as i32 values
  • function parameter references
  • binary integer addition with +
  • equality comparison = as a bool-producing test expression
  • ordering comparison < as a bool-producing i32 condition
  • user-defined function calls
  • the temporary compiler intrinsic (print_i32 value) call as a statement-like expression returning builtin unit
  • local i32 bindings with (let name i32 value) and (var name i32 value)
  • assignment to mutable local i32 bindings with (set name value)
  • local references after declaration
  • value-producing (if condition then-expression else-expression)
  • first-pass (while condition body...) as a non-final sequential body form
  • first-pass struct constructor expressions and immediate field access
  • first-pass fixed i32 array constructors, immutable array locals, and literal checked indexing
  • first-pass i32 option/result constructors used only as direct function returns
  • lexical (unsafe body... final-expression) expression blocks whose body forms and final expression are otherwise supported safe v0 forms
  • final-expression function returns

Implementation-recognized forms are not automatically supported language features. A form remains a design target until it has parser/lowerer behavior, checker behavior, backend behavior or explicit unsupported diagnostics, formatter behavior where applicable, and tests.

All examples under examples/speculative/ are v0 design targets, not current compiler-supported fixtures.

Formatter fixtures under examples/formatter/ are canonical-layout fixtures for the same strict supported syntax. They are not new language features and must not include design-target forms.

Top-level (test "name" body... final-expression) is supported only in the strict form defined in section 16.

Non-goals for v0:

  • macros
  • generics
  • concurrency
  • advanced ownership
  • package manager
  • full standard library
  • direct x86/ARM backend

2. Source Files

A Slovo source file uses the .slo extension.

A source file contains one top-level module form.

(module main)

A practical file usually contains:

(module main)

(fn main () -> i32
  0)

3. Lexical Syntax

Whitespace separates tokens but has no semantic meaning.

Tabs and spaces are equivalent.

Line comments begin with ;.

; this is a comment
(+ 1 2)

A form is a parenthesized tree:

(name arg arg arg)

Identifiers may contain letters, digits, _, -, ?, !, and selected symbolic operator names.

String literals use double quotes.

"Hello, Slovo"

Integer literals:

0
42
-7

Current compiler-supported integer literals are signed i32 values in the inclusive range -2147483648 to 2147483647. Glagol must reject literals outside that range with IntegerOutOfRange before narrowing them into the AST.

Floating-point literals:

3.14
-0.5

Floating-point literals are a design target, not current compiler-supported syntax.


4. Top-Level Forms

Current compiler-supported top-level forms:

module
fn
test
struct

Design-target top-level forms:

import

test is supported only as the top-level form described in section 16. struct is supported only as the first-pass top-level form described in section 12. import remains a broader design target.


5. Types

Current compiler-supported scalar type:

i32

First-pass struct declarations introduce nominal struct types as described in section 12. Those types are supported only as constructor temporaries consumed by immediate field access; they are not valid current supported function parameter types, function return types, or local declaration types.

First-pass fixed arrays are specified in section 13. (array i32 N) is a current compiler-supported immutable local declaration type only when the local is initialized directly from an (array i32 value...) constructor with matching length. It is not a current supported function parameter type or function return type.

First-pass option/result constructors are specified in section 14. (option i32) and (result i32 i32) are current compiler-supported function return types only for functions whose single body expression is the matching direct constructor. They are not current supported function parameter types or local declaration types.

print_i32 status: print_i32 is a temporary compiler intrinsic in v0. It is not yet specified as a user-defined function, standard-library binding, runtime binding, import, or foreign function. Programs may use (print_i32 value) only as part of the current strict compiler subset described in section 1.

unit status: unit is a valid internal/builtin result type for unit-like expressions such as print_i32 and the supported (set name value) form. It is not a user-declarable supported function return type until Glagol supports parsing, checking, lowering, formatting, diagnostics, backend behavior, and tests for user-declared unit returns.

Design-target primitive types:

bool
i8 i16 i64
u8 u16 u32 u64
f32 f64
char
string
unit
never

Broader design-target compound types:

(ptr T)
(array T N)
(slice T)
(option T)
(result T E)

The first-pass (array i32 N), (option i32), and (result i32 i32) cases above are the only promoted compound-type exceptions. Generic, nested, and non-i32 compound-type payloads remain design targets.


6. Functions

Function form:

(fn name ((arg Type) ...) -> ReturnType
  body...)

Example:

(fn add ((a i32) (b i32)) -> i32
  (+ a b))

v0 rules:

  • parameters are explicitly typed
  • return type is explicit
  • body contains one or more expressions
  • the last expression is the return value
  • all control paths must produce the declared return type
  • current supported fixtures use i32 returns, except for the direct option/result constructor-return functions described in section 14

7. Variables

Status: supported for explicit i32 local bindings in function bodies and top-level test bodies.

Immutable binding:

(let name type value)

Mutable binding:

(var name type value)
(set name value)

Supported implementation scope:

  • Explicit local types are required in v0.
  • i32 locals are supported with let and var; immutable (array i32 N) locals are supported only under the first-pass array contract in section 13.
  • Local declarations are allowed only inside function bodies, top-level test bodies, and promoted lexical unsafe blocks contained by those bodies.
  • Local declarations and set forms are sequential body forms before the final result expression.
  • A function or test body must still have a final result expression.

Binding rules:

  • (let name type value) creates a new immutable local binding initialized from value.
  • (var name type value) creates a new mutable local binding initialized from value.
  • The initializer is checked before the new local is introduced, so a binding may not refer to itself.
  • The initializer expression must exactly match the explicit local type.
  • The explicit local type must be i32.
  • Local bindings are visible only after their declaration and only within the containing function body, test body, or lexical unsafe block. Bindings from an outer function or test body remain visible inside a nested unsafe block.
  • Lexical unsafe blocks introduce the only current nested local scope. if branches and first-pass while bodies do not introduce local declaration scopes.

Shadowing and redeclaration:

  • v0 forbids shadowing in local bodies.
  • A local binding name must not match any parameter name in the same function.
  • A local binding name must not match any earlier local binding in the same function or test body.
  • A local binding name must not match a top-level function name or compiler intrinsic visible in the same module.
  • Parameters are immutable for v0 and are not valid set targets.
  • Top-level function names are not local bindings. A local reference resolves to a parameter or local binding; a call resolves to a callable top-level function or compiler intrinsic unless a later namespace rule says otherwise.

Assignment:

  • (set name value) assigns an existing mutable local binding.
  • set produces builtin unit.
  • set may appear only as a non-final sequential body form, because supported function/test results are not unit.
  • The assigned expression must exactly match the mutable local's type.
  • Assigning an unknown name is an UnknownVariable diagnostic.
  • Assigning a parameter is a CannotAssignParameter diagnostic.
  • Assigning an immutable let binding is a CannotAssignImmutableLocal diagnostic.
  • Assigning a value with the wrong type is a TypeMismatch diagnostic.
  • Duplicating a local name is a DuplicateLocal diagnostic.
  • Redeclaring a parameter as a local is a LocalRedeclaresParameter diagnostic.
  • Colliding with a visible function or compiler intrinsic is a LocalShadowsCallable diagnostic.
  • Declaring any non-i32 local is an UnsupportedLocalType diagnostic.
  • Using a local declaration in an inline expression is a LocalDeclarationNotAllowed diagnostic.

Typed-core meaning:

LocalLet {
  name,
  type: i32,
  init: TExpr<i32>,
  mutable: false,
  span,
  name_span,
  type_span,
  init_span
}

LocalVar {
  name,
  type: i32,
  init: TExpr<i32>,
  mutable: true,
  span,
  name_span,
  type_span,
  init_span
}

SetLocal {
  target: LocalId,
  value: TExpr<i32>,
  type: unit,
  span,
  name_span,
  value_span
}

Lowering:

  • let and var lower to typed local storage or SSA values at the implementation's discretion, provided source order, mutability checks, and diagnostics are preserved.
  • set lowers to an assignment/store to the resolved mutable local.
  • A local that is never read may be accepted in v0; unused-local warnings are outside the v0 required diagnostics.

Formatter behavior:

(fn add_then_double ((a i32) (b i32)) -> i32
  (let sum i32 (+ a b))
  (var doubled i32 (+ sum sum))
  (set doubled (+ doubled 1))
  doubled)

The formatter must print each local declaration and set as one body form indented by two spaces. Simple let, var, and set forms stay inline when their initializer or assigned expression is an inline expression under the current expression formatter. The final body expression follows the same two-space indentation and may refer to earlier locals.


8. Expressions

Slovo v0 is expression-oriented.

Most forms produce a value.

Current compiler-supported expressions are the expression forms required by the supported fixtures: integer literals, parameter references, local references, local declarations, local assignment, binary +, user function calls, temporary intrinsic print_i32 calls, equality comparison in tests, ordering comparison in loop conditions, value-producing if, first-pass while, struct construction with immediate field access, fixed i32 array construction, literal array indexing, first-pass option/result constructors, lexical unsafe blocks containing otherwise supported safe body forms, and final-expression returns.

Struct construction and immediate field access are supported under the strict first-pass contract in section 12. Array construction, immutable array locals, and literal checked indexing are supported under the strict first-pass contract in section 13. Option/result constructors are supported under the strict first-pass direct-return contract in section 14. Lexical unsafe blocks are supported under the strict first-pass contract in section 15.

Arithmetic:

(+ a b)
(- a b)
(* a b)
(/ a b)

Comparison:

(= a b)
(< a b)
(> a b)
(<= a b)
(>= a b)

v0 rules:

  • operands must have compatible exact types
  • numeric casts are explicit
  • comparisons return bool

Status note: arithmetic and comparison beyond the supported fixture are design targets until covered by the strict support rule in section 1.

The top-level test contract in section 16 introduces an expected-bool context, but it does not promote any expression form by itself. A promoted test fixture must use only bool-producing expressions that also satisfy the strict support rule, such as equality comparison once it is accepted as part of a promoted fixture.


9. Canonical Formatter

Status: current strict supported syntax only.

The formatter is part of the Slovo language contract. For the current supported subset, canonical output follows these rules:

  • use LF line endings and end every file with one trailing newline
  • remove trailing whitespace
  • use spaces, not tabs
  • put top-level forms at column 1
  • keep (module name) on one line
  • separate top-level forms with exactly one blank line
  • print each supported top-level fn as a multi-line form
  • print the function header on one line as (fn name ((arg Type) ...) -> ReturnType
  • print an empty parameter list as ()
  • keep supported parameter lists inline as ((name i32) (name i32))
  • indent each function body expression by two spaces
  • print one supported body expression per line
  • keep supported expression forms inline: integer literals, parameter references, local references, local declarations, set, binary +, user-defined function calls, (print_i32 value), and first-pass option/result constructors
  • keep nested supported calls inline, for example (print_i32 (add 20 22))
  • place the closing ) for a function on the final body-expression line when that expression is inline
  • preserve full-line comments attached to the nearest following form, without reflowing comment text

Canonical current fixture:

(module main)

(fn add ((a i32) (b i32)) -> i32
  (+ a b))

(fn main () -> i32
  (print_i32 (add 20 22))
  0)

Formatter rules for let, var, and set are part of the current supported formatter fixture set. The if formatter contract and first-pass while formatter contract are also part of the current formatter fixture set. The first-pass formatter rules for struct are also part of the current formatter fixture set. Section 13 specifies the first-pass array/index formatter contract, and examples/formatter/array.slo is the canonical formatter fixture for that promoted subset. Section 14 specifies the first-pass option/result constructor formatter contract, and examples/formatter/option-result.slo is the canonical formatter fixture for that promoted subset. Section 15 specifies the lexical unsafe formatter contract, and examples/formatter/unsafe.slo is the canonical formatter fixture for that promoted subset. Formatter rules for strings, slices, broader option/result value flow, pointer types, raw memory operations, and general user-declared unit returns remain design targets.

Top-level test formatter contract:

(test "add works"
  (= (add 2 3) 5))

The formatter must print each top-level test as a multi-line top-level form, keep the string name on the opening line, indent the single test expression by two spaces, and place the closing ) on the expression line when that expression is inline. Test forms follow the same top-level blank-line and comment attachment rules as fn.


10. Conditionals

Status: compiler-supported value expression through examples/supported/if.slo under the strict support rule in section 1.

Conditional form:

(if condition then-expression else-expression)

v0 rules:

  • condition must be bool
  • both branches must have the same type
  • if returns that type
  • if is an expression form and may be used where its result type is expected.
  • Branches are expressions, not declaration scopes. A local declaration inside an if branch remains unsupported in this pass.
  • The formatter prints if as a multi-line expression:
(if (< value 3)
  10
  20)

Diagnostics:

  • MalformedIfForm: missing condition, then expression, else expression, or an invalid operand shape. Span: whole if form. Expected: (if condition then-expression else-expression).
  • IfConditionNotBool: checked condition type is not bool. Span: condition expression. Expected: bool. Found: checked condition type.
  • IfBranchTypeMismatch: then and else branches do not have the same checked type. Span: whole if form. Expected: then-branch type. Found: else-branch type.

11. Loops

Status: compiler-supported first-pass contract through examples/supported/while.slo under the strict support rule in section 1.

First-pass v0 loop form:

(while condition
  body-form...)

Surface placement:

  • (while condition body-form...) is a sequential body form.
  • It is allowed only inside function bodies and top-level test bodies.
  • In the first supported implementation, it may appear only before the final result expression of supported i32 functions and bool tests.
  • It is not valid as an inline expression argument.
  • It is not valid as the final result expression of a supported i32 function or bool test, because it returns builtin unit.

Checking rules:

  • The condition is checked in the current local environment and must have type bool.
  • The condition is re-evaluated before each iteration.
  • A false initial condition executes zero body iterations.
  • The body must contain one or more body forms.
  • while returns builtin unit.
  • The first implementation does not introduce nested local-scope semantics. Local declarations inside a while body are not supported.
  • Supported first-pass loop body forms are limited to assignments to existing mutable locals with (set name value) and supported unit-producing calls such as (print_i32 value).
  • Loop body forms must each check as unit.
  • Existing locals visible before the while are visible in the condition and body. Assignments in the body update the existing mutable local.
  • break and continue are not part of this pass.
  • Nested while is not part of the first-pass support target.

Typed-core meaning:

While {
  condition: TExpr<bool>,
  body: [TBodyForm<unit>],
  type: unit,
  span,
  condition_span,
  body_spans
}

Lowering:

  • A while lowers to a condition block, a body block, and an exit block.
  • Control enters the condition block first.
  • If the condition evaluates to true, control enters the body block; after the body completes, control branches back to the condition block.
  • If the condition evaluates to false, control branches to the exit block.
  • The exit continuation has builtin unit as the loop result and then continues with the next sequential body form.
  • Locals assigned in the body must preserve source-order mutation semantics. The implementation may use storage slots or equivalent SSA construction, but it must not treat the loop body as a new declaration scope in this first pass.

Formatter behavior:

(fn sum_to ((n i32)) -> i32
  (var i i32 0)
  (var sum i32 0)
  (while (< i n)
    (set sum (+ sum i))
    (set i (+ i 1)))
  sum)

The formatter must print while as a multi-line body form. The opening line contains (while and the formatted condition. Each loop body form is printed on its own line, indented two spaces deeper than the while form. The closing ) is placed on the final body-form line when that body form is inline. The following function or test body form resumes at the original body indentation.

Diagnostics:

  • MalformedWhileForm: missing condition or an invalid operand shape. Span: whole while form or offending operand. Expected: (while condition body...).
  • WhileConditionNotBool: checked condition type is not bool. Span: condition expression. Expected: bool. Found: checked condition type.
  • EmptyWhileBody: a while has no body forms after the condition. Span: whole while form or the location where the first body form was expected.
  • LocalDeclarationInWhileBodyUnsupported: a let or var appears directly in a first-pass while body. Span: offending local declaration. Hint: declare the local before the loop and update an existing mutable local with set.
  • NestedWhileUnsupported: a nested while appears directly in a first-pass while body. Span: nested while. Hint: keep first-pass loop bodies flat.
  • WhileBodyFormNotUnit: a loop body form does not produce unit. Span: offending body form. Expected: unit. Found: checked body-form type.
  • Existing diagnostics such as UnknownVariable, CannotAssignParameter, CannotAssignImmutableLocal, TypeMismatch, ArityMismatch, and UnsupportedBackendFeature apply inside while conditions and bodies.
  • Using a while as the final expression of an i32 function must report ReturnTypeMismatch with expected i32 and found unit.
  • Using a while as the final expression of a top-level test must report TestExpressionNotBool with expected bool and found unit.

Minimal comparison dependency:

The promoted first fixture for while uses < to produce the loop condition. Promoting while does not by itself promote the broader comparison family or unrelated arithmetic. The supported comparison surface for this pass is the exact i32, i32 -> bool < form exercised by the fixture, with parser/lowerer support, checker behavior, LLVM behavior, formatter behavior, and tests.


12. Structs

Status: compiler-supported first-pass contract through examples/supported/struct.slo under the strict support rule in section 1.

Promotion boundary:

  • The supported fixture may use only top-level i32 struct declarations, constructor expressions, and field access returning i32.
  • Local struct storage is a follow-up. The first pass does not support (let p Point ...), (var p Point ...), assigning a struct value, passing a struct as a parameter, or returning a struct from a function.
  • Methods, generics, field mutation, nested struct fields, recursive structs, struct params/returns, layout reflection, cross-module structs, and ABI promises are explicitly outside the first pass.

Struct definition:

(struct Point
  (x i32)
  (y i32))

Struct declarations are top-level forms, sibling to fn and test.

Surface rules:

  • The form is (struct Name (field i32)...).
  • Name is a normal identifier.
  • Each field is a normal identifier.
  • A struct must declare one or more fields.
  • First-pass fields must be exactly i32.
  • Field order is source order and is part of the v0 typed-core contract.
  • Struct names are nominal. Two structs with the same fields are not the same type.
  • A struct name must be unique in the module and must not collide with a top-level function name or compiler intrinsic visible in constructor expression position.
  • Field names must be unique within the struct.
  • A struct declaration is not an expression and produces no runtime value.

Constructor expression:

(Point (x 3) (y 4))

Constructor rules:

  • The callee position must be the declared struct name.
  • Each constructor field is written as (field value), not as a keyword token.
  • Constructor fields must list every declared field exactly once.
  • Constructor fields must appear in declaration order for the first pass.
  • Each field value is checked in source order and must exactly match the declared field type. Since first-pass fields are i32, each value must check as i32.
  • Constructor evaluation preserves field value evaluation order.
  • A constructor produces a value of the nominal struct type.
  • In the first promotion target, a struct constructor may be used only as the immediate value being inspected by field access, for example (. (Point (x 3) (y 4)) x). Wider struct value flow requires the follow-up local/signature/storage contract.

Field access:

(. p x)

First-pass examples should use immediate constructor access:

(. (Point (x 3) (y 4)) x)

Field access rules:

  • The form is (. value field).
  • field is a normal identifier, not a keyword token or string.
  • value must check as a known struct type.
  • The named field must exist on that struct type.
  • The expression result type is the field type. In the first pass this is always i32.
  • The value expression is evaluated once.
  • Field access is read-only. There is no field mutation form in the first pass.

Checking rules:

  • The checker collects and validates top-level struct declarations before checking function and test bodies in the same module.
  • Struct declaration checking validates top-level placement, identifier shape, module-level name uniqueness, field-name uniqueness, and i32 field types.
  • Constructor checking resolves the callee name in the struct namespace, then checks constructor field names, field order, field completeness, duplicate fields, extra fields, and each field value type.
  • Field access checking first checks the value expression, then resolves the requested field against the value's nominal struct type.
  • A struct constructor or other struct-typed expression outside direct field access is not part of the first pass. It must report StructValueFlowUnsupported, except for the more specific local and signature diagnostics below.
  • Existing expected-type diagnostics apply after field access. For example, (. (Point (x 3) (y 4)) x) checks as i32 and may be used anywhere an i32 expression is currently supported.

Typed-core meaning:

StructDecl {
  name: StructId,
  fields: [StructField],
  span,
  name_span
}

StructField {
  name: FieldId,
  type: i32,
  index: FieldIndex,
  span,
  name_span,
  type_span
}

StructConstruct {
  struct_id: StructId,
  fields: [(FieldId, TExpr<i32>)],
  type: Struct(StructId),
  span,
  name_span,
  field_spans
}

StructFieldAccess {
  value: TExpr<Struct(StructId)>,
  field_id: FieldId,
  type: i32,
  span,
  value_span,
  field_span
}

Lowering:

  • A struct declaration lowers to type metadata in the checked module, not to a runtime declaration by itself.
  • A constructor lowers to an aggregate value or equivalent typed-core value with fields in declaration order.
  • Field access lowers to extracting the selected field from that aggregate.
  • For the first promotion target, an implementation may lower direct constructor field access without materializing addressable struct storage.
  • No stable LLVM, C, FFI, or cross-module ABI layout is promised in v0.
  • If a checked struct form reaches a backend gap, Glagol must report UnsupportedBackendFeature instead of panicking.

Formatter behavior:

(struct Point
  (x i32)
  (y i32))

(fn point_x () -> i32
  (. (Point (x 3) (y 4)) x))

The formatter must print each top-level struct as a multi-line top-level form. The opening line is (struct Name; each field declaration is printed on its own line indented two spaces; the closing ) is placed on the final field line. Top-level blank-line and comment attachment rules match fn and test.

Constructor expressions with inline i32 field values stay inline as (Name (field value)...). Field access stays inline when its value expression is inline. If a future formatter must break a constructor over multiple lines, the constructor opening line is (Name, each (field value) pair is indented two spaces, and field order remains declaration order.

Diagnostics:

  • MalformedStructForm: missing name, non-identifier name, missing fields, or invalid field shape. Span: whole struct form or offending operand. Expected: (struct Name (field i32)...).
  • StructDeclarationNotTopLevel: a struct form appears anywhere other than the top level. Span: offending struct form.
  • EmptyStructUnsupported: a struct declares no fields. Span: whole struct form. Hint: declare at least one i32 field.
  • DuplicateStruct: duplicate struct name in the same module. Span: duplicate name, with a related span on the original struct name.
  • StructNameConflictsCallable: struct name collides with a top-level function name or compiler intrinsic visible in constructor position. Span: struct name, with a related span when the colliding source name exists.
  • DuplicateStructField: duplicate field name within one struct. Span: duplicate field name, with a related span on the original field name.
  • UnsupportedStructFieldType: field type is not i32. Span: field type. Expected: i32. Found: written field type.
  • MalformedStructConstructor: constructor operand shape is invalid. Span: whole constructor or offending field pair. Expected: (Name (field value)...).
  • UnknownStruct: constructor name does not resolve to a struct. Span: constructor name.
  • DuplicateStructConstructorField: constructor lists a field more than once. Span: duplicate constructor field, with a related span on the first use.
  • MissingStructField: constructor omits a declared field. Span: whole constructor. Expected: missing field name.
  • UnknownStructField: constructor or field access names a field not declared by the struct. Span: extra constructor field name or accessed field name.
  • StructConstructorFieldOrderMismatch: constructor field order differs from declaration order. Span: first out-of-order constructor field. Expected: declared field name at that position. Found: written field name.
  • TypeMismatch: constructor field value does not match its declared field type. Span: field value. Expected: declared field type. Found: checked value type.
  • MalformedFieldAccess: field access is not (. value field). Span: whole access form or offending operand.
  • FieldAccessOnNonStruct: field access value does not check as a struct type. Span: value expression. Expected: struct value. Found: checked value type.
  • UnsupportedStructLocal: a struct type is used as a local let or var type in the first pass. Span: local type.
  • UnsupportedStructSignatureType: a struct type is used as a function parameter or return type in the first pass. Span: signature type.
  • UnsupportedStructFieldAccess: field access uses a non-immediate struct constructor value in the first pass. Span: field access expression.
  • FieldMutationUnsupported: an implementation that accepts a generalized assignment target must reject field mutation in the first pass. Span: attempted field assignment target.

13. Arrays and Checked Indexing

Status: strict first-pass compiler-supported syntax, promoted through examples/supported/array.slo and examples/formatter/array.slo.

Promotion boundary:

  • The first pass supports only fixed-length arrays with i32 elements.
  • Array lengths must be positive integer literals. Zero-length arrays are not supported in this pass.
  • Array constructors produce immutable temporary array values.
  • Immutable let locals may store (array i32 N) values only when initialized directly from a matching array constructor.
  • The first promoted fixture may index only an immediate array constructor or an immutable array local with a non-negative integer literal index known at compile time.
  • Dynamic indices are not supported in this pass, so v0 does not yet define a runtime bounds trap or test-runner trap result for arrays.
  • Mutable array locals, array parameters, array returns, array mutation, nested arrays, arrays of structs, slices, and unchecked indexing are follow-up work.

Fixed array type:

(array i32 N)

Type rules:

  • N is a source integer literal in type position, not an expression.
  • N must be greater than 0.
  • The only first-pass element type is i32.
  • (array i32 N) is supported only as an immutable local declaration type in the first pass. Function parameter and return types remain unsupported.
  • (array T N) with any T other than i32 is rejected by the first pass.

Array constructor expression:

(array i32 value...)

Constructor rules:

  • The first operand after array is the element type and must be i32.
  • At least one value is required.
  • Each value is checked in source order and must check exactly as i32.
  • The constructor length is inferred from the value count.
  • The constructor result type is (array i32 count).
  • In an expected-type context, a constructor of type (array i32 count) is valid for (array i32 N) only when count == N.
  • Constructor value evaluation order is left to right.

Checked index expression:

(index array-expr index-expr)

First-pass supported example:

(index (array i32 10 20 30) 1)
(let values (array i32 3) (array i32 4 5 6))
(index values 2)

Index rules:

  • The form is exactly (index array-expr index-expr).
  • array-expr must check as (array i32 N).
  • For the first promotion target, array-expr must be an immediate array constructor or an immutable array local.
  • index-expr must be a non-negative i32 integer literal in the first pass.
  • The literal index must satisfy 0 <= index < N.
  • The result type is i32.
  • The array expression is evaluated once.
  • Because accepted first-pass indices are compile-time literals, bounds checking happens during checking. A dynamic checked-index form must be rejected until runtime trap behavior is specified and tested.

Typed-core meaning:

ArrayType {
  element: i32,
  length: NonZeroUsize,
  span,
  element_span,
  length_span
}

ArrayConstruct {
  element: i32,
  length: NonZeroUsize,
  values: [TExpr<i32>],
  type: Array(i32, length),
  span,
  element_span,
  value_spans
}

ArrayIndex {
  array: TExpr<Array(i32, length)>,
  index: usize,
  type: i32,
  span,
  array_span,
  index_span
}

Lowering:

  • An array constructor lowers to an aggregate value or equivalent typed-core value with elements in source order.
  • A first-pass index lowers to extracting the compile-time constant in-bounds element from that aggregate.
  • An immutable array local lowers to implementation-owned storage sufficient for literal indexing inside the current function or test body.
  • The backend must not generate unchecked dynamic array access for this contract. Dynamic indices remain unsupported until runtime checked-index trap behavior is specified.
  • For the first promotion target, an implementation may lower immediate constructor indexing without materializing addressable array storage, and may lower immutable local indexing through local aggregate storage.
  • No stable LLVM, C, FFI, or cross-module ABI layout is promised in v0.
  • If a checked array form reaches a backend gap, Glagol must report UnsupportedBackendFeature instead of panicking.

Formatter behavior:

(fn second () -> i32
  (index (array i32 10 20 30) 1))

(fn local_sum () -> i32
  (let values (array i32 3) (array i32 4 5 6))
  (+ (index values 0) (index values 2)))

The formatter prints (array i32 N) types inline. It prints constructor expressions with inline i32 values as (array i32 value...). It prints first-pass index expressions inline as (index array-expr index-expr) when the array expression and index expression are inline. Function and test body indentation follows the existing one-body-form-per-line rule.

Diagnostics:

  • MalformedArrayType: array type syntax is not (array i32 N). Span: whole type form or offending operand. Expected: (array i32 N).
  • UnsupportedArrayElementType: array element type is not i32. Span: element type. Expected: i32. Found: written element type.
  • ZeroLengthArrayUnsupported: array length is zero. Span: array type. Hint: use one or more i32 elements.
  • MalformedArrayConstructor: constructor syntax is not (array i32 value...). Span: whole constructor or offending operand.
  • InvalidArrayElementType: constructor element type syntax is invalid. Span: element type. Hint: first-pass arrays use i32 elements.
  • EmptyArrayUnsupported: constructor has no element values. Span: whole constructor. Hint: provide one or more i32 values.
  • ArrayLengthMismatch: constructor value count does not match the expected (array i32 N) length. Span: whole constructor. Expected: expected length. Found: value count.
  • TypeMismatch: constructor element value does not check as i32. Span: element value. Expected: i32. Found: checked value type.
  • MalformedArrayIndex: index syntax is not (index array-expr index-expr). Span: whole index form or offending operand.
  • IndexOnNonArray: array-expr does not check as an array type. Span: array expression. Expected: (array i32 N). Found: checked value type.
  • ArrayIndexNotI32: index-expr does not check as i32. Span: index expression. Expected: i32. Found: checked value type.
  • UnsupportedArrayIndexBase: index uses an array value that is not an immediate constructor or immutable array local. Span: array expression. Hint: use (index (array i32 ...) N) or index a let array local.
  • DynamicArrayIndexUnsupported: index-expr is not an integer literal. Span: index expression. Hint: first-pass checked indexing requires a literal index.
  • ArrayIndexOutOfBounds: literal index is outside 0 <= index < N. Span: index literal. Expected: valid index range. Found: literal index.
  • UnsupportedArrayLocalInitializer: an array local is initialized from anything other than a direct array constructor. Span: initializer expression. Hint: use (let values (array i32 N) (array i32 ...)).
  • MutableArrayLocalUnsupported: (array i32 N) is used as a mutable var local. Span: local type. Hint: declare array locals with let.
  • UnsupportedArraySignatureType: (array i32 N) is used as a function parameter or return type in the first pass. Span: signature type.
  • UnsupportedArrayEquality: an array value is compared with =. Span: equality expression.
  • UnsupportedArrayPrint: an array value is passed to print_i32. Span: print argument.
  • ArrayMutationUnsupported: an implementation that accepts a generalized assignment target must reject element mutation in the first pass. Span: attempted array element assignment target.

Slices remain a broader design target. No slice type, slice constructor, borrowed view, pointer decay, or slice indexing behavior is part of this first-pass array contract.


14. Option and Result

Status: strict first-pass compiler-supported syntax, promoted through examples/supported/option-result.slo and examples/formatter/option-result.slo.

Nullable absence uses option.

Recoverable failure uses result.

Promotion boundary:

  • The first pass supports only (option i32) and (result i32 i32).
  • Constructors are the only promoted option/result value operations.
  • Constructors are supported only as the single final body expression of a function whose declared return type exactly matches the constructor result.
  • Function parameters, local declarations, mutable storage, assignment, matching, unwrap, equality, printing, and calls that pass or receive option/result values are not part of this pass.
  • Nested option/result types, non-i32 payloads, strings, arrays, structs, slices, and pointers as payloads are not part of this pass.
  • No stable LLVM, C, FFI, cross-module, or serialized ABI/layout promise is made for option/result values in v0.

First-pass option type:

(option i32)

First-pass result type:

(result i32 i32)

Constructor expressions:

(some i32 value)
(none i32)

(ok i32 i32 value)
(err i32 i32 value)

Type rules:

  • (option i32) is supported only as a function return type under the direct constructor-return rule above.
  • (result i32 i32) is supported only as a function return type under the direct constructor-return rule above.
  • (some i32 value) checks value as i32 and produces (option i32).
  • (none i32) produces (option i32) and has no payload expression.
  • (ok i32 i32 value) checks value as i32 and produces (result i32 i32).
  • (err i32 i32 value) checks value as i32 and produces (result i32 i32).
  • Constructor type operands are type names in source, not expressions.
  • Constructor payload expressions use the already-supported i32 expression rules. The constructor itself may not be nested in another expression for this first pass.
  • Pattern matching is not v0. There is no promoted operation that observes, destructures, compares, prints, or unwraps an option/result value.

Typed-core meaning:

OptionType {
  payload: i32,
  span,
  payload_span
}

ResultType {
  ok: i32,
  err: i32,
  span,
  ok_span,
  err_span
}

OptionConstruct {
  variant: Some | None,
  payload: i32,
  value: Option<TExpr<i32>>,
  type: Option(i32),
  span,
  payload_span,
  value_span
}

ResultConstruct {
  variant: Ok | Err,
  ok: i32,
  err: i32,
  value: TExpr<i32>,
  type: Result(i32, i32),
  span,
  ok_span,
  err_span,
  value_span
}

Lowering and backend boundary:

  • A first-pass option/result constructor lowers to a compiler-owned tagged value representation sufficient to return the value from the direct-return function.
  • The backend may choose any internal representation that preserves the checked variant and i32 payload for future typed-core use.
  • The backend must not expose or rely on a stable option/result ABI, layout, discriminant value, padding, or cross-module representation in v0.
  • No storage, mutation, comparison, print, unwrap, pattern-match, or parameter passing lowering is required or supported by this contract.
  • If a checked option/result form reaches a backend gap, Glagol must report UnsupportedBackendFeature instead of panicking.

Formatter behavior:

(fn some_value ((value i32)) -> (option i32)
  (some i32 value))

(fn no_value () -> (option i32)
  (none i32))

(fn ok_value ((value i32)) -> (result i32 i32)
  (ok i32 i32 value))

(fn err_value ((code i32)) -> (result i32 i32)
  (err i32 i32 code))

The formatter prints (option i32) and (result i32 i32) return types inline. It prints first-pass constructors inline as (some i32 value), (none i32), (ok i32 i32 value), and (err i32 i32 value) when the payload expression is inline. Function body indentation follows the existing one-body-form-per-line rule.

Diagnostics:

  • MalformedOptionType: option type syntax is not (option i32). Span: whole type form or offending operand. Expected: (option i32).
  • UnsupportedOptionPayloadType: option payload type is not i32. Span: payload type. Expected: i32. Found: written payload type.
  • MalformedResultType: result type syntax is not (result i32 i32). Span: whole type form or offending operand. Expected: (result i32 i32).
  • UnsupportedResultPayloadType: result ok or err type is not i32. Span: offending payload type. Expected: i32. Found: written payload type.
  • MalformedOptionConstructor: constructor syntax is not (some i32 value) or (none i32). Span: whole constructor or offending operand.
  • MalformedResultConstructor: constructor syntax is not (ok i32 i32 value) or (err i32 i32 value). Span: whole constructor or offending operand.
  • TypeMismatch: constructor payload value does not check as i32. Span: payload value. Expected: i32. Found: checked value type.
  • UnsupportedOptionResultReturn: an option/result return type is used by a function whose body is not exactly one matching direct constructor expression. Span: return type or final expression.
  • UnsupportedOptionResultSignatureType: an option/result type is used as a function parameter type. Span: signature type.
  • UnsupportedOptionResultLocal: an option/result type is used as a local declaration type. Span: local type.
  • UnsupportedOptionResultFlow: an option/result constructor or value is used anywhere other than the supported direct-return position. Span: value expression.
  • UnsupportedOptionResultEquality: an option/result value is compared with =. Span: equality expression.
  • UnsupportedOptionResultPrint: an option/result value is passed to print_i32. Span: print argument.
  • UnsupportedOptionResultPatternMatch: a match/destructure/unwrap form tries to observe an option/result value. Span: observing expression. Hint: pattern matching is not part of the v0 option/result constructor pass.

15. Unsafe

Status: strict lexical marker support, promoted through examples/supported/unsafe.slo and examples/formatter/unsafe.slo.

First-pass unsafe form:

(unsafe
  body-form...
  final-expression)

Promotion boundary:

  • The first pass supports only the lexical unsafe expression block.
  • The marker is lexically visible in source; there is no implicit unsafe context in v0.
  • An unsafe block may contain the same sequential body forms already supported in normal function and top-level test bodies, followed by one final expression.
  • The block returns the final expression's checked type.
  • The block marks a lexical unsafe context for diagnostics only. It does not make raw memory operations supported.
  • Raw allocation, deallocation, pointer loads or stores, pointer arithmetic, unchecked indexing, raw reinterpretation, and FFI calls remain unsupported.

Example using only supported safe forms:

(fn add_one_in_unsafe ((value i32)) -> i32
  (unsafe
    (let one i32 1)
    (+ value one)))

Surface rules:

  • The form is exactly (unsafe body-form... final-expression).
  • The form must contain at least one expression after unsafe.
  • unsafe is an expression form and may be used wherever its final expression's type is expected.
  • Body forms before the final expression use the same first-pass sequential body-form rules as the surrounding function or top-level test body. Today that means local declarations, local assignment, supported unit-producing calls, and first-pass loops only where those forms are otherwise supported.
  • Local declarations inside an unsafe block are scoped to that block. They are visible only after their declaration and only until the end of the block.
  • Names declared inside an unsafe block must not shadow parameters, locals from an outer body, earlier locals in the same unsafe block, top-level functions, or compiler intrinsics under the existing v0 no-shadowing rules.
  • A nested unsafe block is allowed only if its body forms and final expression satisfy this same lexical contract.

Typed-core meaning:

UnsafeBlock {
  body: [TBodyForm],
  result: TExpr<T>,
  type: T,
  span,
  body_spans,
  result_span
}

The checker carries an in_unsafe lexical flag while checking the block body and final expression. That flag is only used to choose diagnostics for unsafe operation heads; it does not widen the set of supported expression forms.

Unsafe operation heads requiring the lexical marker:

alloc
dealloc
load
store
ptr_add
unchecked_index
reinterpret
ffi_call

These names are reserved as unsafe operation heads in expression position for v0. They do not resolve as ordinary user-defined calls in this pass.

Diagnostic boundary for those heads:

  • Outside an unsafe block, any expression whose head is one of the names above must report UnsafeRequired. Span: whole operation form, with the head span as the primary operation name span. Hint: wrap the operation in an unsafe block.
  • Inside an unsafe block, the same operation heads must report UnsupportedUnsafeOperation. Span: whole operation form, with the head span as the primary operation name span. Hint: raw memory operations are outside the v0 unsafe contract.

Lowering and backend boundary:

  • A lexical unsafe block lowers like an ordinary expression block: lower each checked body form in source order, then lower the final expression and use its value as the block value.
  • The lowered representation must preserve source order and local block scope.
  • There is no backend lowering for raw memory operations in this pass.
  • If a checked lexical unsafe block containing only supported safe forms reaches a backend gap, Glagol must report UnsupportedBackendFeature instead of panicking.

Formatter behavior:

(fn add_one_in_unsafe ((value i32)) -> i32
  (unsafe
    (let one i32 1)
    (+ value one)))

The formatter must print unsafe as a multi-line expression block. The opening line contains only (unsafe. Each body form and the final expression are printed on their own lines, indented two spaces deeper than the unsafe form. The closing ) is placed on the final expression line when that expression is inline. If an unsafe block appears inside another expression, the block may be broken over multiple lines using the same indentation rule.

Diagnostics:

  • MalformedUnsafeForm: an unsafe form has no final expression or has an invalid operand shape. Span: whole unsafe form or the location where the first expression was expected. Expected: (unsafe body-form... final-expression).
  • UnsafeRequired: a raw unsafe operation head appears outside a lexical unsafe block. Span: whole operation form, with the head span as the operation name.
  • UnsupportedUnsafeOperation: a raw unsafe operation head appears inside a lexical unsafe block. Span: whole operation form, with the head span as the operation name.
  • Existing diagnostics for local declarations, assignments, loops, calls, comparisons, and type mismatches apply inside unsafe blocks.

Out of scope for v0:

  • pointer allocation
  • pointer load/store
  • pointer arithmetic
  • unchecked indexing behavior
  • raw reinterpretation
  • FFI calls
  • pointer locals
  • stable ABI/layout promises
  • unsafe abstractions

16. Tests

Status: supported in the strict top-level form below.

Current supported top-level test form:

(test "name"
  body-form...
  final-expression)

Surface rules:

  • test is a top-level declaration, sibling to fn; it is not an expression and is not valid inside a function body or another expression.
  • The form has a string-literal name followed by one or more body expressions.
  • The test name is metadata for the test runner, not a runtime string value.
  • v0 test names must be non-empty printable ASCII without embedded quotes, backslashes, or newlines until the general string escaping contract is specified.
  • Test names must be unique within a module after decoding.
  • A single final expression is supported as the degenerate body.
  • Non-final body forms may be local declarations or local assignments: (let name i32 value), (var name i32 value), or (set name value).
  • When while is promoted, it is also a valid non-final test body form under the first-pass loop contract in section 11.
  • The final test expression has expected type bool; Slovo performs no implicit conversion to bool.
  • Tests may refer to functions in the same module using the same name resolution rules as function bodies. Tests are not callable and do not introduce names into the value namespace.
  • Normal compilation must parse, lower, and check tests, but must not run them. Test mode runs top-level tests in source order.

Typed-core meaning:

TypedTest {
  name: TestName,
  body: [TBodyForm],
  final_expr: TExpr<bool>,
  span,
  name_span,
  body_spans
}

Top-level tests use the same sequential local-body rule as function bodies:

(test "local update"
  (let base i32 20)
  (var total i32 (+ base 1))
  (set total (+ total 1))
  (= total 22))

Local declarations and set forms may appear before the final test expression. The final expression remains the test result and must check as bool. This extension does not promote any broader block syntax.

The checker resolves and checks each test's final expression in a test context with expected type bool. A false result is a failed test result, not a compile-time diagnostic. Compile-time diagnostics are reserved for malformed test forms, invalid names, duplicate names, name-resolution failures, type errors, and unsupported implementation gaps.

Lowering and execution:

  • TypedTest entries are kept in a module test list separate from normal functions.
  • Tests do not affect main, exported functions, or the normal program ABI.
  • In test-runner lowering, each TypedTest may lower to an internal zero-argument bool/LLVM i1 thunk plus a registry entry containing the test name and source span.
  • A test passes when its expression evaluates to true, fails when it evaluates to false, and errors if evaluation traps or the runner cannot execute it.
  • In normal LLVM emission, an implementation may omit test thunks after parsing, lowering, and checking them. It must not silently ignore malformed or invalid tests.

Formatter behavior:

(test "add works"
  (= (add 2 3) 5))

The canonical formatter keeps the test name on the opening line. In the current supported subset, it prints each body form and the final expression on separate two-space-indented lines. Each expression itself uses the normal canonical expression formatter.

Diagnostics:

  • MalformedTestForm: missing name, non-string name, missing final expression, wrong operand count, or invalid body form. Span: whole test form or offending operand. Expected: (test "name" body... final-expression).
  • InvalidTestName: empty name or a name outside the v0 name subset. Span: string literal name.
  • DuplicateTestName: duplicate decoded name in the same module. Span: duplicate name, with a related span on the original test name.
  • TestExpressionNotBool: checked expression type is not bool. Span: test expression. Expected: bool. Found: checked expression type.
  • Existing expression diagnostics such as UnknownVariable, ArityMismatch, TypeMismatch, and UnsupportedBackendFeature apply inside test expressions.

Supported example:

(test "add works"
  (= (add 2 3) 5))

Supported local-body example:

(test "locals work"
  (let base i32 2)
  (var value i32 (add_local base))
  (set value (+ value 1))
  (= value 5))

17. Diagnostics

Diagnostics should have both human-readable and machine-readable forms.

Status: Glagol currently has diagnostics with byte spans and human/machine rendering. Slovo v0 requires both byte spans and line/column ranges for user-facing and machine-readable diagnostics. Exact diagnostic snapshots remain an implementation/test contract, but implementations must not report only unstructured text or only byte offsets.

Machine-readable diagnostic example:

(error
  (code TypeMismatch)
  (expected i32)
  (found string)
  (message "expected i32, found string")
  (span "main.slo"
    (bytes 42 49)
    (range 12 8 12 15))
  (hint "Use an integer value or convert explicitly."))

Required diagnostic fields:

  • code
  • primary source span as a zero-based, half-open byte range
  • primary source range as one-based start line, start column, end line, and end column
  • expected, when applicable
  • found, when applicable
  • message
  • hint, when useful and safe

Byte spans are the canonical location for tools. Line/column ranges are required for human-readable output and machine diagnostics so editors and external tools can display errors without reimplementing Slovo's source mapper. Line and column values are derived from the original source file, not formatter output. Columns are one-based byte columns within a UTF-8 source line; a tab in source counts as one input byte for the machine range, even though canonical formatting uses spaces.

Checked forms that reach a backend feature gap must return a structured UnsupportedBackendFeature diagnostic instead of panicking. Unsupported signature types, string literals without runtime lowering, and other speculative forms must remain out of examples/supported/ until they have backend behavior or explicit diagnostics plus tests.

Test-specific diagnostics introduced by the top-level test contract are MalformedTestForm, InvalidTestName, DuplicateTestName, and TestExpressionNotBool. They follow the same required fields above and must use original source spans, not formatter output. DuplicateTestName includes a related span pointing to the original test name.

Loop-specific diagnostics introduced by the first-pass while contract are MalformedWhileForm, WhileConditionNotBool, EmptyWhileBody, LocalDeclarationInWhileBodyUnsupported, NestedWhileUnsupported, and WhileBodyFormNotUnit. They follow the same required fields above and must use original source spans, not formatter output.

Struct-specific diagnostics introduced by the first-pass struct contract are listed in section 12. They follow the same required fields above, must use original source spans, and must keep related spans for duplicate names or fields when applicable.

Array-specific diagnostics introduced by the first-pass array/indexing contract are listed in section 13. They follow the same required fields above and must use original source spans, including byte ranges for malformed array types, constructors, literal indices, and unsupported array value-flow forms.

Option/result-specific diagnostics introduced by the first-pass constructor contract are listed in section 14. They follow the same required fields above and must use original source spans, including byte ranges for malformed option/result types, constructors, and unsupported option/result value-flow forms.


18. LLVM Lowering Sketch

Slovo v0 should lower through an internal typed core before LLVM IR.

Example Slovo:

(fn add ((a i32) (b i32)) -> i32
  (+ a b))

Approximate LLVM IR shape:

define i32 @add(i32 %a, i32 %b) {
entry:
  %0 = add i32 %a, %b
  ret i32 %0
}

19. v0 Implementation Milestones

  1. Parse forms
  2. Print parsed tree
  3. Canonical formatter
  4. Basic name resolution
  5. Primitive type checker
  6. Function checker
  7. if checker
  8. let / var / set
  9. Struct definitions
  10. Top-level test checker and test runner
  11. Structured diagnostics
  12. Lower simple functions to LLVM IR
  13. Compile and run examples/supported/add.slo
  14. Preserve the glagol binary CLI contract for supported v0 modes, stderr diagnostics, and exit codes
  15. Keep native executable output as an explicit LLVM-plus-runtime linking step until a later backend contract