Platform Bindings and System Intrinsics

Platform Bindings and System Intrinsics

Platform bindings define the interface between Clef code and platform-specific operations. This chapter specifies the three-layer binding architecture used by CCS (Clef Compiler Service) and Firefly.

Overview

Clef uses a three-layer architecture for platform operations:

LayerPurposeExamples
Layer 1: CCS IntrinsicsNative type universe operationsSys.write, NativePtr.set
Layer 2: Binding LibrariesExternal library bindingsGTK, CMSIS, OpenGL
Layer 3: User CodeApplications and librariesUser programs

This approach:

  • Avoids BCL dependencies (System.Runtime.InteropServices)
  • Enables compile-time platform specialization
  • Provides type-safe syscall interfaces
  • Separates intrinsic operations from external library bindings

Layer 1: CCS Intrinsics

CCS recognizes certain operations as intrinsic to the native type universe. These are recognized by module pattern - no declaration in user code is needed.

The Sys Module

The Sys module provides direct system call primitives:

module Sys =
    /// Write bytes to a file descriptor
    /// fd: file descriptor (0=stdin, 1=stdout, 2=stderr)
    /// buffer: pointer to data
    /// count: number of bytes to write
    /// Returns: number of bytes written, or negative on error
    val write : fd:int -> buffer:nativeptr<byte> -> count:int -> int

    /// Read bytes from a file descriptor
    /// fd: file descriptor
    /// buffer: pointer to receive data
    /// maxCount: maximum bytes to read
    /// Returns: number of bytes read, or negative on error
    val read : fd:int -> buffer:nativeptr<byte> -> maxCount:int -> int

    /// Exit the process with the specified code
    /// This function never returns
    val exit : code:int -> 'T

The NativePtr Module

Pointer operations intrinsic to native compilation:

module NativePtr =
    val get       : nativeptr<'T> -> int -> 'T
    val set       : nativeptr<'T> -> int -> 'T -> unit
    val add       : nativeptr<'T> -> int -> nativeptr<'T>
    val toNativeInt   : nativeptr<'T> -> nativeint
    val ofNativeInt   : nativeint -> nativeptr<'T>
    val stackalloc    : int -> nativeptr<'T>

Intrinsic Recognition

CCS recognizes intrinsics by module path pattern during type checking. When code calls Sys.write, CCS:

  1. Matches the module path Sys
  2. Matches the member name write
  3. Returns the intrinsic’s native type signature
  4. Marks the call as SemanticKind.Intrinsic in the SemanticGraph

The compiler (Alex) then provides platform-specific implementations during code generation.


Platform-Specific Implementation

Alex provides implementations of Sys intrinsics for each target platform.

Linux x86-64

IntrinsicImplementation
Sys.writesyscall(1, fd, buffer, count)
Sys.readsyscall(0, fd, buffer, count)
Sys.exitsyscall(60, code)

Syscall convention: rax=syscall number, rdi=arg1, rsi=arg2, rdx=arg3

Linux ARM64

IntrinsicImplementation
Sys.writesvc #0 with x8=64
Sys.readsvc #0 with x8=63
Sys.exitsvc #0 with x8=93

macOS x86-64

IntrinsicImplementation
Sys.writesyscall(0x2000004, fd, buffer, count)
Sys.readsyscall(0x2000003, fd, buffer, count)
Sys.exitsyscall(0x2000001, code)

Note: macOS x86-64 uses BSD syscall numbers with 0x2000000 offset.

macOS ARM64

IntrinsicImplementation
Sys.writesvc #0x80 with x16=4
Sys.readsvc #0x80 with x16=3
Sys.exitsvc #0x80 with x16=1

Windows x86-64

IntrinsicImplementation
Sys.writeWriteFile via ntdll
Sys.readReadFile via ntdll
Sys.exitNtTerminateProcess

Freestanding

For bare-metal targets, intrinsics may:

  • Map to hardware registers
  • Generate inline assembly
  • Require target-specific configuration

Layer 2: Binding Libraries

External library bindings (GTK, CMSIS, OpenGL, etc.) require rich semantic metadata that CCS cannot know intrinsically:

  • Memory layouts and alignment
  • Ownership semantics (managed, unmanaged, refcounted)
  • Volatile access requirements
  • Callback calling conventions
  • Register mappings (for hardware peripherals)
  • FFI calling conventions

The Quotation Solution

F# quotations (<@ ... @>) are compile-time inspectable data structures. Unlike regular code which compiles to instructions, quotations compile to expression trees that can be examined during compilation.

This makes quotations ideal for carrying binding metadata:

  • Generated by Farscape from C/C++ headers
  • Compiled as regular F# code
  • Inspected at compile time by CCS
  • Never executed at runtime

How Quotation Binding Works

Step 1: Farscape Generates Binding Library

Farscape parses C/C++ headers and generates F# binding libraries:

// Generated by Farscape from gtk.h
module Gtk.Bindings

open BAREWire.Descriptors
open Memory

/// Type descriptor - quotation carries layout and semantics
let gtkWindowDescriptor: Expr<TypeDescriptor> = <@
    { TypeName = "GtkWindow"
      CName = "GtkWindow"
      Layout = { Size = 24un; Alignment = 8un }
      Ownership = Unmanaged
      RefCounted = true
      Destructor = Some "gtk_widget_destroy" }
@>

/// Function descriptor - quotation carries calling convention
let gtkWindowNewDescriptor: Expr<FunctionDescriptor> = <@
    { CName = "gtk_window_new"
      Parameters = [
          { Name = "type"; Type = I32; PassBy = Value }
      ]
      ReturnType = Ptr gtkWindowDescriptor
      CallingConvention = CDecl
      OwnershipTransfer = CallerOwns }
@>

/// The callable function - references the descriptor
let windowNew (windowType: int) : nativeptr<GtkWindow> =
    // Body references descriptor, enabling CCS to find metadata
    failwith "Binding placeholder"

Step 2: CCS Inspects Quotations at Compile Time

When CCS encounters a call to a binding function, it:

  1. Recognizes the binding pattern by module structure
  2. Finds associated quotations by naming convention (*Descriptor)
  3. Inspects quotation structure to extract metadata
  4. Attaches metadata to SemanticGraph nodes
User code: let window = Gtk.windowNew 0
                ↓
CCS: "This calls Gtk.Bindings.windowNew"
                ↓
CCS: "Find associated descriptor quotation"
                ↓
CCS: Inspects <@ { CName = "gtk_window_new"; ... } @>
                ↓
SemanticGraph node gets:
  - FFI.CName = "gtk_window_new"
  - FFI.CallingConvention = CDecl
  - FFI.OwnershipTransfer = CallerOwns
  - MemoryRegion = Unmanaged

Step 3: Alex Uses Metadata for Code Generation

The metadata flows from SemanticGraph to Alex:

SemanticGraph node (with FFI metadata)
                ↓
Alex sees: "FFI call to gtk_window_new, CDecl, returns owned pointer"
                ↓
Generates: LLVM call with correct ABI, ownership tracking

Active Patterns for Recognition

Binding libraries also provide active patterns for PSG traversal:

/// Active pattern for matching GTK window creation
let (|GtkWindowCreate|_|) (node: SemanticNode) =
    match node.Kind with
    | Application(funcNode, args) when
        funcNode.Symbol = Some "Gtk.Bindings.windowNew" ->
        Some { WindowType = extractArg args 0 }
    | _ -> None

/// Active pattern for matching GTK signal connection
let (|GtkSignalConnect|_|) (node: SemanticNode) =
    match node.Kind with
    | Application(funcNode, args) when
        funcNode.Symbol = Some "Gtk.Bindings.signalConnect" ->
        let widget = extractArg args 0
        let signal = extractArg args 1
        let callback = extractArg args 2
        Some { Widget = widget; Signal = signal; Callback = callback }
    | _ -> None

These patterns enable Alex to recognize and handle specific binding patterns during code generation.

Quotation Structure Requirements

Binding quotations must follow specific structure for CCS inspection:

/// Type descriptor quotation
type TypeDescriptor = {
    TypeName: string          // F# type name
    CName: string             // C/C++ type name
    Layout: LayoutInfo        // Size, alignment
    Ownership: OwnershipKind  // Managed | Unmanaged | RefCounted
    RefCounted: bool          // Uses reference counting
    Destructor: string option // Cleanup function name
}

/// Function descriptor quotation
type FunctionDescriptor = {
    CName: string                 // C function name
    Parameters: ParameterInfo[]   // Parameter types and passing
    ReturnType: TypeRef           // Return type reference
    CallingConvention: CallConv   // CDecl | StdCall | FastCall
    OwnershipTransfer: Transfer   // CallerOwns | CalleeOwns | Borrowed
}

/// Hardware register descriptor (for embedded)
type RegisterDescriptor = {
    Name: string              // Register name
    Address: unativeint       // Memory-mapped address
    AccessKind: AccessKind    // ReadOnly | WriteOnly | ReadWrite | Volatile
    ResetValue: uint32        // Value after reset
    Fields: FieldInfo[]       // Bit field definitions
}

Why Quotations, Not Attributes?

ApproachLimitation
AttributesLimited to simple values (strings, numbers)
InterfacesRequire runtime dispatch
ReflectionRequires runtime, BCL dependency
QuotationsFull F# expressions, compile-time inspectable, BCL-free

Quotations can express:

  • Nested structures (layouts containing fields)
  • References to other types (pointer to TypeDescriptor)
  • Complex expressions (computed offsets, conditional layouts)
  • All without runtime overhead

Layer 3: User Code

User code uses intrinsics and binding libraries. It does NOT declare platform bindings.

Correct Usage Pattern

// Console module - uses CCS intrinsics directly
module Console

let inline write (s: string) : unit =
    Sys.write 1 s.Pointer s.Length |> ignore

let inline writeln (s: string) : unit =
    write s
    Sys.write 1 &&'\n' 1 |> ignore

Incorrect Pattern (Deprecated)

The following pattern is deprecated and should not be used:

// WRONG - Do not declare platform bindings with BCL stubs
module Platform.Bindings =
    let writeBytes fd buffer count : int =
        Unchecked.defaultof<int>  // BCL dependency!

This pattern was used historically but creates BCL dependencies and requires special handling in the compiler.


Standard File Descriptors

Standard file descriptors on Unix-like systems:

DescriptorValuePurpose
stdin0Standard input
stdout1Standard output
stderr2Standard error

Intrinsic Constraints

CCS intrinsics have restrictions:

  1. No closures: Intrinsics cannot capture environment
  2. Primitive types only: Arguments and returns must be primitive or pointer types
  3. No exceptions: Errors returned via return values
  4. No allocation: Intrinsics do not allocate managed memory
  5. No currying: Intrinsics must be called with all arguments

Diagnostics

CodeMessage
FS8030Platform intrinsic not available for target
FS8031Invalid intrinsic signature
FS8032Intrinsic requires primitive types
FS8033Intrinsic called with partial application

Platform Descriptor

The Platform Descriptor is a quotation-based structure that defines all platform-specific characteristics. It flows from Fidelity.Platform through CCS to Alex, enabling compile-time platform specialization.

Structure

type PlatformDescriptor = {
    Architecture: Architecture          // X86_64, ARM64, RISCV64, etc.
    OperatingSystem: OperatingSystem    // Linux, Windows, MacOS, BareMetal
    Dimensions: Map<WidthDimension, int> // Pointer → 64, Register → 64, etc.
    Endianness: Endianness              // Little or Big
    TypeLayouts: Map<string, TypeLayout>  // Type sizes and alignments
    SyscallConvention: SyscallConvention  // Syscall ABI
    MemoryRegions: MemoryRegion list      // Stack, Heap, Text, Data, etc.
    FreestandingStartup: FreestandingStartup option  // Entry point for freestanding mode
}

Freestanding Startup

For freestanding builds (no libc), the platform descriptor includes startup information:

type FreestandingStartup = {
    EntrySymbol: string    // "_start" on Linux
    ExitSyscall: int64     // 60 on Linux x86-64
}

When output_kind = "freestanding" is specified in the project file, the compiler:

  1. Generates a _start wrapper function
  2. _start creates an empty string array, calls the F# main, and calls Sys.exit
  3. Links with -Wl,-e,_start to set the entry point

Console Mode

For console builds (with libc), no special entry point handling is needed:

  • libc provides _start which initializes the runtime and calls main
  • The F# main function is emitted with C-compatible signature
  • The F# type array<string> -> int maps to the platform C ABI

Linux x86-64 Example

let platform: Expr<PlatformDescriptor> = <@
    { Architecture = X86_64
      OperatingSystem = Linux
      Dimensions = Map.ofList [ (Pointer, 64); (Register, 64) ]
      Endianness = Little
      TypeLayouts = (* ... *)
      SyscallConvention =
        { CallingConvention = SysV_AMD64
          ArgRegisters = [| RDI; RSI; RDX; R10; R8; R9 |]
          ReturnRegister = RAX
          SyscallNumberRegister = RAX
          SyscallInstruction = Syscall }
      MemoryRegions = (* ... *)
      FreestandingStartup = Some {
        EntrySymbol = "_start"
        ExitSyscall = 60L
      }
    }
@>

Compile-Time Resolution

The platform descriptor is inspected at compile time:

  1. CCS reads the platform descriptor from Fidelity.Platform
  2. For freestanding mode, Intrinsic Elaboration generates the _start wrapper
  3. The wrapper uses Sys.emptyStringArray and Sys.exit intrinsics
  4. Alex emits MLIR; the linker sets the entry point via -Wl,-e,_start

The F# code author writes idiomatic F# (main: string[] -> int); the compiler handles entry point generation based on the platform and output mode


See Also