hex programming language

Specification

Introduction

hex is a minimalist, concatenative, stack-based programming language designed for experimenting with the concatenative programming paradigm. It is inspired by the min programming language and aims to provide a small yet powerful language for creating short scripts and automating common tasks.

hex supports 32-bit integers (written only in hexadecimal format), strings, and quotations (lists). It features a set of built-in symbols that implement arithmetic operations, boolean logic, bitwise operations, comparison of integers, I/O operations, file manipulation, external process execution, and stack manipulation. The language is fully homoiconic, meaning that everything in hex is data.

hex was created with simplicity in mind, both in its implementation and usage. The language's design encourages a minimalist approach, focusing on essential features and avoiding unnecessary complexity.

Syntax

The syntax of hex is designed to be simple and intuitive, following the principles of concatenative programming. In hex, programs are composed of sequences of literals and symbols, which are evaluated from left to right.

Literals push values onto the stack, while symbols manipulate the stack or perform operations. There are no explicit control structures; instead, hex relies on stack manipulation and quotations to achieve flow control and data management. Symbols in hex can be used to store values globally, providing a way to manage state across different parts of a program.

hex programs are written as sequences of whitespace-separated tokens. Tokens can be literals, symbols, or comments.

This is an example of a simple hex program:

    ; Filters a quotation to keep only the even numbers
    (0x2 0x3 0x4 0x5 0x6) (0x2 % 0x0 ==) filter

This example includes:

One single-line comment: ; Filters a quotation to keep only the even numbers
Two quotations: (0x2 0x3 0x4 0x5 0x6) and (0x2 % 0x0 ==)
Three symbols: %, ==, and filter

Comments

Comments in hex are used to annotate code and are ignored during execution. There are two types of comments: single-line comments and multi-line comments.

Single-line Comments

Single-line comments start with a semicolon (;) and continue until the end of the line. Everything after the semicolon is ignored.

Example:

    ; This is a single-line comment
    0x2 0x3 + ; This adds 0x2 and 0x3$

Multi-line Comments

Multi-line comments start with #| and end with |#. Everything between these markers is ignored, allowing comments to span multiple lines.

Example:

    #|
      This is a multi-line comment
      It can span multiple lines
    |#
    0x2 0x3 + #| This adds 0x2 and 0x3 |#

Integer Literals

Integer literals in hex are always written in hexadecimal form, prefixed with 0x. They can contain up to 8 hexadecimal digits, representing 32-bit integers. Hexadecimal digits include the numbers 0-9 and the letters >a-f (or A-F), which correspond to the decimal values 10-15.

Integers in hex can be positive or negative, and are implemented using two's complement representation. For more information on two's complement, see .

Examples:

0x1 represents the decimal value 1.
0xa represents the decimal value 10.
0x1f represents the decimal value 31.
0xffffffff represents the decimal value -1 (in two's complement).

Integers are case-insensitive; typically, lowercase letters are preferred but not mandatory.

String Literals

String literals in hex are delimited by double quotes ("). They can contain any character except for a newline, meaning that strings must be on a single line. To include special characters within a string, hex supports the following escape codes:

\n - Newline
\t - Tab
\r - Carriage return
\b - Backspace
\f - Form feed
\v - Vertical tab
\\ - Backslash
\" - Double quote

Example:

"Hello, World!\nThis is a new line."

Quotation Literals

Quotations in hex are delimited by parentheses (they must start with ( and end with )). They can contain integers, strings, symbols, and even other quotations, allowing for nested structures.

Examples:

(0x1 0x2 0x3) - A quotation containing three integer literals.
(0x1 "hello" (0x2 0x3)) - A nested quotation containing an integer, a string, and another quotation.

Unlike string literals, quotations can span multiple lines, making them suitable for representing complex data structures and control flow mechanisms.

Symbol Identifiers

Symbol identifiers in hex are used to represent built-in native symbols and user-defined symbols.

There are 0x40 (64) native symbols in hex, and some of them contain special characters like == or .

Instead, user-defined symbols:

must start with a letter (a-z or A-Z) or an underscore (_)
can contain additional letters (a-z or A-Z), digits (0-9), dashes (-) and underscores (_)

Symbols are case-sensitive.

Data Types

hex supports the following data types:

Integers — 32-bit signed integers represented in hexadecimal form.
Strings — Sequences of characters delimited by double quotes.
Quotations — Lists of literals, symbols, and other quotations delimited by parentheses.
Symbols — Identifiers representing native or user-defined symbols.

Integers

Integers in hex are 32-bit signed values represented in hexadecimal form. They can be positive or negative (using two's complement), and range from -2,147,483,647 (-2³¹) and 2,147,483,647 (2³¹ - 1)

Integers are written using the prefix 0x followed by up to 8 hexadecimal digits.

hex integers are case-insensitive, meaning that 0x1f and 0X1F are equivalent (however, lowercase letters are preferred).

computations.

Because hex has no boolean data type, 0x0 is assumed to be false, and any other integer value is assumed to be true.

Examples:

0x1 — Represents the decimal value 1.
0xffffffff — Represents the decimal value -1.
0x10 — Represents the decimal value 16.

Strings

Strings in hex are sequences of characters delimited by double quotes ("). They can contain any character except for a newline character, and special characters can be escaped using backslashes.

Strings are used to represent textual data and can be manipulated using various string manipulation symbols in hex.

Examples:

"Hello, World!" — Represents the string Hello, World!.
"This is a string with a newline:\nSecond line." — Represents a string with a newline character.

Quotations

Quotations in hex are lists of literals (including other quotations) and symbols delimited by parentheses (( and )). They are used to represent structured data and are a fundamental part of the language's syntax.

An important thing to remember about quotations is that any symbol contained in them will not be executed, and this is a fundamental property of hex and other concatenative programming languages, because it means that quotation effectively acts as code blocks, holding code that can be executed later on using appropriate dequoting symbols.

Consider the following example:

    0x0 "t-count" :
    (t-count 0xa <)
        (
            t-count puts
            t-count 0x1 + "t-count" :
        )
    while
    "t-count" #

This example defines a symbol t-count that counts from 0 to 9 and prints each number to the standard output. The quotation (t-count 0xa <) is used to check if the count is less than 10, and the while symbol repeats the process until the condition is no longer met.

In this case, the first two quotations are first pushed on the stack, and the the while symbols perform the dequoting necessary to implement the expected control flow.

Symbols

In hex there native symbols and user-defined symbols. Native symbols are built-in functions that perform specific operations, while user-defined symbols are created by the user to store values or define custom behavior.

hex provides 64 (0x40) native symbols that cover a wide range of functionality, including arithmetic operations, control flow, I/O operations, file manipulation, and stack manipulation.

You can think of symbols as both functions that manipulate the stack, or variables that can be used to store literal values.

While native symbol identifiers sometimes are comprised of special characters, like ==, user-defined symbol identifier must adhere to specific rules.

All symbols are stored in a single registry, implemented as a simple dictionary. Therefore, all symbols in hex are global, and not lexically scoped. The main driver for this is to keep the language as simple as possible.

You can store your own symbols and free them using the memory management symbols provided natively. However, native symbols cannot be freed.

Stack

The stack is a fundamental data structure in hex that holds values and controls the flow of execution. hex is a stack-based language, meaning that all operations are performed on a stack of values. The order according to which items are added to or removed from the stack is LIFO.

In the canonical implementation, the hex stack can contain up to 256 items. If you try to push more items on the stack, a stack overflow error will be raised and the program will terminate. While this may seem a relatively low number, it is important to note that typically there will not be more than 5-10 items on the stack at any time, because typically symbols are used to frequently drop them from the stack.

Pushing Literals

Literals are values that are directly pushed onto the stack. In hex, literals can be integers, strings, or quotations. When a literal is encountered in a hex program, it is pushed onto the stack for further processing.

Examples:

0x1 — Pushes the integer 1 onto the stack.
"Hello, World!" — Pushes the string Hello, World! onto the stack.
(0x1 0x2 0x3) — Pushes the quotation (0x1 0x2 0x3) onto the stack.

Pushing Symbols

Symbols in hex are used to represent native or user-defined functions and values. When a symbol is encountered in a hex program, it is looked up in the registry, and its associated value or function is pushed onto the stack.

Native symbols can perform manipulations on the stack; they can drop values from the stack and add values back in.

In the canonical implementation, native symbols are implemented as native C functions that are executed whenever the corresponding native symbol is pushed on the stack.

By contrast, you can only store hex literals as user-defined symbols. When storing a quotation as a symbol, it can be used as data (a list of values) or a portion of an hex program which that can then be dequoted through symbols like ., which pushes all the items in a quotations on the stack, one by one.

Consider the following example hex program:

    (dup * *) "square" :
    0x3 square . puts ; prints 9

This program defines a symbol square that can be used to calculate the square value of an integer, using the symbol :. From then on, if square is found anywhere in the same hex program, it will be substituted with (* *). However, this is not enough to calculate the square value, because the logic to do so is in a quotation. To "execute" (dequote) a quotation, you must use the . symbol, which pushes all the items in the quotation on the stack, which is equivalent to the following program:

    0x3 dup * * puts ; prints 9

While the : symbol can be used to store quotations that can then be dequoted later using ., typically you want to define operators which are immediately dequoted when pushed on the stack, thus behaving in a similar way as their native counterparts.

You can achieve this using the :: symbol, and the previous example can be rewritten as follows:

    (dup * *) "square" ::
    0x3 square puts ; prints 9

In this case, you no longer need to explicitly dequote square using ., because it has been stored as an operator and hex knows it has to be immediately dequoted when pushed on the stack.

Registry

The registry in hex is a simple dictionary that stores symbols and their associated values or functions. The registry is used to look up symbols when they are encountered in a hex program and to store user-defined symbols and their values.

When a symbol is pushed onto the stack, hex looks up the symbol in the registry and pushes its associated value or function onto the stack. If the symbol is not found in the registry, an error is raised.

The registry is implemented as a simple key-value store, where the keys are symbol identifiers and the values are the associated values or functions. The registry is global and shared across the entire hex program.

hex provides a set of native symbols that are pre-defined in the registry and cannot be deleted or modified. These symbols provide basic functionality for arithmetic operations, control flow, I/O operations, file manipulation, and stack manipulation.

hex also allows users to define their own symbols and store values in the registry. User-defined symbols can be created, modified, and deleted using the memory management symbols provided natively.

It is important to note that the registry is a global store, meaning that symbols are not lexically scoped and can be accessed from anywhere in the program. This design choice was made to keep the language simple and straightforward.

In the canonical hex implementation, the registry can hold up to 4096 symbols (4032 of which can be user-defined symbols).

Hex Bytecode eXecutable (HBX) Format

hex programs can be compiled to a binary format called Hex Bytecode eXecutable (HBX). HBX is a compact binary representation of hex programs that can be executed by the hex interpreter. HBX files are typically smaller and faster to load than hex source files, making them ideal for distribution and execution.

HBX files are structured as follows:

Bytecode Header (8 bytes)
Bytecode Symbol Table — containing the list of all symbols that have been defined by the user in the compiled program.
Bytecode Program — containing the compiled hex program as a sequence of opcodes and payload.

Bytecode Header

The header of an HBX file consists of 8 bytes:

01 — Header Start
68 — The letter 'h'
65 — The letter 'e'
78 — The letter 'x'
01 — Version
00 — First byte indicating the size of the symbol table (little-endian)
00 — Second byte indicating the size of the symbol table (little-endian)
02 — Header End

Bytecode Symbol Table

The symbol table in an HBX file contains the list of all symbols that have been defined by the user in the compiled program. Symbols are stored sequentially using the following format:

Symbol Length (1 byte) — The length of the symbol identifier (Can be up to 255 characters long).
Symbol Identifier (variable length) — The symbol identifier as a sequence of ASCII characters (not null-terminated).

The symbol table can theoretically contain up to 65536 entries (the maximum size representable in two bytes); however, the maximum number of user-defined symbols is currently limited to 4032, since the registry has a maximum size of 4096 items and 64 are reserved for native symbols.

Bytecode Program

The bytecode program in an HBX file contains the compiled hex program as a sequence of opcodes and payload. Each opcode is represented by a single byte, and some opcodes may have an associated payload.

The following opcodes are defined for pushing different types of values on the stack

00 — (LOOKUP) Lookup user symbol
01 — (PUSHIN) Push Integer
02 — (PUSHST) Push String
03 — (PUSHQT) Push Quotation

Other opcodes are assigned to each native symbol, and range from 10 to 4f.

Each of the four opcodes for pushing data has an associated payload, which is used to provide additional information to the opcode. The payload is represented as a sequence of bytes following the opcode byte.

Opcodes for native symbols, instead, do not have any associated payload.

00 - LOOKUP

The 00 (LOOKUP) opcode is used to look up a user-defined symbol in the symbol table and push its associated value onto the stack. The 00 opcode is followed by two bytes representing the index of the symbol in the symbol table, in little-endian format.

For example, the sequence 00 03 00 instructs the interpreter to perform a lookup in the symbol table and retrieve the 4th symbol (index 3).

01 - PUSHIN

The 01 (PUSHIN) opcode is used to push an integer value onto the stack. The 01 opcode is followed by:

One byte representing the number of following bytes used to represent the integer (1 to 4).
Four bytes representing the signed integer value using two's complement, in little-endian format.

For example, the sequence 01 04 fe ff ff ff represents the integer -2 (0xfffffe), and the sequence 01 01 10 represents the integer 16 (0x10$).

02 - PUSHST

The 02 (PUSHST) opcode is used to push a string value onto the stack. The 02 opcode is followed by:

A variable number of bytes representing the length of the string, encoded using the Little Endian Base 128 (LEB128) algorithm.
Variable-length sequence of bytes representing the ASCII characters of the string, without the null terminator. Note that only ASCII characters are supported by the HBX format right now; attempting to encode non-ASCII characters will result in a compiler error.

The following sequence:

02 16 54 68 69 73 20 69 73 20 61 20 74 65 73 74 20 73 74 72 69 6e 67 21

represents the string "This is a test string!"

03 - PUSHQT

The 03 (PUSHQT) opcode is used to push a quotation value onto the stack. The 03 opcode is followed by:

A variable number of bytes representing the number of items in the quotation, encoded using the Little Endian Base 128 (LEB128) algorithm.
The opcode sequences for each item of the quotation.

The following sequence:

03 05 02 04 74 65 73 74 01 01 01 38 3d 45

represents the quotation ("test" 0x1 dec cat puts)

Full Bytecode Example

Consider the following hex program:

(
  $"_n" 
  (_n 0x0 <=)
    (0x1)
    (_n dup 0x1 - factorial *)
  if
  $"_n" #
) $"factorial" ::
0x5 factorial dec puts

This gets compiled to the following bytecode:

01 68 65 78 01 02 00 02 
02 5f 6e 09 66 61 63 74 
6f 72 69 61 6c 03 08 02 
02 5f 6e 10 03 03 00 00 
00 01 01 00 31 03 01 01
01 01 03 06 00 00 00 19 
01 01 01 22 00 01 00 23 
14 02 02 5f 6e 12 02 09 
66 61 63 74 6f 72 69 61 
6c 11 01 01 05 00 01 00 
38 45

And here is an annotated breakdown:

; Header with symbol table of size 2
01 68 65 78 01 02 00 02 
; Symbol Table: _n, factorial
02 5f 6e 
09 66 61 63 74 6f 72 69 61 6c 
; Push quotation of eight items
03 08 
   ; Push string "_n"
   02 02 5f 6e 
   10 ; Symbol :
   ; Push quotation of three items
   03 03 
      ; Lookup first symbol (_n)
      00 00 00 
      ; Push integer 0x0
      01 01 00 
      31 ; Symbol <=
   ; Push quotation of one item
   03 01 
      ; Push integer 0x1
      01 01 01 
   ; Push quotation of six items
   03 06 
      ; Lookup first symbol (_n)
      00 00 00 
      19 ; Symbol dup
      ; Push integer 0x1
      01 01 01 
      22 ; symbol -
      ; Lookup second symbol (factorial)
      00 01 00 
      23 ; Symbol *
   14 ; Symbol if
   02 02 5f 6e 
   12 ; Symbol ::
; Push string "factorial"
02 09 66 61 63 74 6f 72 69 61 6c 
11 ; Symbol #
; Push integer 5
01 01 05 
; Lookup second symbol (factorial)
00 01 00 
38 ; Symbol dec
45 ; Symbol puts

Native Symbol Reference

hex provides a set of 64 (0x40) native symbols that are built-in and pre-defined in the registry. The following section provides details on each of these symbols, including a signature illustrating how each symbol manipulates the stack.

The notation used to specify the signature of a symbol is as follows:

    in1 in2 ... inN → out1 out2 ... outM

Where in1, in2, ..., inN are the items consumed from the stack, and out1, out2, ..., outM are the items pushed back onto the stack.

Note that the → character represents the symbol being described, and:

inN is the first element on the stack before the symbol is pushed on the stack.
outM is the first element on the stack after the symbol is pushed on the stack.

The following abbreviations are used to represent different types of literals (and each can have a numerical suffix for differentiation within the signature):

a — Any literal value
s — String
q — Quotation
i — Integer

Additionally, * is used to represent zero or more literals of any type.

Consider, for example, the following signature for the swap symbol:

a1 a2 → a2 a1

This signature indicates that the symbol swap drops two items from the stack (a1 and a2), and then pushes them back onto the stack in reverse order (a2 and a1).

Memory Management Symbols

`:` Symbol

a s →

OPCODE: 10

Stores the literal a in the registry as the symbol s.

`::` Symbol

a s →

OPCODE: 11

Stores the literal a in the registry as the symbol s. If a is a quotation, it will be immediately dequoted when pushed on the stack.

`#` Symbol

s →

OPCODE: 12

Frees the symbol s from the registry.

`symbols` Symbol

→ q

OPCODE: 13

Pushes a quotation on the stack containing the identifiers of all the symbols currently stored in the registry.

Control Flow Symbols

`if` Symbol

q1 q2 q3 → *

OPCODE: 14

Dequotes quotation q1, if it pushes a positive integer on the stack it dequotes q2, otherwise dequotes q3.

`while` Symbol

q1 q2 → *

OPCODE: 15

Dequotes quotation q1, if it pushes a positive integer on the stack it dequotes q2 and repeats the process.

`error` Symbol

→ s

OPCODE: 16

Pushes the last error message to the stack.

`try` Symbol

q1 q2 → *

OPCODE: 17

Dequotes quotation q1, if it throws an error it dequotes q2.

`throw` Symbol

s →

OPCODE: 18

Throws an error printing error message s.

Stack Management Symbol

`dup` Symbol

a → a a

OPCODE: 19

Duplicates literal a and pushes it on the stack.

`stack` Symbol

→ q

OPCODE: 1a

Pushes the items currently on the stack as a quotation.

`drop` Symbol

a →

OPCODE: 1b

Removes the top item from the stack.

`swap` Symbol

a1 a2 → a2 a1

OPCODE: 1c

Swaps the top two items on the stack.

Evaluation Symbols

`.` Symbol

q → *

OPCODE: 1d

Dequotes quotation q.

`!` Symbol

(s1|q) s2 → *

OPCODE: 1e

Evaluates the string s1 as an hex program, or the array of integers to be interpreted as hex bytecode (HBX format). s2 will be used as the file name to display in stack traces.

`'` Symbol

a → q

OPCODE: 1f

Pushes the literal a wrapped in a quotation on the stack.

`debug` Symbol

q → *

OPCODE: 20

Dequotes q with debugging enabled.

Arithmetic Symbols

`+` Symbol

i1 i2 → i

OPCODE: 21

Pushes the result of the sum of i1 and i2 on the stack.

`-` Symbol

i1 12 → i

OPCODE: 22

Pushes the result of the subtraction of 12 from i1 on the stack.

`*` Symbol

i1 12 → i

OPCODE: 23

Pushes the result of the multiplication of i1 and 12 on the stack.

`/` Symbol

i1 12 → i

OPCODE: 24

Pushes the result of the division of i1 by 12 on the stack.

`%` Symbol

i1 12 → i

OPCODE: 25

Pushes the result of the modulo of i1 by 12 on the stack.

Bitwise Operations Symbols

`&` Symbol

i1 12 → i

OPCODE: 26

Pushes the result of a bitwise and of i1 and i2 on the stack.

`|` Symbol

i1 12 → i

OPCODE: 27

Pushes the result of a bitwise or of i1 and i2 on the stack.

`^` Symbol

i1 12 → i

OPCODE: 28

Pushes the result of a bitwise xor of i1 and i2 on the stack.

`~` Symbol

i → i

OPCODE: 29

Pushes the result of a bitwise not of i on the stack.

`<<` Symbol

i1 12 → i

OPCODE: 2a

Pushes the result of shifting i1 by i2 bits to the left.

`>>` Symbol

i1 12 → i

OPCODE: 2b

Pushes the result of shifting i1 by i2 bits to the right.

Comparisons Symbols

`==` Symbol

a1 a2 → i

OPCODE: 2c

Pushes 0x1 on the stack if a1 and a2 are equal, or 0x0 otherwise.

`!=` Symbol

i1 12 → i

OPCODE: 2d

Pushes 0x1 on the stack if a1 and a2 are not equal, or 0x0 otherwise.

`>` Symbol

i1 12 → i

OPCODE: 2e

Pushes 0x1 on the stack if i1 is greater than i2, or 0x0 otherwise.

`<` Symbol

i1 12 → i

OPCODE: 2f

Pushes 0x1 on the stack if i1 is less than i2, or 0x0 otherwise.

`>=` Symbol

i1 12 → i

OPCODE: 30

Pushes 0x1 on the stack if i1 is greater than or equal to i2, or 0x0 otherwise.

`<=` Symbol

i1 i2 → i

OPCODE: 31

Pushes 0x1 on the stack if i1 is less than or equal to i2, or 0x0 otherwise.

Boolean Logic Symbols

`and` Symbol

i1 i2 → i

OPCODE: 32

Pushes 0x1 on the stack if i1 and i2 are non-zero integers, or 0x0 otherwise.

`or` Symbol

i1 i2 → i

OPCODE: 33

Pushes 0x1 on the stack if i1 or i2 are non-zero integers, or 0x0 otherwise.

`not` Symbol

i → i

OPCODE: 34

Pushes 0x1 on the stack if i is zero, or 0x0 otherwise.

`xor` Symbol

i1 i2 → i

OPCODE: 35

Pushes 0x1 on the stack if i1 and i2 are different, or 0x0 otherwise.

Type Checking and Conversion Symbols

`int` Symbol

s → i

OPCODE: 36

Converts the string s representing a hexadecimal integer to an integer value and pushes it on the stack.

`str` Symbol

i → s

OPCODE: 37

Converts the integer i to a string representing a hexadecimal integer and pushes it on the stack.

`dec` Symbol

i → s

OPCODE: 38

Converts the integer i to a string representing a decimal integer and pushes it on the stack.

`hex` Symbol

s → i

OPCODE: 39

Converts the string s representing a decimal integer to an integer value and pushes it on the stack.

`ord` Symbol

s → i

OPCODE: 3a

Pushes the ASCII value of the string s on the stack.

If s is longer than 1 character or if it is not representable using an ASCII code between 0x0 and 0x7f, 0xffffffff is pushed on the stack.

`chr` Symbol

i → s

OPCODE: 3b

Pushes the ASCII character represented by the integer i on the stack.

If i is not between 0x0 and 0x7f, an empty string is pushed on the stack.

`type` Symbol

a → s

OPCODE: 3c

Pushes the type of the literal a on the stack (integer, string, quotation, native-symbol, user-symbol, invalid, or unknown).

List (Strings and Quotations) Symbols

`cat` Symbol

(s1 s2|q1 q2) → (s|q)

OPCODE: 3d

Pushes the result of the concatenation of two strings or two quotations on the stack.

`len` Symbol

(s|q) → i

OPCODE: 3e

Pushes the length of a string or a quotation on the stack.

`get` Symbol

(s|q) i → a

OPCODE: 3f

Pushes the ith item of a string or a quotation on the stack.

`index` Symbol

(s a|q a) → i

OPCODE: 40

Pushes the index of the first occurrence of the literal a in a string or a quotation on the stack. If a is not found, 0xffffffff is pushed on the stack.

`join` Symbol

q s1 → s2

OPCODE: 41

Assuming that q is a quotation containing only strings, pushes the string s2 obtained by joining each element of q together using s1 as a delimiter.

`split` Symbol

s1 s2 → q

OPCODE: 42

Pushes a quotation q containing the strings obtained by splitting s1 using s2 as a delimiter.

`sub` Symbol

s1 s2 s3 → s4

OPCODE: 43

Pushes the string s4 obtained by replacing the first occurrence of s2 in s1 by s3.

`map` Symbol

q1 q2 → q3

OPCODE: 44

Dequotes quotation q1 and applies it to each item of quotation q2 to obtain a new quotation q3.

Input/Output Symbols

`puts` Symbol

a →

OPCODE: 45

Prints a to standard output, followed by a new line.

`warn` Symbol

a →

OPCODE: 46

Prints a to standard error, followed by a new line.

`print` Symbol

a →

OPCODE: 47

Prints a to standard output.

`gets` Symbol

→ s

OPCODE: 48

Reads a line from standard input and pushes it on the stack as a string.

File Symbols

`read` Symbol

s1 → (s2|q)

OPCODE: 49

Reads the content of the file s1 and pushes it on the stack as a string, if the file is in textual format, or as a quotation of integers representing bytes, if the file is in binary format.

`write` Symbol

(s1|q) s2 →

OPCODE: 4a

Writes the string s1 or the array of integers representing bytes q to the file s2.

`append` Symbol

(s1|q) s2 →

OPCODE: 4b

Appends the string s1 or the array of integers representing bytes q to the file s2.

Shell Symbols

`args` Symbol

→ q

OPCODE: 4c

Pushes the command line arguments as a quotation on the stack.

`exit` Symbol

i →

OPCODE: 4d

Exits the program with the exit code i.

`exec` Symbol

s → i

OPCODE: 4e

Executes the string s as a shell command, and pushes the command return code on the stack.

`run` Symbol

s → q

OPCODE: 4f

Executes the string s as a shell command, capturing its output and errors. It pushes a quotation on the stack containing the following items:

the exit code of the command as an integer
the standard output of the command as a string
the standard error of the command as a string

Specification

Introduction

Syntax

Comments

Single-line Comments

Multi-line Comments

Integer Literals

String Literals

Quotation Literals

Symbol Identifiers

Data Types

Integers

Strings

Quotations

Symbols

Stack

Pushing Literals

Pushing Symbols

Registry

Hex Bytecode eXecutable (HBX) Format

Bytecode Header

Bytecode Symbol Table

Bytecode Program

00 - LOOKUP

01 - PUSHIN

02 - PUSHST

03 - PUSHQT

Full Bytecode Example

Native Symbol Reference

Memory Management Symbols

: Symbol

:: Symbol

# Symbol

symbols Symbol

Control Flow Symbols

if Symbol

while Symbol

error Symbol

try Symbol

throw Symbol

Stack Management Symbol

dup Symbol

stack Symbol

drop Symbol

swap Symbol

Evaluation Symbols

. Symbol

! Symbol

' Symbol

debug Symbol

Arithmetic Symbols

+ Symbol

- Symbol

* Symbol

/ Symbol

% Symbol

Bitwise Operations Symbols

& Symbol

| Symbol

^ Symbol

~ Symbol

<< Symbol

>> Symbol

Comparisons Symbols

== Symbol

!= Symbol

> Symbol

< Symbol

>= Symbol

<= Symbol

Boolean Logic Symbols

and Symbol

or Symbol

not Symbol

xor Symbol

Type Checking and Conversion Symbols

int Symbol

str Symbol

dec Symbol

hex Symbol

`:` Symbol

`::` Symbol

`#` Symbol

`symbols` Symbol

`if` Symbol

`while` Symbol

`error` Symbol

`try` Symbol

`throw` Symbol

`dup` Symbol

`stack` Symbol

`drop` Symbol

`swap` Symbol

`.` Symbol

`!` Symbol

`'` Symbol

`debug` Symbol

`+` Symbol

`-` Symbol

`*` Symbol

`/` Symbol

`%` Symbol

`&` Symbol

`|` Symbol

`^` Symbol

`~` Symbol

`<<` Symbol

`>>` Symbol

`==` Symbol

`!=` Symbol

`>` Symbol

`<` Symbol

`>=` Symbol

`<=` Symbol

`and` Symbol

`or` Symbol

`not` Symbol

`xor` Symbol

`int` Symbol

`str` Symbol

`dec` Symbol

`hex` Symbol

`ord` Symbol

`chr` Symbol

`type` Symbol

`cat` Symbol

`len` Symbol

`get` Symbol

`index` Symbol

`join` Symbol

`split` Symbol

`sub` Symbol

`map` Symbol

`puts` Symbol

`warn` Symbol

`print` Symbol

`gets` Symbol

`read` Symbol

`write` Symbol

`append` Symbol

`args` Symbol

`exit` Symbol

`exec` Symbol

`run` Symbol