The Tutorial
Hey!
So you want to learn hex huh? Well, you are in the right place.
hex is...
- ...tiny: a single executable, just a few hundreds KBs in size.
- ...minimalist: just integers, strings, arrays, and symbols. No statements, no expressions, and (almost) no variables.
- ...concatenative: think reverse Polish notation, postfix syntax, stack-based.
- ...slightly-esoteric. It definitely got quirks: hexadecimal-only integers, global-only symbols, ...things like that.
In a word: magical! Hence the name. Really. And the hexadecimal integer thing of course, that too.
Setting things up
Now that you know what hex is, you have to get it, then you can run it. Or not really, you can just go here and play with it.
If you got hex, it's just a single executable. You can run it with options, or without, as explained on the Get page.
You can even double-click it, especially if you are on Windows. That will bring up the REPL, which well... reads from input, evaluates what you enter, prints the result (or better, the first item on The Stack), and then loops again.
Working with things
hex is tiny. I may have said that already. It is also simple. As such, it doesn't really have fancy things like objects, or... floating-point numbers, for example.
Instead, it focuses on making do with just a few things, or better, data types.
It understands integers like 27 or -19, except that if you type any of those in the REPL (go on, try it!) and press ENTER, you'll get something like this:
[error] Invalid symbol: 27
Right, because hex only understands integers in hexadecimal format.
Now, if you type 0x1b
instead... well, at least it doesn't complain, right? You can try
entering 0xffffffed
now (that's -19 in hexadecimal format using two's complement), and that works too.
Now what just happened is that you pushed two values (integers, even) on The Stack (more on this later). Since you have two numbers on The Stack already, you may as well enter + to add them up, and that gives you:
0x8
Jolly good.
Now... +
is actually a symbol; and symbols... well, they do tricks, those
symbols, every time you try to push them on the stack.
For example, +
takes the first two items on the stack, adds them together, and puts the result
on the stack.
You can now enter "eight"
in the REPL. See the double quotes? That's a
string. Strings work in the same way as in other programming languages. Nothing weird, I
promise. Of course strings can only be delimited via double quotes, not single, angular, circular, or whatever
quotes. Just double quotes.
Next... let's see. You can type : (which is another symbol), and... nothing happens!
Or better, nothing gets pushed back on The Stack. :
is a greedy, selfish symbol that just eats a
value (any literal) and a string, and doesn't put anything back on The Stack.
Now type eight
(with no quotes) and press ENTER:
0x8
Aha! It turns out that our :
friend works for The Registry. The Registry likes to keep
things for itself. Values don't just get pushed and popped from The Registry, no sir! It ain't like The
Stack. Once you are in, you are in, and you can't get out that easily (unless you are freed).
Clear? No? Well, you'll get there kid, eventually.
What's missing? Let's see, we talked about integers, strings, and even a little bit about symbols... Ah! Right: quotations, of course!
A quotation is a fancy name for an array, or a list. In hex quotations have no internal separators between items, and are delimited by ~~square~~ round brackets, like this:
(0x1 "two" three !)
Oh my! You really can put anything in quotations, right? Assuming that it's a valid literal, a known
native symbol (like !), or a syntactically-valid user-defined symbol (even if it doesn't exist, like
three
). You can even nest a quotation in another quotation, and another, and another...
...And nothing will happen by the way: quotations sit neatly on the stack like other literals. Until some pesky symbol decides to dequote them, like ., which strips a quotation and puts all its items on the stack! Oi! The naughty boy!
The Stack
We had to mention The Stack earlier, it was unavoidable. See, The Stack is where the magic happens! But what is it, you ask? Well, let's try a simple example and try to use hex to subtract 3 from 5, and take it reeeally slow.
Fiiiirst we start a hex REPL.
Then, we enter 0x5
and press ENTER
. 0x5
gets pushed on The
Stack, like this:
+-----------+
| 0x5 |
+-----------+
Then, we enter 0x3
on The Stack. Now there are two items on The Stack, like this:
+-----------+
| 0x3 |
+-----------+
| 0x5 |
+-----------+
Great, and finally, we are going to push the symbol - on the stack, because that's how postfix notation (a.k.a. Reverse Polish Notation) works: first the operands, and then the operators.
Anyhow, what happens to The Stack now? Waait... wait...
*
* - *
+-----------+
| 0x3 |
+-----------+
| 0x5 |
+-----------+
...magic! Real quick, -
takes two items from The Stack, performs the subtraction, aaaand pushes
the result back on The Stack, that now looks like this:
+-----------+
| 0x2 |
+-----------+
Symbols ain't that bad after all. And yes, The Stack is AWESOME! Did you know that if you use postfix
notation you will NEVER ever need to use parenthesis when performing math operations to tweak operator
preference? No? Let's try it. Let's calculate (3 + 2) * 7
:
First, the sum, right? so:
0x3 0x2 +
...then we simply add the multiplication, and so we have it:
0x3 0x2 + 0x7 *
If we take this further, you can use The Stack as an accumulator for your program state, and never, ever use a variable.
Whaaaaaaat?
Yeah, mind blown. That's how purists of concatenative programming languages would write programs huh? Well, the problem with it is that programs written like that tend to become a wee bit hard to read (but definitely not for purists of concatenative programming languages).
Sooooo that's why next to The Stack, we also have... The Registry!
The Registry
The Registry knows everything. It is the place that contains all the definitions of all hex symbols, both the 64 native symbols that are built-in, and also any symbol that you may want to create.
The one thing to remember about The Registry is that it is only one. You can't have more than one, no sir, so anything you put in there will become available anywhere within an hex program. Yes you read it right:
every symbol in hex is global
Let that sink in.
Sure, it's not the best design in the world, but it has the advantage of being a very simple thing to implement and use. You have to know it, and you have to be very careful with it.
Now... to add a new symbol to The Registry, you use the : symbol. That can also be used to overwrite existing symbols with new values, but not native symbols.
Say we want to teach hex some Roman numerals... we could do this:
0x1 "I" :
0x2 "II" :
0x3 "III" :
0x4 "IV" :
; ...
Then, you could use them like ordinary symbols:
I IV + ; Pushes 0x5 on the stack
If you don't need a symbol anymore, you can use the # symbol to free it from The Registry. See? Simple.
Of course The Registry is smart enough to stop you from freeing native symbols!
So if you try to remove +$
"+" #
...you'll get:
[error] Cannot free native symbol '+'
Hexxing all the way!
An there you have it! Now technically you know everything you need about hex, and you can go off hexxing away on your own! Off you go then!
...What? You don't even know how to implement a loop or a condition? But it's all the same! It's always values pushed and popped from the stack using symbols!
Say you want to print the size of each file in the current directory, for example.
Let's start with creating our own symbol to read the contents of the current directory and put them in a quotation:
("ls" run 0x1 get "\n" split) "ls" :
Basically, when dequoted, this quotation will:
- Run the
ls
command (it will work on *nix at least) using the run symbol. - Get the content of the standard output of the command result via get.
- Now you have a single string with all the directory contents, so you want to just split it by "\n" to get the file names in a quotation.
Let's try it in the REPL:
ls .
Remember the dot? . is that naughty boy that likes stripping quotations! It removes their parenthesis and push all the items on The Stack, essentially executing your logic.
That should show you something like this:
("CHANGELOG.md" "LICENSE" "Makefile" "README.md" "example.hex" "hex" "hex.js" "hex.vim" "hex.wasm" "releases" "scripts" "src" "web")
...at least that's what it shows on my machine in the hex directory.
How do we print each of those along with their file size? The keyword here is each, and here's the full program:
ls .
(dup ": " cat print
read len dec puts)
each
Woah, that's a lot. Let's break it down:
- First we execute our quotation to get the. list of all the file names.
- Then we push a quotation on The Stack. This code will be executed by each for each item in the previous quotation (the list of files).
- Inside that, first we use dup to duplicate the current item. We'll need one to print, adding a ": " after it via cat.
- Then we can use print to print file name and colon without a new line.
- Then there's still going to be a duplicate of the file name on the stack, which we are going to use to calculate its size by reading its contents via read.
- Now.. you'll get a nice hexadecimal number corresponding to the size in bytes (and zero for directories). You can use dec to convert it to a string in decimal representation...
- ...and finally print it with a new line at the end, using puts.
If everything went well, you should get something like this printed on the screen:
CHANGELOG.md: 124
LICENSE: 1070
Makefile: 711
README.md: 211
hex.js: 79923
hex.vim: 2881
releases: 0
scripts: 0
src: 0
web: 0
Phew... that was fun, wasn't it?!
Hopefully you should have the basics down by now. If you are not yet bored after all this, then I recommend going over the Specification which is known to cure insomnia and get you to sleep by the time you get to the middle of the Native Symbol Reference.
Sweet dreams, and happy hexxing!