The Tutorial
Hey!
So you want to learn hex huh? Well, you are in the right place.
hex is...
- ...tiny: a single executable, just a few hundreds KBs in size.
- ...minimalist: just integers, strings, arrays, and symbols. No statements, no expressions, and (almost) no variables.
- ...concatenative: think reverse Polish notation, postfix syntax, stack-based.
- ...slightly-esoteric. It definitely got quirks: hexadecimal-only integers, global-only symbols, ...things like that.
In a word: magical! Hence the name. Really. And the hexadecimal integer thing of course, that too.
Setting things up
Now that you know what hex is, you have to get it, then you can run it. Or not really, you can just go here and play with it.
If you got hex, it's just a single executable. You can run it with options, or without, as explained on the Get page.
You can even double-click it, especially if you are on Windows. That will bring up the REPL, which well... reads from input, evaluates what you enter, prints the result (or better, the first item on The Stack), and then loops again.
Working with things
hex is tiny. I may have said that already. It is also simple. As such, it doesn't really have fancy things like objects, or... floating-point numbers, for example.
Instead, it focuses on making do with just a few things, or better, data types.
It understands integers like 27 or -19, except that if you type any of those in the REPL (go on, try it!) and press ENTER, you'll get something like this:
ERROR: Invalid symbol: 27
Right, because hex only understands integers in hexadecimal format.
Now, if you type 0x1b
instead... well, at least it doesn't complain, right? You can try
entering 0xffffffed
now (that's -19 in hexadecimal format using two's complement), and that works too.
Now what just happened is that you pushed two values (integers, even) on The Stack (more on this later). Since you have two numbers on The Stack already, you may as well enter + to add them up, and that gives you:
0x8
Jolly good.
Now... +
is actually a symbol; and symbols... well, they do tricks, those
symbols, every time you try to push them on the stack.
For example, +
takes the first two items on the stack, adds them together, and puts the result
on the stack.
You can now enter "eight"
in the REPL. See the double quotes? That's a
string. Strings work in the same way as in other programming languages. Nothing weird, I
promise. Of course strings can only be delimited via double quotes, not single, angular, circular, or whatever
quotes. Just double quotes.
Next... let's see. You can type : (which is another symbol), and... nothing happens!
Or better, nothing gets pushed back on The Stack. :
is a greedy, selfish symbol that just eats a
value (any literal) and a string, and doesn't put anything back on The Stack.
Now type eight
(with no quotes) and press ENTER:
0x8
Aha! It turns out that our :
friend works for The Registry. The Registry likes to keep
things for itself. Values don't just get pushed and popped from The Registry, no sir! It ain't like The
Stack. Once you are in, you are in, and you can't get out that easily (unless you are freed).
Clear? No? Well, you'll get there kid, eventually.
What's missing? Let's see, we talked about integers, strings, and even a little bit about symbols... Ah! Right: quotations, of course!
A quotation is a fancy name for an array, or a list. In hex quotations have no internal separators
between items, and are delimited by square round brackets, like this:
(0x1 "two" three !)
Oh my! You really can put anything in quotations, right? Assuming that it's a valid literal, a known
native symbol (like !), or a syntactically-valid user-defined symbol (even if it doesn't exist, like
three
). You can even nest a quotation in another quotation, and another, and another...
...And nothing will happen by the way: quotations sit neatly on the stack like other literals. Until some pesky symbol decides to dequote them, like ., which strips a quotation and puts all its items on the stack! Oi! The naughty boy!
The Stack
We had to mention The Stack earlier, it was unavoidable. See, The Stack is where the magic happens! But what is it, you ask? Well, let's try a simple example and try to use hex to subtract 3 from 5, and take it reeeally slow.
Fiiiirst we start a hex REPL.
Then, we enter 0x5
and press ENTER
. 0x5
gets pushed on The
Stack, like this:
+-----------+
| 0x5 |
+-----------+
Then, we enter 0x3
on The Stack. Now there are two items on The Stack, like this:
+-----------+
| 0x3 |
+-----------+
| 0x5 |
+-----------+
Great, and finally, we are going to push the symbol - on the stack, because that's how postfix notation (a.k.a. Reverse Polish Notation) works: first the operands, and then the operators.
Anyhow, what happens to The Stack now? Waait... wait...
*
* - *
+-----------+
| 0x3 |
+-----------+
| 0x5 |
+-----------+
...magic! Real quick, -
takes two items from The Stack, performs the subtraction, aaaand pushes
the result back on The Stack, that now looks like this:
+-----------+
| 0x2 |
+-----------+
Symbols ain't that bad after all. And yes, The Stack is AWESOME! Did you know that if you use postfix
notation you will NEVER ever need to use parenthesis when performing math operations to tweak operator
preference? No? Let's try it. Let's calculate (3 + 2) * 7
:
First, the sum, right? so:
0x3 0x2 +
...then we simply add the multiplication, and so we have it:
0x3 0x2 + 0x7 *
If we take this further, you can use The Stack as an accumulator for your program state, and never, ever use a variable.
Whaaaaaaat?
Yeah, mind blown. That's how purists of concatenative programming languages would write programs huh? Well, the problem with it is that programs written like that tend to become a wee bit hard to read (but definitely not for purists of concatenative programming languages).
Sooooo that's why next to The Stack, we also have... The Registry!
The Registry
The Registry knows everything. It is the place that contains all the definitions of all hex symbols, both the 64 native symbols that are built-in, and also any symbol that you may want to create.
The one thing to remember about The Registry is that it is only one. You can't have more than one, no sir, so anything you put in there will become available anywhere within an hex program. Yes you read it right:
every symbol in hex is global
Let that sink in.
Sure, it's not the best design in the world, but it has the advantage of being a very simple thing to implement and use. You have to know it, and you have to be very careful with it.
Now... to add a new symbol to The Registry, you use the : symbol. That can also be used to overwrite existing symbols with new values, but not native symbols.
Say we want to teach hex some Roman numerals... we could do this:
0x1 "I" :
0x2 "II" :
0x3 "III" :
0x4 "IV" :
; ...
Then, you could use them like ordinary symbols:
I IV + ; Pushes 0x5 on the stack
If you don't need a symbol anymore, you can use the # symbol to free it from The Registry. See? Simple.
Of course The Registry is smart enough to stop you from freeing native symbols!
So if you try to remove +...
"+" #
...you'll get:
ERROR: Cannot free native symbol '+'
Hexxing all the way!
An there you have it! Now technically you know everything you need about hex, and you can go off hexxing away on your own! Off you go then!
...What? You don't even know how to implement a loop or a condition? But it's all the same! It's always values pushed and popped from the stack using symbols!
Alright, let's do something actually useful. I know: let's implement a new operator to implement the factorial of an integer! You never know when you'll need a factorial these days.
Here goes:
(
"_fact_n" :
(_fact_n 0x0 <=)
(0x1)
(_fact_n _fact_n 0x1 - fact *)
if
"_fact_n" #
) "fact" ::
0x5 fact dec puts ; Prints 120
Woah! That was a mouthful, wasn't it? Before breaking it down, look at the very end of the program: see that ::? That's the symbol to store operator symbols in The Registry. Operator symbols are defined using a quotation, but unlike ordinary quotations (stored using :), they will be immediately dequoted when pushed on the stack. In other words, our fact operator symbols will behave exactly like one of the built-in native symbol.
Let's see what is happening inside the quotation:
- First, we are storing a symbol _fact_n in the registry. Wait, but there's no value? Correct, the value will be provided when the symbol is used, so like 0x5 fact. It's like saying that we are expecting a value on the stack (and here we are assuming it's an integer, but that's ok for this example).
- Then three (!) quotations and the symbol if will be pushed on the stack. Yep, you got that right: that's a good old if clause. The first quotation is the condition to be checked, then the then quotation, and finally the else quotation. Note how we can recursively call the fact operator that we are just defining... mind-blowing, I am sure.
- Finally, we need to free the temporary symbol _fact_n using #. It is always good practice to do so, otherwise The Registry will be littered with stale symbols nobody uses anymore... quite a sore sight, believe me! Plus they take up precious memory.
OK! Now you gotta have enough.
What?! Still no loops huh? Go read about it in the specification while symbol... takes two quotations... one for the condition and the other for the body of the loop. Really not rocket science once you get used to The Stack and The Registry.
And this concludes our brief but dense tour of hex, the slightly-esoteric concatenative programming language. I hope you enjoyed our time together. If not, tough luck, no refunds, go learn a lisp next time instead! ;)
Happy hexxing!