Ray Tracing That Was Anything *But* a Weekend — in Forth
2025-10-10

## Result
[kirisaki/forth-raytracing: An implementation of ray tracing in Forth.](https://github.com/kirisaki/forth-raytracing)
## The Beginning
There’s a well-known challenge called [*Ray Tracing in One Weekend*](https://raytracing.github.io/).
The idea is simple — write a minimal ray tracer over a weekend.
I once implemented it in [Rust](https://github.com/kirisaki/rust-raytracing).
This time, I decided to try it in **Forth**. *Am I insane?*
## Why It’s Madness
Writing a “weekend ray tracer” in Forth is borderline insanity.
- There’s no garbage collector
- No concept of ownership
- You manipulate raw memory directly
- There are no variables
- Well, kind of, but they’re very peculiar
- In gforth, there are literally none for floating-point numbers
- ~~No namespaces~~ There are wordlists after all (added on 2025-10-11)
- No types
- You push values onto a stack, call a word (Forth’s term for a function), and push the result back onto the stack
- Naturally, stack underflows and overflows happen all the time
It’s a parade of “no”s.
With no compile-time checks, it’s even harsher than C. Maybe slightly better than assembly.
So why write in such a language?
*Because the stack was there.*
## Still, It Works If You’re Patient
Forth is quirky, but once you understand how it works, it’s not too bad.
For example, here’s how you’d generate random numbers — since there’s no standard library, you have to write your own.
```forth
\ Generate a random number
: rand ( u -- u u )
\ Xorshift64 PRNG
dup 0= if drop 1 then
dup 12 rshift xor
dup 25 lshift xor
dup 27 rshift xor
2685821657736338717 *
;
````
The `( u -- u u )` part is just a comment.
It means: “Given one unsigned integer, return two unsigned integers.”
It’s purely descriptive — there’s no enforcement — but if you test carefully, it works fine.
You can also use variables.
```forth
\ Create a new ray
: ray-new ( origin direction pool -- ray-addr )
locals| p d o |
p pool-alloc
dup r-origin o swap vec3-move
dup r-direction d swap vec3-move
;
```
`locals| ... |` declares local variables *in the order they’re popped from the stack*.
So they’re bound in the **reverse order** of the comment. This causes frequent bugs.
Also, gforth’s floating-point stack doesn’t support `locals| ... |`, so you have to mentally trace every stack operation yourself.
Once you understand these nuances — and memory management — you can get by.
I thought I’d be done by the weekend.
## Leak, Leak, Leak
“*Let the OS handle memory management*,” I thought.
After all, it’s a one-shot program, not a long-running server.
The result? **300 MB of memory leaks per second.**
Even with 128 GB of RAM, it was gone in no time.
The final challenge — rendering a scene full of spheres — couldn’t even finish.
Thus began the hell of leak hunting. But the leaks refused to die.
At last, I made a decision:
— *Start over, and implement proper memory management.*
## Bringing Order to Chaos
Even if I rewrote it, calling `malloc (allocate)` and `free` constantly didn’t seem ideal.
I asked ChatGPT for advice — it confirmed the performance concern — and suggested using an **arena allocator**.
The idea is simple: allocate a large memory region (the arena), then manage memory manually with custom `allocate` and `free`.
I also introduced a **memory pool**.
In the arena, memory is simply allocated sequentially, wasting unused space.
The pool, on the other hand, manages fixed-size chunks in a linked list, reusing freed blocks efficiently.
After about a week of rewriting with arenas and pools, I finally generated the image above.
**Victory was mine.**
## Remaining Challenges
It worked — but not without issues.
### It’s Slow
Rendering the 1024×576 image took **four hours**.
There’s zero optimization — the code runs exactly as written.
Naïve implementations take time. Lots of it.
### Inconsistent Calling Conventions
Because I designed everything ad hoc, the argument order is inconsistent across functions.
While not unique to Forth, this is especially painful here:
Forth calls words based on the *stack order*, so changing call order later is a nightmare.
Also, passing four different memory pools around is ridiculous — I should’ve just made a struct.
## Conclusion
Writing Forth was painful — but also deeply enjoyable.
It gave me the same low-level control as C, with a Lisp-like expressiveness.
If you’re curious, give it a try. It’s worth it.