Ray Tracing That Was Anything *But* a Weekend — in Forth
Result
kirisaki/forth-raytracing: An implementation of ray tracing in Forth.
The Beginning
There’s a well-known challenge called Ray Tracing in One Weekend. The idea is simple — write a minimal ray tracer over a weekend.
I once implemented it in Rust.
This time, I decided to try it in Forth. Am I insane?
Why It’s Madness
Writing a “weekend ray tracer” in Forth is borderline insanity.
- There’s no garbage collector
- No concept of ownership
- You manipulate raw memory directly
- There are no variables
- Well, kind of, but they’re very peculiar
- In gforth, there are literally none for floating-point numbers
No namespacesThere are wordlists after all (added on 2025-10-11)- No types
- You push values onto a stack, call a word (Forth’s term for a function), and push the result back onto the stack
- Naturally, stack underflows and overflows happen all the time
It’s a parade of “no”s.
With no compile-time checks, it’s even harsher than C. Maybe slightly better than assembly.
So why write in such a language?
Because the stack was there.
Still, It Works If You’re Patient
Forth is quirky, but once you understand how it works, it’s not too bad.
For example, here’s how you’d generate random numbers — since there’s no standard library, you have to write your own.
\ Generate a random number
: rand ( u -- u u )
\ Xorshift64 PRNG
dup 0= if drop 1 then
dup 12 rshift xor
dup 25 lshift xor
dup 27 rshift xor
2685821657736338717 *
;
The ( u -- u u )
part is just a comment.
It means: “Given one unsigned integer, return two unsigned integers.”
It’s purely descriptive — there’s no enforcement — but if you test carefully, it works fine.
You can also use variables.
\ Create a new ray
: ray-new ( origin direction pool -- ray-addr )
locals| p d o |
p pool-alloc
dup r-origin o swap vec3-move
dup r-direction d swap vec3-move
;
locals| ... |
declares local variables in the order they’re popped from the stack.
So they’re bound in the reverse order of the comment. This causes frequent bugs.
Also, gforth’s floating-point stack doesn’t support locals| ... |
, so you have to mentally trace every stack operation yourself.
Once you understand these nuances — and memory management — you can get by. I thought I’d be done by the weekend.
Leak, Leak, Leak
“Let the OS handle memory management,” I thought. After all, it’s a one-shot program, not a long-running server.
The result? 300 MB of memory leaks per second. Even with 128 GB of RAM, it was gone in no time. The final challenge — rendering a scene full of spheres — couldn’t even finish. Thus began the hell of leak hunting. But the leaks refused to die. At last, I made a decision:
— Start over, and implement proper memory management.
Bringing Order to Chaos
Even if I rewrote it, calling malloc (allocate)
and free
constantly didn’t seem ideal.
I asked ChatGPT for advice — it confirmed the performance concern — and suggested using an arena allocator.
The idea is simple: allocate a large memory region (the arena), then manage memory manually with custom allocate
and free
.
I also introduced a memory pool.
In the arena, memory is simply allocated sequentially, wasting unused space.
The pool, on the other hand, manages fixed-size chunks in a linked list, reusing freed blocks efficiently.
After about a week of rewriting with arenas and pools, I finally generated the image above. Victory was mine.
Remaining Challenges
It worked — but not without issues.
It’s Slow
Rendering the 1024×576 image took four hours. There’s zero optimization — the code runs exactly as written. Naïve implementations take time. Lots of it.
Inconsistent Calling Conventions
Because I designed everything ad hoc, the argument order is inconsistent across functions. While not unique to Forth, this is especially painful here: Forth calls words based on the stack order, so changing call order later is a nightmare. Also, passing four different memory pools around is ridiculous — I should’ve just made a struct.
Conclusion
Writing Forth was painful — but also deeply enjoyable. It gave me the same low-level control as C, with a Lisp-like expressiveness. If you’re curious, give it a try. It’s worth it.