Daemon Engine
A custom C++20 game engine with V8 JavaScript scripting, DirectX 11 rendering with bloom pipeline, multithreaded JobSystem, FMOD 3D spatial audio, TCP/UDP networking, and a publish/subscribe event system.
Overview
Building a custom game engine wasn’t a choice I made — it was part of SMU Guildhall’s curriculum. Each assignment introduced a different engine subsystem: rendering, audio, networking, input, jobs. The value of building from scratch is that you study each system in isolation before they become entangled. But for most of the process, the design decisions were already made for me by the assignment spec.
That changed when I built DaemonAgent — the moment I embedded Google’s V8 JavaScript Engine into the engine. Suddenly I was making real architectural decisions: how to decouple C++ rendering from JavaScript game logic, how to expose engine APIs across a language boundary, how to let the C++ side render as fast as it wants while external agents modify JavaScript at their own pace. DaemonAgent is now one of the agents in my AGENTS thesis project, which means this engine isn’t just a learning exercise anymore — it’s a runtime that AI agents write code against.
The engine has been in active development since August 2024 and powers multiple shipped projects: DaemonWindows, Daemon Chess, Daemonstein, and DaemonStarship, each exercising different subsystem combinations from 3D rendering to multi-window desktop gameplay.
Architecture
The engine follows a layered subsystem architecture. Game-layer code (either C++ or JavaScript) depends on mid-level subsystems (renderer, audio, network, scripting), which in turn depend on core services (jobs, events, resources, math). All subsystems are owned and lifecycle-managed by the Engine singleton. The most important flow is the scripting bridge: JavaScript game logic calls into C++ engine APIs through V8 bindings, while the EventSystem keeps everything else decoupled.
| Module | Description |
|---|---|
| Renderer | DirectX 11 with bloom/blur pipeline, 3 light types, max 8 lights, embedded HLSL shaders |
| ScriptSubsystem | V8 JavaScript runtime with hot-reload, Chrome DevTools, configurable 256 MB heap |
| AudioSubsystem | FMOD 3D spatial audio with multi-listener support for splitscreen |
| JobSystem | Generic + I/O worker threads with bitfield-based job type matching |
| NetworkSubsystem | TCP/UDP sockets with WebSocket support via WebSocket++ |
| ResourceSubsystem | Async loading via JobSystem, 7 resource types (textures, shaders, models, etc.) |
| EventSystem | Publish/subscribe with NamedStrings arguments for decoupled communication |
| DevConsole | Command system with multiple display modes, ImGui integration |
Design Decisions
Why I Embedded V8 Instead of Lua or Python for Scripting
The goal wasn’t just to add scripting to the engine. I wanted hot-reload capability on game logic so that the C++ rendering side could run at full speed while something else, whether a human or an AI agent, modifies JavaScript files without restarting the engine. That “something else” turned out to be DaemonAgent, one of the agents in my AGENTS thesis.
Lua is the traditional choice for game engine scripting, and Python has a massive ecosystem. But V8 gave me three things neither could: ES6 module support out of the box, Chrome DevTools debugging on port 9229 (breakpoints, step-through, live inspection for free), and an async runtime model that maps naturally to the decoupled architecture I needed. The tradeoff is that V8 is heavy. Getting the precompiled binaries set up was genuinely painful, and the memory footprint is larger than Lua. But once the async generic command bridge between C++ and JavaScript was working, the performance was better than the tightly coupled alternative, because the two runtimes don’t block each other.
Why Bitfield-Based Job Type Matching Instead of Separate Queues
The JobSystem splits worker threads into generic and I/O types. When a job is submitted, it carries a type bitfield. Worker threads claim jobs by bitwise-AND matching their type against the queue, so an I/O worker only picks up I/O jobs while generic workers handle compute tasks. This prevents I/O-bound work from starving CPU-bound tasks.
The obvious alternative is separate queues per type, or a priority system. I was actually planning to implement that, but held off because of the overhead concern. A single queue with bitfield matching is dead simple: one mutex, one scan, one atomic check. Separate queues mean separate locks, and a priority system adds sorting or heap operations on every enqueue. For the scale of projects this engine runs, the bitfield approach gives me clean type separation without the complexity tax. If I hit a workload where the linear scan becomes a bottleneck, I’ll revisit, but that hasn’t happened yet.
Challenges
Finding DirectX 11 Memory Leaks Without Clear Symptoms
The hardest bug category in this engine wasn’t logic errors — it was DX11 memory leaks. The symptom is subtle: the engine runs fine for minutes, then frame times start creeping up, or VRAM usage climbs in Task Manager without any obvious trigger. There’s no crash, no error message, just slow degradation.
My first assumption was always that I’d forgotten to release a COM object somewhere obvious, like a render target or shader resource view. But DX11 leaks are rarely that straightforward. The real causes were things like creating a new blend state every frame instead of caching it, or holding a reference to a swap chain buffer after a resize. The debugging process was methodical but slow: enable the D3D debug layer, read the object leak report at shutdown, cross-reference the allocation IDs with the codebase, and trace the lifetime of each leaked object. There’s no shortcut. You just have to be disciplined about COM reference counting and verify your assumptions about who owns what.
Building the C++/JavaScript Async Generic Command System
The hardest feature to build was the decoupled async command system between C++ and JavaScript. The ScriptSubsystem was initially created with non-generic JSON commands, which meant every time I wanted to expose a new C++ API to JavaScript, I had to write a dedicated handler with manual type marshalling. It was a boilerplate nightmare that didn’t scale.
The fix was to build a generic command bridge: C++ registers callable functions with type metadata, and the V8 binding layer automatically marshals arguments and return values based on that metadata. Commands are async by default, so JavaScript can fire a request and continue executing while C++ processes it on its own thread. The difficulty wasn’t any single piece — it was getting the type conversion, the async lifecycle, and the error propagation all working together without introducing race conditions or silent type mismatches.
Debugging Multithreaded JobSystem Issues
When developing the JobSystem, the hardest part wasn’t the implementation — it was debugging it. Multithreaded bugs are intermittent by nature. A job might complete correctly 99 times and deadlock on the 100th because of a specific interleaving of thread scheduling that you can’t reproduce on demand.
The debugging process required stepping thread by thread in Visual Studio, inspecting each worker’s state, and mentally reconstructing the execution order that led to the problem. Breakpoints change timing, which changes behavior, which means the act of debugging can hide the bug. I learned to rely more on logging with thread IDs and timestamps than on interactive debugging, and to write small isolated test cases that stress specific interleavings rather than trying to reproduce issues in the full engine.
Code
The JobSystem’s claim mechanism is the core of how work gets distributed. Worker threads don’t pull from separate queues — they scan a single shared queue and claim the first job whose type bitfield overlaps with their own. This is simpler than maintaining per-type queues (one mutex, one scan) while still preventing I/O-bound work from starving compute tasks. The bitwise AND check is the key line: it’s what makes a generic worker ignore I/O jobs and vice versa, without any conditional branching on job categories.
bool JobSystem::ClaimJobFromQueue(Job*& out_job,
WorkerThreadType const workerType)
{
std::scoped_lock lock(m_jobQueuesMutex);
for (auto it = m_queuedJobs.begin(); it != m_queuedJobs.end(); ++it)
{
Job* job = *it;
// Bitfield match: worker type must overlap with job type
if ((job->GetJobType() & workerType) != 0)
{
m_queuedJobs.erase(it);
m_executingJobs.push_back(job);
out_job = job;
return true;
}
}
out_job = nullptr;
return false;
}
Gallery
Technical Specifications
| Component | Technology |
|---|---|
| Language | C++20 (MSVC) |
| Graphics | DirectX 11, HLSL (Blinn-Phong, Bloom/Blur) |
| Scripting | V8 JavaScript Engine, Chrome DevTools (port 9229) |
| Audio | FMOD (3D spatial, multi-listener) |
| Networking | TCP/UDP (WinSock2), WebSocket++ |
| Build System | MSBuild (Visual Studio) |
| Third-Party | ImGui, nlohmann/json, TinyXML2, stb, OpenSSL, Noise |
| Platform | Windows (x64) |
Related Projects
Game Dev
Daemon Libra
A 2D top-down tank shooter with heat-map AI pathfinding, procedural map generation via worm algorithms, and bouncing bullet physics, built on a custom C++ engine.
Game Dev
Daemon Starship
A 2D space shooter with fixed-size entity pools, dual-radius collision, and 5 progressive enemy waves, built on a custom C++ engine.
Game Dev
Daemon Windows
A multi-window action game that turns the Windows desktop into a shrinking battlefield, with wave-based survival, a shop upgrade system, and real Win32 windows as game objects.