Mr. Phil Games’ Blog

Posts for Tag: parity testing

Stellar Throne Devlog #10 — From Bug Fixes to Breakthroughs

Yesterday marked a massive surge of progress on Stellar Throne, my 4X strategy game. For those new to the project, Stellar Throne is built with Godot (or “go-DOH”) for the UI client and Zig for the high-performance simulation engine. After validating that the dual-engine architecture was correctly preserving visual data, yesterday’s focus was all about systematically fixing critical bugs and implementing major missing systems.

Fixing Fleet Movement: The Multi-Hop Breakthrough

The day began with a curious issue — fleets would stop at their first waypoint instead of continuing to their final destination. If I ordered a fleet to travel across five star systems, it would jump to the first and then just… stop.

There were two culprits:

  1. The Zig simulation backend was missing multi-hop continuation logic.

  2. The has_planned_route flag wasn’t being properly set during deserialization.

I fixed both by adding continuation logic to the Zig fleet arrival handler and correcting route flag inference from the route array. Fleets now correctly travel across all waypoints to reach distant colonies.


Event Manager Revival: “Salvage the Past” Returns

Next, I discovered that the “Salvage the Past” event — which provides narrative context and a small resource bonus at game start — wasn’t firing.
The cause? The Event Manager was still trying to load from a deprecated JSON file instead of the new TOML configuration system. After migrating all configs months ago, this subsystem had simply never been updated.

I replaced the hardcoded loader with a TOML-based one through the Config Manager, added automatic format conversion (snake_case → camelCase), and fixed an autoload order issue that caused Event Manager to initialize too early. The fix was simple but pivotal: moving Config Manager earlier in the autoload sequence. Events now trigger properly again.


Research System Overhaul: Persistence, Costs, and Effects

The research system was next — and it was a tangle. Research progress wasn’t persisting correctly after saving and loading, and techs were completing at incorrect cost values. Digging in revealed:

  • A serialization key mismatch (active_tech vs. active_tech_id)

  • Corrupted deserialization from nested research dictionaries

  • Over-accumulating progress past 100%

  • Flat serialization structures incompatible with the Godot side

I rebuilt the serialization/deserialization layer and restructured ResearchStateSimEmpire, and TurnSimulator to restore consistent persistence.

But research completion costs were still wrong — some techs completed at arbitrary thresholds. The problem was broken tier-matching logic: Zig was inferring tech costs from tiers instead of fetching them from the TOML configuration. After replacing it with proper ID-based lookups, research now respects real costs.

Finally, I discovered that tech effects weren’t being applied when research completed. The Zig backend had a “simplified” completeTech() function that unlocked techs but didn’t apply effects or send notifications. I implemented a temporary workaround in the TurnSimulationService, post-processing newly completed techs and calling the full Godot-side applyTechnologyEffects() method.

This fix is documented for a proper long-term solution in TODO_TECH_EFFECTS.md, which estimates a 15–20 hour implementation.


Planning for Parity: Building a Structured Roadmap

After the bug marathon, I shifted to long-term planning. The Zig backend currently includes 10 simplified systems — construction, combat, AI, events, and more. To bring full parity with Godot, I created a detailed roadmap across five new documents:

  1. Zig Implementation Plan — 13-week roadmap with system-by-system breakdowns, totaling 360–450 estimated hours.

  2. Zig Parity Roadmap — Executive summary showing 8 full systems, 10 simplified, and a move from 65% tolerance to near 0% parity.

  3. Parity Test Examples — Deterministic test specifications for construction, combat, AI, and events using exact RNG seeding.

  4. Zig Parity Progress — Session tracking, milestone checklists, and velocity metrics.

  5. Session Workflow Guide — Practical start/during/end checklists for ongoing development.


Phase Five Complete: Construction System Implemented

With the roadmap ready, I dove into Phase Five — Construction.

This system enables both turn-based building construction and production-based shipbuilding in Zig. I created:

  • SimConstruction.zig (with ConstructionOrder and ConstructionQueue)

  • Extended SimGalaxy for build and ship queues

  • Added processing to TurnSimulator

Each construction order tracks 13 fields — progress, turns, costs, and more — while queues handle FIFO operations. The logic now supports both building completion and ship production (logged for now, pending full ship creation parity).

The system clocked in at 8 hours — 33% ahead of the 12–16 hour estimate.
Current test results:

  • 339 Zig unit tests: ✅ Passing

  • 739/740 GUT integration tests: ✅ Passing

Zig parity stands at 1/7 phases complete (14%).


Construction Bugs & Fixes

The new construction system exposed three deeper issues:

  1. Disappearing Orders — Planets lacked system_id after deserialization. Fixed by setting it during StarSystem.from_dict().

  2. “Unknown Building” Labels — Type strings weren’t deserializing properly. Implemented a temporary queue-preservation workaround.

  3. Wrong Building Creation — Empty item types caused incorrect building spawns. Fixed via state preservation until full parity logic is in place.


Smarter Documentation: Resume Management Automation

Finally, I overhauled documentation. The massive RESUME.md file had grown to over 2,000 lines, bloating context windows. I:

  • Archived sessions 1–59 to a separate file

  • Reduced RESUME.md to 400 lines (80% reduction)

  • Created two new Claude agents:

    • Resume Writer Agent — auto-archives when context exceeds 150K tokens or on “done for today”

    • Resume Reader Agent — summarizes state at session start

Now, context is lean, searchable, and structured for long-term development.


Technical Summary

Yesterday’s session generated 13 commits and over 1,000 lines of code changes:

  • Fleet multi-hop fix — 34 lines

  • Event system TOML migration — 63 lines

  • Autoload dependency fix — 1 line

  • Research persistence & cost fixes — 128 lines

  • Tech effect workaround — 53 lines

  • Planning documents — 2,000+ lines (5 new files)

  • Construction system — 454 lines

  • Construction bug fixes — 108 lines

  • Resume refactor — 1,500 archived lines + 2 Claude agents


Current Status

✅ Fleets now travel through multi-hop routes.
✅ Events trigger properly at game start.
✅ Research completes at correct costs and applies tech effects.
✅ Construction system is fully implemented in Zig.
✅ One of seven parity phases complete (14%).
✅ Documentation system streamlined and automated.

This marks a shift from “fixing bugs in the Zig backend” to “systematically implementing full system parity.” The new roadmap provides the structure to reach 100% Zig simulation parity over the next 13 weeks.

Next up: Phase Nine — Combat Resolution, estimated at 24–32 hours.


You can follow ongoing updates and technical breakdowns at mrphilgames.com.
Thanks for reading — and for joining me on this journey to build Stellar Throne from the ground up.

— MrPhil

Stellar Throne Devlog #8: Validation, Parity, and Precision

Monday was a critical day of validation and bug hunting for Stellar Throne, my 4X strategy game built with Godot for the UI client and Zig for the high-performance simulation engine. After enabling the Zig backend in production over the weekend, my goal was to make absolutely sure both engines stayed perfectly in sync.


Dual-Engine Validation

The big picture: the Zig simulation now runs about fifty-two times faster than the original Godot implementation. But speed means nothing if the results aren’t identical. Monday’s focus was validation, testing, and fixing any divergence between the two engines.

I started the day tracking down a fleet position serialization bug. When fleets moved between star systems, their positions were stored correctly in Zig, but once the data returned to Godot for rendering, the interpolation broke — fleets would jump around the screen instead of moving smoothly along their paths.

The culprit was the serialization layer. Zig and Godot were using different coordinate formats. I standardized the data structure and updated the movement interpolation code to properly handle the incoming data. With that fix, fleet movement became smooth and visually accurate again.


Building Systematic Validation

Ad-hoc bug fixes weren’t enough; I needed systematic validation. I built a comprehensive serialization parity test framework to catch any mismatch before it caused a production bug.

The new test infrastructure compares both engines after every operation. It serializes game state from Godot to JSON, deserializes it in Zig, runs a simulation step, then serializes Zig’s results back to JSON and compares every field — ensuring no corruption or drift between engines.

This framework immediately revealed several subtle issues:

  • Inconsistent resource formatting across empires

  • Rounding errors in planet population data due to floating-point differences

  • Missing serialized fields in certain edge cases

I spent the rest of the day tightening the serialization code. Every numeric field now includes explicit precision validation. Optional fields have null checks. Missing legacy data gets default values. The result is a much more defensive and resilient serialization layer.


Achieving Population Growth Parity

Once the parity tests stabilized, a new challenge surfaced: population growth parity. The tests passed for most systems, but population numbers were diverging slightly between engines.

The root cause turned out to be integer math in Zig where floating-point calculations were required. After switching the intermediate values to floating-point, both engines produced identical results down to the last decimal place — restoring confidence in one of the game’s most important mechanics.


Fixing Test Infrastructure Race Conditions

The deeper I tested, the more subtle issues emerged. The asynchronous turn-processing code in the test framework was causing race conditions: some tests were checking results before a turn had actually finished.

I fixed this by adding proper sequencing and awaits. Each turn now fully completes before validation begins, eliminating false failures and making the test results rock-solid.


Floating-Point Precision in Max Population Calculations

Another precision issue appeared in maximum population calculations. Planets calculate their population limits from habitability scores. Godot used single-precision floats, while Zig used double-precision. Over long playthroughs, these tiny discrepancies added up.

I standardized both engines to use double-precision floating-point for all population math. Now, population caps match exactly — no more subtle drift.


Debugging Star Rendering

The day ended with a star rendering issue. After running turns through the Zig simulation, a few stars failed to render correctly on the galaxy map. The investigation pointed to a coordinate transformation problem in the rendering pipeline. This one’s still in progress, but it’s the next problem on my list.


Technical Breakdown

  • Ten commits completed

  • Three hundred twenty-seven lines added for the serialization parity framework

  • Eighty-one lines changed for population growth fixes across four files

  • Six functions updated to standardize floating-point precision

  • Fifty-three lines added to improve asynchronous test sequencing

The parity suite now runs fifteen validation scenarios, covering empire resources, fleet positions, planet populations, star system states, and cross-turn consistency. Every test executes both engines side by side and compares outputs field by field.


Current Status

The Zig backend is now fully running in production with comprehensive parity validation.
✔ Population growth matches exactly
✔ Serialization round-trips are verified
✔ Fleet movement is smooth and accurate
⚙️ Star rendering issue still under investigation

This validation work might not be flashy, but it’s essential. The dual-engine architecture only works if both engines produce identical results. These parity tests prove that the fifty-two-times performance gain isn’t coming at the cost of correctness.


Next Steps

  • Finish debugging the star rendering issue

  • Add new parity tests for edge cases like empire bankruptcy and fleet combat

  • Begin profiling the Zig simulation to find remaining performance bottlenecks

  • Explore new simulation features that take advantage of the speed boost


Thanks for following along with the development of Stellar Throne.

You can find more updates and technical breakdowns at MrPhilGames.com.

Stellar Throne Devlog #6 — Zig Parity and Production Deployment

Hey everyone, MrPhil here with another devlog update!

Today marks a critical production deployment milestone for Stellar Throne, my 4X strategy game set in a post-empire galaxy. For those new here, I’m building Stellar Throne with a dual-engine architecture — using Godot for the UI client and Zig for the high-performance simulation backend.

This week’s focus: validating that the Zig simulation engine isn’t just fast — but production-ready.


⚙️ Context: The Zig Engine Hits 52.7× Performance

After two weeks of development, the Zig simulation engine is complete — delivering a verified 52.7× speedup over the original Godot implementation. Every one of the 1,069 tests passes cleanly.

The question was no longer “Does it work?” but “Can I ship it?” And even more importantly: “Will it stay stable under real-world stress?”


🧪 Parity Testing and Stress Scenarios

To answer that, I built a comprehensive parity testing system — a harness that runs the exact same turn through both the Godot and Zig engines, then compares every field in the game state to ensure they produce identical results.

Test scenarios ranged from:

  • 10 empires with 10 systems each

  • Up to 100 empires with 250 star systems

That’s when things got interesting — and the bugs started surfacing.


🐛 Three Critical Bugs (and How I Fixed Them)

1. Negative Resource Spirals

Zig’s resource validation could create negative feedback loops if a colony’s food production went below zero — spiraling further negative each turn instead of clamping to zero.
Fix: Added strict validation to prevent negative resource values during calculations.

2. Fleet Upkeep Mismatch

The Godot version used a flat-rate upkeep (10 Energy + 5 Minerals per fleet), but Zig calculated per-ship upkeep — leading to wildly inconsistent costs.
Fix: Standardized both engines to use the flat-rate formula for base upkeep while keeping per-ship maintenance for hull and weapon costs.

3. Maintenance Persistence Loss

Maintenance costs were being overwritten during stat recalculation, resetting values to zero.
Fix: Preserved maintenance data during the recalculation phase.

After fixing these, I expanded the stress tests — pushing edge cases like 200+ fleets per empire, negative production chains, and zero-population planets. That led to 11 parity fixes across both engines — covering resources, fleets, population growth, and infrastructure costs.


🚀 Deployment Roadblocks and Race Conditions

With parity achieved, I turned to production deployment — enabling the Zig backend automatically during game initialization.

Two blockers emerged:

  1. Initialization Timing
    The Zig backend tried to serialize game state before empires existed — crashing when the empires array was still empty.
    Fix: Corrected property references (galaxy.galaxy_width → galaxy.width) and restructured initialization order.

  2. Async Signal Race Condition
    The system awaited a signal after it was emitted — a classic async race.
    Fix: Stored the signal reference before starting the turn, ensuring the listener was connected in time.

To prevent future issues, I implemented:

  • Helper functions in GameScreenCore and GameInitializer

  • Debug logging in TurnProcessor

  • 343-line deployment guide detailing rollback, monitoring, and troubleshooting procedures


🧩 Deployment Decision: Disabled by Default

After reviewing everything, I made the cautious call to keep the Zig backend disabled by default for now.
The timing between empire creation and backend initialization is delicate — and I’d rather avoid subtle production bugs.

That said, the system is fully integrated. Manual enablement is available for testers. The performance boost is real, and parity is 100% confirmed.


📈 Final Stats

  • 20 commits in one day

  • 4,243 new lines of parity infrastructure across 13 files

  • 343-line deployment guide completed

  • 11 parity bugs fixed

  • All 1,069 tests passing

  • 52.7× performance improvement validated

The Zig simulation backend is production-ready, complete with fallback to Godot if anything fails.


🧭 Next Steps

  1. Verify signal-timing fixes in live game sessions

  2. Test deployment in extended campaign scenarios

  3. Decide whether to enable Zig automatically or keep manual control

Either way, the hard part’s done — the Zig engine is live-ready and just one config flag away.


Thanks for following along — and as always, you can find more at MrPhilGames.com.

See you in the next devlog!