Mr. Phil Games’ Blog

✅ Godot Port Complete (Mostly!)

Today marked a huge milestone — the Godot port of Stellar Throne is functionally complete. I wrapped up the final phases of integration, testing, and UI polishing. Most of the core systems are now running in Godot, including fleet travel, the research system (96 techs with tier unlocking), fog-of-war rendering, and responsive detail panels for stars, planets, and fleets.

While the port wasn't 100% complete — I estimate about 20–30% of the original Zig/SDL2 functionality still needs to be rebuilt — the trade-offs feel worth it. I’ve gained a cleaner UI, smoother navigation tools, and access to Godot’s powerful exporting pipeline (WebGL, platform builds), editor-driven UI, and runtime debugging.

Bug hunting continues. I’ve already addressed missing header fonts, fleet icon visibility, and notification behavior. Button layouts have been reorganized for better clarity, and I'm finally seeing visual polish emerge from the workflow.

From here, it’s about rebuilding what was lost and extending forward from a stronger foundation.

📦 Stellar Throne: Porting Progress & Platform Trade-Offs

Today's Godot port made huge strides across core systems:

  • ✅ Phase 3.6: Combat & Military Systems

  • ✅ Phase 3.7: Strategic Layer

  • ✅ Phase 3.8: Advanced Features

  • ✅ Phase 4: UI System

  • 🔧 Phase 5: Integration & Testing (in progress)

One big shift I noticed today: debugging with Godot + Claude introduces a workflow tax. Since Claude can’t see the Godot runtime logs directly, I now have to manually screenshot errors and paste them in. This adds friction compared to my Zig workflow, where Claude could read the logs, compile the project, and even suggest fixes without manual input.

Trade-offs becoming clear:

Godot Zig
+ WebGL, cross-platform support + Easier automated debugging
+ GUI tools and live UI previews + Claude can build & run tests directly
- Slower iteration (need to reload manually) - No built-in deploy pipeline
- No direct access to logs - No visual UI editor

Overall, I still feel optimistic about the switch. Once the core pipeline is smoothed out and testing integrated, the benefits of rapid deployment and visual editing may outweigh the added friction.

Tomorrow I’ll focus on fixing the save/load integration and begin real-world gameplay tests inside the Godot port.

🚀 Stellar Throne Enters Alpha: Polish, Audio, and the Godot Port Begins

Today marked a turning point in Stellar Throne’s development — I’ve officially reached what feels like an Alpha milestone. The core gameplay loop is playable, the systems are largely in place, and while many features are still rudimentary or placeholder, the structure of the game exists end-to-end.

I kicked off Phase 7: Polish the Polish and Phase 8: Audio & Advanced Visual Effects, but I’ve quickly realized that I’m bottlenecked by asset limitations — both sound and visuals. It’s probably time to bring on an Art Director to elevate the game’s look and feel with professional guidance.

The other big development? I’ve begun porting Stellar Throne to Godot. Claude estimated about a week of work, and I’ve already completed the foundational port, galaxy rendering, and core logic migration. Using Godot opens up several exciting possibilities:

  • WebGL support for playtesting on itch.io

  • A proper UI editor (rather than building interfaces through AI prompts)

  • Streamlined asset integration and animation pipelines

This is all made possible thanks to the accelerated development AI provides. What would normally take months has taken weeks — and I can feel the momentum building.

🎮 Stellar Throne Devlog — Phase 6 Complete, Diplomacy & Polish Begin

Today marked a major milestone: Phase 6 is complete! Galaxy Generation is now finalized, and I've officially kicked off both the AI Systems and Diplomacy phases. The first AI behaviors are in motion, and the foundations of diplomatic postures are coming online.

I also posted the latest newsletter and fixed a tricky bug with the fleet-splitting UI, where ship selection wasn’t responding. That one was subtle but satisfying to squash.

I’ve started on Phase 7 — “Polish the Polish” — where I refine interaction details, feedback animations, and overall game feel. It’s still early, but I’m excited to start layering clarity and visual cohesion.

One experiment didn’t pan out as hoped: subagents in Claude. While conceptually appealing, they aren’t automatically leveraged in workflows, so their value is limited unless I manually invoke them. Still, they help keep the context clean.

Newsletter: 🧠 Agentic Game Development

This week’s newsletter dives into the development philosophy behind Stellar Throne — and how I’m building it with AI as a true collaborator, not just a code generator.

I call it Agentic Game Development:
A workflow where AI helps me plan, design, and prototype entire game systems — from diplomacy AI to tactical ground combat — by working from goals, not syntax.

In the post, I cover:

  • A full weekly dev update (event-driven managers, new HUD actions, galaxy anomalies, bug fixes)

  • What “agentic development” means and how I use it daily

  • Real examples of prompts, AI responses, and what I build on top of them

  • The pitfalls of working with AI — and how I mitigate them

  • Why this workflow lets me design faster and focus on what matters

👉 Read the full post here https://mrphilgames.substack.com/p/agentic-game-development?r=ifm45

Whether you're curious about AI dev workflows or just want a behind-the-scenes look at Stellar Throne, I think you’ll find this one interesting.

Refactoring, Regression Fixes, and the Case Against Unit Tests?

Today was a busy one for Stellar Throne. I pushed through a long list of regression fixes and finally got unit test failures down to a single case related to game save versioning. I also redirected all tests to a separate test_saves directory to avoid interference with the main game data — something Claude had been missing.

UI got some much-needed love today too:

  • Top-bar buttons for ResearchDiplomacyShip BuildConstructionFleet Management, and Bombardmentare all back.

  • You can now double-click planets to jump into construction, and double-click stars to access the shipyard.

  • I also fixed UI quirks like not being able to select the last item in a panel, and persistent notifications not returning if a project wasn’t selected.

Combat bugs are getting squashed as well. Battles between AI empires no longer hang the game, and turn processing now has a wait animation to replace the “Pause” message.

But the biggest lesson today? I'm questioning the ROI on unit tests. While they’ve helped Claude stay aligned in theory, in practice they rarely catch production issues. The real failsafes are still my manual playthroughs and design audits. Something I'll have to think about.

CLAUDE.md Audit Continues, Test Suite Nears Green

Today was focused on improving test stability and aligning the project with the standards outlined in CLAUDE.md. I got most of the unit tests passing — only 9 remain, and they look solvable once the audit-driven changes are complete.

The CLAUDE.md audit is yielding real architectural benefits. I migrated several core managers to the new event-based system, simplifying responsibilities and improving modularity. Each change feels like it’s setting the foundation for long-term scalability, especially as we head into more complex systems like AI behavior and diplomacy.

Still no movement on Phase 6 tasks (Galaxy Generation and AI), but I'm confident that clearing this audit hurdle will unlock progress across the board.

Test Breakdown and Recovery Mode

Today was another deep dive into testing infrastructure and project consistency. I extended the test runner to generate a summary of passing, failing, and skipped tests — giving Claude a better diagnostic view of the project’s state. Unfortunately, things didn’t go smoothly: some tests stopped running altogether, and memory leaks began to appear in others.

After some investigation, I added further safeguards to the test counting and registration system, aiming to prevent silent failures. It’s still unclear whether the breakdown was due to the verification changes or caused by Claude's recent outages and instability. Either way, the process highlighted how fragile automated workflows can become when a foundational tool starts misbehaving.

Alongside this, I continued auditing the codebase against the rules in CLAUDE.md, identifying more inconsistencies in style and structure that likely compounded the testing failures.

Tomorrow, I’ll need to finish the audit, track down the failing tests, and work through a handful of feature regressions — particularly around UI routing and AI turn logic. It’s a bit of a recovery mode kind of day.

Test Coverage, CLAUDE Compliance, and a Safer Codebase

Today’s focus was on strengthening Stellar Throne's internal validation pipeline and ensuring the project continues to scale cleanly under AI-assisted development.


✅ What Got Done

  • ✅ Improved Unit Test Coverage
    Identified gaps in existing test coverage, especially from earlier Claude-generated files. Added new test cases for key game systems, including colony construction, combat state transitions, and shipyard logic.

  • ✅ Test Counting & Verification System
    Built a utility that compares registered test modules against the actual file count. This prevents “orphaned” tests — where Claude writes a test but forgets to register it. It’s now part of the daily validation workflow.

  • ✅ CLAUDE.md Audit
    Completed a pass over the codebase to verify:

    • Naming conventions (camelCase)

    • File structure adherence

    • Token-safe module boundaries

    • AI prompt formatting patterns
      CLAUDE.md remains aligned with the project as of today.


💡 What I Learned

The test coverage illusion was subtle but dangerous — having tests written but not run creates a false sense of safety. Fixing this now saves major debugging headaches later. Also, AI output policies need reinforcement: it's easy for a helper like Claude to silently skip crucial steps unless expectations are made explicit and checked automatically.

The Case of the Missing Unit Tests

Today’s development revealed an important oversight: while Claude had been generating unit tests for various systems, they weren’t being added to the test runner. As a result, many validations were silently unused — giving a false sense of coverage.


🧪 What Got Done

  • ✅ Improved Unit Test Coverage:

    • Identified and integrated previously orphaned tests

    • Updated the main test runner to explicitly include all test modules

    • Improved unit test coverage


🔍 What I Learned

AI can produce great utility code, but it doesn’t always connect the dots — especially when it comes to integration. Even if a file looks complete, it may not be registered in the larger test framework. Always verify that generated tests are executed, not just written.