Modules, Interrupts, Memory Leaks

The Road to Early Access basically looks like the Road to El Dorado, except the musical numbers are worse. Right now I am still in a land where I have to actually sit down and do all the stuff I’ve been putting off for the course of the product. Two things that fall into this category are “putting modules in the corner of a building” and “interrupts.” Oh, and “fixing the memory leaks.”

In The Colonies there are unfortunate occasions wherein real estate looks bigger on the outside than it is on the inside. (See also: "Mad Architects Of The Frontier And Their Terrible Creations".)

In The Colonies there are unfortunate occasions wherein real estate looks bigger on the outside than it is on the inside. (See also: “Mad Architects Of The Frontier And Their Terrible Creations”.)

Putting modules in the corner of a building has been something that, since the start of the project, would blow up the game. There were a number of things causing problems here: first off, the code to actually handle the case of inserting the small “module footer” that forms the foundation of a module was blowing up if you put it in the corner of a map. This has been fixed. Second, when we finished a module’s foundation, we would delete the edges on the floor plan, and merge the leftmost and rightmost edges together. Obviously, this is not the right thing to do if a foundation piece is at the leftmost, or rightmost, edge of a blueprint – you end up collapsing two edges together that are at 90 degrees. So this has been fixed for the outside of modules. The inside of modules still breaks, because there is some funny business going on with how we construct interiors that I have yet to track down – but we’re making progress on long standing issues.

Carpentry, meat on the ground. Common scenes of  work on The Frontier.

Carpentry; Ground-steaks & tripe. Or: Scenes of common work on The Frontier.

Whatever is going on here, trust us, it's way more difficult than it looks.

Whatever is going on here, trust us, it’s way more difficult than it looks.

Interrupts are a bit more significant: the notion that a character, while doing a task (“haul goods to stockpile”), can be interrupted because some other job (“not being on fire”) takes precedence. This turned out to be not so bad to implement: every frame, we look at all the jobs that have a higher interrupt precedence than my current job and that are mandatorily evaluated. If a job has a higher interrupt precedence AND a higher utility, we abort the current job and go on to the next one. The bigger problem here is the problem of aborting the current job cleanly, which has meant going through the codebase and annotating everything to make sure it cleans up nicely. This has led to us fixing a plague of random crashes caused by, say, posting a job to harvest a cabbage because it was interrupted, despite the cabbage already having been harvested.

Finally, there is the issue of the memory leaks. We were leaking memory like a stuck pig, and this is going to get technical. Last night I threw a wobbly, for lack of a better term, and started poking about in the code base. I tracked the following memory leaks down by virtue of “disabling stuff and seeing what happens”:

  • particle systems were not cleaning themselves up correctly, and were leaking memory on deletion (fixed)
  • gossipping characters would leak memory due to a base class not having a virtual destructor (and several variations on the theme of “a base class not having a virtual destructor”)
  • memory allocated during a range-based for loop in the pathfinding code would not be released correctly when we performed an early-out: this might be a MSVC bug or simply something about the C++11 spec I don’t understand, I don’t know, but it looks like calling return to abort a function from within a range-based for causes memory to not be freed from the heap. Which is odd.
  • We never stored that a skin for the user interface was loaded after we loaded it, so we re-loaded all of the UI art every time we needed to draw a new UI widget. This got very bad, again, when gossipping characters needed an icon and we’d immediately load an enormous texture.

At this point, pausing the game thread has a perfectly consistent memory footprint from frame to frame: the renderer is, seemingly, leak-free. Turning on the game thread results in a game that does not leak memory until some time in Day 2 of the simulation. I decided I needed a better tool to figure out where memory leaks were coming from, and I found it in GlowCode, a handy piece of software that I’ve never heard of. If you are the sort of person tasked with optimizing things and fixing memory leaks, add GlowCode to your shopping list alongside Telemetry by RAD Game Tools. It is amazing – point it at your software, profile it in real time at an interactive (ish) framerate, and see memory leaks, in real time, as you run your software. I am a convert, and they are getting my money.

(On a side note: the gold standard for this used to be Rational Purify, now owned by IBM. Attempting to download a free trial of Purify from IBM resulted in an error page, repeatedly, until I somehow found a download for the EXE by just plugging “Free Trial Rational Purify” into Google until it returned the correct link; at this point, I discovered that they do not support Microsoft Visual C++ 2012 or 2013. Does anybody know why IBM hasn’t died yet?)

Not a picture of a memory leak. But look at those fishpeople, they sure are angry. Pew pew!

Not a picture of a memory leak. But look at those creepy fishpeople! They sure are angry. Pew pew!

GlowCode pointed me at the next error that I hadn’t figured out: there is a very large memory leak in our code to handle events, which not only triggers every frame but seemingly has something to do with a large block of code labelled “FIXME”, and I cannot figure out, for the life of me, what it does. I will probably just delete it. We also have a smaller memory leak allocating memory from grid positions when sending data from one Lua game object to another, but I’m going to leave puzzling that one out to Micah.

For those interested, here is an internal changelog for the upcoming Revision 21. This is basically a summary of all the work that we have done this week, so you get some idea of how much we can get done in a week.

  • FIXED: when paperwork, sweeping, and butchering are interrupted, the temporary tool is properly disposed of
  • “chop planks” uses carpentry icon
  • Buildings now have more or less consistent interior wall-top fill textures
  • we now have a series of smaller gibs to go along with the larger gibs
  • added dev button to spawn a bottle of whisky
  • added proper icon for hunting job
  • added in special fishperson melee attack animation
  • hunting job now has an icon
  • FIXED: “Interrupted Tidy Shop to Do Paperwork”
  • FIXED: Crashing game opening work crews
  • FIXED: Middle class single beds def had an extra space which removed them from game.
  • bigger booms added to SK grenade attack
  • FIXED: death of animals no longers triggers tragedy music (as tragic as it may be to us vegans -dgb)
  • butchery spawns gibs (now we need a good way to clean these up)
  • FIXED: “nullMessage” crash when a job destroys an item then restarts the job (I hope?)
  • FIXED: steam knight now operates again
  • FIXED: modules can now be placed on the corners of buildings, and the exterior of the building will render correctly. (The interior will still be broken; however, it is Harmless and affects Only the Visuals. Still working on this.)
  • Enormous, nine-stage animation pipeline for stew and bread preparation
  • FIXED: butchering crash
  • FIXED: “Return XYZ” to Stockpile Spam in jobs panel
  • added “spawn aurochs” dev button w/ cute icon
  • butchering now uses hunting knife (and improved? animation timing)
  • sounds attached to various tool use animations
  • made gamestart way less generous
  • FIXED: cannibalism now requires a HUMAN corpse, not any corpse
  • color of MC single bed adjusted slightly
  • LC cot is now 3×1 module, not 2×1 module
  • fishpeople get special attack animation
  • FIXED: memory leak in particle system
  • FIXED: memory leak in animation system
  • FIXED: memory leak loading skins
  • FIXED: memory leak in pathfinding system
  • FIXED: basically, just memory leaks
  • removed “nuts” and “screws” from workcrew name text list because, although hilarious, we’re not in the business of double entendres
  • FIXED: death now aborts FSMs; should help with characters getting stuck due to melee combat w/ dead entities

So, yeah. Productivity. The next thing to do – for me, at least – is to finish the save game code. Those of you who have played Dungeons of Dredmor know that save games are, shall we say, my Achilles heel – and a very hard problem in general. Hopefully we don’t have the same hideous problems that we had with Dredmor’s saves – the one lesson I learned from Dredmor is “make your save game format not binary.”

Have funning saving all this clutter, Nicholas.

Have fun saving all this ridiculous clutter Nicholas!

 

Posted in Clockwork Empires | Tagged , , , , , , , , , , , , , , , , , , ,
14 Comments

14 Responses to “Modules, Interrupts, Memory Leaks”

  1. Headjack says:

    Stockpile Spam for a cold winter night.

    { reply }
  2. wootah says:

    I lost many a nights sleep over a Dredmor character that was eaten to game save file crashes. Luckily, CE isn’t rogue-like, so there isn’t a reason to destroy the save files on load, which means that at the very least it will be easy for early access players to reproduce/avoid crashes.

    { reply }
  3. Nathan says:

    IBM is mainly doing server stuff now, so that’s why all their products for normal people seem unmaintained – by and large, they are.

    Will there ever be documentation on the Dredmor save format? If you’re using non-binary saves for CE, we shouldn’t need documentation for them, but Dredmor is still a mystery.

    { reply }
  4. William says:

    So looking forward to the early access. Godspeed gentlemen!

    { reply }
  5. Nonya says:

    IBM is still in business because they make ludicrous money with expensive consulting contracts to big businesses and govt who bought their ridiculously overpriced server hardware and / or software.

    When the bugs and omissions in your software become a revenue stream because you just get to send consultants to the tune of > 1000 USD a day to ‘customise your enterprise-grade COTS solution’, the motivation to make your product lines work straight out of the box mysteriously vanishes.

    Odd.

    Oh also, this game keeps looking better and better. Glad to hear you addressed some seemingly fundamental causes of bugs, it beats fixing them at the surface.

    { reply }
  6. Programmer Art? says:

    Just out of curiosity, is there a reason you’ve chosen MVSC over the likes of mingw (Do you develop in Visual Studio?)? Here’s hoping Clockwork Empires will be playable on linux distros. πŸ™‚

    { reply }
  7. Robert says:

    Why not binary save games?

    { reply }
    • If anything is off by one byte, the file won’t load. With XML, things can be quite off and the loader might still limp through it.

      { reply }
      • Not to mention that XML has a decent amount of library support, so the low-level stuff can be easily avoided. That being said, I’m not sure it doesn’t increase load times, but for saves in CE I doubt that will be a big difference.

        Also, why are you allocating memory to the heap in a range-based for? I suppose I mainly find that confusing because if you have C++11/14 it would seem more natural to just use unique_ptr/shared_ptr in that situation than a new. That or set up an std container before the loop and just populate it within the loop using stack allocations and push_back/emplace.

        { reply }
  8. Dan Thompson says:

    Interesting MSVC range-based for loop bug(!?).

    I suspect that, depending on your LuaJIT interface to the C++ engine, and also depending on how many game objects you’re able ffi.new in your LUA scripts, that your serialize/deserialize process is going to be as fun of an adventure as it is for me πŸ™‚

    I’d love a more technical post about the save code details in the future when you win that particular battle!

    Keep up the good work.

    { reply }
  9. Vitellozzo says:

    Please work hard with saves, my dredmor characters aren’t playables anymore since save corruptions.

    { reply }
  10. Cutter says:

    Yeah, and IZBM still makes a metric TON of money on supporting legacy hardware and software for governments, airports, and all manner of industry. I did a year long contract for them once at the Vancouver IA and you’d be amazed at how many antiques are still in use around that place – or at least there were a decade ago and probably still are. If they don’t need to be nimble in adopting new tech, they’re not.

    Anyway, I love the opacity on the walls in those shots. Any chance you could make that a standard feature or at least and option? I’d like that a lot more than an occlusion bubble.

    { reply }
  11. jethro says:

    Obviouslyif you have a system that uses xml you should then use xml for everything else. At least when modders go to hack your saves they won’t be using hex edit.

    { reply }
  12. Abculatter_2 says:

    Just want to say that I found the, “…seemingly has something to do with a large block of code labelled β€œFIXME”, and I cannot figure out, for the life of me, what it does.” line to be incredibly hilarious.

    { reply }

Leave a Reply

Your email address will not be published. Required fields are marked *