Devblog 30: The Stopwatch Strikes Back

Month Nineteen

We often worry about time. Usually it’s a gnawing anxiety that we aren’t spending hours wisely, or that we might be a few minutes late to something important. But have you ever worried about a millisecond? One friend is a professional athlete, and even he measures success in seconds. Not once have I heard him explain that he plans to improve by half a millisecond. Nor would I take him seriously if he did.

And yet, here we are, worrying about milliseconds. It all began when we realised that a group of one hundred units was slowing the game down to 10-20 Frames Per Second (FPS). Considering that a hundred units was benchmarked last year running at 130-160 FPS, this was somewhat concerning.

So Jamie and I began our investigation, at first with a bit of pair programming. Unity has profiling tools, which provide a visual breakdown of CPU usage, showing where we must start. This can be confirmed by simply ripping features out, and seeing if that helps. Unfortunately, the vast majority of time is spent combing through the code using C#’s stopwatch feature. This allows developers to see how long something has taken in milliseconds (thousandth of a second).

Optimisation can be surmised using an old environmentalist slogan: ‘reduce, reuse, recycle’. The biggest problem is when code repeats itself. This often occurs during initial development, when it may not be obvious just how time critical an operation is. Outcomes are especially bad when iterating over the same list multiple times, so the most important thing is to reduce the number of loops.

Recycling code via caching is good practice, but also complex. Generally, the smaller the object, the more efficient it is to reuse. In one example, we found it was twice as fast to cache (and reuse) an array with four items instead of creating a new array each time. Large collections however are expensive to reset, because the operation is O(n), meaning it may be better to create a new collection instead. The only way you’ll know is by testing. Programmers must remember that these decisions are never free, and are always a trade: spending more memory for less CPU usage, or vice versa. Discarding more items makes the ‘garbage collector’ work harder to delete old resources, which is itself a non-trivial cost.

Another somewhat obvious surprise was about deep references. The more times code must call references, especially ones external to the object, the more expensive referencing becomes. It may not be immediately obvious that this is happening, but the simplest way of thinking about it is counting the number of dots in a function. More dots, more references, more expense. Caching references locally may not seem worthwhile, but often is. Consider the following example from within a loop:

            int newWorldX = Self.Coordinate.worldX + direction.worldX;
            int newWorldY = Self.Coordinate.worldY + direction.worldY;
            int newWorldZ = Self.Coordinate.worldZ + direction.worldZ;

That doesn’t seem too bad. But let’s cache the Coordinate object:

            Coordinate selfCoordinate = Self.Coordinate;
            int newWorldX = selfCoordinate.worldX + direction.worldX;
            int newWorldY = selfCoordinate.worldY + direction.worldY;
            int newWorldZ = selfCoordinate.worldZ + direction.worldZ;

Now, count the dots. Nine and seven respectively. Doesn’t seem like much, but that change improved performance by between 0.1 and 0.2ms. Of course, a superior solution would be to avoid calculating new coordinates each time entirely, which requires Coordinate objects to cache their neighbours. With modern computing, that may be an acceptable cost in terms of memory.

Now, some of you will read 0.1ms and think I’ve gone mad (perhaps just descending further into madness). A hundred nanoseconds Richard, really? One phrase used by programmers is a helpful guide: “premature optimisation is the root of all evil” (Sir Tony Hoare). However, in our case, this isn’t premature and we really do need simulation code to be as performant as possible. With careful attention to detail; reducing, reusing, recycling, we have improved the median time it takes per tick to simulate 100 units [at rest] from 8ms to 2.5ms. The biggest improvement was probably fixing spatial partitioning, which was returning far too many neighbours. But that could never be enough, and every line of code involved in unit movement has to be examined with our reliable friend the stopwatch.

Further optimisation requires less calculations per second. In Chris Park’s excellent development blog post: ‘Optimizing 30,000+ Ships in Realtime in C#‘, he provides many wise tips and tricks. The most important is probably that a ship’s rotation is only calculated four times a second, and players didn’t notice the difference. At any rate, the simulation doesn’t need to calculate rotation every tick. This will further improve performance from 2.5ms, and so achieving a sub 1ms result is possible.

In other news, I am delighted to report that Norn Industries’ application for funding from local government (The Pixel Mill) has been successful. This funding will allow us to hire another programmer, and to extend pre-publisher development into 2022. Success would have been impossible without Rory Clifford’s help, the Pixel Mill has been an invaluable resource. Hiring is a relief, as Jamie and I need another programmer with games development experience. As Norn Industries is a small organisation, our hiring policy is like the British Army’s Special Air Service: you can’t sign up, we will ask you.

Devblog 27: Programming, Leadership, and Professionalism

Month Sixteen

It is alleged that Stalin once said “quantity has a quality all its own”. While this applies to total war on the eastern front, it most certainly does not apply to code: more is not better.

Programming is like writing novels, a good writer does not publish their first draft. Completion is far more than just getting to the end of the story, requiring incremental improvement. Complex ideas have to be written well, otherwise the reader can miss the point. If the reader is as merciless and pedantic as a software compiler (which interprets code logically) those misunderstandings will result in faults, euphemistically referred to as “bugs”.

Bad code is a moral failing. Too often IT professionals are obsessed with delivery at any cost. Too often managers understand ‘done’ to be the task has just been completed and appears to work. The client probably won’t read the code, so what does it matter if it’s a bit rough around the edges?

Bad code kills people. Bad code costs billions of dollars. Bad code means broken promises. To be more precise: bad bosses, bad managers, and bad programmers are responsible for these things. In September 2017 The Atlantic published ‘The Coming Software Apocalypse‘, citing examples of awful code and dire consequences. Perhaps the most ominous case involved litigation against Toyota:

In September 2007, Jean Bookout was driving on the highway with her best friend in a Toyota Camry when the accelerator seemed to get stuck. When she took her foot off the pedal, the car didn’t slow down. She tried the brakes but they seemed to have lost their power. As she swerved toward an off-ramp going 50 miles per hour, she pulled the emergency brake. The car left a skid mark 150 feet long before running into an embankment by the side of the road. The passenger was killed. Bookout woke up in a hospital a month later.

The incident was one of many in a nearly decade-long investigation into claims of so-called unintended acceleration in Toyota cars. Toyota blamed the incidents on poorly designed floor mats, “sticky” pedals, and driver error, but outsiders suspected that faulty software might be responsible. The National Highway Traffic Safety Administration enlisted software experts from NASA to perform an intensive review of Toyota’s code. After nearly 10 months, the NASA team hadn’t found evidence that software was the cause—but said they couldn’t prove it wasn’t.

It was during litigation of the Bookout accident that someone finally found a convincing connection. Michael Barr, an expert witness for the plaintiff, had a team of software experts spend 18 months with the Toyota code, picking up where NASA left off. Barr described what they found as “spaghetti code,” programmer lingo for software that has become a tangled mess. Code turns to spaghetti when it accretes over many years, with feature after feature piling on top of, and being woven around, what’s already there; eventually the code becomes impossible to follow, let alone to test exhaustively for flaws.

Using the same model as the Camry involved in the accident, Barr’s team demonstrated that there were more than 10 million ways for key tasks on the onboard computer to fail, potentially leading to unintended acceleration.* They showed that as little as a single bit flip—a one in the computer’s memory becoming a zero or vice versa—could make a car run out of control. The fail-safe code that Toyota had put in place wasn’t enough to stop it. “You have software watching the software,” Barr testified. “If the software malfunctions and the same program or same app that is crashed is supposed to save the day, it can’t save the day because it is not working.”

So how does games development fit into this unethical mess? Snugly. CD Projekt’s greatly anticipated ‘Cyberpunk 2077’ was released half baked in late 2020, full of errors and incomplete features. Remarkably, this was after three delays and months of staff overtime. This led to an investor rebellion, with CD Projekt’s executives blamed for company stock plunging 57% (a $6.2bln loss).

While it’s unlikely that any errors in our game will result in fatalities or lost billions, that’s besides the point. It is critically important to ensure that our work is the best it can be. Code isn’t finished until it is written, tested, refactored (redrafted) multiple times. It must be as lean and legible as possible.

Towards this end, Jamie and I have completed our first great purge (review) of everything we have written. Choosing to spend time paying attention to detail is critical, and protects against future losses by creating a system which is easy to maintain.

Norn Industries may only be a small company, but it is nonetheless my legal responsibility, and I have to balance delivering a good product within budget while treating my staff with the dignity and respect they deserve. This is possible only by pursuing the highest standards of craftmanship and professionalism.

It is important that we do not overpromise or suffer ‘feature creep’, as the creative process is as much about building relationships as products. Game designers want to implement new ideas, but every idea has a financial cost. Creating the best possible experience with the fewest possible parts has to be the project’s guiding philosophy, as this is the most realistic design principle. Elegant innovation needs good working practices.

Part of the legal and moral responsibility of leadership is setting good examples and criticising bad behaviour. The games industry is infected with bad working practices, tolerated only because of immature beliefs about work, and imbalanced relationships between employers and employees which enable exploitation. Leaders are responsible for the welfare of others, to exploit this position is disgraceful and inexcusable. Exploiting customers is no better, perhaps best exemplified by the industry’s fondness for ‘loot boxes’. Loot boxes are gambling services sold to children, so for EA to tell the UK Parliament they are ethical “surprise mechanics” is outrageous. It is also outrageous that industry giants like Activision’s CEO Bobby Kotick boast of record profits and increase CEO remuneration, all while laying off workers and keeping others on the breadline.

Unfortunately, this behaviour is part of a broader trend. One 2016 report published by the London School of Economics found that the top ten recruitment firms responsible for placing 70-90% of British chief executives described executive remuneration as “absurdly high”:

Headhunters claimed that, for every appointment of a CEO, another 100 people could have filled the role just as ably, and that many chosen for top jobs were “mediocre”.

[…]

One headhunter said: “I think there are an awful lot of FTSE 100 CEOs who are pretty mediocre.” Another added: “I think that the wage drift over the past 10 years, or the salary drift, has been inexcusable, incomprehensible, and it is very serious for the social fabric of the country.”

There is no excuse for bad working practices or labour exploitation, especially from people who should know better and who can definitely afford better. There is also no excuse for bad code.

Devblog 25: Beetles! Economics!

Month Fourteen

When Jamie said “the internet is a mess and I hate it”, I suspected he wasn’t enjoying network programming. Frustrations regarding IPv4 versus IPv6 aside, he has completed the core networking task, and is happy for now. Unfortunately for Jamie, there’s more mess to it than that. Matchmaking with strangers over the internet is a different sort of thing to connecting with friends over LAN or WAN. But we’re in a good place now, which wouldn’t be possible without his hard work.

Speaking of hard work, we have more concept art: Michal has finished the first harvester. This beetle will run about the battlefield gathering resources for the player. I’m not sure if technically it is a crab… given how its claws look more like arms. Regardless, we are very happy with it.

Beetle harvester is very beetle.

When it came to the question of in-game economics, there are already a diverse set of examples in the RTS genre. As some of you may have guessed, the existence of a dedicated harvester implies that Command & Conquer (C&C) is an influence. But there are ways our design will depart from classic formulas to provide novel choices.

While there’s nothing wrong with Age of Empire’s villagers or StarCraft’s SCVs, I wanted an economic model which provides more opportunities for drama. The low cost and fragility of villager-type gatherers makes their untimely demise probable when the enemy shows up. In C&C, the question of whether the harvester will survive is more tense, because the harvester is more expensive, carries more resources, and is much tougher. The decision to create a harvester is more serious, and losing one is also a big deal.

However, if the player wants, they can task soldiers to gather resources. The downside of this is the aforementioned fragility, but also that a harvester would be far more efficient. The point of this choice isn’t to balance gather rates between soldiers and harvesters, but to provide the player with the means to do something if they don’t have access to a harvester.

Our design philosophy is about providing meaningful choice: players must be able to respond when things aren’t going to plan. C&C and AoE provide players with options when they start losing, and in both games you can run away and rebuild somehow: the map is a canvas upon which strategy is painted. That experience is dramatic and fun, especially when a comeback turns the game around. It feels natural to want to create a mutant hybrid from those two influences, and that’s exactly what we are doing.

Devblog 24: Meet the Androids

Month Thirteen

Jamie and I have almost completed our two largest challenges: networking and pathfinding respectively. The next tasks will be technically simpler, and provide the core gameplay loops necessary for a Real Time Strategy experience; base building, resource gathering, etc. Jamie’s work in particular has allowed us to finally upgrade our engine, from Unity 2019.3 to Unity 2020.2. This was because more recent versions of Unity removed their old network code libraries, which was something of a problem when we hadn’t a replacement.

Delightfully, the art pipeline has assembled. Our concept artist (Michal Kus) is delivering final concepts, which are handed to our 3D artist, while we look for an animator. So with this in mind, it’s time to reveal some of the soldiers you’ll be able to command!

From the start, I didn’t want a game which was constrained by realism. My love of science fiction is clearly showing, as I concluded that in a sufficiently advanced future there would be no vehicles on the battlefield. Why make a tank which has wheels? Legs are better. As such, each android soldier is the personification of a specific combat role. Amphibians are basic, mammals are fast, lizards are tough, and birds have long range.

These abstractions don’t make perfect biological sense (lizards can be very fast!), but I was inspired as much by the primitive power of anthropomorphic art as the idea that vehicles should walk. Considering that one of the oldest pieces of art in human history is the 40,000 year old ‘lion man‘, it seems that anthropomorphic art has possessed a spiritual significance for most of human history; inclusive of the famous animal headed gods of the ancient Egyptian pantheon. This design also allows us to make infantry types visually recognisable.

Of course, it’s all well and good to say let’s make a lizard man, but it is more difficult to know exactly how one should look. Over the last year Michal and I discussed and explored different ideas. There had to be a sweet spot somewhere between biological and mechanical aesthetics, and while designs which were less humanoid would be more visually distinct, they would also lose something of their humanity which is essentially functional. Perhaps you could say my inner Egyptologist won that debate.

Art will lag behind programming, but that won’t stop us from achieving our gameplay objectives soon enough.

Devblog 23: Blue January

Month Twelve

Jamie and I are back to work after a Christmas break. This means there isn’t much to report, but we have been making progress on lockstep, networking, pathfinding, and flocking. These systems are complex, and have proved challenging. It’s the most difficult work either of us has done. Nevertheless, we are fast approaching a basic completeness, upon which other functionality can be safely added.

It has been a year since work on the project began. Reflecting upon that, much has been done and much is still left to do. After we have finished the aforementioned concerns, the next tasks will involve the implementation of base building, economy, and art integration.

Other non-trivial tasks on the to-do list involve deterministic raycasting and computer player intelligence. The latter will be modelled on human attention and emotional states, so that the computer player behaves more like a human, providing a more interesting experience.