Devblog 30: The Stopwatch Strikes Back

Month Nineteen

We often worry about time. Usually it’s a gnawing anxiety that we aren’t spending hours wisely, or that we might be a few minutes late to something important. But have you ever worried about a millisecond? One friend is a professional athlete, and even he measures success in seconds. Not once have I heard him explain that he plans to improve by half a millisecond. Nor would I take him seriously if he did.

And yet, here we are, worrying about milliseconds. It all began when we realised that a group of one hundred units was slowing the game down to 10-20 Frames Per Second (FPS). Considering that a hundred units was benchmarked last year running at 130-160 FPS, this was somewhat concerning.

So Jamie and I began our investigation, at first with a bit of pair programming. Unity has profiling tools, which provide a visual breakdown of CPU usage, showing where we must start. This can be confirmed by simply ripping features out, and seeing if that helps. Unfortunately, the vast majority of time is spent combing through the code using C#’s stopwatch feature. This allows developers to see how long something has taken in milliseconds (thousandth of a second).

Optimisation can be surmised using an old environmentalist slogan: ‘reduce, reuse, recycle’. The biggest problem is when code repeats itself. This often occurs during initial development, when it may not be obvious just how time critical an operation is. Outcomes are especially bad when iterating over the same list multiple times, so the most important thing is to reduce the number of loops.

Recycling code via caching is good practice, but also complex. Generally, the smaller the object, the more efficient it is to reuse. In one example, we found it was twice as fast to cache (and reuse) an array with four items instead of creating a new array each time. Large collections however are expensive to reset, because the operation is O(n), meaning it may be better to create a new collection instead. The only way you’ll know is by testing. Programmers must remember that these decisions are never free, and are always a trade: spending more memory for less CPU usage, or vice versa. Discarding more items makes the ‘garbage collector’ work harder to delete old resources, which is itself a non-trivial cost.

Another somewhat obvious surprise was about deep references. The more times code must call references, especially ones external to the object, the more expensive referencing becomes. It may not be immediately obvious that this is happening, but the simplest way of thinking about it is counting the number of dots in a function. More dots, more references, more expense. Caching references locally may not seem worthwhile, but often is. Consider the following example from within a loop:

            int newWorldX = Self.Coordinate.worldX + direction.worldX;
            int newWorldY = Self.Coordinate.worldY + direction.worldY;
            int newWorldZ = Self.Coordinate.worldZ + direction.worldZ;

That doesn’t seem too bad. But let’s cache the Coordinate object:

            Coordinate selfCoordinate = Self.Coordinate;
            int newWorldX = selfCoordinate.worldX + direction.worldX;
            int newWorldY = selfCoordinate.worldY + direction.worldY;
            int newWorldZ = selfCoordinate.worldZ + direction.worldZ;

Now, count the dots. Nine and seven respectively. Doesn’t seem like much, but that change improved performance by between 0.1 and 0.2ms. Of course, a superior solution would be to avoid calculating new coordinates each time entirely, which requires Coordinate objects to cache their neighbours. With modern computing, that may be an acceptable cost in terms of memory.

Now, some of you will read 0.1ms and think I’ve gone mad (perhaps just descending further into madness). A hundred nanoseconds Richard, really? One phrase used by programmers is a helpful guide: “premature optimisation is the root of all evil” (Sir Tony Hoare). However, in our case, this isn’t premature and we really do need simulation code to be as performant as possible. With careful attention to detail; reducing, reusing, recycling, we have improved the median time it takes per tick to simulate 100 units [at rest] from 8ms to 2.5ms. The biggest improvement was probably fixing spatial partitioning, which was returning far too many neighbours. But that could never be enough, and every line of code involved in unit movement has to be examined with our reliable friend the stopwatch.

Further optimisation requires less calculations per second. In Chris Park’s excellent development blog post: ‘Optimizing 30,000+ Ships in Realtime in C#‘, he provides many wise tips and tricks. The most important is probably that a ship’s rotation is only calculated four times a second, and players didn’t notice the difference. At any rate, the simulation doesn’t need to calculate rotation every tick. This will further improve performance from 2.5ms, and so achieving a sub 1ms result is possible.

In other news, I am delighted to report that Norn Industries’ application for funding from local government (The Pixel Mill) has been successful. This funding will allow us to hire another programmer, and to extend pre-publisher development into 2022. Success would have been impossible without Rory Clifford’s help, the Pixel Mill has been an invaluable resource. Hiring is a relief, as Jamie and I need another programmer with games development experience. As Norn Industries is a small organisation, our hiring policy is like the British Army’s Special Air Service: you can’t sign up, we will ask you.

Devblog 29: Manifest Good Vibes (With Good UI)

Month Eighteen

The User Interface (UI) is a critical component of any digital product, whether a website, mobile app, or game. It must be inviting and intuitive, and sometimes what is necessary goes beyond functionality. As an immersive experience, games need to create an ambience for the player. Disco Elysium does a wonderful job of this, combining pastel backdrop and soft atmospheric music. But we are here to talk about our UI… and not games I’m currently playing.

The old UI was basic, but served its purpose. Buttons were generated by bespoke code at runtime, rendered on screen with absolute values, such as position, width and height. The problem was this doesn’t scale for different monitor resolutions. But now we are hurtling towards the end of the core development phase, and need something better. We also have no art assets for the menus at this time, as the team are focused on other jobs. So my task was to prepare the UI for being ‘skinned’ later with some colourful images and animations.

This task was divided into three phases: research (as always), the menu system (pre-game) and the GUI (in-game). Usually in software development the design and development of a UI would be delegated to a dedicated technical artist who has specialist knowledge. Richard has me, without specialist knowledge.

Before creating any designs, I browsed the UI’s of games I’ve played, as well as other games in the RTS genre. I researched a lot of start screens and tried to spot the differences between the good and the bad. After some digging (and over fifty open Chrome tabs) it started to become obvious which ones simply worked and which ones didn’t.

Classic and modern RTS games have significant differences: modern games like Northgard adopt minimalism, showing information only when necessary, while older games make panels of information permanent fixtures. We wanted to make the menus as minimalist as possible, much like Northgard or Disco Elysium. Just simple and pretty menus which allow the player effortless interaction.

Before and after. Next step is to jazz up the buttons, add some animation and a logo.

We wanted to make sure that everything would be accessible exactly where the player expects. Eventually we’ll enable the player to move these elements, such as the minimap, to wherever their heart so desires, affording a personalised UI experience. But for now we must establish a generic layout, such as an actions interface positioned at the bottom of the screen. This space will dynamically populate with actions, based upon the player’s selection. I also added unit counters along the top, so that at a glance players know how many of each unit type they possess.

All of the objects shown were created using Unity’s UI tools. It took a few tutorials before I understood the different components. For example, my first attempt at the dynamic action panel involved programmatically creating and moving buttons. After some irritation with that not working, I came across Unity’s “GridLayoutGroup” component, which unsurprisingly did what I wanted. One caveat though, it did not change the size of child elements (i.e. action buttons) to fit the space provided. So a small custom script was needed to supplement this component.

This task did not require a perfect, final, or even pretty UI, as we have begun an iterative process, to refine the design until it performs without issue. For now, the foundations have been laid, making future work much easier.

Devblog 28: The Fog

Month Seventeen

Unlike John Carpenter’s movie, this blog post will not be about Ghost Pirates.

Chess players have total knowledge of the game state. They know where their opponents pieces are and can plan accordingly. In RTS games it is important that this is not the case. Players must be allowed time to build their armies and defences in secret; a match could be over in minutes if everyone knew exactly where each others bases are and what they are doing (looking at you grenadiers). Not knowing where the enemy is creates suspense and rewards players who bother to scout.

The way we can achieve this is to produce something called ‘fog of war’. This fog covers the entire game world and is dispersed by troops as they traverse the field. Think of it as the Mist from the movie of the same name. The characters are our units, we want them to be able to see only a certain distance ahead.

There are two ways to create fog of war, one is computed on the CPU and another on the GPU. I chose to implement the former, as it seemed easier to understand and implement. Programmers love to follow the KISS principle. No, that does not mean putting on makeup and listening to ‘Rock And Roll All Nite’.

Sometimes we end up having to redo our work. I began by creating the fog, which was basically a black plane sitting above the game world, and corresponded to an array of fog tiles. Next I created a script that would be added to each unit. This script adds the unit to a list of ‘fog effectors’, and while the unit is moving, finds the intersection on the fog plane between our camera and the unit. We would then use this point as the centre of our circle of dispersal. This circle was then used to gather coordinates and compare that to a list of fog tiles, if the tiles were ‘fogged’ they were added to a list.

As you can see, It’s a good thing we have actual artists drawing the characters for the game.

At first this process worked quite well. The system cycled through each effector and ‘dispersed’ the appropriate fog tiles. There was one problem though. With RTS games its quite common to have hundreds of units on screen at once. This test proved fatal. Frame rate dropped to around 1.7 frames per second. Which is not playable. At all. We had to do better!

After some research, and a fair few forum posts later, I found that a change of plan would be in our best interest. This was implemented in a similar way, only the computation was delegated to the hardware through the use of shaders. A map object was created containing the size and position of the game world. This was then passed into the ‘drawer’ which set properties for the shaders to do their job of covering the map in a glorious fog. Luckily I was able to get quite a bit of assistance with the shaders, as they are dark boxes of mysticism that only a wizard can make sense of.

Each team in a match is assigned a fog of war object, with only the local players’ fog rendered on screen. This lets our determinism system incorporate fog checks for other players’ units, and includes a shared fog for team-mates.

Raycasting is magical.

This solution also uses the position of the unit as it moves to determine what to disperse. This time we incorporated unit ‘line of sight’ (LOS), allowing us to create a flashlight effect, and stop dispersal when the unit is obstructed by a wall, for example.

We also use shaders to create the shape around the unit for fog dispersal. This shape is also cached so it does not need to be recreated every tick.

With this completed, and the great purge [refactor] (which Richard alluded to last month) coming to conclusion, we are now in a position to create the UI. I have told Richard this will take hours of playing games research before I can get cracking.

LOS with obstructions.

Devblog 27: Programming, Leadership, and Professionalism

Month Sixteen

It is alleged that Stalin once said “quantity has a quality all its own”. While this applies to total war on the eastern front, it most certainly does not apply to code: more is not better.

Programming is like writing novels, a good writer does not publish their first draft. Completion is far more than just getting to the end of the story, requiring incremental improvement. Complex ideas have to be written well, otherwise the reader can miss the point. If the reader is as merciless and pedantic as a software compiler (which interprets code logically) those misunderstandings will result in faults, euphemistically referred to as “bugs”.

Bad code is a moral failing. Too often IT professionals are obsessed with delivery at any cost. Too often managers understand ‘done’ to be the task has just been completed and appears to work. The client probably won’t read the code, so what does it matter if it’s a bit rough around the edges?

Bad code kills people. Bad code costs billions of dollars. Bad code means broken promises. To be more precise: bad bosses, bad managers, and bad programmers are responsible for these things. In September 2017 The Atlantic published ‘The Coming Software Apocalypse‘, citing examples of awful code and dire consequences. Perhaps the most ominous case involved litigation against Toyota:

In September 2007, Jean Bookout was driving on the highway with her best friend in a Toyota Camry when the accelerator seemed to get stuck. When she took her foot off the pedal, the car didn’t slow down. She tried the brakes but they seemed to have lost their power. As she swerved toward an off-ramp going 50 miles per hour, she pulled the emergency brake. The car left a skid mark 150 feet long before running into an embankment by the side of the road. The passenger was killed. Bookout woke up in a hospital a month later.

The incident was one of many in a nearly decade-long investigation into claims of so-called unintended acceleration in Toyota cars. Toyota blamed the incidents on poorly designed floor mats, “sticky” pedals, and driver error, but outsiders suspected that faulty software might be responsible. The National Highway Traffic Safety Administration enlisted software experts from NASA to perform an intensive review of Toyota’s code. After nearly 10 months, the NASA team hadn’t found evidence that software was the cause—but said they couldn’t prove it wasn’t.

It was during litigation of the Bookout accident that someone finally found a convincing connection. Michael Barr, an expert witness for the plaintiff, had a team of software experts spend 18 months with the Toyota code, picking up where NASA left off. Barr described what they found as “spaghetti code,” programmer lingo for software that has become a tangled mess. Code turns to spaghetti when it accretes over many years, with feature after feature piling on top of, and being woven around, what’s already there; eventually the code becomes impossible to follow, let alone to test exhaustively for flaws.

Using the same model as the Camry involved in the accident, Barr’s team demonstrated that there were more than 10 million ways for key tasks on the onboard computer to fail, potentially leading to unintended acceleration.* They showed that as little as a single bit flip—a one in the computer’s memory becoming a zero or vice versa—could make a car run out of control. The fail-safe code that Toyota had put in place wasn’t enough to stop it. “You have software watching the software,” Barr testified. “If the software malfunctions and the same program or same app that is crashed is supposed to save the day, it can’t save the day because it is not working.”

So how does games development fit into this unethical mess? Snugly. CD Projekt’s greatly anticipated ‘Cyberpunk 2077’ was released half baked in late 2020, full of errors and incomplete features. Remarkably, this was after three delays and months of staff overtime. This led to an investor rebellion, with CD Projekt’s executives blamed for company stock plunging 57% (a $6.2bln loss).

While it’s unlikely that any errors in our game will result in fatalities or lost billions, that’s besides the point. It is critically important to ensure that our work is the best it can be. Code isn’t finished until it is written, tested, refactored (redrafted) multiple times. It must be as lean and legible as possible.

Towards this end, Jamie and I have completed our first great purge (review) of everything we have written. Choosing to spend time paying attention to detail is critical, and protects against future losses by creating a system which is easy to maintain.

Norn Industries may only be a small company, but it is nonetheless my legal responsibility, and I have to balance delivering a good product within budget while treating my staff with the dignity and respect they deserve. This is possible only by pursuing the highest standards of craftmanship and professionalism.

It is important that we do not overpromise or suffer ‘feature creep’, as the creative process is as much about building relationships as products. Game designers want to implement new ideas, but every idea has a financial cost. Creating the best possible experience with the fewest possible parts has to be the project’s guiding philosophy, as this is the most realistic design principle. Elegant innovation needs good working practices.

Part of the legal and moral responsibility of leadership is setting good examples and criticising bad behaviour. The games industry is infected with bad working practices, tolerated only because of immature beliefs about work, and imbalanced relationships between employers and employees which enable exploitation. Leaders are responsible for the welfare of others, to exploit this position is disgraceful and inexcusable. Exploiting customers is no better, perhaps best exemplified by the industry’s fondness for ‘loot boxes’. Loot boxes are gambling services sold to children, so for EA to tell the UK Parliament they are ethical “surprise mechanics” is outrageous. It is also outrageous that industry giants like Activision’s CEO Bobby Kotick boast of record profits and increase CEO remuneration, all while laying off workers and keeping others on the breadline.

Unfortunately, this behaviour is part of a broader trend. One 2016 report published by the London School of Economics found that the top ten recruitment firms responsible for placing 70-90% of British chief executives described executive remuneration as “absurdly high”:

Headhunters claimed that, for every appointment of a CEO, another 100 people could have filled the role just as ably, and that many chosen for top jobs were “mediocre”.

[…]

One headhunter said: “I think there are an awful lot of FTSE 100 CEOs who are pretty mediocre.” Another added: “I think that the wage drift over the past 10 years, or the salary drift, has been inexcusable, incomprehensible, and it is very serious for the social fabric of the country.”

There is no excuse for bad working practices or labour exploitation, especially from people who should know better and who can definitely afford better. There is also no excuse for bad code.

Devblog 26: From Block to Reptile – A Unit’s Tale

Month Fifteen

Currently Richard and I are working on buildings and ‘fog of war’ respectively, and we will share these things with you. But in the meantime, thanks to the hard work of Michal, Arek, and Ayse, our art production line is up and running.

So if you’re wondering how we go from angry geometric shapes to android animals you’re in luck. Game objects all follow the same design flow:

Once Michal has finished with the sketch, Arek takes over and creates the 3D model, and the team discuss improvements, such as whether the character is correctly proportioned, or if anyone can see potential animation issues due to the model’s dimensions.

Once everyone is happy the model is finalised and coloured. This is when the character starts to come to life.

fully coloured reptile in all its T-posing glory

Next up is Ayse, who adds life to the character through animation. Each character requires several animations to make the game immersive. No one wants T-posing characters floating around, that’s the stuff of nightmares!

Generally each character needs: idle, walk, run, shoot, melee, and death animations. So while Ayse got to work creating those, I got to work on the last part of the process; importing and integrating art into Unity. This was a welcome break from the fog of war system which is proving… interesting….

My first task was to figure out how to replace our geometric shapes with these new models. Adding the .FBX files to the project was easy, these files include the model, rig, and materials. But I soon realised that the colour wasn’t right on the final in-engine product. After discussion with the art team I had to edit the project structure to use URP (Universal Render Pipeline) from the default Legacy Renderer that unity uses.

There are lots of different rendering options in Unity, but we settled with URP, as it is meant to be quick to learn and easy to use. More powerful options, like HDRP, weren’t required as our models are low poly. Also, URP has ‘shader graphs’. These are fantastic once you get past the initial learning curve.

After a few tutorials we were able to fully integrate the character model in the game, including team colours.

Animation State Machine

Once the animations were complete it was time to add them to the character. This is done in unity by creating an animation state machine, which basically has a list of Boolean values that you set in code through your animation controller. Create the state machine, add the animation, and link it to the character, and most of the time that’s all that’s required. But sometimes there are bizarre outcomes, such as the death animation perpetually rotating the character. Although if you add the Dead or Alive classic “You spin me round” it becomes hilarious.