Game Design, Programming and running a one-man games business…

Coding vs Software Engineering

This is a topic I feel strongly about, but at the same time I am very aware that its very difficult to get across in text, because its not something you can really illustrate with a single line of code, or a witty cartoon or a small diagram, so I may go on a bit here…

I have been looking at code *not written by me*, and also talking to friends learning some new stuff who are also working with other peoples code, and have been reading a book on this topic, so my head is full of opinions on the topic of coding versus software engineering. let me first explain the difference.

‘Coding’ is the skill of understanding syntax and principles of how programming works, and slapping together a bunch of code that makes something happen. This is not *that hard*, and in fact yes, you can buy totally serious books that claim to teach you C or C++ in 21 days or less, which is laughable…but yes it does allow you to write code that compiles without errors and does the thing you want it to do.

‘Software Engineering’ is like coding, but much much HARDER. Mostly its about the scalability and long term usability of what you code. Code may ‘work’ in the same way that replacing a key component of an old car with a coat-hanger or a piece of string may *work*, but its likely going to go wrong at some point, nobody else will understand what it is or how it works, and when you try to scale it up, everything may completely fall to bits.

Software engineering is a pain because the best way to really get good at it is probably just experience of writing very large programs again and again and again, with different people, on different platforms, with different requirements, and having people criticize your code, or finding bugs in it, or having to revisit it five or ten years later to fix stuff.

The problem is that to 99% of people, and even 95% of coders, the difference between coding and software engineering is actually REALLY hard to spot. because many coders are managed (especially in the games industry) by non-coders, they aren’t even encouraged to get good at software engineering, because frankly the boss doesn’t know what it is.

When you are working as a coder, in crunch, at a game studio with deadlines, generally speaking the boss wants result X by date Y. The big problem is that result X is really shoddily defined. If ‘compiles and runs and the QA team couldn’t make it crash’ is the criteria, then LOL, yeah done easily mate. Unfortunately anything beyond that level of skill goes unrewarded, because its REALLY hard to spot.

Luckily I’ve worked for some very clever coders. My first coding boss (at elixir) was Dave Silver, who is now a mega-celeb in the world of AI at DeepMind. My second coding boss was James Brown (Lionhead), who now spends his time replicating conways game of life using lego for some reason. Both of them were very clever, and I’m a better coder for working under them. I learned a lot from them, not about *code* (which you can get from a book) but about software engineering.

If you haven’t already read ‘Code Complete‘ I really recommend that you do. Its excellent and is probably step one on the path to this stuff. The next things you should do are to work on a BIG project with other coders, and also work on the complete ‘project lifecycle’. This means, you start off with nothing, and finish when the project has shipped, and gone through multiple updates, ports and patches. Only then do you really know if the architecture choices you made at the start are correct.

A fairly simple blog-post style tip on this stuff concerns feature/syntax use and what I call the ‘gunslinger’ attitude. Take this line of C++:

X = X +1

Pretty much anyone (coder or not) can tell you that this adds 1 to the value of X. You can also write this

X += 1

Which does the same thing actually, and theoretically is very very very slightly faster because X is only evaluated once. However, its dark times indeed if in 2020 we cant expect a compiler to realize this and do that sort of thing for us.. Lets get a bit more vague…

float fInitA = InitA > 0 ? ( float )InitA : 1.f;

WTF? Now I am a C++ coder, so I can understand this… but I have to actually engage my brain to do so, which slows me down. Its not immediately intuitive to my half-asleep brain exactly whats going on here and… It really does not have to be written this way. You can just do this:

float fInitA = (float)InitA;
if(InitA < 0)
{
  fInitA = 1.f;
}

And OH MY GOD THE HORROR, its 5 lines of code instead of one. My god. What a n00b. Obviously this idiot doesn’t know about the C++ ternary operator and its syntax. The fool!

And yet its actually readable, and much easier to debug because its multiple lines allowing for breakpoints. The longer simpler version here is much BETTER code. And thats generally IMHO a principle that you can stick with. The trouble is, some coders adopt a ‘gunslinger’ attitude where they are presumably living out dreams of alpha-male dominance through writing the most complex obfuscated mess imaginable. Believe it or not your job as a coder is to write CLEAR and MAINTAINABLE code. You do not get fined for every line you use, and you do not earn points for confusing the people working with you.

There is a very ‘macho’ culture in programming, built around showing off, and using obscure stuff that you just learned. This is nuts. Just because you learn how to use a certain feature/function/syntax does not mean you HAVE to use it everywhere. I’ve worked with coders like this. Its a nightmare.

Its a worthy goal to write code that someone who isn’t even a programmer can look at and go “errr… I think I can see what you are doing here.”. This is because really GOOD code is code that can be understood by someone you have never met, five years later when a bug has been found and they need to work out if its in that function or not. If you are writing a tiny program thats only 1,000 lines of code and nobody else will ever see it, and you will never edit it then…ok maybe you can hack it together, but a proper software engineer always writes code that can be maintained.

Programming is a HUGE topic, and to get good at it, to get REALLY good at it takes an entire lifetime. I started coding aged 11, which is 39 years ago. I think I’m pretty good at C++ now, but not an expert, and its the only language I’m comfortable with. The internet and its many youtube vids and forums have spawned an attitude that you can learn to code one summer, or during lockdown, and…yeah not really. You can learn to hack stuff together by copying and pasting from stackoverflow…but thats really not proper software engineering.

Its worth saying I’m not exactly at the end of the journey myself yet either. The code for Democracy 4 is *not perfect* by any means. Some bits are hacky, there were some fundamental design decisions I made about 15 years ago with the basis of my GUI library that are embarrassing but still there (of COURSE buttons should be a subclass of window you idiot!), but overall my code gets better with each game.

I coded about 5 games before I realized that having a decent separation and naming convention to keep GUI and Simulation code entirely separate was a worthy thing! I probably coded 8 games before I had a rock-solid translation-management system that meant not a single line of text exists in code. It took me maybe 10 games to get threading to work safely, and maybe another 2 until I had a rock-solid and highly-optimized multi-threading system. I didn’t really start to use the power of macros for about 10 games. I’ve only just (in the last 2 games) really got my code for setting up configurable color palates to be usable.

I had most of the technical knowledge to do all of that stuff about 15 years ago, but to do it *well* and to know how to arrange things, and to set them up to be re-usable, optimized, stable, and readable… thats what those extra fifteen years were spent doing.

The VAST majority of comments you read online about programming, especially games programming are written by coders, not software engineers. They suffer a lot from the delusion that they have mastered code, because (as is natural) they don’t know what they don’t know. Its REALLY hard from a distance to spot the software engineers from the coders, but in my experience the amount of time they have been in the industry, and the number of large completed projects is a really good sign.

A final way of spotting the difference: If a lot of someones code has been copied and pasted from stackoverflow or pastebin then… yeah. Thats not a software engineer.

Rethinking the game dev productivity gap

Its really only in the last six months I’ve realized this, and I’ve been an indie for twenty+ years and coding for 39 years, so yeah…this took a while to sink in.

I am frustrated on a CONSTANT basis by the lack of productivity of almost everybody in the universe. I am especially irritated by the low productivity of most people in game development, and most indie devs. I almost never read about the development schedule of a game, (mostly through post-mortems, interviews or chatting to actual humans), without being shocked at how long it took to do stuff.

For most of the time, I have attributed this to an attitude. I work pretty much every day, and for most of the day, although my schedule these days is deliberately lighter than the early years. I’m prone to going out for lunch or to coffee shops, but then I’m prone to working all day Saturday and Sunday, so YMMV. I also often reply to forum posts, youtube posts, blog posts and emails in the evenings from my laptop. I’m often thinking about code when I’m not writing it.

Because of this, I find talking to people with a less work-centric attitude to be infuriating. It boggles my mind how long it takes most devs to add what seem like easy and simple features to games. I am constantly told that I am woefully inefficient because I don’t use unity, but still seem easily capable of working faster in terms of adding features & content than the very people who berate me for not using such productive tools.

So yup, I often think such people are just lazy. Or do not have the same attitude as me, or do not realize just HOW HARD it is to compete in this industry. In other words I think that their mindset is less focused, and its a personal weakness on their part, because yup…i’m a bit obsessed.

But now..I’m thinking there are two other things that explain the disparity better.

First thing: Lack of distractions. I have 3 cats, and live with my wife and these 3 relatively-low-maintenance pets, but no kids. I have a hobby of playing the guitar, which I make myself do a bit each day, but thats it. I am not having to take time out to walk the dog, pick kids up from school, drop kids at school, answer questions from kids, sort out other stuff for kids, walk the dog again, and so on. My wife is a writer, so has the same introverted ‘happy to be alone with a keyboard’ daytime work schedule as me.

Nobody ever phones me, unless its an elderly relative. I have a call screener device to prevent phone spam, and we live in the middle of nowhere. Nobody knocks on our door trying to sell us anything. There is very little noise. Its the perfect set up for zero distractions. If you possibly can do ANYTHING to reduce the distractions in your day, do it.

The second thing: experience.

This is the big one. I’ve been coding for 39 years. Thats an AGE. When I first started learning computer programming, this person was US president:

Image result for jimmy carter

Yup, exactly.

That means any silly mistake you can make when designing code…I’ve done in thirty times. 95% of my conversations with fellow devs when I’ve hit a bug go like this:

“Could be a memory-bounds issue…?”

Me: “Nope”

“Could it be that you deleted the object?”

Me: “Nope”

“…Maybe its a multi-threaded synch issue?”

me: “Nope”

…and so on.

Now that sounds super arrogant, like I think I’m the bees knees at C++. Actually I am not. I am not that good an all-round programmer *at all*. I am VERY good at learning in excruciating detail about the elements of C++ that I use, and nothing else. Because I work for myself I have no marketable need to be an all-rounder. I don’t need to learn ‘agile’ or ‘scrum’ or ‘.NET’ or RubyOnRails or whatever the hell jobs ask for this week/month/year. Its irrelevant to me, so I can be VERY good at VERY few things. This is hugely efficient.

Plus… again, trying to put my arrogance in context here… language proficiency is language proficiency, whether its English or C++. C++ is way less forgiving than English, but still…how good at English were you when you had been speaking it for just five years…versus thirty years? Hardly an exact comparison I know, but I think its a good mental exercise. I get better at C++ every year, but in a way that is not exactly how you would think:

I do NOT know more ‘clever tricks’ than a newcomer to C++. I do NOT have a better memory of the syntax of C++ than a newcomer. I do NOT type *that* much faster. I do not make use of a wider range of the standard C library than anybody else. I don’t do any of those things. What I *do* better, is that I have just learned from my mistakes.

A lot of mistakes.

I used to take the odd coding test in job interviews back in the day. These tests are good for one reason: to see if the candidate has any clue about syntax. Thats pretty much it. The amount of code required otherwise renders the test pretty much useless.

The trouble with C++ is that it attracts hotshot coders. These are people who think a super-complex algorithm, or the algorithm that uses the most clever combination of features will somehow get them more sex/money. This is predictable and sad, but not useful in terms of real productivity.

The best code, is the combination of three things:

Simplicity, Performance, Readability.

A lot of really, really good code looks fairly boring, because boring is often simple, fast and readable. The worst possible insult you can get from a senior/lead programmer with experience is this:

“That looks a bit over-engineered”

Its truly a damning insult, but you only really realize how insulting it is after about thirty years of writing crappy code. I wish I knew of an easy way to help people fast-forward those thirty years and develop the skills you have at the end of it, without those thirty years but I don’t think I can. The only advice I can offer is this:

  1. Write as much code as you can. Not over-engineered nonsense, but just code a lot. Put the hours in. At least the thousand obviously, but likely way, way more.
  2. Get a job with a really experienced coder and ask for criticism of your code. Only someone who works with you all the time will read enough of your code to really give you structural, high level advice on why your code sucks.
  3. Read code-complete at least twice, if you have not done so already.
  4. Get cats not dogs. Cats don’t need a walk.

Hope that helps someone :D

Website optimization in 2020

Sooo… in a random moment of surfing a few months ago I encountered an article of the webp format and how it was faster, and how it was a Google thing, and they therefore wanted you to use it. I knew I had a server move coming up (long story) so delayed worrying about it until now…

Basically webp is like a super-amazing improved replacement for PNG that is MUCH more efficient. Full details here, but for example one of the files I converted to webp went from 942k to 189k which is not to be sneezed at. I still cannot tell ANY difference when I look at both images. Sadly wordpress is too useless to upload webp, but here is one embedded:

…and here is the png:

…exactly.

So with this in mind, I replaced some of the larger images on the Production Line webpage with webp equivalents to speed up the loading. This IS WORTH DOING, but its also worth remembering that some Luddites may be using stupidly old browsers that cannot cope with webp, and you need to also have the option of a png for these people. You can do this with some magic modern html like so:

<picture>
	<source type="image/webp" srcset="images/thumb.webp">
	<img src="images/thumb.png" width = "1000" height="563" >
</picture>

That basically says ‘show this webp image, unless you don’t have any idea WTF that is, in which case here is an old fashioned png. All of the attributes for your image still go in the src bit.

That got me a nice speed bump, but some test done both with googles site checker and also the popular web speed test showed I was mainly slowed down by third party stuff, specifically humble bundle widget and youtube embeds. (I embed 2 large youtube videos on that page). This is annoying, but after a lot of fiddling I found a reliable way to get around the slow youtube stuff.

What I did was have 2 identical sized elements on the page for each video. One a ‘panel’ and the other a ‘vid preview’, which was basically a big thumbnail made by me (webp obviously) with a fake play button to simulate youtube. The code in the actual page body looks like this:

<div id="panel">
<table width="100%" align="center" cellpadding="0"cellspacing="0">
	<tr>
	<td align="center" width="1000" height="563">
	<iframe id="trailer_youtube" width="1000" height="563" src="" frameborder="0" allowfullscreen></iframe>
	</td>
	<tr>
</table>	
</div>
				
<div id="vidprev">
<table width="100%" align="center" cellpadding="0" cellspacing="0" onclick="myFunction()">
	<tr>
	<td align="center" width="1000" height="563">				
		<picture>
		<source type="image/webp" srcset="images/thumb.webp">
		<img src="images/thumb.png" width = "1000" height="563" >
		</picture>						
	</td>
	<tr>
</table>	
</div>

In practice what this does is say ‘here is an embedded iframe called ‘trailer_youtube’ with NO source. And here in the same place is a big phat image. BTW if we get clicked call myFunction()’.

Then at the top of the page in the header we add some code:

<style>
#panel, .flip {
  font-size: 16px;
  text-align: center;
  color: white;
  margin: auto;
  z-index:1;
}
.vidprev
{
z-index:2;
}
#panel {
  display: none;
}
</style>

…which sets the z index (bottom to top stacking) of the two panels, and then we need some actual code for when the thumbnail is clicked on, also in the header:

<script>
function myFunction() {
  document.getElementById("trailer_youtube").src = "https://www.youtube.com/embed/IhGTKBAC94c";
  document.getElementById("panel").style.display = "block";
  document.getElementById("vidprev").style.display = "none";
}
</script>

…that code basically grabs the youtube panel, sets it visible, and assigns it a proper valid youtube link, handily deferring any connecting to youtube.com until we need to. it also hides the thumbnail. The result is a MUCH faster page load (roughly half the time).

In addition, I used some javascript called ‘lazy sizes’, to make the loading of some items lower down the page asynchronous, so they wont even get loaded until the visitor scrolls down. source:

<picture>
 <source type="image/webp" data-srcset="images/resources.webp" class="lazyload">
<img data-src="images/resources.png"  class="lazyload">
</picture>

and that requires an extra include:

<script src="./js/lazysizes.min.js" async></script>

The result is pretty good, and raises Google’s estimation of my site speed quite a chunk. That will be good for SEO with Google, and they are basically the only search engine that counts so…yay :D. here is the full waterfall chart:

Production Line DLC#2 COMING SOON

So we now have an official coming soon page for the new Production Line expansion pack (Design variety pack). Here it is in all its amazing html glory:

https://store.steampowered.com/app/1174730/Production_Line__Design_Variety_Pack/

Obviously the most exciting part of the page is the ‘add to wish-list’ button, which i thoroughly encourage people to do, as gossip among indie devs is that having high wish-list numbers converts into valve sending you nicer chocolate at Christmas, or something like that (I forget the details). Actually the best thing for me to do is probably embed the steam widget thingy:

I have no idea why there is a scrollbar on that widget. I think its safe to blame the mess that is wordpress…

Anyway at the moment the store page is not translated into each language but I’m getting that done now. All the actual content is done, and tested in game, and the new cars look lovely. Its purely cosmetic, so don’t yell at me if you can’t afford it for ruining game balance or whatever. I read that epics cosmetic DLC earns them a bazillion dollars and I’d like to retire eventually (ha!.. will never happen), so somewhere in that paragraph is my reasoning for adding content to the game…

On less business-y levels… I’m tracking down some ultra-rare but annoying Production Line bugs right now. One is a thing where very, very rarely, sounds stop streaming, or the music stops (after a good number of hours). I am digging into this, but its super hard to pin down.

Another bug is related to an error message in logs (which is now harmless…but bugs me) relating to shaders, and some visual artifacting. I discovered that the two different systems I was using to set and unset shaders may potentially have come into conflict, so I fixed that abominable code architecture by ensuring the game only has one possible system for turning shaders on and off, and hopefully now there can never be a conflict or a shader ‘stuck’ on. This will all be in the next patch, just before the DLC release.

Oh…and expect more Democracy 4 update goodness soon

Speeding up Production Line (large factories)

I thought I might try another of those ‘live blog’ entries where I go through my thought processes as I do some code profiling to speedup the route-finding slowdowns on super-huge Production Line factories. Here is the map I’m analyzing usage on.

I’m loading the map, waiting 20 seconds, then making a change to the resource route layout, then letting it run another 20 seconds, and taking a look at the function level profiling snapshot using aqtime. This save gamne has 8,994 slots in it (which is massive) and uses a custom map.

Here is the function breakdown for the GeneratePath() code which seems to be a major cause of any framerate issues when changing routes:

I probably need a refresher in my mind as to how this code works so… Lets look at when I change a route by (for example) placing down a new conveyor. This makes the following calls:

			SIM_GetResourceConveyors()->RefreshExits();
			SIM_GetResourcesRoutes()->PurgeAllImpossible();
			SIM_GetResourcesRoutes()->InvalidateRoutes();
			SIM_GetResourceObjects()->OnRouteAddition();

RefreshExits is trivial, taking on average 0.002ms, so not a problem, PurgeAllImpossible() doesnt show up in the profiler, but it basically sets a lot of flags to be false, and calls a sub function that also is presumably too quick to show up, so likely not a culprit.

InvalidateRoutes() is likely the culprit. It tells every slot, and every stockpile that it needs to begin verifying its current cached route table in case something has changed.

Finally OnRouteAddition goes through every intransit resource object and tells them they need to verify their route(over the next two seconds) in case a newer route is available, or the current one just got deleted. This is also too quick to show up.

So it looks like all that route invalidation is the slowdown. But why? The actual function takes 0.2ms, which is slowish…but not noticeable when a frame is 16.0 ms. Its the delay over the next few seconds that causes the problems… Basically every slot with a stockpile calls PrepareToVerifyBays()…

void SIM_PlaceableSlot::PrepareToVerifyBays()
{
	BayVerificationPending = true;
	int limit = BAY_VERIFICATION_INTERVAL;
	if (APP_GetPerformance()->DoRoutesLimitFramerate())
	{
		limit *= 2;
	}
	BayVerificationTimer = GRandom::RandomChoice(limit);
}

So in English, this code tells the slot that it needs to verify its nearest import bays, and to do so at some random time over the next 60 frames (1 second). A globally calculated value works out if we have a poor framerate, and if we are verifying bays, and thus notes that given this PC, this map etc…we need to allocate twice as much time as usual (4 seconds) to verify those bays…

Thus we have maybe 4 seconds max (4,000ms) over which to spread the effect of all this route re-calculation. That might mean up to 4 seconds of lag, which may seem a short time, but if you are watching a resource item getting delivered in the game, and it travels down the wrong route for more than four seconds…you would be perplexed, so its a reasonable target. How to make it faster?

Bays are verified every frame. a check happens every frame to see if a verification is pending, and if it is, a function gets called. That function is pretty huge and its called SIM_ProductionSlot::RecalculateNearestBay(). In my profiled sample it was called 2,200 times and on average takes 2.8ms. OH MY GOD. Almost all that time is spent inside SIM_ResourceConveyorManager::GetNearestBays(). This function itself does some fancy multithreading, spinning off taks to a lot of threads (8 on my PC) each of which processes a list of routes, eventually calling SIM_PathFinderManager::FindPath(). This gets complex so lets look at a multithreaded analysis of it in vtune:

This actually looks pretty efficient and packed, so i think that the real answer to speeding this up is not speeding up the actual route-verification, but culling the cases in which I actually even bother verifying a route. For example…Here is a simple map with production slot stockpiles at A and D, and import bays at B,C,E,F…

Both A and D keep a track of the shortest route along conveyor belts to the 2 nearest import bays. Note not every tile is a route, only each one marked with conveyors, so we have the following.

Now lets say I add a new conveyor belt tile to improve the routing from A to its two nearest bays…

Now as mere humans, its pretty clear to us that as soon as possible, we need to get A to recalculate those two routes it has to its nearest bays, because there is now a shorter route from A to B (but actually not C). Its also pretty clear that D’s routes (to E and F) are totally unaffected. Right now… my algorithm is dumb as fuck, because it tells A and D ‘OMG something changed! redo everything!’, which is a massive waste of resources. The problem is…How exactly does a better algorithm look?

This is one of those things that sounds simple but actually is not. If I learned computer science at school I’d likely be better at this, but I’m self taught. Here is my current thought process:

The route from D to E is 4 tiles. The distance from D to the new tile (pointed at by the arrow) is greater than 4. Thus there is NO way that my route from D to E can possibly be affected by this change. Thus I can leave D as a slot that keeps its old fashioned routes.

So an algorithm that made this optimization would be something like:

For Each Slot
bool bcheck = false
int max_dist_bay = GetMaxDistToBay()
For Each changed tile
If CrudeDistanceToTile <= max_dist_bay
bcheck = true

So its worth trying to see if that helps…

…And I tried it! I even wrote some debug output to verify that the number of slots that got verified each time was below 100%. Something I didn’t realize was that obviously for changes taking place right in the middle of a map, the majority of the map ends up being covered by this algorithm, as the distance for slots to the center of the map is often about the same as the distance to the nearest importer. However, when making changes nearer the edges and corners of the map, a way smaller percentage of the slots need verifying, sometimes just 2-4% of them.

Even if I just assume that we average 50% of slots being verified instead of 100%, this represents a 2x speedup in the game during these events, which should be super noticeable. Of course the only way to be sure this feeds through to user experience is to stick the new algorithm on a togglable switch and watch the frame rate counter and…

YES

It boost the framerate a LOT. Obviously its still spread out over the same amount of time, but the total amount of recalculating needed is way lower so… obviously we get a big framerate boost.

This is NOT a perfect optimised way to do it, but you can bet its already roughly twice as fast, which is great. I need a better way to work out what slots are needlessly recalculating, AND I need to ensure I can use similar techniques on situations that cause a recalc by other means, such as when stuff is deleted but…its definite progress.

I’ll probably work on the deletion case tomorrow, do some more extensive testing, then try it out in the 1.80 preview build (unstable beta) on steam.