What is an ECS ?
Traditionally in game development, you would follow an inheritance approach to problems. A Goblin inherits from a Monster which inherits from an Actor. A Shopkeeper inherits from a Human which also inherits from an Actor. The Actor class contains a function called Render() which knows how to render an Actor, so for every Goblin you can call Goblin.Render() and for every Shopkeeper you can call Shoperkeeper.Render().
There are two main problems with this approach. The first is the problem of flexibility. If you decide that you want to visit a town of friendly goblins in the game, and you have Goblin Shopkeepers, your inheritance tree gets messed up. You have all of the shopkeeping functionality in the Shopkeeper class (selling, bartering, whatever), but your Goblin Shopkeeper can’t inherit from Shopkeeper because that would make the Goblin Shopkeeper a Human. Without a doubt, inheritance has its place in software development, but in gameplay programming it can cause problems.
The second problem is a misuse of the cache. In games, you commonly iterate over a set of objects multiple times per second, running methods on them every frame. For example, your physics system might iterate over all objects that are subject to physics and call Object.Integrate(dt), updating their position, velocity, and acceleration. So traditionally you’d have your big object that contains all of its state, including those needed for physics, and you’d call the integrate function on every object that needs to be updated. In each object’s Integrate()method, you access the object’s position, velocity, and acceleration member variables. When you access position, it’s pulled into a cache line along with nearby member variables. Some of those nearby member variables will be useful (velocity and the acceleration), while others will not be. This is a huge waste of the cache and in an age where the performance bottleneck is the time it takes for data to get from main memory to the CPU’s memory, it’s a big deal.
The tides have been shifting into component-based design to solve the first problem. Looking at Unity, for example, all of the game objects are component-based. You start with a blank object that has only the default required Transform component, and you add more components to give the object functionality. But that hasn’t solved the second problem.
The second problem is solved by keeping all of the data that will be iterated upon regularly packed tightly into memory so that an entire cache line’s worth of data can be loaded at once, and when the next item is iterated upon, its data is already in the cache. This is solved by defining components as Plain Old Data (POD), essentially a simple struct with only the relevant data included. To continue the physics example, you might have Transform with position, Rigidody with velocity and acceleration, and Gravity with the gravitational constant g.
The physics system would then iterate over all “objects” that “contain” these three components, pulling in only the data it cares about into the cache.
In reality, the traditional concept of the “object” is gone. Instead we have an Entity which is simply an ID. It doesn’t “contain” anything. Instead the ID is used as an index into an array of components. An array is contiguous in memory which lends itself well to being the data structure of choice. So the physics system might have a list of all entities that have a Transform, RigidBody, and Gravity component, and use the entity’s ID as an index into the Transform array, into the RigidBody array, and into the Gravity array.
So conceptually it’s all pretty simple. An Entity is an ID. A Component is a struct of data. A System is the logic that operates on the components.