How do we distribute the load of running core game play functionality across multiple processes to support thousands of concurrent players?
We are developing the server for a massively multiplayer online game with a client-server architecture. The game design seeks to create an immersive play experience by enabling thousands of players to interact with each other in a shared virtual world.
- The game design depends on having many players online at the same time.
- The game design does not require the virtual world to simulate a single, large and unbounded open space that permits total freedom of movement and play.
- The game design may specify the use of ephemeral maps, where small groups of players choose to share in scripted experiences.
- The game design targets player population ranges for each map based on map size, features, player character traffic patterns, and play activities expected to occur on each map.
- The game server uses a distributed deployment architecture.
- The game server uses the Distributed Network Connections pattern to manage game client connections.
- The development plan favors technical simplicity and ease of implementation over flexible configuration and dynamic scalability at runtime.
- The long-term product plan calls for incremental enhancement of the game through adding content and increasing the size of the virtual world.
- The business plan accepts that operating and support costs may increase in proportion to content expansion and player population.
Develop a map-centric game server strategy that concentrates core game play activity with the maps in which it occurs. Do this by creating two server types, area server and world server. The game’s server cluster contains many area server processes connected to a single central world server process. One or more connection servers generally manage client network connections, but are not part of this pattern.
The area servers contain the maps that make up the virtual world and control the game behavior that occurs on them.
A data structure called a map template defines the terrain, geometry and other content for a given map. At run time, an area server creates one or more map instances from a given map template. These map instances provide a context within which the core game play for the map executes. Each area server may host multiple instances of one or more map templates.
Large maps designed as common spaces for many players usually exist as singletons. That is, only one instance of that map exists in the entire game. The area server for one of these maps creates its instance at server startup, and keeps it running perpetually.
Smaller maps exist for running scripted experiences with smaller groups of players. These maps are ephemeral. The area server creates instances of these maps on demand, and shuts them down when no longer needed.
When a player’s character moves between maps in the virtual world, the game moves the character’s state from the old map instance to the new one. If the map instances are in different area server processes, the origin area server saves and unloads the state, and the destination area server loads it into the new map instance.
The world server serves as a central controller for area servers. It assigns map templates to area servers, notifies area servers when to create map instances, and coordinates movement of player characters between map instances and their area servers.
The world server is the central authority on a player character’s location within both the virtual world and the area server process nodes that make up the game server cluster. When a player character moves between area servers, the world server updates the routing data used by the connection servers so that client messages reach the correct area server.
We have an MMO game server that distributes the core game play load across multiple server processes to support thousands of connected clients at one time. The game can be made to perform reasonably well, but will meet obstacles to growing the player base.
All game play activity is bound to specific map instances. This means that map instances must be allocated to specific area server processes using a mix that must be determined before they are created or populated. Getting the right mix will require attaching metadata to map templates to classify them based on intended load and/or population characteristics. Developing and maintaining this metadata will take careful performance testing, measurement, and analysis, much of which must be done by hand.
Scaling the game to higher player concurrency requires adding more map instances, and most likely, more area server processes. This works well for ephemeral maps, which may have an unbounded number of instances. However, few options exist for handling player load in shared singleton maps, where higher player load will have the greatest impact.
One option is to create more shared maps, but that imposes a content production burden and takes time. It’s the obvious choice for major content releases or expansions, however.
Another option is to relax the singleton constraint on shared maps, adding more instances when load requires it. This comes at the cost of a degraded player experience, as players will have to be aware of which instance they are in when grouping or during other social interaction. Technical solutions exist to mitigate this, but they won’t eliminate it entirely.
A final option for handling higher load is to use sharding to handle increased player load. This essentially means creating one or more complete copies of the entire game server. Each copy is an entity unto itself. With sharded game servers, players must select which shard they want to join at login and create characters that they can use only on that shard. This limits player mobility across shards and presents an obstacle to social game play. Options exist for addressing these issues, but they add yet more development, operating, and customer support costs.
MMO development is difficult and costly, requiring large teams and complex project management. Development timelines often exceed two years, usually more. Meaningful feedback on game design often doesn’t materialize until late in preproduction, if you’re lucky/good, and sometimes not until closed beta if you’re not.
Reducing the development timeline is one of the most common strategies for reducing the risk of failure to ship an MMO title. Whether or not it’s the best choice depends on many factors outside the scope of this article. That said, it is sometimes effective, if enough of those other factors line up favorably.
One of the ways to cut development time is to reduce the up-front technical complexity of the project. This pattern attempts to do that by simplifying the design and implementation of core game systems by keeping most of it within a single type of server, the area server.
For the most part, the pattern succeeds in simplifying development when compared to more highly distributed architectures. While this pattern still qualifies as a distributed deployment architecture, it minimizes the number of distributed server types. It also limits most code changes to the area server once the supporting server implementations stabilize.
This produces several benefits for developers:
- Developers of server-side game systems work mostly in the same code, allowing easier sharing of knowledge, code, and responsibilities.
- The overall server architecture is easier to understand and communicate than highly distributed architectures. This reduces the time it takes to spin up new developers to be productive.
- Compared to more highly distributed architectures, there is usually reduced need for certain complex programming paradigms such as multithreading and asynchronous processing.
- Debugging and testing is often much easier when most of the functionality is in the same process.
These benefits come at the price of long-term flexibility and growth. However, for most projects, the long-term scenario is moot if the game fails to ship within some expectation.
This pattern is probably a good one to use when:
- The project is (one of) the first MMO projects for the development team, studio, or management.
- The development budget is very aggressive or limited.
- Team staffing emphasizes hiring game system programmers over engineers with experience in building scalable servers.
- The long-term product strategy must defer to overcoming near term shipping risk.
- Richard Garriott’s Tabula Rasa
- Hero Engine (an MMO server framework) http://hewiki.heroengine.com/wiki/Area_Server, http://hewiki.heroengine.com/wiki/World_Server
- Face of Mankind http://www.mmorpg.com/blogs/FaceOfMankind/052013/25185_A-Journey-Into-MMO-Server-Architecture