Common monitoring scenarios - Amazon GameLift Servers
Services or capabilities described in Amazon Web Services documentation might vary by Region. To see the differences applicable to the China Regions, see Getting Started with Amazon Web Services in China (PDF).

Common monitoring scenarios

Dive deep performance investigation

Scenario: A host/instance is having degraded performance due to specific processes or game sessions

Investigation steps:

  • Access the Instance Performance dashboard.

  • Review "Top N Memory Consuming Game Sessions" table to identify which processes contribute the most to instance memory consumption.

  • Review "Top N CPU Consuming Game Sessions" table to identify which processes contribute the most to instance CPU utilization.

  • Click on Game Session links to enable deeper investigation of detailed metrics.

  • Analyze server timings (Server Delta Time, Server Tick Rate, Server Tick Time, Server World Tick Time) to identify performance bottlenecks.

Game server crash investigation

Scenario: A game session has crashed and you need to determine the root cause

Investigation steps:

  • Access the Server Performance dashboard for the crashed game session.

  • Check Memory Usage (Units) and Physical Memory Usage (%) to determine if the crash was due to out of memory.

  • Review CPU Usage (%) to identify if CPU overload caused the crash.

  • Analyze Network I/O (Bytes) and Network I/O (Packets) to determine if network bandwidth problems contributed to the crash.

  • Examine Packet Loss percentage to identify network-related issues.

Investigate player reported issues

Scenario: Players report lag or interruption during gameplay

Investigation steps:

  • Access the Server Performance dashboard for the affected game session.

  • Review Server Tick Time and Server World Tick Time to identify delays in game updates.

  • Check Server Tick Rate to ensure consistent server update frequency.

  • Analyze CPU Usage (%) to identify processing bottlenecks.

  • Review Memory Usage metrics to identify memory-related performance issues.

  • Check Network I/O metrics and Packet Loss to identify network bottlenecks.

Identify performance changes in different game server builds

Scenario: You want to measure how game performance changes across different server builds

Investigation steps:

  • Compare Server Tick Time metrics between different builds to measure processing efficiency changes.

  • Analyze Server Tick Rate consistency across builds to identify performance regressions.

  • Review Server World Tick Time to measure game world update performance changes.

  • Compare Memory Usage patterns between builds to identify memory optimization improvements or regressions.

  • Monitor CPU Usage trends to assess computational efficiency changes.

Detect delays and slowness in gameplay

Scenario: You need to monitor server responsiveness and game update speed

Investigation steps:

  • Monitor Server Tick Time to measure how fast the server processes each update cycle.

  • Track Server Tick Rate to ensure consistent game state updates per second.

  • Analyze Server World Tick Time to measure game world update speed, which directly impacts customer experience.

  • Set up alerts for Server Delta Time variations to detect inconsistent server performance.

Benchmarking different game scenarios

Scenario: You want to identify how different game scenarios affect server performance

Investigation steps:

  • Compare server performance metrics across different player counts to understand scaling impact.

  • Analyze performance differences between game modes using Server Tick Time and CPU Usage metrics.

  • Monitor Memory Usage patterns across different game scenarios to identify resource-intensive features.

  • Track Network I/O metrics to understand bandwidth requirements for different gameplay scenarios.

  • Use the Instance Performance dashboard to identify which game scenarios produce the highest resource-consuming game sessions.

High resource utilization response

Scenario: Unusual resource spikes (CPU >85%, Memory >90%)

Investigation steps:

Identify affected resources

  • Use DescribeGameSessionDetails API.

  • Filter by Status if needed.

  • Document affected instances.

Analyze resource usage

  • Review Instance Overview dashboard.

  • Compare utilization across fleet.

  • Check historical patterns.

Monitor game server impact

  • Check Server Performance metrics.

  • Review tick times and packet loss.

  • Monitor memory leaks.

Resolution steps

  • Download session logs.

  • Address build issues.

  • Monitor improvements.

Game server crash analysis

Scenario: Multiple error-status game sessions across fleet

Investigation steps:

Initial assessment

  • Access Fleet Overview dashboard.

  • Review crashed session table.

  • Note patterns in timing/location.

Performance analysis

  • Check server timing metrics.

  • Review resource utilization.

  • Monitor network performance.

Infrastructure review

  • Verify fleet capacity.

  • Check instance health.

  • Review scaling policies.

Resolution path

  • Analyze server logs.

  • Review code optimization.

  • Implement fixes.

Fleet capacity optimization

Scenario: Game launch or benchmark study

Analysis steps:

Resource utilization

  • Filter by location.

  • Review P50/P95/P99 metrics.

  • Analyze usage patterns.

Instance type analysis

  • Compare performance by type.

  • Identify scaling candidates.

  • Document utilization patterns.

Optimization actions

  • Adjust scaling policies.

  • Modify instance types.

  • Update fleet configuration.