r.pannkuk@gmail.com
Riley Pannkuk
  • Posts
  • Projects
    • Rumble TV >
      • Website
    • Dischord >
      • Website
    • Ping >
      • Website
    • Temple of the Water God >
      • Website
    • E.A.T: Escape Alien Terrestrials
  • Code Samples
    • [JS] Google Form Email Response
    • [JS] Google Form Trello Integration
    • [JS] Trello Card Organizer
    • [C++] Metrics Integration
    • [Python] Zero Metrics Integration
  • About Me
  • Résumé

League of Legends Statistical Analysis - An Introduction

6/1/2015

0 Comments

 
League of Legends is one of the most popular games currently being played, with many sites citing the game as the most-played PC game and with a healthy international audience.  For the last several months I’ve been tracking statistical data on League of Legends featured games—that is, the games highlighted such that other players can spectate some of the most-skilled players participating live—on a routine schedule, with over 18,000 data entries from games in the North American region.  You can view the data I’ve captured, where I’ve split patches into individual sheets and separated statistics and probabilities interpreted from the data. 

This post serves as an introduction to the methodologies, the findings, and the general purposes behind the project and what conclusions I was able to draw.  As such, it will be very general-focused in nature, and less specific on a patch or the findings in a given patch.  Because League of Legends is a continuously evolving game, follow-up posts will be made going into detail about the changes between patches, the findings from them, and what sort of conclusions or reasons can be drawn for the findings with the data.

Goals

The questions and hypotheses sought out by this statistical analysis cover:

  1. Are both team sides in League of Legends (blue and purple) equivalent in probability of winning, all other factors being equal?
  2. How much does capturing key objectives first matter in influencing chance of winning for a given game? Similarly, does capturing a key objective first influence other chances in key objectives, and by what margin? Key objectives are defined in this sense as:
  • Player Kill (First Blood)
  • Tower Capture
  • Inhibitor Destroyed
  • Dragon Slain
  • Baron Slain

Data Sets

The data set is pulled using Riot Games’ developer REST API, by selecting the featured games on a selected interval (currently every fifteen minutes).  These games are games that are selected by an automated broadcaster service on the League of Legends client, and selects games from the top players in either the Masters or Challenger level skill brackets.  This serves two purposes: 1) To accommodate limited rates of data pulling as demanded by the Riot Developer Agreement, and 2) To avoid the skew of ELO when comparing champion data and other factors in a match.  By selecting only the top level players, I hope to minimize skew in the data where there may be an uneven playing field, such as instances where one player is under-performing at such a level that it has given the opposing team a significant advantage over the game.  These variables can lead to great changes in the outcome of a game, and the purpose of this project is to analyze static game data as opposed to individual player skill.

The data sets for the project will exist in the following spreadsheet:

https://docs.google.com/spreadsheets/d/1A8A0CHbcgTdOl1ZyX7jSnCf73b3SzZ-CdJirdabaTyQ/edit?usp=sharing

Sheets within this spreadsheet exist for data captured on a single game patch, as well as sheets for statistics per champion and probabilities for the global data set. 

Approach

For the games listed, I’ve collected several thousand data entries for each game logged between two opposing sides of five players each.  The data sets are limited to Summoner’s Rift Ranked Play, as to eliminate any possibility for outlier variables.

Equality Between Sides

To determine equality between game sides, I compare the percentage of wins for both Blue and Purple in any given patch, and determine the 95% and 99% confidence interval of these given values knowing that the ideal win percentages for each side should be 50%.  Then I wish to set out whether to reject or accept this hypothesis on both a universal (all patch) and individual (per patch) scale.

Key Objective Influence

To determine the relationship between key objectives and win percentage, I’ve taken the global game data set—recent patches have not made substantial changes to the map—and then drawn relationship trees between each objective and the count of team wins.  Then, using the Bayesian theorem and the statistics for all games, I can draw conclusions about the probability and influence a given statistic has on the win percentage and on other objectives.

Findings

Equality Between Sides

The proposed hypothesis is that the probability of winning a game between any sides of the match is equal, therefore:
Picture
With my data set I found the following statistics on win percentages per patch:
Picture
To determine whether or not the hypothesis is correct, I will use hypothesis testing with the second (95%) and third (99%) standard deviations.  Each individual data sample and the global data sample will be tested separately.
Picture
Picture
Because the  is lower than both the 95% and 99% confidence intervals in all cases except the 5.7.2 patch, we reject the initial hypothesis that the game is evenly balanced between both sides, all other factor being equal. 

To address the reasons for the 5.7.2 discrepancy, one immediate reason for the discrepancy may be the smaller sample size compared to the other patches.  This results in an artificially enlarged confidence value () in the data for this patch, which will gradually go down (assuming the probability stays the same) as the data continues to flow in.  Looking at the patch notes for 5.7.2, nothing immediately strikes out as a change to the layout of the map.  With more testing in the future, we can confirm whether this value will stay respondent to the balanced-teams hypothesis.

The implications of this data is a huge factor to the overall gameplay of League of Legends.  Although the board does all in its power to maintain symmetry through its map design, the slight differences in objectives may make up for what causes a good portion of the differentiation in win ratios.  For example, one of the key objectives in the game, the Dragon, is strategically better positioned for the purple team over the blue team.  The game balances this by putting a stronger but more beneficial objective, the Baron, on an opposite side of the map with blue team having the advantage.  Although in a perfect setting these two objectives would balance out, the fact that they are different objectives and become more important at different times of the game will net different results for each team.

Another reason for the discrepancy may also be because of the resulting lane positions and current “meta” (followed style of play) that the game has undertaken for teamwork.  In current games, there consists of one Fighter positioned on the top-left most lane, a Mage in the middle lane of the map, a Jungler who roams about the unoccupied areas of the map and helps lanes in need, and both a Support and a Marksman character on the bottom-right lane.  Each team matches in this order, but because the lanes themselves are only semi-symmetrical (symmetrical from a flipped perspective), the disadvantages faced by the Purple Team’s top lane are the same disadvantages that the Blue Team’s bottom lane are faced with.  This means that two players are faced with a disadvantage on the blue team, while only one is faced with the same disadvantage on the purple team.

There are other factors to consider that may point in the direction of the discrepancy in win percentages, but this interesting dynamic pokes holes in the balance that has been otherwise purported by Riot Games and the player base as a whole, which will typically look at global data and not data in specific brackets of skill level

Key Objective Influence

For determining influence between objectives on the map, I’ve included relationships between the objectives at four distinct levels: binary, tertiary, quaternary, and quinary relationships.  The data set omits data collected under the 5.4.1 patch due to missing data in some critical objective fields that were added in later patches.  The total sample size for this set is 5061 data points.
Each computation is performed by using the following conditional probability equation:
Picture
For example, I am calculating the relationship between First Blood and Winner as follows:
Picture
Because either one team or another will get first blood, we can substitute the denominator with 1. Therefore:
Picture
For comparisons in Tertiary and higher relationships, the effective formula is:
Picture

Binary Relationships (All Objectives)

Picture
Picture
Picture
Picture
When looking at singular, binary relationships between values, it becomes clear that certain objectives do indeed affect the outcome of other objectives more than others.  For example, the First Inhibitor, an objective which comes late into the game and often clinches a winning team’s supremacy over the losing team, has a large effect on the outcome of the match.  These values appear to be related in terms of time; a normal game will progress with an immediate First Blood, followed by either a First Tower or a First Dragon depending on a team’s composition, and with Baron and Inhibitors being more important in later in the game.  This follows suit with the data, where we expect and see these groupings.  The later objectives similarly have a larger impact on the game, although surprisingly the baron seems to be less impactful than initially hypothesized.

The red sections indicate the relationship that is poor or doesn’t correlate with the corresponding column.  For example, First Blood, First Tower, and First Dragon have little effect on whether or not a team gets First Baron. 

Tertiary, Quaternary, and Quinary Relationships (Winner)

In all other data points I am factoring in only the chance of winning with each separate pairing.  Below are tables that list the number of games won with the effective pairing, the number of games lost with the effective pairing, and the total number of games out of all games that had a pairing like this in it (either from Blue or from Purple).  This is used to satisfy the conditional probability equation as such:

Picture
Picture
Picture
Picture
Picture
Picture
Picture
Picture
When looking at more in depth at the relationships between different size groupings of objectives and their relationship to a team’s victory, the obvious becomes clear: as one gains more objectives, their win percentage goes up dramatically.  However, the rate at which objectives change the win percentage again relies on how important those objectives are and their ability to “snowball” a team to victory.

We see that inhibitor across the board will have the largest impact in affecting the win percentage, but other interesting data points appear as well.  First Baron and First Inhibitor, as expected, are the two highest data points to indicate a win, and when expanding into the tertiary and quaternary sets we see the relationships increase the same.    What’s more interesting is that the First Tower, a very early and often not seen as significant objective, seems to have a decent correlation with some of the higher win percentages in these quadrants, making it desirable to look at further.  This combined with the binary relationships posted above, we see that the dragon has a statistically high relevance, on par with the Inhibitor which would not be intuitive to most players.

As stated before, a good portion of this could be due to “snowballing” in the early segments of the game.  Although first blood is an early way to solidify your lane, it doesn’t have the same relevance as destroying the first tower and continuing the trend forward for your entire team.  This is something to consider when competing in matches and letting enemies take your tower, since it could result in them snowballing out of control.

What's Next

Having looked at the patches from this cycle in a conglomerate viewpoint, and drawing form the conclusions outlined above, what's next is to breakdown each patch individually and highlight changes from one patch to another.  Look into reasons as to why a given champion is unfair during a patch, or why it was picked more.  Using this information we can start to see trends forming, and move towards legitimate, quantifiable reasons as to why a given hero needs a design change based on quantitative testing.

Stay tuned for the next update I'm going to give on this project, highlighting the most recent patch and what champions were identified as "Unfair" according to my analysis. I will use the same methodology but look at certain cases in particular, those being the champions who are played in each game.
0 Comments



Leave a Reply.

    Archives

    June 2015
    May 2015

    Categories

    All
    LoL Statistics
    Production Methodologies

    Other People
    Trent Reed
    Ming-Lou "Allen" Chou
    Sean Middleditch

    Darius Kazemi
    Ellen Beeman
    Ernst ten Bosch

    GDC Vault
    Better Teams Through Game Design
    Growing the LoL Team
    Concrete Practices to Better Lead
    How Agility Turned Terror Into Triumph
    Managing a Creative Process
    Knights of the Round Table
    The Hollywood Model in Real Life
    Advanced SCRUM and Agile
    An Agile Retrospective

    Websites
    Project Euler

    Code Kata
Powered by Create your own unique website with customizable templates.