Player Identification from RTS Game Replays

Siming Liu, Christopher Ballinger, Sushil J. Louis

Dept. of Computer Science and Engineering

University of Nevada, Reno

Reno, 1664 N. Virginia Street, 89557

simingl@cse.unr.edu

This paper investigate the problem of identifying an RTS game player from their playing style. More specifically, we use machine learning algorithms in the WEKA toolkit to learn how to identify a StarCraft II player from features extracted from game replays. Results reveal that Boosting J48 and Random Forest decision trees perform best on identifying a player from replay data. For a particular player, the results also help us identify the most frequently used strategy against different opponent types and the player's strengths and weaknesses. We believe that these results will help us design better RTS game AI.

startsectionsection1@24pt plus 2 pt minus 2 pt 12pt plus 2pt minus 2ptIntroduction Models of player behavior in real-time strategy games are important for the AI community. If we can learn to recognize a player from the way the player plays the game, we are learning a player model and we can use this model to devise counter-strategies that beat the player. On the other hand, especially if the player is good, we can learn strong winning strategies from analysing the player's gameplay. Either way, player modeling helps us develop better gameplay strategies and AI has used board and card games for research since the early fifties when Samuel worked on developing a checkers players []. In the paper, we focus on using machine learning techniques to identify a professional StarCraft II player from a database of game replays. StarCraft II is one of the most popular multiplayer real-time strategy games [] with professional player leagues and large databases of professional and other game replays available on the Internet. Figure1 shows a typical game interface of StarCraft II. In South Korea, there are twelve professional StarCraft II teams with top players making six-figure salaries and the average professional gamer makes more than the average Korean.

**Figure 1:** Screenshot from StarCraft II
$\begin{figure}\epsfxsize =\columnwidth \epsfbox{starcraftII.eps} \end{figure}$

RTS games involve spatial reasoning, resource management, and strategic and tactical thinking. A player has to build up an economy to obtain enough resources to generate and support a strong military that will defeat the opponent. Any advances in AI approaches in designing RTS game players will have industrial, military, and social applications. Our overall research goal is to develop competent RTS game players and this paper represents initial research in this direction. We are interested in analyzing professional games to see if professional players share common play charecteristics (styles). That is, can we model a specific StarCraft II player from the player's replays? How does this player's style compare with another? What are a player's strengths and weaknesses? In our research, we explore the use of supervised machine learning techniques to identify a StarCraft II player for game replayer. Specifically, we use machine learning algorithms from the WEKA toolkit on features extracted from StarCraft II replays to learn to identify a specific professional StarCraft II player. Preliminary results show that our prediction accuracy on a testing set can be as high as 87 percent using a random forest with one hundred trees. Other analysis with an entropy minimizing decision tree indicates that the player always tries to maximize economic resources early in the game.

The remainder of this paper is organized as follows. Section 2 describes provides related work RTS games and in player modeling. The next section describes our methodology and features used for player identification. Section 4 contains the results for this research. Finally, the last section provides conclussion and discusses future work.

startsectionsection1@24pt plus 2 pt minus 2 pt 12pt plus 2pt minus 2ptRelated Work StarCraft II was released in 2010 and being a relatively new game, has not been used much for scientific research. Michael Whidby implemented a Python game for studying scouting efficiency in different leagues of one-versus-one ladder in StarCraft II [4]. His results, for a specific kind of scouting, shows that players in higher leagues scout more than players in lower leagues.

However, StarCraft: Brood Wars, the predecessor to StarCraft II, has been used often for research in the AI community. Ji-Lung Hsieh and Chuen-Tsai Sun applied a case-based reasoning approach for the purpose of training their system to learn and predict player strategies. [5]. Ben G. Weber and Michael Mateas present a data mining approach to opponent modeling in StarCraft. [6]. They applied various machine learning algorithms to detecting an opponent's strategy in his history game replays, and predict opponent's strategy action and timing based on it.

There is much player identification research in other games. Jan Van Looy and Cedric Courtois studied player identification in online games [7]. They were focused on massively multiplayer online games(MMOGs) and their research did not use game data, rather, it was based on a group of game volunteers, who gather data on the preference of their avatar's appearance using survey questions.

Some work has been done in extracting features from replay files. SC2GEAR [12] provides a StarCraft II replays parse service to convert a binary replay file to an XML structured file which we can easily understand. Gabriel Synnaeve and Pierre Bessiere worked on extracting the complete game state from a recorded StarCraft replay file by rerunning the replay file and recording the commplete game state through the Brood War API (BWAPI) framework [8]. This approach could access the full game state in each every frame in the game and broke the limitation of inherent in replay files which only record the actions of each player. However, StarCraft II does not have such an interface since and cannot access its complete game state yet. We therefore only use the data from player actions in the StarCraft II replay file as parsed by the SC2GEAR parsing service.

startsectionsection1@24pt plus 2 pt minus 2 pt 12pt plus 2pt minus 2ptMethodology startsection subsection2@12pt plus 2pt minus 2pt12pt plus 2pt minus 2ptsetsizesubsize12ptxiptData Collection One of the challenges of identifying a player is to gather enough game replays from one specific player versus other players. A StarCraft II replay is a file which saved all the action event logged during the game. These replays reflect the players' thinking and decision making at every stage during the game. We therefore believe that we can infer the play style and useful patterns from replays. Many websites are mainly focused on collecting and sharing game replays and most replays are from professional tournaments including MLG [9], IPL [10], GSL [11] and other pro-leagues. Therefore, it is possible to collect a representative set of replays for specific professional players. In this early work, we only focus on one-versus-one type of games, because this is the most popular game type for professional matches. There are three different races (Terran, Protoss and Zerg) that a player can choose, each race is totally different from others in terms of structures and units and thus play styles. In this research, we select a Protoss player because typically Protoss players have very different strategies versus each of the other races. Most Protoss players prefer early attacks when their opponent is Protoss, a more balanced game against Zerg, and a much more economically focused game versus Terran.

We gathered more than 450 replays from SC2Rep.com [1] and GameReplays.org [2]. Half of the games have our specific player, player

, and half do not. However, the sources of these replays are from big fans of this game, they are not official replays, and therefore the data is noisy. For example, there are Zerg versus Zerg replays in the file but mislabeled as Protoss versus Zerg. These noisy data are cleaned manually. We finally ended up with

game replay files and the detailed game replays for Protoss versus different races are shown in Table 1.

**Table 1:** Game Distribution in Replay Files
	PVP	PVT	PVZ
Player 1	87	64	52
Other player	66	74	54

StarCraft II replay files are stored in a binary format by the game. We need to parse it to a format that we can understand and use. There are several websites that provide parse services and SC2Gears parse service [12] is one of the popular ones. Thus we developed a web tool to convert the replays to XML structured file using SC2Gears. We get all user interface actions from the parsed replay files. However, all game state information is not available because some state information is generated by the game engine and not saved in replay files. BWAPI for StarCraft Broodwar could get complete game state, but StarCraft II does not have this interface yet. Table 2 shows a subset of an example game log for a Protoss player parsed from one replay file.

**Table 2:** Replay Logs
Frame(Time)	Player	Action	Object
3296	Player 1	Build	Pylon
3588	Player 2	Build	Supply Depot
3625	Player 1	Train	Probe
4804	Player 2	Train	SCV
5638	Player 1	Select	Hotkey 1
6208	Player 2	Build	Barracks
7543	Player 1	Attack	Target position

startsection subsection2@12pt plus 2pt minus 2pt12pt plus 2pt minus 2ptsetsizesubsize12ptxiptExtract Features The goal of our representation is to capture as much as game information and also the unique aspects of one specific player compared to other players so that we can identify the player from others based on these unique characteristics. Our feature vector covered three parts of the game information. The first part is general game information which includes game length, winner, and actions per minute (APM). The second part represents the changing state of the game and covers how many units or structures are built in each three minutes time slice. The last part records the key frames for first build or use of each building, unit, and unit ability. This information indicates an important part of the strategy the player used. Formally, our features are represent by equation 1 where

is units, structures, upgrades, or abilities, and

is the index of a three minutes time slice with $t \le 10$ .

$\begin{displaymath} \overrightarrow{F} =\{G, S_x^t , O_x\} \\ \end{displaymath}$

(1)

$\begin{displaymath} O_x = \left\{ \begin{array}{ll} f, & \quad \mbox{frame ... ...ad \mbox{if $x$\ was never produced} \\ \end{array}\right\} \end{displaymath}$

(2)

In our early experiments, we noticed one very simple distinguishing feature for players. The number of Action Per Minutes (APM) serves to reliably identify specific players with an accuracy of up to $95\%$ . However, this dominating feature only shows how fast this player is clicking the keyboard and mouse, but tells us nothing about how they played in the game. Therefore, we did not include the APM in the rest of our work. In the end, we collected

features from each replay file.

startsection subsection2@12pt plus 2pt minus 2pt12pt plus 2pt minus 2ptsetsizesubsize12ptxiptEvaluation

Given the research aim, our approach was trying to achieve maximum classification performance from the training set and testing set. We apply various classification and prediction algorithms using the WEKA toolkit from the University of Waikato [3] to explore multiple machine learning approaches to identifying player

. WEKA is a powerful machine learning software that includes many prediction and classification algorithms, as well as preprocessing and regression techniques and toolkits from statistics. We applied the following techniques:

Different features represent different play styles and player's preference. Some players prefer to expand quickly, and the number of workers built in the first three minutes turns to be more important than the the the time of the appearance of the first Stalker (a military unit). Some features may not be used at all in some strategies. For example, Player

never used Phoenixes (a flying unit) when playing against Terran, which means that all features related to Phoenix contribute nothing to identify this player. Each feature has different importance for different opponents as well as different opponent strategies. Since very large decision trees are difficult to analyze, we needed to select more important features out of the total 230 features in different scenarios. In addition, discarding noisy and obfuscatory features increases classification accuracty. We therefore used WEKA's attribute evaluator to choose the features with the best predictive ability. This helped us to find out which features are more important than others in identifying a specific player. Table 3 shows the best seven attributes for Protoss versus Protoss games.

**Table 3:** Important Features in PVP
1	Worker count built in first 3 minutes
2	First time of using chrono boost
3	Blink count in 6 to 9 minutes
4	First Gateway built time
5	Fist pylon built time
6	Zealot count built in first 3 minutes
7	Observer count built in 9 to 12 minutes

startsectionsection1@24pt plus 2 pt minus 2 pt 12pt plus 2pt minus 2ptResults Several machine learning algorithms were applied to get high identification performance with WEKA toolkit. We use ten fold cross-validation and all results are on the test-set.

startsection subsection2@12pt plus 2pt minus 2pt12pt plus 2pt minus 2ptsetsizesubsize12ptxiptPlayer Identification Performance

Experiments were conducted to get the best identification results. The first experiment use the all 230 features with the machine learning algorithms in Subsection 3.3. J48 get

percent of accuracy with all features on PVP,

percent on PVT and

on PVZ. The reason that we got better result from PVP is the dataset of PVP is the biggest as shown in Table 1, PVZ has the smallest dataset. Since PVP data coverage is better, our classifiers learn better and get better results. ANN does better than J48. it gets

percent on PVP,

percent on PVT and

percent on PVZ. We also applied Boosting and voting to improve the identification performance. AdaBoost and Random Forest get us even a better results. The best performance of PVP is

percent from AdaBoost. The best PVT is

from ANN. The best PVZ is

percent from Random Forest. Figure 3 shows the results of J48, ANN, Random Forest and AdaBoost performed with all 230 features.

**Figure 2:** Results of All Attributes
$\begin{figure}\epsfxsize =\columnwidth \epsfbox{allattributes.eps} \end{figure}$

These algorithms have different effect on different number of features. The reason of the difference is that a small part of features turns out to be good features that could build a tree and identify the player. However, some style that the player used only three or four times may not be represent in the tree. Therefore, for decision trees, AdaBoost and Random Forest get better results than J48. For ANN, we have enough input nodes and hidden nodes, so the more attributes we input, the better result we will get.

We conduct experiment with the best attributes which based on the WEKA toolkit attributes selection. We get seven best attributes on PVP (See Table 3). Now we can see the worker built in first three minutes is very important in identify this player, and the first time he use chrono boost is also a big difference from other pro players. Figure

shows the results with only the best features. The results indicate that Random Forest get best results from all three categories with best features. We get 87.6 percent on PVP, 83.3 percent on PVT and 77.3 percent on PVZ with Random Forest. The reason is random tree select four features by random, best features represent more information than random four features from 230 features.

**Figure 3:** Results of Best Attributes
$\begin{figure}\epsfxsize =\columnwidth \epsfbox{bestattributes.eps} \end{figure}$

Figure 4 compares the accuracy of using all attributes and using only the best attributes for each algorithm and race. Neural Network have a higher accuracy when all attributes are given to it, while other algorithms have a higher accuracy when given the best attributes. PVZ shows that using only the best attributes performs better when the dataset is limited.

**Figure 4:** Attributes vs Best Attributes
$\begin{figure}\epsfxsize =\columnwidth \epsfbox{allvsbest.eps} \end{figure}$

There are several reasons that we cannot get 100 percent of identification accuracy. Our approach use history game replays to extract features that could help us to identify a player's play style. If there is a strategy that this player and other players are prefer to use and all the details they followed are the same, then the machine learning algorithms cannot tell the difference and cannot identify him based on this strategy. However, this usually happened in top pro players because they always want to copy the latest and best strategies. Another reason is the player sometimes choose an uncommon tactic that he used only once or twice, there are too few cases that machine learning algorithms could learn and recognize him. We check a false positive case is the player use cannon rush in a tournament match, the game ends in four minutes, all the build order and structure are only fit that tactic. Therefore, all the algorithms are predict wrong in that case.

startsection subsection2@12pt plus 2pt minus 2pt12pt plus 2pt minus 2ptsetsizesubsize12ptxiptRules and Strategies The advantages of J48 decision tree is not only fast, but also it can easily convert to rules which can understand by people. ANN AdaBoost and Random Forest are hard to understand by people. Table

shows the best four rules for different opponents based on J48 trees. Based on the rules of identify this player, we can get his unique "characters" from other pro-players. This is an example rule from PVP generated by J48 decision tree. If a set of data fulfilled this rule, it predicts positive for this player, otherwise, it is negative:

From the rule, we can see this player always built more workers than others in first three minutes. He always build his Gateway early than 6320, and he used Blink in six to nine minutes less than twelve times, while some other players in the training set used more than twelve. Therefore, we could indicate some strategies and play style which tells the difference between him and other professional players.

We can also get his strategy preferences against different opponents from the rules. We can see his strategy versus Protoss is different with his strategy versus Terran and Zerg. The first time he use Chrono Boost in PVZ is later than PVP, the reason is that he prefers early pressure when his opponent is Zerg, he saved Chrono Boost from build worker to build army. For PVT, the decision tree is more distributed to more different leaves than PVP and PVZ. It indicates that he has more choice of strategy to play against Terran, there are no one or two strategies dominated in PVT.

startsectionsection1@24pt plus 2 pt minus 2 pt 12pt plus 2pt minus 2ptConclusion and Future Work In this paper we have shown our method of identifying a specific player from his replay history and achieved good performance. This reveals that we can reliably identify a professional player among other professional player only based on his playing style. It also indicates the professional gamer has his own playing style. The features extracted from replays contain a player's unique characters and also the difference between him and other pro gamer.

Results also help us to reveal the important features for a good player. We know maximize the economics is important in the early game, and the base line could be 25 workers we should build in the first three minutes. Therefore, we can improve our understanding to design a better AI player to player like the pro player, and even defeat the pro player with counter-strategy.

startsectionsection1@24pt plus 2 pt minus 2 pt 12pt plus 2pt minus 2ptSummary This guide will get you through a basic article with no figures and no equations. Refer to your Latex Manual if you need to include these items. startsection subsection2@12pt plus 2pt minus 2pt12pt plus 2pt minus 2ptsetsizesubsize12ptxipt*Acknowledgments This will show you how to do an unnumbered subsection.

Bibliography

About this document ...

This document was generated using the LaTeX2HTML translator Version 2008 (1.71)

The command line arguments were:
latex2html -split 1 SCII_Player_Identifier_ISCA