From a previous post:

One of the problems with most of the games people use to measure AI's is that the outcome of a contest is subjective. The metric which determines the winner is not transitive so the ordering is not stable when new participants are added and the order of judgments impacts the outcome.

Consider the simple case where A beats B, B beats C, but C beats A, all the time. If all possible games are not played, the judgement will of A plays B, B plays C will have the outcome that C is rated the loser simply because it played last. A statistical sampling would most likely find a winner and a loser where that relationship does not actually exist.

Next think about what happens, even in the case where all possible outcomes of players vs player on all possible maps are actually counted. The way we determine the winners is to sort the players by number of wins. If this is a good objective metric, we would expect that if we added just one more player to the mix, that player would be slotted at his level of ability, and the relative position of other players would remain the same or different by one. But this is not the case with head-to-head games like Galcon: Consider the case where there is a bot that can beat top rated player %100 percent of the time, and other players zero percent of the time. When this player is inserted into the list, top rated player suddenly has losses equal to the number of games played against the new bot, while other bots have wins, and thus, the order is unstable. Thus, this metric is subjective to the exact set of players, and is not an objective relationship.

TLDR: sorting non-transitive games is like using a faulty compare function in a sort routine.

Next:

There are many ways to subvert the ELO judging system. One simple one is where multiple bots collude to boost the rating of another. Note that it is not necessary to be able to beat bocsimako's bot in order to be successful! All you need to do is enter a large number of bots which feed wins up the chain to a master bot; I won't explain how to do it here, but I will claim it is possible to do without detection and with as much influence on rank as you wish. In particular, the matching system which attempts to pigeon-hole bots into a rank magnifies the ability of colluders to do this.

There are other ways to game this system...

Why have I not done this? Because I am not a cheater. There are however vast masses of people who would cheat joyfully; they do it in online games and even form syndicates that swarm games and destroy them by exploiting stuff like this.

Finally:

Why is fighting bots head-to-head so important to you? Why do you need the games to be adversarial?

Statistics: Posted by krokkrok — Wed Dec 01, 2010 7:54 pm

