[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 483: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 112: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/bbcode.php on line 112: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead
[phpBB Debug] PHP Warning: in file /includes/functions.php on line 4586: Cannot modify header information - headers already sent by (output started at /includes/functions.php:3765)
[phpBB Debug] PHP Warning: in file /includes/functions.php on line 4588: Cannot modify header information - headers already sent by (output started at /includes/functions.php:3765)
[phpBB Debug] PHP Warning: in file /includes/functions.php on line 4589: Cannot modify header information - headers already sent by (output started at /includes/functions.php:3765)
[phpBB Debug] PHP Warning: in file /includes/functions.php on line 4590: Cannot modify header information - headers already sent by (output started at /includes/functions.php:3765)
AI Challenge Forums • View topic - Any ideas how to make ranking less random?

It is currently Tue Dec 12, 2017 9:35 pm Advanced search

Any ideas how to make ranking less random?

Random stuff about the contest, posts that don't fit in the other forums.

Re: Any ideas how to make ranking less random?

Postby dimkadimon » Wed Dec 01, 2010 4:36 am

There is also some non-trivial clustering happening in the current system. Bots with similar strategies end up with a similar rating and hence are more likely to play each other. This only brings the clusters closer together. In order to break the clusters apart, bots should be paired with other bots that are further away. I am sure that some of the bots in the top 10 can lose to bots that are ranked 100-300, but unfortunately such matches are very rare. I've noticed these effects with my bot. My bot has a high chance of losing against weaker bots, but at the same time it can beat strong bots. After the first day of the final tournament my bot was ranked around 340. After two days it was around 200 and now, after 3 days it is 130.
dimkadimon
Major-General
 
Posts: 263
Joined: Wed Oct 06, 2010 11:34 pm
Location: Adelaide, Australia

Re: Any ideas how to make ranking less random?

Postby Fritzlein » Wed Dec 01, 2010 5:42 am

Clustering can indeed happen, but when ratings are calculated with BayesElo it isn't a problem. It is true that a group of identical bots could become tightly linked to each other by their results, but that means that when a bot within the group loses to a bot outside the group, the whole group moves down together. Similarly when a bot within the group beats a bot outside the group, the whole group moves up together. In short, what happens to a group of identical bots is basically what should happen to that group.

The least random, most accurate ranking would be created if every bot played every other bot on every map in the set. If this full round-robin were played, then there could be no concern about intransitivity conflicting with Elo ratings, since the Elo ratings would rank players exactly the same way as round-robin won-loss records. Unfortunately, resource constraints make a full round-robin impossible, so the decision before us is what subset of games to play.

The first set of games we want to throw out is mismatches. These games give very little information per game. Yes, it is true that an 800-point rating gap doesn't always mean a 100:1 chance of winning. However, if the true chance of winning is 97:4 instead, i.e. an implied 554-point Elo gap, we have played 101 games to discover a score difference of only 3 victories.

Contrast this to a close match where the ratings are supposedly only 118 points apart, for an implied winning ratio of 67:34. In the previous example our rating estimate was off by 246 points and that changed the score by only 3 victories. But here if the estimate is off by only 59 elo, i.e. if the true gap is 59 points, that changes the score by 8 victories to 59:42. We have played the same 101 games in light of a smaller rating error and have more change in score to show for it. That's good efficiency.

We all know that actual game results will not be transitive in the way that Elo ratings imply, but to get the most bang for the buck in detecting that intransitivity, we should not play a lot of mismatches. Instead we should play lots of close matches to make the most efficient use of server resources.

The second set of games we want to throw out is repeated pairings. It might turn out that for whatever reason a lower-rated bot scores well against a particular higher-rated bot. When this pairing occurs, it is a little bit of bad luck for the higher-rated bot and a little bit of good luck for the lower-rated bot. In a round-robin format, this kind of good and bad luck will even out over all, but if we allow repeated pairings before all bots have had a chance to play all other bots, we will amplify the luck inherent in opponent selection, thus making the final rankings more random. That's inefficiency.

Unfortunately, these two principles of efficient pairing are contradictory. If we avoid repeat pairings, we will be forced to have big mismatches. If we avoid big mismatches, we will be forced to have repeat pairings.

My opinion is that taking an extreme stance in either direction is suboptimal. The most efficient use of limited server resources will be to find some sort of balance. For example, one could absolutely avoid repeat pairings until that avoidance forced us into mismatches of 500 Elo or more. If (and only if) the only non-repeat pairings are outside of 500 Elo, then repeat pairings should be allowed.

My speculation on efficient pairing is quite separate from the issue of focusing server resources on the higher-ranked bots later in the tournament. The gradually shrinking field that has been implemented is a good idea independent of exactly how the remaining players in the field are paired.

Finally, as Janzert said, this is a very hard measurement problem. The fact that places #2 through #5 are churning doesn't mean the tournament is doing a bad job. Given two bots that are in truth 20 Elo points apart, you can let them play 100 times, and the better one is only expected to be ahead 53:47. That's only 0.6 standard deviations from equal. Or, to put it another way, the wrong bot will be ahead about 27% of the time even after 100 games. Obviously the more games you play, the better the estimate becomes, and the more the ratings settle down. You won't see nearly as much churning on the final day, after the top bots already have hundreds of games played each.
Fritzlein
Colonel
 
Posts: 81
Joined: Thu Feb 18, 2010 9:20 pm

Re: Any ideas how to make ranking less random?

Postby krokkrok » Wed Dec 01, 2010 6:14 pm

The answer to this problem is easy. Pick contests that can be perfectly and transparently measured instead of games where the results are a product of the matching algorithm. As it stands, the matching algorithm could be manipulated to put almost any bot at any position you wanted.

A really smart AI would spend its time gaming the judge rather than its opponents... :twisted:

That we tolerate this possibility is a real shame.
krokkrok
Major
 
Posts: 38
Joined: Mon Sep 13, 2010 2:48 am

Re: Any ideas how to make ranking less random?

Postby george » Wed Dec 01, 2010 7:04 pm

krokkrok - you are kidding, right?
this galcon-type game is excellent for testing AIs. - what game is perfectly measured? - chess isn't - maybe a running race? if you want an adverserial game, it's likely to be hard to rank.
sure there is noise in the rankings, ELO isn't quite the best measure because of higher chance of losing against bots ranked much lower, but it's small enough.

and hey, if you had a way to beat bocsimacko's bot, by somehow 'gaming the judge' why didn't you submit it?
george
Lieutenant
 
Posts: 16
Joined: Mon Mar 08, 2010 4:19 pm

Re: Any ideas how to make ranking less random?

Postby krokkrok » Wed Dec 01, 2010 7:54 pm

I am not joking.
You claim "this galcon-type game is excellent for testing AIs". Can you back this up?

From a previous post:
One of the problems with most of the games people use to measure AI's is that the outcome of a contest is subjective. The metric which determines the winner is not transitive so the ordering is not stable when new participants are added and the order of judgments impacts the outcome.

Consider the simple case where A beats B, B beats C, but C beats A, all the time. If all possible games are not played, the judgement will of A plays B, B plays C will have the outcome that C is rated the loser simply because it played last. A statistical sampling would most likely find a winner and a loser where that relationship does not actually exist.

Next think about what happens, even in the case where all possible outcomes of players vs player on all possible maps are actually counted. The way we determine the winners is to sort the players by number of wins. If this is a good objective metric, we would expect that if we added just one more player to the mix, that player would be slotted at his level of ability, and the relative position of other players would remain the same or different by one. But this is not the case with head-to-head games like Galcon: Consider the case where there is a bot that can beat top rated player %100 percent of the time, and other players zero percent of the time. When this player is inserted into the list, top rated player suddenly has losses equal to the number of games played against the new bot, while other bots have wins, and thus, the order is unstable. Thus, this metric is subjective to the exact set of players, and is not an objective relationship.


TLDR: sorting non-transitive games is like using a faulty compare function in a sort routine.

Next:
There are many ways to subvert the ELO judging system. One simple one is where multiple bots collude to boost the rating of another. Note that it is not necessary to be able to beat bocsimako's bot in order to be successful! All you need to do is enter a large number of bots which feed wins up the chain to a master bot; I won't explain how to do it here, but I will claim it is possible to do without detection and with as much influence on rank as you wish. In particular, the matching system which attempts to pigeon-hole bots into a rank magnifies the ability of colluders to do this.
There are other ways to game this system...
Why have I not done this? Because I am not a cheater. There are however vast masses of people who would cheat joyfully; they do it in online games and even form syndicates that swarm games and destroy them by exploiting stuff like this.

Finally:
Why is fighting bots head-to-head so important to you? Why do you need the games to be adversarial?
krokkrok
Major
 
Posts: 38
Joined: Mon Sep 13, 2010 2:48 am

Re: Any ideas how to make ranking less random?

Postby george » Wed Dec 01, 2010 9:09 pm

hi krokkrok,
well, i used (or tried using) heuristics, nnets, minimisations including poor man's GA, cheap alpha-beta. the problem required tricks of this type for me to do well. i'd say that these are good uses of AI. at any rate, the problem was rich enough that complicated algos (see eg iouri's writeup) were needed.

again, to take a well-known example, chess has the same issues regarding transitivity. i'm prepared to live with this minor issue.

ok, i see your point on collusion between bots. i had ignored that. my guess is it's not happening here - hey, i'm currently in the first 10 (only just though) and haven't needed to cheat to get there. many of the top bots have their code available - why don't you play with them on your own machine, and see that they win their games fairly?

why do i need games to be adverserial? well, do many things, some of which (eg orienteering, rockclimbing, project euler) are not adverserial. but adverserial games can be fun, and they add extra richness as you have to directly interact with, and react to, opponents.

have you looked at http://projecteuler.net, or http://kaggle.com - maybe you'd be happier in that area?
george
Lieutenant
 
Posts: 16
Joined: Mon Mar 08, 2010 4:19 pm

Re: Any ideas how to make ranking less random?

Postby bocsimacko » Thu Dec 02, 2010 12:56 pm

bocsimacko
Major
 
Posts: 36
Joined: Wed Feb 10, 2010 10:06 pm

Re: Any ideas how to make ranking less random?

Postby george » Thu Dec 02, 2010 3:14 pm

me? i will do, havent got round to it yet though. possibly the things i tried, but failed, to use are the more interesting.
mainly, i had a 'choose moves' function that was quick, and could be called a few hundred times.

i ran this with a few choices of parameters (normal, normal but do nothing first go, aggressive, timid, etc) vs an 'opponent' with 2 sets of parameters (normal, normal BUT do nothing on first go), for 30 turns. then i picked my best results against the worst the 'opponent' could do.

i briefly looked at iouri's writeup, and yours, much of what they said seemed similar.
i'll write something soonish.

btw, congrats on winning, and so convincingly. i was almost sure you were using that monte-carlo method your blog mentions - i couldnt see how you could be so good using just methods i had. thanks for your writeup - i might even try reading the code too .)
george
Lieutenant
 
Posts: 16
Joined: Mon Mar 08, 2010 4:19 pm

Re: Any ideas how to make ranking less random?

Postby mogron » Thu Dec 02, 2010 6:34 pm

krokkrok, I do not get your complaining. There are AI competitions that are much closer to your ideals - like the IPC, or some branches of RoboCup Rescue, and more, I'm sure. I think many people enjoy the adversarial nature of this competitions problems, and don't really care about exact objective rankings, as long as the winner seems to be justified.
mogron
Lieutenant
 
Posts: 18
Joined: Mon Sep 20, 2010 9:07 am

Re: Any ideas how to make ranking less random?

Postby Fritzlein » Fri Dec 03, 2010 12:16 am

Fritzlein
Colonel
 
Posts: 81
Joined: Thu Feb 18, 2010 9:20 pm

PreviousNext

Return to Misc

Who is online

Users browsing this forum: No registered users and 2 guests

cron