I am not sure with the rating system used in the contest. There are 2 issues which seemed to be strange.
1) It seems to me that the most recent game is much more important than games in the past. In the case two opponents playing only games with each other ... playing 1000 does winning first 700 games guarantee better final rating?
2) When a game finishes by draw, the lower rated players gain while higher rated players lose. This is probably OK, when games finished by server issues are not counted (are they?).
In this way xiathis drawing last few games could lower his rating under rating of the second best bot.
I don't know how the rating system should work.
I would think about maintaining players pairwise score histories and update ratings only reflecting the history score change.
It seems to me such system could be set such that in two players only case the order of games does not matter.
I am not sure how it would work for more players, I hope it would add stability to the system. It surely is interesting topics for further study ...