Darhuuk: I ended up training my network on between 100k and 300k of example situations taken from xathis' games (version 1). When I tried using much smaller example sets (like 10k) I had the problem of overfitting, meaning much larger validation set error than training set error. When I tried using all the data I'd extracted from xathis' games (900k examples), I couldn't do the backpropagation in a reasonable amount of time. I converged to a solution within 200-500 iterations, and consistently was able to predict about 60% of xathis' moves for the validation set. Sometimes 58%, sometimes 62%, depending on my parameters. I went with one of the neural nets that saw the most data eventually, and which hit 58% on both the validation and training data sets.
This number was disappointing to me initially, but my most important discovery was that when I trained on only one-vs-one situations, which I assumed would be simpler, the percentage dropped to around 50%.... because in the 1-vs-1 scenario, there's often two moves which are equally favorable...
I'm delighted to report that my neural network produces at least some apparently coordinated behavior, like walling off the mazes as in
http://tcpants.com/replay.2984and occasionally surrounding groups of enemy ants before engulfing them. It does tend to err on the cautious side (which was a characteristic of xathis v1, I believe)... but all in all, it's not terrible. To clarify: as soon as my ants get within roughly 4 tiles distance from the enemy, the neural net assumes all control, I'm not doing anything else except checking for and avoiding collisions.
Right now, I'm tweaking all the non-neural-net parameters and trying out different combinations of weights for roaming, enemy hill attack, backing up my friends in trouble, and so on...
/Claes