psb217 wrote:One of the largest jumps in my bot's apparent performance came with the introduction of variable weighting for each vertex within a player's voronoi region. Basically, I counted each vertex with 1 neighbor in the region as 0 points, "..." with 2 neighbors as 1 point, etc. I thought of this as representing the overall "robustness" of the region that each player controlled, and it implicitly includes the "move to least neighbored neighbor" heuristic while somewhat maximizing the degrees of freedom within a controlled region.
This is one of the reasons I believe my bot did so well and I totally failed to mention it thus far. For every square in the region a given player controls, I count the square and its number of edges with different weights. I wasn't sure what the best weight was, so I took a machine learning approach: I wrote a utility (just a variant of my existing bot) which analyzes an arbitrary game when fed in end-to-start, and wrote a perl utility (a pastebin link was up on this forum) to fetch an arbitrary game id from the contest site which could optionally show the game backwards. My analysis tool found the point at which the players "separated", figured out the maximum possible fillable space in each region (ignoring how well the actual players filled it), and then outputted some statistics for every turn prior to that. The goal was to find, given certain indicators about the controlled space, what the statistically likely endgame looks like.
I never came up with anything smarter than a simple, very approximate, linear model. The difference in endgame moves after separation is roughly equal to K1 * (player 1 fillable squares - player 2 fillable squares) + K2 * (player 1 sum of edges - player 2 sum of edges) + K3 * (irrelevant factor). "Fillable squares" is roughly 2*min(red_count, black_count) plus/minus whatever adjustment for starting/ending squares. "sum of edges" is the number of neighbors in the region. irrelevant factor was nodes that are likely to become articulation vertices. After fitting the linear model on ~12000 of the games played by top-100 players with a quick and dirty perl script (model1.pl... did I remember to commit that? I'll check) I found K1 to be about 0.050, K2 about 0.150, and K3 less than 0.009. Hence I consider the third factor pretty much irrelevant. I use integer scores, so I multiplied through by 1000 and that was my heuristic before separation, with 10000*(difference of fillable area) as the heuristic after separation.
I was kind of surprised that the number of edges (which was about three times higher) had about three times more weight than the number of squares, but it's likely that it's a peculiarity of the least squares estimation as the two are highly correlated, obviously. I wanted to do some more statistical stuff with board evaluation but I didn't really have time. The model is pretty sketchy, to be honest... the graph of the raw data is a huge barely correlated blob. But picking numbers based on data is a lot better than picking them randomly.