After competing in serveral Bot Contests in CodinGame, I started to learn meta-heuristics(aka modern heuristics). I implemented an evolutionary algorithm (lamba, mu) in the Coders Strike Back multi-game(ranked 55 at time of writing this post). I find that designing the evaluation function is most tough and also creative part when we adopt a meta-heuristic based approach.
In this post, I will share some insights gained from crafting the evaluation function in the CSB game.
Meta-heuristics (such as genetic algorithms, hill-climbing, simulated annealing) are used for problems where you don’t know how to find a good solution, but if shown a candidate solution, you can give it a grade.
To grade a solution, we use Evaulation Function.
Most of time, we can have mutliple stages in a game such as early game, middle game and late game. A good candidate solution in the early game is not necessarily a good one in late game. In addition to stages, we can also stand in differnt positions such as winning, losing positions. If you are winning, you may want to enforce your position by being more defensive. On the contrary, if you are losing, it’s better to risk all your resources to reverse that situation.
Therefore, avoid applying the same and universal evaluation function all the time. Make it situational.
Don’t use huge numbers because we are not good at reason about huge numbers.
For example, we want to evaluate the distance between two objects. They can be tens of thousands meters far away from eatch other. How to transform the distance to small numbers?
My favorite function is the expotention function with a small base. For instance, Math.pow(0.999, 1234) ~= 0.3. It feels much better to compare 0.3 with 0.9 instead of comparing 15123 to 53492.
Don’t just look at the formula. Open your debugger, see and feel the numbers to make sure they are relavant. Feeling those numbers helps a lot to tune your formula.
Instead of using positive numbers for bonus and negative numbers for malus, always think in terms of positive score. Only in the end, subtract malus from bonus. This helps to you to stick with one reference system.
There are still many areas that I need to explore such as how to weight different coeffcients, how to design multi-objective functions. I will add more lessons here when I play and win more games.