Season 12 of the Top Chess Engine Championship saw the arrival of NN engines, modeled after DeepMind findings Alpha Zero, into public computer chess events. At that season LCZero played with CPU only. In Season 13 it was handed a GPU, and only a season later that neural network won its first major title.
Besides LCZero, one more neural network based engine entered TCEC. It was Scorpio NN by Daniel Shawul. And even though it did not make it past Div 4, it showed a trend in computer chess: people understood the potential value and advantages of neural networks in chess.
Season 15 is about to start and one more neural network based engine is going to enter TCEC. This is a combined project by Adam Treat and his Allie and Mark Jordan and his Leelenstein. The new engine is called Allie+Stein, a unique engine by the TCEC rules that will start its quest for top positions and climb of the ladder from Division 4.
Here is an extensive interview with the authors Adam Treat and Mark Jordan.
Your engine Allie+Stein will be the new neural network in TCEC, making its debut this Season 15. Welcome to the Top Chess Engine Championship!
Adam Treat: Allie is a very new chess engine, but represents a lot of hard work over the last several months, so I’m hoping TCEC can provide an opportunity to see how she stacks up against a host of more established engines. Combining Allie with the Leelenstein network will also be interesting given that both introduce new avenues of research in NN chess engines. I an new to the chess programming community and culture and so I am excited to participate and learn through TCEC.
Mark Jordan: I am excited to see how far an engine that uses supervised learning (SL) exclusively can go and I hope there will be more strong SL networks to compare against for benchmarking average performance and maximum performance of the method.
So far there have been two unique NNs in TCEC – LCZero and Scorpio. What makes your engine the third unique NN engine by the TCEC rules
AT: Like Leela, Allie is based on the same concepts and algorithms that were introduced by Deepmind in the AlphaZero paper(s), but her code is original and contains an alternative implementation of those ideas. You can think of Allie as a young cousin of Leela that can utilize the same networks produced by the Lc0 project or other compatible networks. The Leelenstein network is also a novelty in that it introduces supervised learning into the TCEC competition. Finally, Allie+Stein will be using MCTS for the beginning portions of the tournament, but I’m hoping to switch to AlphaBeta search during later rounds… if she makes it that far :)
MJ: Allie+Stein is a completely new engine and neural network produced, thus easily satisfying 2 out if the 3 conditions for uniqueness. It is possible eventually that I will rewrite the training scripts completely with some more new ideas in the future. Currently, training has some changes to the Leela training scripts.
Can you share more on how is Allie+Stein engine being trained?
MJ: Training started from a random initialized network, and consisted of mostly CCRL computer games (I used all of them available), and some weak Leela t30 games and some games from other experiments I was able to gather. These were were all about 100 elo weaker than t10. I tried several new learning techniques using these same games. My goals have mostly been focused on NN learning experimentation. I compared the learning schedule and optimizer that Leela used to different ones from some academic papers, and it seemed to improve performance, but it was still about 50 elo weaker than the best Leela nets. So I replaced all the weak t30 games with many more recent ones, using all of January’s games, while still keeping the CCRL games in the training window, and continuing on several more of my cyclical learning rate cycles.
And how about search?
AT: To begin the tournament, Allie will perform MCTS based search with absolute fpu where new nodes start off with win pct of -1. The search is modeled after Deepminds paper’s. As I said above, I’m hoping to switch to AlphaBeta for the long term direction of the project. I’ve experimented with many, many ways of doing this with the networks generated by the Lc0 project and I think I’ve hit upon a way to achieve the depths required to maintain ELO level with MCTS based search, but it is not ready yet. In the future, I imagine we’ll see a lot of experimentation with different variations of search (mcts, ab) + eval (handwritten, NN) in computer chess engines. Hoping to be a part of that and to contribute to the shared pool of knowledge.
What is the strength of Allie+Stein? What division do you expect to reach? Do you think stronger versions will come out as the season progresses and thus have a better shot at the Premier Division?
MJ: I think the neural network seems to scale pretty well with more nodes. In bullet testing it loses to Stockfish quite a bit, but seems to hold its own in tournaments with more time. I believe it has potential to get to division 1, but will require some more work to better utilize multiple GPUs and support tablebases. I have some ideas to use tablebases in learning as well that could end up making the neural network Premier Division material. But for this season I will be happy even with just getting to Division 2.
AT: This is a question I get quite often. Most people familiar with chess engines know that it is extremely hard to compare strengths other than through heads up competition with a set of match rules/controls in place. That is exactly what TCEC provides. A levelish playing field and a set of machines/rules agreed upon beforehand to determine the relative strength of different algorithms and their implementations. So with a huge grain of salt I’ll say that on my own rather meager hardware I have experienced heads up matches between Allie and Lc0 at short time controls with the same network with a relative difference of 50-100 elo in Lc0’s favor. Obviously, that is with only one GPU. Throw in the advanced multi-gpu hardware here at TCEC and that is a very big new variable. Add the other engines and we have another large unknown variable. Add in that we’ll be using a Supervised Learning network and yet another big variable. Then we have the fact that Allie will start off the tournament without TB support. Just lots and lots of variables. So, ok I expect she will be able to advance out of Div 4, but anything is possible. I’ll be happy if she is able to get winning positions and mate consistently in them :)
Comparing to the top NN engine now Lc0 , do you expect with your approach to have better future?
MJ: The beauty of my approach using only existing games is I can train whole new networks from scratch in a week or two to try a new idea, something that would take the Lc0 project several months, even with 100x or 1000x more compute than I have. I hope to eventually show that some of my ideas must be good by how strong the network is and Lc0 can try them to become strong as well. And I will also continue to try good ideas from there as well. And the project is a great resource for people who need many millions of strong games, as it already has many more games available than all of CCRL even though it is much younger. And sadly fishtest doesn’t store and host all of its games. So I expect to see some great symbiosis.
AT: I expect that whatever advances are made by one engine will (with time and effort) be incorporated into other engines. As they should be! We are all standing on the shoulders of the pioneers in computer science and chess engine programming that came before. I do think that AlphaBeta is a superior search method and that there is no reason that an AB+(eval method) engine can’t compete favorably with an MCTS+(eval method) regardless the eval method. But this is just my theory and worth very little until proven. Only the future will tell.
What are your thoughts on the current hardware balance at TCEC? Are you happy with the hardware your engine is going to play on?
MJ: There are always debates about hardware balance, and I think a wide range power’s are somewhat fair, and as long as the specifications and NPS numbers are published and maintained throughout the tournament it is a reasonable tournament data point. The balance seems to be within a 2x order of magnitude of fair based on any of purchase price, watts, and total cost of ownership metrics which I think is as close as it can possibly get to please everyone. But it does leave room for future debate and improvement. 2x the power or cost doesn’t translate to 2x the nodes so the real elo difference is probably not enough to change what division any engine will end up in most cases, even if it changes a result or two. The format is not guaranteed to find 20-30 elo differences anyway, so I don’t see it as a big issue except during boring games :)
AT: Considering that I’ve never personally tested Allie on this level hardware, sure I’m happy. I just hope she scales well enough.
Stockfish has been dominant at TCEC for almost 1 year.
AT: Stockfish is the strongest engine until proven (convincingly) otherwise. I have great respect for all the developers who work on Stockfish (and other engines for that matter) and think the community of chess programmers is pretty collegial.
MJ: I very much support non-private engines, so I am glad to see that there has never been a private engine to win the Superfinal. And even more glad that it seems that free and even open-source projects can win. I am glad such excellent chess is truly accessible to all.
Alexander Lyashuk from Lc0 shared in an interview that he expects at least 5 NNs to appear and they to dominate computer chess. Do you share this vision? Do you expect one of those NN engines to be Allie+Stein?
AT: I do think it is only a matter of time before NN eval is shown to beat handwritten eval regardless the search method. Right now, if you limit the number of nodes – hands down – any of the NN’s will beat the traditional handwritten eval engines. This is just a fact. Still, I have great respect for the ingenuity of those writing the traditional engines. I hope Allie+Stein will be a meaningful engine in terms of helping to advance the state of art in computer chess programming in the near future.
MJ: As I mentioned above, I am excited to see more projects, and I hope there will be enough of them that we have to pick the most exciting and unique ones. We can use them to develop more tests to determine how unique they are in their ideas of openings and in general play, and which ones are truly beyond average. Of course I hope to have one of these beyond, so the top 5 sounds nice!