With a 99.41% win rate that crushes human chess players, will AI really beat humans this time?

With a 99.41% win rate that crushes human chess players, will AI really beat humans this time?

This time, AI beat humans again.

A research team led by Huawei Cloud AI CTO Dai Zonghong and Peking University AI Institute Assistant Professor Yang Yaodong has developed an algorithm that can crush human opponents with a 99.41% win rate in chess games - JiangJun (pronounced as "general").

The related research paper, titled “JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games”, has been published on the preprint website arXiv.

Using human players as opponents, and constantly trial and error, and iteration, is the common way of evolution of AI agents based on reinforcement learning. In recent years, considering that there are usually multiple agents at the same time in real scenes, researchers have extended their focus from the single-agent field to multi-agent.

In fact, multi-agent reinforcement learning has indeed achieved remarkable success in various game fields, and has been proven in games such as Hide and Seek (a game on Steam), Go, StarCraft II, Dota 2, and Military Chess.

However, algorithms like AlphaZero and AlphaGo, which focus on the recent performance of their opponents for training, may not be able to consistently win or achieve the desired state in games with non-transitive structures. Although this problem has been intensively studied in games with incomplete information, it has been relatively less studied in games with complete information.

Perfect Information Game: A game in which every participant has accurate information about the characteristics, strategies, and payoff functions of all other participants, such as chess.

Incomplete information game: at least one participant has incomplete knowledge of the above information, such as Western Army Chess.

Currently, overcoming the non-transitivity problem in perfect information games remains an unsolved research problem. Recent research focuses on using strategy space response predictor (PSRO) algorithms to find Nash equilibria, but these methods have not been explored in perfect information games.

The accessibility of chess makes it an excellent object for exploring board games and non-transitive geometry. This study deeply explores the complex geometric properties of chess, using a large-scale dataset of more than 10,000 human games to reveal the remarkable non-transitivity of chess in the transitive middle region.

To solve the non-transitivity problem, the researchers proposed the JiangJun algorithm, which, unlike AlphaZero's self-playing strategy, uses Nash responses to select opponents.

The JiangJun algorithm consists of two basic modules: MCTS Actor and Populationer. These components jointly use Monte Carlo Tree Search (MCTS) technology to approach Nash equilibrium within the player group.

The effectiveness of JiangJun's algorithm was comprehensively evaluated across a range of metrics. The researchers proposed a training framework that effectively leveraged the computing power of up to 90 V100 GPUs on the Huawei Cloud ModelArt platform to train the JiangJun algorithm to master-level performance.

Multiple metrics, including relative population performance, Nash distribution visualization, and low-dimensional game landscape visualization in two main embedding dimensions, together confirm the proficiency of JiangJun's algorithm in solving the chess non-transitivity problem.

In addition, the JiangJun algorithm significantly outperformed its contemporary algorithms in win rate, with win rates exceeding 85% and 96.40% respectively compared to standard AlphaZero chess and Behavior Clone chess. In the exploitability evaluation, the JiangJun algorithm (8.41% win rate of near-optimal response) was significantly closer to the optimal strategy than the standard AlphaZero chess algorithm (25.53%).

In addition, the researchers designed a chess applet on the WeChat platform, which collected more than 7,000 game records between the JiangJun algorithm and human opponents over a six-month period. According to the game data, the JiangJun algorithm defeated human opponents with an astonishing 99.41% winning rate.

In addition to its amazing win rate of nearly 100%, case studies of various endgames show that JiangJun's algorithm also has a strong ability to flexibly respond to the complexity of chess endgames.

The advent of the JiangJun algorithm marks an amazing achievement of AI in the field of chess. By solving the non-transitivity problem in complete information games, the research team successfully introduced Nash response and Monte Carlo tree search technology, bringing a new way of thinking to the field of chess. This algorithm not only achieves an amazing winning rate, but also demonstrates the powerful ability of AI in dealing with complex and uncertain problems.

Reference Links:

https://arxiv.org/abs/2308.04719

https://openreview.net/forum?id=MMsyqXIJuk

https://sites.google.com/view/jiangjun-site/

Author: Hazel Yan

<<:  One picture to understand | In order to let the "melon-eating crowd" eat watermelon without spitting out the seeds, breeding experts have come up with this method

>>:  Why is it "limit salt" instead of "quit salt"?

Recommend

The efficacy and function of Burmese jujube

In today's society, health preservation seems...

The efficacy and function of moxibustion of mugwort

Mugwort is a Chinese herbal medicine often used i...

Why is lotus recognized as the center of summer flowers?

When it comes to the center of summer flowers, th...

The efficacy and function of Tianshui ant grass

Tianshui Yicao is a famous traditional commonly u...

How to eat Poria cocos for whitening

Many female friends know that Poria cocos has the...

Is Dragon Claw poisonous?

I believe many people don’t know what dragon claw...