GPT-4 reasoning is more human-like! The Chinese Academy of Sciences proposed "Thinking Communication", analogical thinking completely beats CoT, plug and play

Original source: Xinzhiyuan

Image source: Generated by Unbounded AI

Nowadays, giant neural network models such as GPT-4 and PaLM have emerged and have demonstrated amazing few-sample learning capabilities.

Given simple prompts, they can reason about text, write stories, answer questions, program...

However, LLM often loses to humans in complex, multi-step reasoning tasks, and struggles to no avail.

In this regard, researchers from the Chinese Academy of Sciences and Yale University proposed a new framework of "Thought Propagation" that can enhance LLM's reasoning through "analogical thinking."

Paper address:

“Thought spreading” is inspired by human cognition, which is that when we encounter a new problem, we often compare it with similar problems we have already solved to derive strategies.

Therefore, the core of this method is to let LLM explore "similar" problems related to the input before solving the input problem.

Finally, their solutions can be used out of the box or extract insights for useful planning.

It is foreseeable that "thinking communication" is proposing new ideas for the inherent limitations of LLM's logical capabilities, allowing large models to use "analogy" to solve problems like humans.

LLM multi-step reasoning, defeated by humans

It is obvious that LLM is good at basic reasoning based on prompts, but it still has difficulties when dealing with complex multi-step problems, such as optimization and planning.

Humans, on the other hand, draw on intuition from similar experiences to solve new problems.

Large models cannot do this due to their inherent limitations.

Because the knowledge of LLM comes entirely from the patterns in the training data, it cannot truly understand the language or concepts. Therefore, as statistical models, they are difficult to perform complex combinatorial generalizations.

The most important thing is that LLM lacks systematic reasoning capabilities and cannot reason step by step like humans to solve challenging problems.

In addition, the reasoning of large models is local and "short-sighted", so it is difficult for LLM to find the best solution and maintain the consistency of reasoning over a long period of time.

In short, the shortcomings of large models in mathematical proof, strategic planning and logical reasoning mainly stem from two core issues:

**- Inability to reuse insights from previous experience. **

Humans accumulate reusable knowledge and intuition from practice that help solve new problems. In contrast, LLM approaches each problem "from scratch" and does not borrow from previous solutions.

**- Compound errors in multi-step reasoning. **

Humans monitor their own chains of reasoning and modify initial steps when necessary. But mistakes made by LLM in the early stages of reasoning are amplified because they lead subsequent reasoning down the wrong path.

The above weaknesses seriously hinder the application of LLM in dealing with complex challenges that require global optimization or long-term planning.

In this regard, researchers have proposed a brand-new solution-thinking communication.

TP Framework

Through analogical thinking, LLM can reason more like humans.

According to researchers, reasoning from scratch cannot reuse insights from solving similar problems, and errors will accumulate in the intermediate reasoning stages.

"Thought spreading" can explore similar problems related to the input problem and get inspiration from solutions to similar problems.

The figure below shows the comparison between "Thought Propagation" (TP) and other representative technologies. For the input problem p, IO, CoT and ToT will reason from scratch to arrive at the solution s.

Specifically, TP includes three stages:

**1. Ask similar questions: **LLM generates a set of similar questions that have similarities with the input question through prompts. This will guide the model to retrieve potentially relevant prior experiences.

**2. Solve similar problems: ** Let LLM solve each similar problem through existing prompting technology, such as CoT.

**3. Summarizing solutions: **There are 2 different approaches - directly inferring new solutions to the input problem based on analogous solutions; deriving high-level plans or strategies by comparing analogous solutions to the input problem.

This allows large models to reuse previous experience and heuristics, and also cross-check their initial reasoning with analogical solutions to refine those solutions.

It is worth mentioning that "thought propagation" has nothing to do with the model and can perform a single problem-solving step based on any prompt method.

The key novelty of this method is to stimulate LLM analogical thinking to guide complex reasoning processes.

Whether "thinking communication" can make LLM more like a human depends on the actual results.

Researchers from the Chinese Academy of Sciences and Yale conducted the evaluation in 3 tasks:

**- Shortest path reasoning: **The need to find the best path between nodes in a graph requires global planning and search. Even on simple graphs, standard techniques fail.

**- Creative Writing: ** Generating coherent, creative stories is an open-ended challenge. When given high-level outline prompts, LLM often loses consistency or logic.

- LLM Agent Planning: LLM agents interacting with textual environments struggled with long-term strategies. Their plans often "drift" or get stuck in cycles.

Shortest path reasoning

In the shortest path reasoning task, the problems encountered by existing methods cannot be solved.

Although the graph in (a) is very simple, since inference starts from 0, these methods only allow LLM to find suboptimal solutions (b, c) or even repeatedly visit the intermediate node (d).

The following is an example of combining TP and ToT.

ToT (b) cannot solve the problem in (a) due to accumulation of errors in intermediate reasoning steps. Based on solutions to similar problems, TP (c) refines the initial suboptimal solution and eventually finds the optimal solution.

By comparing with the baseline, TP's performance in processing the shortest path task is significantly improved by 12%, generating optimal and effective shortest paths.

In addition, due to the lowest OLR, the effective path generated by TP is closest to the optimal path compared with the baseline.

At the same time, the researchers further studied the impact of the number of TP layers on the complexity and performance of the shortest path task.

Under different settings, the token cost of layer 1 TP is similar to ToT. However, Layer 1 TP has achieved very competitive performance in finding the optimal shortest path.

In addition, the performance gain of layer 1 TP is also very significant compared to layer 0 TP (IO). Figure 5(a) shows the increase in token cost for layer 2 TP.

Creative Writing

Table 2 below shows the performance of TP and baseline in GPT-3.5 and GPT-4. In terms of consistency, TP exceeds the baseline. Additionally, in user studies, TP increased human preference in creative writing by 13%.

LLM agent planning

In the third task evaluation, the researchers used the ALFWorld game suite to instantiate the LLM agent planning task in 134 environments.

TP increases the task completion rate by 15% in LLM agent planning. This demonstrates the superiority of reflective TP for successful planning when completing similar tasks.

The above experimental results show that "thought propagation" can be generalized to a variety of different reasoning tasks and performs well in all these tasks.

Keys to Enhanced LLM Inference

The "thought propagation" model provides a new technology for complex LLM reasoning.

Analogical thinking is a hallmark of human problem-solving abilities and can lead to a range of systemic advantages, such as more efficient search and error correction.

Similarly, LLM can also better overcome its own weaknesses, such as lack of reusable knowledge and cascading local errors, by prompting analogical thinking.

However, there are some limitations to these findings.

Generating useful analogy questions efficiently is not easy, and longer chains of analogical reasoning paths can become unwieldy. At the same time, controlling and coordinating multi-step reasoning chains remains difficult.

However, "thought propagation" still provides us with an interesting method by creatively solving the reasoning flaws of LLM.

With further development, analogical thinking may make LLM's reasoning even more powerful. And this also points the way to achieving more human-like reasoning in large language models.

about the author

Ran He

He is a professor at the National Experimental Key Laboratory of Pattern Recognition of the Institute of Automation, Chinese Academy of Sciences and the University of Chinese Academy of Sciences, an IAPR Fellow and a senior member of IEEE.

Previously, he received his bachelor's and master's degrees from Dalian University of Technology, and his PhD from the Institute of Automation, Chinese Academy of Sciences in 2009.

His research interests are biometric algorithms (face recognition and synthesis, iris recognition, person re-identification), representation learning (pre-training networks using weak/self-supervised or transfer learning), generative learning (generative models, image generation, image translation ).

He has published more than 200 papers in international journals and conferences, including famous international journals such as IEEE TPAMI, IEEE TIP, IEEE TIFS, IEEE TNN, and IEEE TCSVT, as well as top international conferences such as CVPR, ICCV, ECCV, and NeurIPS.

He is a member of the editorial boards of IEEE TIP, IEEE TBIOM and Pattern Recognition, and has served as a regional chair for international conferences such as CVPR, ECCV, NeurIPS, ICML, ICPR and IJCAI.

Junchi Yu(俞UN驰)

Yu Junchi is a fourth-year doctoral student at the Institute of Automation, Chinese Academy of Sciences, and his supervisor is Professor Heran.

Previously, he interned at Tencent Artificial Intelligence Laboratory and worked with Dr. Tingyang Xu, Dr. Yu Rong, Dr. Yatao Bian and Professor Junzhou Huang. Currently, he is an exchange student in the Computer Science Department of Yale University, studying under Professor Rex Ying.

His goal is to develop Trustworthy Graph Learning (TwGL) methods with good interpretability and portability and explore their applications in biochemistry.

References:

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)