Artificial Intelligence

Researchers from SEA AI Labs, UCAS, NUS, and SJTU introduce FlowReasoner: a meta-agent for query-level generated by a personalized system

A multi-agent system based on LLM, characterized by planning, reasoning, tool usage and memory capabilities, forms the basis for applications such as chatbots, code generation, mathematics, and robotics. However, these systems face significant challenges when designed manually, resulting in high HR costs and limited scalability. Graphic-based approaches attempt to automate workflow design by using workflows as a network, but their structural complexity limits scalability. State-of-the-art approaches represent multi-agent systems as programming code and use advanced LLMs as meta-agents to optimize workflows, but focus on generating task-level solutions for a single task-specific system. This single-fit approach lacks the ability to automatically adapt to single user queries.

LLM-based multi-agent system is the basis for a variety of real-life applications, including code intelligence, computer usage and in-depth research. These systems have LLM-based agents equipped with scheduled capabilities, database access and tool function calls that can collaborate to achieve promising performance. The focus of early approaches was to optimize prompts or hyperparameters through evolutionary algorithms to automate proxy analysis. ADAS introduces code representations of agents and workflows, and has meta-agents to generate workflows. In addition, OpenAI proposes reasoning in LLM by developing the O1 model. Models such as QWQ, QVQ, DeepSeek and Kimi also followed suit and developed an O1-like inference architecture. Openai’s O3 model achieved encouraging results on the Arg-Agi benchmark.

Researchers from SEA AI Laboratory, Singapore, Chinese Academy of Sciences, National University of Singapore University and Jiao Tong University in Shanghai have proposed a meta-agent process scientist aiming to automatically create a quality level of multi-level system systems at query level that is automated for automation, thereby generating a customized system for a customized system. The researchers extracted DeepSeek R1 to provide the basic inference functionality needed to create a multi-agent system, and then enhanced it by strengthening learning through external execution feedback. A multifunctional reward mechanism was developed to optimize training across three key dimensions: performance, complexity, and efficiency. This enables FlowReasoner to generate a personalized multi-agent system by deliberating reasoning for each unique user query.

The researchers selected three datasets: BigCodebench, HumaneVal, and MBPP for engineering-oriented tasks to demonstrate algorithmic challenges to conduct detailed evaluation across different code generation schemes. Three categories of benchmarks were evaluated for mobile explorers:

  • Direct call using independent llms
  • Manually designed workflows, including self-refine, LLM-Debate and LLM-Blender, and have manual inference strategies
  • Automated workflow optimization methods such as Aflow, ADA and MAA that build workflows by searching or optimizing.

Both O1-Mini and GPT-4O-Mini are used as working models for manually designing workflows. Flow Reasoner uses O1-Mini as Worker model and implements in two variants of DeepSeek-R1-Distill-Qwen (7b and 14b parameters).

FlowReasoner-14b performed better than all competing methods, with an overall improvement of 5 percentage points compared to the strongest baseline MAAS. It outperforms the performance of its basic worker model O1-Mini, which is a magnitude of 10%. These results indicate the effectiveness of workflow-based inference frameworks in improving code generation accuracy. To evaluate generalization capabilities, experiments were conducted, and models such as QWEN2.5-CODER, CLAUDE and GPT-4O-MINI were used to replace O1-Mini workers, while the meta-agent was fixed to FlowReasonion-7B or FlowReason-14B. Mobility historians have significant transferability and maintain consistency of different workers’ models on the same task.

In this article, researchers propose FlowReasoner, a query-level meta-agent designed to automate the creation of a personalized multi-institutional system for a single user query. Flow Reasoner leverages external execution feedback and reinforcement learning, multi-functional rewards focus on performance, complexity, and efficiency to generate optimized workflows without relying on complex search algorithms or well-designed search sets. This approach reduces HR costs by enabling more adaptive and efficient multi-agent systems while increasing scalability that can dynamically optimize their structure based on specific user queries instead of relying on fixed workflows across the entire task category.


Check Paper and github pages. Also, don’t forget to follow us twitter And join us Telegram Channel and LinkedIn GrOUP. Don’t forget to join us 90K+ ml reddit.

🔥 [Register Now] Minicon Agesic AI Virtual Conference: Free Registration + Certificate of Attendance + 4-hour Short Event (May 21, 9am-1pm) + Hands-On the Workshop


Sajjad Ansari is a final year undergraduate student from IIT Kharagpur. As a technology enthusiast, he delves into the practical application of AI, focusing on understanding AI technology and its real-world impact. He aims to express complex AI concepts in a clear and easy way.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button