ChatHLS: Towards Systematic Design Automation and Optimization for High-Level Synthesis¶
Conference: ACL 2026
arXiv: 2507.00642
Code: None
Area: LLM-assisted hardware design
Keywords: High-Level Synthesis, LLM-assisted design, Multi-agent, Pragma optimization, Automatic debugging
TL;DR¶
ChatHLS proposes a multi-agent HLS design framework featuring two core components: HLSTuner (QoR-aware reasoning for pragma selection) and HLSFixer (a debugging framework enhanced by hierarchical feedback). Combined with a self-evolving error case expansion mechanism (VODA), it significantly outperforms baselines in HLS-C generation success rates and hardware performance optimization.
Background & Motivation¶
Background: High-Level Synthesis (HLS) accelerates hardware design by abstracting C/C++ into hardware descriptions. The success of LLMs in code generation has inspired research into applying them to HLS.
Limitations of Prior Work: (1) HLS-specific data is scarce, with existing datasets rarely exposing synthesizability constraints, pragma selection rationales, and QoR correlations; (2) The combinatorial explosion of the pragma tuning space makes manual optimization extremely time-consuming; (3) General-purpose LLMs struggle to identify and correct HLS-specific compatibility errors.
Key Challenge: HLS design requires simultaneous optimization of functional correctness and hardware efficiency, but existing LLMs lack an understanding of hardware constraints and pragma semantics.
Goal: Build an automated HLS design, optimization, and debugging framework.
Key Insight: Multi-agent collaboration + Specialized fine-tuning + Self-evolving data augmentation.
Core Idea: Enable LLMs to understand the causal relationship between pragmas and hardware performance through QoR-aware reasoning, and accurately diagnose HLS errors via a reasoning-to-instruction approach.
Method¶
Overall Architecture¶
The framework consists of two major phases: (1) HLS-C Generation Phase—The LLM generates HLS-C code, and HLSTuner selects and inserts pragmas; (2) HLS-C Debugging Phase—HLSFixer parses tool feedback for error diagnosis and repair, while VODA expands the error case library.
Key Designs¶
-
HLSTuner (QoR-aware Pragma Optimization):
- Function: Automates HLS pragma selection, configuration, and insertion.
- Mechanism: Taking source HLS-C and initial QoR as input, it analyzes the causal chain of pragma changes \(\rightarrow\) hardware architecture changes \(\rightarrow\) performance changes via QoR-aware reasoning. NSGA-II is used to generate diverse optimized designs, and optimized CoT generated by a teacher model is used for supervised training.
- Design Motivation: To enable the LLM to understand the relationship between pragmas and QoR, rather than just mechanically inserting pragmas.
-
HLSFixer (Hierarchical Feedback Debugging Framework):
- Function: Resolves HLS-specific compilation and synthesis errors.
- Mechanism: Decouples debugging into error identification, diagnosis, and repair. An analysis LLM extracts error information from HLS tool feedback and generates repair instructions, which a repair LLM then executes. For errors outside the training distribution, an LLM-as-a-Judge system evaluates outputs from multiple perspectives.
- Design Motivation: A reasoning-to-instruction approach is more controllable and interpretable than end-to-end repair.
-
VODA (Self-evolving Data Augmentation):
- Function: Continuously expands the error case library.
- Mechanism: Automatically captures newly encountered error cases during the ChatHLS workflow to further strengthen the debugging capabilities of HLSFixer.
- Design Motivation: To address the long-tail distribution of HLS-specific errors.
Loss & Training¶
HLSTuner utilizes supervised fine-tuning with CoT generated by NSGA-II and teacher models. HLSFixer employs decoupled reasoning-to-instruction fine-tuning.
Key Experimental Results¶
Main Results¶
- ChatHLS achieves a 32.6% improvement in debugging relative to Gemini-3-pro.
- The HLS-C generation success rate is increased by 41.8%.
- It achieves a 3.3× performance gain compared to RAG-based methods.
Key Findings¶
- QoR-aware reasoning significantly outperforms simple code-to-code mapping.
- Hierarchical feedback debugging is more effective than end-to-end repair.
- The VODA self-evolution mechanism continuously improves debugging performance.
Highlights & Insights¶
- QoR-aware reasoning allows LLMs to "understand" hardware constraints rather than simply generating code.
- The decoupled reasoning-to-instruction debugging method offers high interpretability and reliability.
Limitations & Future Work¶
- Currently targeted at specific HLS toolchains, which may limit applicability to other EDA tools.
- The process of generating CoT via NSGA-II involves high computational costs.
- Future research could explore end-to-end Reinforcement Learning (RL) as an alternative to supervised fine-tuning.
Related Work & Insights¶
- Compared to template-based methods like HeteroRefactor and HeteroGen, ChatHLS is more flexible and does not require predefined templates.
- Compared to RAG methods, specialized fine-tuning provides more precise and structured domain knowledge.
Rating¶
- Novelty: ⭐⭐⭐⭐ QoR-aware reasoning and self-evolving debugging are novel designs.
- Experimental Thoroughness: ⭐⭐⭐⭐ Evaluated against multiple benchmarks and baselines.
- Writing Quality: ⭐⭐⭐⭐ Detailed framework descriptions and clear flowcharts.