ChatHLS: Towards Systematic Design Automation and Optimization for High-Level Synthesis¶
Conference: ACL 2026 arXiv: 2507.00642 Code: None Area: LLM-Assisted Hardware Design Keywords: High-Level Synthesis, LLM-Assisted Design, Multi-Agent, Pragma Optimization, Automated Debugging
TL;DR¶
ChatHLS proposes a multi-agent HLS design framework featuring two core components — HLSTuner (QoR-aware reasoning for pragma selection) and HLSFixer (a hierarchical feedback-enhanced debugging framework) — combined with a self-evolving error case augmentation mechanism (VODA), achieving significant improvements over baselines in HLS-C generation success rate and hardware performance optimization.
Background & Motivation¶
Background: High-Level Synthesis (HLS) accelerates hardware design by abstracting C/C++ into hardware descriptions. The success of LLMs in code generation has stimulated research interest in applying them to HLS.
Limitations of Prior Work: (1) HLS-specific data is scarce, and existing datasets rarely expose synthesizability constraints, pragma selection rationale, or QoR correlations; (2) the combinatorially explosive pragma tuning space makes manual optimization extremely time-consuming; (3) general-purpose LLMs struggle to identify and correct HLS-specific compatibility errors.
Key Challenge: HLS design requires simultaneously optimizing functional correctness and hardware efficiency, yet existing LLMs lack understanding of hardware constraints and pragma semantics.
Goal: Construct an automated framework for HLS design, optimization, and debugging.
Key Insight: Multi-agent collaboration + specialized fine-tuning + self-evolving data augmentation.
Core Idea: Enable LLMs to understand the causal relationship between pragmas and hardware performance through QoR-aware reasoning, and accurately diagnose HLS errors via a reasoning-to-instruction approach.
Method¶
Overall Architecture¶
The framework consists of two major phases: (1) HLS-C Generation — an LLM generates HLS-C code, and HLSTuner selects and inserts pragmas; (2) HLS-C Debugging — HLSFixer parses tool feedback for error diagnosis and repair, while VODA expands the error case library.
Key Designs¶
-
HLSTuner (QoR-Aware Pragma Optimization):
- Function: Automates HLS pragma selection, configuration, and insertion.
- Mechanism: Takes source HLS-C and initial QoR as input, then analyzes the causal chain of pragma changes → hardware architecture changes → performance changes via QoR-aware reasoning. NSGA-II is used to generate diverse optimized designs, and a teacher model generates optimization CoTs for supervised training.
- Design Motivation: Enable LLMs to understand the relationship between pragmas and QoR, rather than mechanically inserting pragmas.
-
HLSFixer (Hierarchical Feedback Debugging Framework):
- Function: Resolves HLS-specific compilation and synthesis errors.
- Mechanism: Decouples debugging into error identification, diagnosis, and repair. An analysis LLM extracts error information from HLS tool feedback and generates repair instructions, which a repair LLM then executes. For out-of-distribution errors, an LLM-as-a-Judge system performs multi-perspective evaluation.
- Design Motivation: The reasoning-to-instruction approach is more controllable and interpretable than end-to-end repair.
-
VODA (Self-Evolving Data Augmentation):
- Function: Continuously expands the error case library.
- Mechanism: Automatically captures newly encountered error cases during the ChatHLS workflow and uses them to further reinforce HLSFixer's debugging capability.
- Design Motivation: Address the long-tail distribution of HLS errors.
Loss & Training¶
HLSTuner employs supervised fine-tuning using CoTs generated by NSGA-II and a teacher model. HLSFixer uses decoupled fine-tuning via the reasoning-to-instruction approach.
Key Experimental Results¶
Main Results¶
- ChatHLS achieves a 32.6% improvement in debugging over Gemini-3-pro.
- HLS-C generation success rate improves by 41.8%.
- Achieves a 3.3× performance gain over RAG-based methods.
Key Findings¶
- QoR-aware reasoning significantly outperforms simple code-to-code mapping.
- Hierarchical feedback debugging is more effective than end-to-end repair.
- The VODA self-evolving mechanism continuously improves debugging capability.
Highlights & Insights¶
- QoR-aware reasoning enables LLMs to "understand" hardware rather than simply generate code.
- The reasoning-to-instruction decoupled debugging approach offers strong interpretability.
Limitations & Future Work¶
- The framework targets a specific HLS toolchain and may not generalize to other EDA tools.
- The NSGA-II-based CoT generation process incurs high computational cost.
- Future work may explore end-to-end RL training as an alternative to supervised fine-tuning.
Related Work & Insights¶
- Compared to template-based approaches such as HeteroRefactor and HeteroGen, ChatHLS is more flexible and requires no predefined templates.
- Compared to RAG-based methods, specialized fine-tuning provides more precise domain knowledge.
Rating¶
- Novelty: ⭐⭐⭐⭐ QoR-aware reasoning and self-evolving debugging are novel designs.
- Experimental Thoroughness: ⭐⭐⭐⭐ Multiple benchmarks and baselines are compared.
- Writing Quality: ⭐⭐⭐⭐ Framework description is thorough with clear pipeline diagrams.