ChatHLS: Towards Systematic Design Automation and Optimization for High-Level Synthesis¶

Conference: ACL 2026 arXiv: 2507.00642 Code: None Area: LLM-Assisted Hardware Design Keywords: High-Level Synthesis, LLM-Assisted Design, Multi-Agent, Pragma Optimization, Automated Debugging

TL;DR¶

ChatHLS proposes a multi-agent HLS design framework featuring two core components — HLSTuner (QoR-aware reasoning for pragma selection) and HLSFixer (a hierarchical feedback-enhanced debugging framework) — combined with a self-evolving error case augmentation mechanism (VODA), achieving significant improvements over baselines in HLS-C generation success rate and hardware performance optimization.

Background & Motivation¶

Background: High-Level Synthesis (HLS) accelerates hardware design by abstracting C/C++ into hardware descriptions. The success of LLMs in code generation has stimulated research interest in applying them to HLS.

Limitations of Prior Work: (1) HLS-specific data is scarce, and existing datasets rarely expose synthesizability constraints, pragma selection rationale, or QoR correlations; (2) the combinatorially explosive pragma tuning space makes manual optimization extremely time-consuming; (3) general-purpose LLMs struggle to identify and correct HLS-specific compatibility errors.

Key Challenge: HLS design requires simultaneously optimizing functional correctness and hardware efficiency, yet existing LLMs lack understanding of hardware constraints and pragma semantics.

Goal: Construct an automated framework for HLS design, optimization, and debugging.

Key Insight: Multi-agent collaboration + specialized fine-tuning + self-evolving data augmentation.

Core Idea: Enable LLMs to understand the causal relationship between pragmas and hardware performance through QoR-aware reasoning, and accurately diagnose HLS errors via a reasoning-to-instruction approach.

Method¶

Overall Architecture¶

The framework consists of two major phases: (1) HLS-C Generation — an LLM generates HLS-C code, and HLSTuner selects and inserts pragmas; (2) HLS-C Debugging — HLSFixer parses tool feedback for error diagnosis and repair, while VODA expands the error case library.

Key Designs¶

HLSTuner (QoR-Aware Pragma Optimization):
- Function: Automates HLS pragma selection, configuration, and insertion.
- Mechanism: Takes source HLS-C and initial QoR as input, then analyzes the causal chain of pragma changes → hardware architecture changes → performance changes via QoR-aware reasoning. NSGA-II is used to generate diverse optimized designs, and a teacher model generates optimization CoTs for supervised training.
- Design Motivation: Enable LLMs to understand the relationship between pragmas and QoR, rather than mechanically inserting pragmas.
HLSFixer (Hierarchical Feedback Debugging Framework):
- Function: Resolves HLS-specific compilation and synthesis errors.
- Mechanism: Decouples debugging into error identification, diagnosis, and repair. An analysis LLM extracts error information from HLS tool feedback and generates repair instructions, which a repair LLM then executes. For out-of-distribution errors, an LLM-as-a-Judge system performs multi-perspective evaluation.
- Design Motivation: The reasoning-to-instruction approach is more controllable and interpretable than end-to-end repair.
VODA (Self-Evolving Data Augmentation):
- Function: Continuously expands the error case library.
- Mechanism: Automatically captures newly encountered error cases during the ChatHLS workflow and uses them to further reinforce HLSFixer's debugging capability.
- Design Motivation: Address the long-tail distribution of HLS errors.

Loss & Training¶

HLSTuner employs supervised fine-tuning using CoTs generated by NSGA-II and a teacher model. HLSFixer uses decoupled fine-tuning via the reasoning-to-instruction approach.

Key Experimental Results¶

Main Results¶

ChatHLS achieves a 32.6% improvement in debugging over Gemini-3-pro.
HLS-C generation success rate improves by 41.8%.
Achieves a 3.3× performance gain over RAG-based methods.

Key Findings¶

QoR-aware reasoning significantly outperforms simple code-to-code mapping.
Hierarchical feedback debugging is more effective than end-to-end repair.
The VODA self-evolving mechanism continuously improves debugging capability.

Highlights & Insights¶

QoR-aware reasoning enables LLMs to "understand" hardware rather than simply generate code.
The reasoning-to-instruction decoupled debugging approach offers strong interpretability.

Limitations & Future Work¶

The framework targets a specific HLS toolchain and may not generalize to other EDA tools.
The NSGA-II-based CoT generation process incurs high computational cost.
Future work may explore end-to-end RL training as an alternative to supervised fine-tuning.

Compared to template-based approaches such as HeteroRefactor and HeteroGen, ChatHLS is more flexible and requires no predefined templates.
Compared to RAG-based methods, specialized fine-tuning provides more precise domain knowledge.

Rating¶

Novelty: ⭐⭐⭐⭐ QoR-aware reasoning and self-evolving debugging are novel designs.
Experimental Thoroughness: ⭐⭐⭐⭐ Multiple benchmarks and baselines are compared.
Writing Quality: ⭐⭐⭐⭐ Framework description is thorough with clear pipeline diagrams.