Skip to content

ChatHLS: Towards Systematic Design Automation and Optimization for High-Level Synthesis

Conference: ACL 2026
arXiv: 2507.00642
Code: None
Area: LLM-assisted hardware design
Keywords: High-Level Synthesis, LLM-assisted design, Multi-agent, Pragma optimization, Automatic debugging

TL;DR

ChatHLS proposes a multi-agent HLS design framework featuring two core components: HLSTuner (QoR-aware reasoning for pragma selection) and HLSFixer (a debugging framework enhanced by hierarchical feedback). Combined with a self-evolving error case expansion mechanism (VODA), it significantly outperforms baselines in HLS-C generation success rates and hardware performance optimization.

Background & Motivation

Background: High-Level Synthesis (HLS) accelerates hardware design by abstracting C/C++ into hardware descriptions. The success of LLMs in code generation has inspired research into applying them to HLS.

Limitations of Prior Work: (1) HLS-specific data is scarce, with existing datasets rarely exposing synthesizability constraints, pragma selection rationales, and QoR correlations; (2) The combinatorial explosion of the pragma tuning space makes manual optimization extremely time-consuming; (3) General-purpose LLMs struggle to identify and correct HLS-specific compatibility errors.

Key Challenge: HLS design requires simultaneous optimization of functional correctness and hardware efficiency, but existing LLMs lack an understanding of hardware constraints and pragma semantics.

Goal: Build an automated HLS design, optimization, and debugging framework.

Key Insight: Multi-agent collaboration + Specialized fine-tuning + Self-evolving data augmentation.

Core Idea: Enable LLMs to understand the causal relationship between pragmas and hardware performance through QoR-aware reasoning, and accurately diagnose HLS errors via a reasoning-to-instruction approach.

Method

Overall Architecture

The framework consists of two major phases: (1) HLS-C Generation Phase—The LLM generates HLS-C code, and HLSTuner selects and inserts pragmas; (2) HLS-C Debugging Phase—HLSFixer parses tool feedback for error diagnosis and repair, while VODA expands the error case library.

Key Designs

  1. HLSTuner (QoR-aware Pragma Optimization):

    • Function: Automates HLS pragma selection, configuration, and insertion.
    • Mechanism: Taking source HLS-C and initial QoR as input, it analyzes the causal chain of pragma changes \(\rightarrow\) hardware architecture changes \(\rightarrow\) performance changes via QoR-aware reasoning. NSGA-II is used to generate diverse optimized designs, and optimized CoT generated by a teacher model is used for supervised training.
    • Design Motivation: To enable the LLM to understand the relationship between pragmas and QoR, rather than just mechanically inserting pragmas.
  2. HLSFixer (Hierarchical Feedback Debugging Framework):

    • Function: Resolves HLS-specific compilation and synthesis errors.
    • Mechanism: Decouples debugging into error identification, diagnosis, and repair. An analysis LLM extracts error information from HLS tool feedback and generates repair instructions, which a repair LLM then executes. For errors outside the training distribution, an LLM-as-a-Judge system evaluates outputs from multiple perspectives.
    • Design Motivation: A reasoning-to-instruction approach is more controllable and interpretable than end-to-end repair.
  3. VODA (Self-evolving Data Augmentation):

    • Function: Continuously expands the error case library.
    • Mechanism: Automatically captures newly encountered error cases during the ChatHLS workflow to further strengthen the debugging capabilities of HLSFixer.
    • Design Motivation: To address the long-tail distribution of HLS-specific errors.

Loss & Training

HLSTuner utilizes supervised fine-tuning with CoT generated by NSGA-II and teacher models. HLSFixer employs decoupled reasoning-to-instruction fine-tuning.

Key Experimental Results

Main Results

  • ChatHLS achieves a 32.6% improvement in debugging relative to Gemini-3-pro.
  • The HLS-C generation success rate is increased by 41.8%.
  • It achieves a 3.3× performance gain compared to RAG-based methods.

Key Findings

  • QoR-aware reasoning significantly outperforms simple code-to-code mapping.
  • Hierarchical feedback debugging is more effective than end-to-end repair.
  • The VODA self-evolution mechanism continuously improves debugging performance.

Highlights & Insights

  • QoR-aware reasoning allows LLMs to "understand" hardware constraints rather than simply generating code.
  • The decoupled reasoning-to-instruction debugging method offers high interpretability and reliability.

Limitations & Future Work

  • Currently targeted at specific HLS toolchains, which may limit applicability to other EDA tools.
  • The process of generating CoT via NSGA-II involves high computational costs.
  • Future research could explore end-to-end Reinforcement Learning (RL) as an alternative to supervised fine-tuning.
  • Compared to template-based methods like HeteroRefactor and HeteroGen, ChatHLS is more flexible and does not require predefined templates.
  • Compared to RAG methods, specialized fine-tuning provides more precise and structured domain knowledge.

Rating

  • Novelty: ⭐⭐⭐⭐ QoR-aware reasoning and self-evolving debugging are novel designs.
  • Experimental Thoroughness: ⭐⭐⭐⭐ Evaluated against multiple benchmarks and baselines.
  • Writing Quality: ⭐⭐⭐⭐ Detailed framework descriptions and clear flowcharts.