Skip to content

ChatHLS: Towards Systematic Design Automation and Optimization for High-Level Synthesis

Conference: ACL 2026 arXiv: 2507.00642 Code: None Area: LLM-Assisted Hardware Design Keywords: High-Level Synthesis, LLM-Assisted Design, Multi-Agent, Pragma Optimization, Automated Debugging

TL;DR

ChatHLS proposes a multi-agent HLS design framework featuring two core components — HLSTuner (QoR-aware reasoning for pragma selection) and HLSFixer (a hierarchical feedback-enhanced debugging framework) — combined with a self-evolving error case augmentation mechanism (VODA), achieving significant improvements over baselines in HLS-C generation success rate and hardware performance optimization.

Background & Motivation

Background: High-Level Synthesis (HLS) accelerates hardware design by abstracting C/C++ into hardware descriptions. The success of LLMs in code generation has stimulated research interest in applying them to HLS.

Limitations of Prior Work: (1) HLS-specific data is scarce, and existing datasets rarely expose synthesizability constraints, pragma selection rationale, or QoR correlations; (2) the combinatorially explosive pragma tuning space makes manual optimization extremely time-consuming; (3) general-purpose LLMs struggle to identify and correct HLS-specific compatibility errors.

Key Challenge: HLS design requires simultaneously optimizing functional correctness and hardware efficiency, yet existing LLMs lack understanding of hardware constraints and pragma semantics.

Goal: Construct an automated framework for HLS design, optimization, and debugging.

Key Insight: Multi-agent collaboration + specialized fine-tuning + self-evolving data augmentation.

Core Idea: Enable LLMs to understand the causal relationship between pragmas and hardware performance through QoR-aware reasoning, and accurately diagnose HLS errors via a reasoning-to-instruction approach.

Method

Overall Architecture

The framework consists of two major phases: (1) HLS-C Generation — an LLM generates HLS-C code, and HLSTuner selects and inserts pragmas; (2) HLS-C Debugging — HLSFixer parses tool feedback for error diagnosis and repair, while VODA expands the error case library.

Key Designs

  1. HLSTuner (QoR-Aware Pragma Optimization):

    • Function: Automates HLS pragma selection, configuration, and insertion.
    • Mechanism: Takes source HLS-C and initial QoR as input, then analyzes the causal chain of pragma changes → hardware architecture changes → performance changes via QoR-aware reasoning. NSGA-II is used to generate diverse optimized designs, and a teacher model generates optimization CoTs for supervised training.
    • Design Motivation: Enable LLMs to understand the relationship between pragmas and QoR, rather than mechanically inserting pragmas.
  2. HLSFixer (Hierarchical Feedback Debugging Framework):

    • Function: Resolves HLS-specific compilation and synthesis errors.
    • Mechanism: Decouples debugging into error identification, diagnosis, and repair. An analysis LLM extracts error information from HLS tool feedback and generates repair instructions, which a repair LLM then executes. For out-of-distribution errors, an LLM-as-a-Judge system performs multi-perspective evaluation.
    • Design Motivation: The reasoning-to-instruction approach is more controllable and interpretable than end-to-end repair.
  3. VODA (Self-Evolving Data Augmentation):

    • Function: Continuously expands the error case library.
    • Mechanism: Automatically captures newly encountered error cases during the ChatHLS workflow and uses them to further reinforce HLSFixer's debugging capability.
    • Design Motivation: Address the long-tail distribution of HLS errors.

Loss & Training

HLSTuner employs supervised fine-tuning using CoTs generated by NSGA-II and a teacher model. HLSFixer uses decoupled fine-tuning via the reasoning-to-instruction approach.

Key Experimental Results

Main Results

  • ChatHLS achieves a 32.6% improvement in debugging over Gemini-3-pro.
  • HLS-C generation success rate improves by 41.8%.
  • Achieves a 3.3× performance gain over RAG-based methods.

Key Findings

  • QoR-aware reasoning significantly outperforms simple code-to-code mapping.
  • Hierarchical feedback debugging is more effective than end-to-end repair.
  • The VODA self-evolving mechanism continuously improves debugging capability.

Highlights & Insights

  • QoR-aware reasoning enables LLMs to "understand" hardware rather than simply generate code.
  • The reasoning-to-instruction decoupled debugging approach offers strong interpretability.

Limitations & Future Work

  • The framework targets a specific HLS toolchain and may not generalize to other EDA tools.
  • The NSGA-II-based CoT generation process incurs high computational cost.
  • Future work may explore end-to-end RL training as an alternative to supervised fine-tuning.
  • Compared to template-based approaches such as HeteroRefactor and HeteroGen, ChatHLS is more flexible and requires no predefined templates.
  • Compared to RAG-based methods, specialized fine-tuning provides more precise domain knowledge.

Rating

  • Novelty: ⭐⭐⭐⭐ QoR-aware reasoning and self-evolving debugging are novel designs.
  • Experimental Thoroughness: ⭐⭐⭐⭐ Multiple benchmarks and baselines are compared.
  • Writing Quality: ⭐⭐⭐⭐ Framework description is thorough with clear pipeline diagrams.