Neurodynamics-Driven Coupled Neural P Systems for Multi-Focus Image Fusion¶

Conference: CVPR 2026 arXiv: 2509.17704 Code: MorvanLi/ND-CNPFuse Area: Interpretability Keywords: Multi-focus image fusion, coupled neural P systems, neurodynamics, decision map, spiking mechanism

TL;DR¶

This paper proposes ND-CNPFuse, which performs neurodynamical analysis of coupled neural P (CNP) systems to establish constraint relationships between network parameters and input signals, preventing abnormal sustained neuronal firing. The method generates high-quality, interpretable decision maps for multi-focus image fusion (MFIF) without any training.

Background & Motivation¶

Multi-focus image fusion (MFIF) aims to fuse multiple images of the same scene captured at different focal lengths into a single all-in-focus image. The core challenge lies in generating decision maps with precise boundaries. Existing methods suffer from two categories of problems:

End-to-end deep learning methods: directly generate fused images but struggle to maintain spatial consistency with source images.

Decision-map-based deep learning methods: use networks to predict focused/defocused regions, but internal mechanisms are uninterpretable (black-box), leading to spurious edges and artifacts in decision maps.

Coupled neural P (CNP) systems are biologically inspired neural computation models motivated by the synchronous spiking mechanisms of the mammalian visual cortex, making them naturally suited for distinguishing focused from defocused regions. However, directly applying CNP to MFIF can cause abnormal sustained firing, making spike counts unable to accurately reflect focus differences. This paper addresses the issue by analyzing the dynamics of CNP neurons.

Method¶

Overall Architecture¶

ND-CNPFuse consists of three modules: (a) input preprocessing → (b) ND-CNP system for decision map generation → (c) pixel-level fusion. Given a pair of multi-focus source images \(A\) and \(B\), the final fused image is:

\[F(i,j) = A(i,j) \times DM(i,j) + B(i,j) \times (1 - DM(i,j))\]

Key Designs¶

CNP Neuron Dynamics Analysis: Each CNP neuron contains three memory units—feeding input unit \(U\), linking input unit \(V\), and dynamic threshold unit \(T\)—updated as follows:
- \(U(t) = \alpha U(t) + I + K(n)\) (accumulation of external input \(I\))
- \(V(t) = \sum_{n=0}^{t-1} K(n) \beta^{t-n-1}\) (accumulation of neighborhood coupling signals)
- \(T(t) = \lambda \frac{1-\gamma^{t-1}}{1-\gamma}\) (threshold grows with iterations)

The core theorem (Theorem 4) derives a closed-form solution for the sustained firing condition: \(I > \frac{\lambda(1-\alpha)(1-\beta)}{(1-\gamma)(1-\beta+\text{sum}(W))} - \text{sum}(W)\) Corollary 1 follows: the external input must not exceed this threshold, and all parameters can be automatically configured from the input image, requiring no manual tuning.

SML Input Preprocessing: Sum-Modified Laplacian (SML) preprocessing is applied to source images, converting simple pixel values into richer feature signals to avoid restricting neuronal firing when using raw pixel values. Ablation studies show that SML has a minor effect on overall performance but aids in handling noisy inputs.
Spike-Count-Based Decision Map Generation: Two ND-CNP systems \(\Phi_A\) and \(\Phi_B\) take preprocessed \(A\) and \(B\) as inputs respectively, running until the maximum iteration count and outputting spike matrices \(SM_A\) and \(SM_B\). Firing counts \(F_A\) and \(F_B\) are computed within coupling radius \(r\), and the decision map is generated by direct comparison: \(DM(i,j) = \begin{cases} 1, & F_A(i,j) > F_B(i,j) \\ 0, & \text{otherwise} \end{cases}\) Focused regions produce more spikes (consistent with human visual perception), and the entire process requires no post-processing and is fully interpretable.

Loss & Training¶

This method requires no training whatsoever. All parameters (\(u, v, \tau\), etc.) are automatically configured through neurodynamical analysis. Key hyperparameters are coupling radius \(r=16\) and iteration count \(t=110\); sensitivity analysis demonstrates that these two parameters generalize well across different datasets.

Key Experimental Results¶

Main Results¶

Comparisons against 9 state-of-the-art methods on four classical MFIF datasets using 6 evaluation metrics:

Dataset	Metric	ND-CNPFuse	Prev. SOTA	Rank
Lytro	\(Q_{abf}\)	0.7621	0.7613 (PADCDTNP)	1st
Lytro	\(FMI_w\)	0.5967	0.5916 (PADCDTNP)	1st
Lytro	SSIM	0.8541	0.8525 (CCF)	1st
MFFW	\(Q_{abf}\)	0.7399	0.7384 (DMANet)	1st
MFI-WHU	\(FMI_w\)	0.6268	0.6248 (SAMF/DMANet)	1st
Real-MFF	PSNR	34.2024	34.0174 (DMANet)	1st

Runtime: MATLAB 0.41s / C++ 0.18s (CPU), outperforming DMANet on GPU (0.21s). Energy consumption: \(1.12 \times 10^{-5}\) J per image pair (extremely low).

Ablation Study¶

Configuration	\(Q_{abf}\)	\(FMI_w\)	SSIM	PSNR	Note
Without neurodynamical analysis	0.747	0.509	0.841	25.702	Baseline CNP system
With neurodynamical analysis	0.762	0.597	0.854	26.990	\(FMI_w\) Gain 17.29%
Without SML	0.761	0.593	0.852	26.983	Minimal impact
With SML	0.762	0.597	0.854	26.990	Slight improvement

Key Findings¶

Neurodynamical analysis is the core contribution, yielding a 17.29% improvement in \(FMI_w\), indicating substantially better feature information preservation.
Decision map visualizations demonstrate that the ND-CNP system generates maps with sharper boundaries and higher accuracy, avoiding the region misclassification seen in the baseline CNP.
Parameters \(r\) and \(t\) perform consistently across four datasets, validating the generalizability of the method.

Highlights & Insights¶

Biologically inspired + theory-driven: Rather than simply applying a neural computation model, this work performs in-depth dynamical analysis and derives closed-form constraint conditions, making the model reliable and practically usable.
Training-free and interpretable: No deep learning training is required; the decision map generation process is based on spike count comparison with clear physical meaning.
Extremely low energy consumption and real-time performance: Real-time fusion is achieved CPU-only (C++ 0.18s) at an energy cost of only \(10^{-5}\) J, suitable for edge deployment.
First theoretical analysis of its kind: This is the first work to study the neurodynamics of CNP systems, opening a new direction for the theoretical understanding of this class of models.

Limitations & Future Work¶

Numerical improvements are relatively modest; on some metrics, gains over PADCDTNP and DMANet are marginal.
The current method handles only the standard MFIF scenario with two input images; although the appendix extends to multiple images, the complexity analysis is insufficient.
The 110-iteration count is relatively large; it is worth exploring whether adaptive termination strategies could further accelerate inference.
SML preprocessing has limited capacity to handle low-contrast edges (e.g., on the MFFW dataset), preventing optimal SSIM performance.

Relationship to DMANet/PADCDTNP: These methods similarly focus on decision map quality but rely on deep learning black boxes; ND-CNPFuse provides an interpretable alternative.
Relationship to prior CNP works: Prior works proposed CNP systems but relied heavily on manual parameter tuning; this paper resolves the parameter automation problem through dynamical analysis.
Insights: The spike count → focus estimation paradigm can be extended to other tasks requiring pixel-level decisions (e.g., saliency detection, depth estimation); the neurodynamical constraint analysis framework can be applied to other spiking neural network models.

Rating¶

Novelty: ⭐⭐⭐⭐ First to introduce neurodynamical analysis into CNP systems for image fusion, with solid theoretical contributions.
Experimental Thoroughness: ⭐⭐⭐⭐ Four datasets, nine comparison methods, six metrics, and comprehensive ablation studies.
Writing Quality: ⭐⭐⭐⭐ Theorem derivations are clear, illustrations are intuitive, and the overall structure is well-organized.
Value: ⭐⭐⭐ The fusion domain is relatively niche and numerical gains are limited, but the work establishes a new paradigm for interpretable fusion methods.