面向视觉目标检测的脉冲神经网络综述：从生物机制到前沿应用

仵赛飞; 张渊; 谢迪; 俞海; 朱江

面向视觉目标检测的脉冲神经网络综述：从生物机制到前沿应用

Comprehensive Survey of Spiking Neural Networks for Visual Object Detection: From Biological Mechanisms to State-of-the-Art Applications

摘要

摘要: 脉冲神经网络（Spiking Neural Network, SNN）作为第三代神经网络模型，凭借其生物启发的脉冲动力学机制、事件驱动的异步计算特性以及低功耗优势，在目标检测领域展现出巨大潜力。本文系统综述了面向视觉目标检测的SNN方法，涵盖其生物机制基础、神经元模型、神经编码策略、数据集分类及主流算法方案。在神经元模型方面，分析了从IF、LIF到Izhikevich和Hodgkin-Huxley等模型在生物合理性与计算效率之间的权衡；在编码机制上，阐述了泊松编码、强度-延迟编码等输入编码方法，以及速率、时间、群体解码策略；在数据集层面，系统归类了静态、神经形态转换、神经形态捕获和模拟生成四类数据集的特点与局限性。在SNN算法方面，详细分析了基于ANN-to-SNN转换的方法与基于代理梯度的直接训练方法，并对比了其在精度、能效、延迟和硬件适配性等方面的表现。最后，本文对SNN在训练-硬件协同优化、新架构设计、多模态扩展及工具链生态等方向的发展趋势进行了展望，为低功耗、高能效脉冲视觉系统的研究与应用提供了参考。

Abstract: Spiking Neural Networks (SNNs), recognized as the third generation of neural network models, exhibit substantial potential in the domain of object detection. This potential arises from their biologically inspired spiking mechanisms, event-driven asynchronous computational characteristics, and low-power consumption benefits. This paper presents a systematic review of SNN methodologies for visual object detection, encompassing their biological foundations, neuron models, neural encoding techniques, dataset classifications, and prevailing algorithmic frameworks. Concerning neuron models, the balance between biological plausibility and computational efficiency is analyzed across models such as IF, LIF, Izhikevich, and Hodgkin-Huxley. Regarding encoding mechanisms, input encoding techniques like Poisson encoding and intensity-latency encoding, along with decoding strategies including rate, time, and population decoding, are thoroughly discussed. At the dataset level, the attributes and limitations of four dataset categories—static, neuromorphic conversion, neuromorphic acquisition, and simulation-generated—are systematically examined. For SNN algorithmic frameworks, methods based on ANN-to-SNN conversion and direct training using surrogate gradients are explored in detail, with comparisons of their performance in terms of accuracy, energy efficiency, latency, and hardware compatibility. Finally, this paper delineates future trends in SNN development, including training-hardware co-optimization, innovative architecture design, multimodal expansion, and toolchain ecosystem advancement, offering insights into the research and application of low-power, high-efficiency spiking vision systems.

HTML全文

参考文献(55)

施引文献

资源附件(0)