A Bio-Inspired Neural Network for Detecting Crowd Convergence Behavior Based on Locust Visual Mechanisms
Introduction
Crowd convergence behavior refers to the unique movement pattern that occurs when multiple streams of pedestrians merge at intersections or passageways in public spaces. This phenomenon is a precursor to potential public safety risks such as crowding, pushing, and trampling. Historical incidents, such as the 2001 Akashi pedestrian bridge stampede in Japan and the Mina stampede during the Hajj pilgrimage in Saudi Arabia, highlight the dangers associated with uncontrolled crowd convergence. Detecting and warning about such behavior in real-time could help prevent catastrophic accidents. However, traditional computer vision techniques struggle with this task due to the random movement of crowds, severe occlusions, and dynamic lighting conditions in public spaces.
To address this challenge, researchers have turned to biological inspiration. Animals, through millions of years of evolution, have developed highly efficient visual systems capable of processing complex motion patterns. Among these, the visual system of locusts stands out due to its simplicity and effectiveness in detecting approaching threats. The Lobula Giant Movement Detector (LGMD) neuron in locusts, for instance, exhibits strong neural responses to looming objects, making it an ideal candidate for modeling collision avoidance and motion perception.
This paper introduces a bio-inspired artificial neural network called the Crowd Convergence Behavior Detection Neural Network (CCBDNN), which leverages the visual processing mechanisms of locusts and the direction-selective properties of mammalian retinas. Unlike traditional computer vision models that rely on optical flow computation or deep learning training, CCBDNN operates without explicit training and directly processes visual motion cues to detect crowd convergence.
Background and Related Work
Traditional Approaches to Crowd Behavior Analysis
Existing research on crowd behavior can be broadly categorized into computer vision-based methods and pedestrian dynamics models.
Computer Vision-Based Methods: These approaches typically extract motion features such as optical flow or employ deep learning techniques to detect anomalies in crowd behavior. For example, some studies use spatiotemporal convolutional networks to identify abnormal crowd movements, while others combine object detection with tracking algorithms to monitor pedestrian flows. However, these methods suffer from high computational costs, sensitivity to lighting conditions, and the need for extensive training data.
Pedestrian Dynamics Models: These models simulate crowd behavior using mathematical formulations, such as social force models, to study interactions between individuals in merging flows. While useful for understanding crowd behavior in controlled environments, these models require offline data collection and are impractical for real-time detection in dynamic public spaces.
Biological Inspiration: Locust Vision and Mammalian Direction Selectivity
The locust visual system provides a compelling framework for motion detection. The LGMD neuron, located in the third neuropil layer of the locust’s visual pathway, responds strongly to approaching objects, enabling rapid collision avoidance. The neural pathway involves sequential processing stages—retina, lamina, medulla, lobula, and lobula plate—where visual signals are progressively refined to detect motion and potential threats.
In mammals, direction-selective neurons in the retina play a crucial role in motion perception. These neurons respond preferentially to movement in specific directions while suppressing responses to opposite directions. Previous bio-inspired models, such as the Direction Selection Neural Network (DSNN), have successfully replicated this mechanism to detect global motion patterns. However, these models are limited to recognizing four primary directions and are unsuitable for detecting localized crowd convergence.
The CCBDNN Model
The CCBDNN model integrates insights from locust vision and mammalian direction selectivity to detect crowd convergence behavior. The network consists of five neural layers and nine functional neurons, structured as follows:
-
Photoreceptor Layer (R Layer) The R layer captures luminance changes in the visual scene, distinguishing moving pedestrians from the background. Each photoreceptor cell responds to brightness variations above a threshold, generating an initial motion signal.
-
Lamina Layer (L Layer) This layer applies Gaussian filtering to smooth the visual input, mimicking the preprocessing stage in locust photoreceptors. Only cells exceeding an activation threshold propagate signals to subsequent layers.
-
Medulla Layer (M Layer) The M layer divides the visual field into localized receptive fields, analogous to the ommatidia in a locust’s compound eye. Each receptive field activates only if it contains sufficient motion energy, ensuring that processing focuses on regions with significant crowd activity.
-
Lobula Layer (Lo Layer) and Direction-Selective Neurons The Lo layer computes motion suppression signals, while eight direction-selective neurons (L, LD, D, RD, R, RU, U, LU) extract local motion directions within each receptive field. These neurons are arranged in a circular “direction column” to encode eight primary movement directions.
-
Lobula Plate Layer (LP Layer) and CCBD Neuron The LP layer identifies convergence zones by analyzing directional intersections across receptive fields. When multiple motion directions intersect in a localized region, the LP layer marks it as a potential convergence area. Finally, the CCBD neuron integrates these signals and generates a spike response if the motion energy in the convergence zone exceeds a threshold, indicating crowd convergence behavior.
Experimental Validation
Datasets and Evaluation Metrics The model was tested on multiple datasets, including PDDA (Pedestrian Dynamics Data Archive) and PETS2009, which contain real-world crowd convergence scenarios. Additional non-convergence behaviors (e.g., random motion, escape, violence) from UMN, UCSD Pedestrian, and Crowd Violence datasets were used to assess the model’s specificity. Performance was evaluated using precision, recall, and false alarm rate (FAR).
Key Findings
-
Effectiveness in Detecting Crowd Convergence: CCBDNN successfully identified convergence zones in various scenarios, including T-junctions, Y-junctions, and crosswalks. The LP layer’s membrane potential visualization confirmed accurate localization of merging flows, while the CCBD neuron’s spike responses aligned with ground-truth convergence intervals.
-
Robustness to Crowd Density: The model performed well in medium- and high-density crowds but showed reduced sensitivity in low-density scenarios where sparse pedestrian movements did not form coherent flows.
-
Specificity to Convergence Behavior: CCBDNN exhibited strong selectivity, producing no false positives for non-convergence behaviors such as random motion, escape, or violence. This confirms that the model responds exclusively to directional convergence rather than general motion energy.
-
Comparative Advantage Over Existing Models: When tested against related bio-inspired models (e.g., CDNN for collision detection, CEBDNN for escape behavior), CCBDNN outperformed all alternatives, achieving near-perfect precision and recall in convergence detection. Traditional computer vision methods, which rely on optical flow or deep learning, were either computationally expensive or ineffective in dynamic environments.
Conclusion
The CCBDNN model demonstrates the potential of bio-inspired approaches in addressing complex crowd behavior analysis challenges. By combining locust motion detection mechanisms with mammalian direction selectivity, the network achieves robust, training-free detection of crowd convergence without relying on computationally intensive optical flow methods.
Future work could explore additional biological visual mechanisms to enhance the model’s sensitivity to low-density flows or adapt it for real-time deployment in surveillance systems. The success of CCBDNN also opens avenues for applying similar principles to other crowd behavior recognition tasks, such as bottleneck detection or panic spread analysis.
For further details, refer to the original paper:https://doi.org/10.19734/j.issn.1001-3695.2024.05.0224
Was this helpful?
0 / 0