Design and Implementation of VRUs Detection and Speed Estimation Using an Uncalibrated Top-View Perspective Camera
Monitoring Vulnerable Road Users (VRUs) in video surveillance systems is crucial for ensuring road safety, au-tonomous driving, and efficient traffic control. However, con-textual variables including occlusion, varying background conditions, environmental factors, illumination changes pose significant challenges to accurate and reliable VRUs tracking. The study focuses on enhancing road user detection, tracking, counting and approximate speed estimation with a special emphasis on cyclists on crosswalks at intersections, using vision-based techniques. The study explores the challenges associated with top-view data and proposes the integration of the YOLOv8 ByteTrack model, fine-tuned on a top-view dataset, for improved object detection. Vulnerable Road Users (VRUs) are tracked, and a methodology for monitoring and calculating the speed of VRUs, emphasizing the importance of traffic analysis zones is introduced. Using limited real data enriched with augmented synthetic datasets, the proposed approach is evaluated in terms of detecting, tracking, counting, and calculating the speed of VRUs. The method's evaluation on both synthetic and actual datasets demonstrates a practical model generalization and data accessibility. The fine-tuned YOLOv8 achieved an Average Precision (AP) score of 0.845 on a top-view test dataset. YOLOv8 ByteTrack processed the video with an inference speed of 13.9 ms, which demonstrates efficiency in tracking objects in real-time scenarios. The proposed cyclist speed estimation method was validated against ground-truth data, indicating some discrepancies but overall reasonable accuracy.