METHODS FOR IMPLEMENTATION OF STRING OBJECT RECOGNITION RESULTS IN REAL-TIME VIDEO STREAMS

Authors

DOI:

https://doi.org/10.31891/2219-9365-2024-80-41

Keywords:

real-time video stream, string object recognition, text recognition, dynamic programming, video-based OCR, recognition result integration, distortion mitigation

Abstract

This study addresses the challenge of string object recognition in real-time video streams, particularly focusing on improving recognition accuracy under dynamic conditions with distortions such as defocusing, glare, and motion artefacts. A novel algorithm is proposed that integrates recognition results from multiple video frames using extended result models, considering alternative classification options for each object. The algorithm leverages dynamic programming and advanced metrics, such as the Generalized Levenshtein Distance, to aggregate recognition outcomes effectively. Experimental validation on the MIDV-500 dataset demonstrates the proposed method's superiority over traditional approaches, including the ROVER method, in reducing recognition errors across various text fields. The findings highlight the algorithm's robustness and scalability for applications in document digitization, automated data extraction, and mobile-based text recognition. Future research directions include optimizing computational efficiency, expanding to multilingual recognition tasks, and validating performance on diverse datasets to ensure generalizability for real-world applications.

Downloads

Published

2024-11-28

How to Cite

GUANXIANG, X., & KOVTUN, V. (2024). METHODS FOR IMPLEMENTATION OF STRING OBJECT RECOGNITION RESULTS IN REAL-TIME VIDEO STREAMS. MEASURING AND COMPUTING DEVICES IN TECHNOLOGICAL PROCESSES, (4), 338–347. https://doi.org/10.31891/2219-9365-2024-80-41