METHODS FOR IMPLEMENTATION OF STRING OBJECT RECOGNITION RESULTS IN REAL-TIME VIDEO STREAMS
DOI:
https://doi.org/10.31891/2219-9365-2024-80-41Keywords:
real-time video stream, string object recognition, text recognition, dynamic programming, video-based OCR, recognition result integration, distortion mitigationAbstract
This study addresses the challenge of string object recognition in real-time video streams, particularly focusing on improving recognition accuracy under dynamic conditions with distortions such as defocusing, glare, and motion artefacts. A novel algorithm is proposed that integrates recognition results from multiple video frames using extended result models, considering alternative classification options for each object. The algorithm leverages dynamic programming and advanced metrics, such as the Generalized Levenshtein Distance, to aggregate recognition outcomes effectively. Experimental validation on the MIDV-500 dataset demonstrates the proposed method's superiority over traditional approaches, including the ROVER method, in reducing recognition errors across various text fields. The findings highlight the algorithm's robustness and scalability for applications in document digitization, automated data extraction, and mobile-based text recognition. Future research directions include optimizing computational efficiency, expanding to multilingual recognition tasks, and validating performance on diverse datasets to ensure generalizability for real-world applications.