ROI-SSD: REGION OF INTEREST SINGLE SHOT DETECTOR
DOI:
https://doi.org/10.31891/2219-9365-2026-86-41Keywords:
neural networks for object detection, regions of interestAbstract
This paper proposes ROI-SSD, an extension of the Single Shot Detector (SSD) architecture designed to support object detection on image regions of arbitrary size and aspect ratio. Unlike conventional approaches that process full frames of fixed resolution, the proposed method enables detection on local image fragments. This is achieved through dynamic generation of default bounding boxes and adaptive truncation of the convolutional backbone depending on the scale of the corresponding region of interest.
Conventional object detectors are typically trained and evaluated on fixed-size images, which leads to a significant degradation in detection accuracy when applied to partial image regions. In this work, it is shown that the primary cause of this degradation is the mismatch between the training data distribution and the actual inference conditions. To address this issue, a progressive multi-resolution training strategy is introduced. This strategy gradually expands the set of input resolutions during training and incorporates cropped regions around objects at later stages.
Experimental results demonstrate that the proposed architectural modification preserves detection accuracy on full images compared to the baseline SSD model. At the same time, models trained only on full images exhibit a substantial drop in performance when applied to cropped regions. The proposed training strategy significantly improves detection accuracy on partial image regions, confirming the importance of aligning training conditions with the target inference scenario.
The results indicate that ROI-SSD provides a practical extension of SSD for detection on arbitrarily sized regions and forms a foundation for ROI-aware processing in applications with constrained computational resources.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Ярослав ГОЗАК

This work is licensed under a Creative Commons Attribution 4.0 International License.


