YOLO - You only look once 10647 times
Authors: Christian Limberg, Andrew Melnik, Helge Ritter, Helmut Prendinger
(currently in review for publication at ICONIP conference)
Abstract
In this article we are revealing that the "You Only Look Once" (YOLO) single-stage object detection approach can be compared to a parallel classification of 10647 fixed region proposals. We support this narrative by showing by two complimentary approaches, that each of YOLOs output pixel is attentive to a specific sub-region of previous layers, comparable to a local region proposal. This understanding reduces the conceptual gap between YOLO-like single-stage object detection models, RCNN-like two-stage region proposal based models, and ResNet-like image classification models. This page shows interactive exploration tools and exported media for a better visual understanding of the YOLO information processing streams.