ReActNet: Temporal Localization of Repetitive Activities in Real-World Videos

Brief description

We address the problem of temporal localization ofrepetitive activities in a video, i.e., the problem of identifying all segments of a video that contain some sort of repetitive or periodic motion. To do so, the proposed method represents a video by the matrix of pairwise frame distances. These distances are computed on frame representations obtained with a convolutional neural network. On top of this representation, we design, implement and evaluate ReActNet, a lightweight convolutional neural network that classifies a given frame as belonging (or not) to a repetitivevideo segment. An important property of the employed representation is that it can handle repetitive segments of arbitrary number and duration. Furthermore, the proposed training process requires a relatively small number of annotated videos. Our method raises several of the limiting assumptions of existing approaches regarding the contents ofthe video and the types of the observed repetitive activities. Experimental results on recent, publicly available datasets validate our design choices, verify the generalization potential of ReActNet and demonstrate its superior performancein comparison to the current state of the art.

Sample results

Video with description and experimental results


  • Giorgos Karvounas, Iason Oikonomidis, Antonis A. Argyros
  • This work was partially supported by EU H2020 project Co4Robots (Grant No 731869).

Relevant publications

  • G. Karvounas, I. Oikonomidis, A.A. Argyros, “ReActNet: Temporal Localization of Repetitive Activitiesin Real-World Videos”, Intelligent Short Video 2019 (ISV 2019 - ICCVW 2019), Seoul, S Korea, October, 2019.

The electronic versions of the above publications can be downloaded from my publications page.