Positions

  • Present 2014

    Postgraduate scholar - Research assistant

    Computational Vision & Robotics Laboratory,
    Institute of Computer Science, FORTH

  • 2013 2012

    Software Engineer

    Computational Vision & Robotics Laboratory,
    Institute of Computer Science, FORTH

  • 2010 2008

    Postgraduate scholar - Research assistant

    Computational Vision & Robotics Laboratory,
    Institute of Computer Science, FORTH

Education - Training

  • Ph.D. 2014 - present

    Ph.D. candidate in Computer Science

    Computer Science Department, University of Crete, Greece

  • M.Sc.2008-2010

    Master of Science in Computer Science

    Computer Science Department, University of Crete, Greece

  • Dipl.-Ing.2002-2007

    Diploma in Computer Engineering & Informatics

    School of Engineering, University of Partas, Greece

News

  • July 2019
    image
    I participated in the Roche Continents 2019 program. The invitation is received honourably nominated to 100 talented PhD candidates from across Europe.
  • June 2019
    I was awarded the honorary Maria Michail Manassaki Bequest Fellowship for graduate students for the academic year 2017-2018. The fellowship is granted annually to exceptional Ph.D. candidates of the University of Crete.
  • June 2019
    Our paper "Unsupervised and Explainable Assessment of Video Similarity" was accepted as a spotlight presentation in BMVC 2019, Cardiff. !
    image
    Our joint work with my supervisor Prof. Antonis Argyros was accepted to the British Machine Vision Conference (BMVC 2019), BMVA, Cardiff, UK, September 2019 ! The work is titled in "Unsupervised and Explainable Assessment of Video Similarity"..
  • Mar 2018
    New project page available by Prof.Costas Panagiotakis for our joint work on "A Graph-based Approach for Detecting Common Actions in Motion Capture Data and Videos" here !
    image
    A new project webpage is published under https://sites.google.com/site/costaspanagiotakis/research/mucos/ containing the source code, datasets and additional results of our method presented in "A Graph-based Approach for Detecting Common Actions in Motion Capture Data and Videos".
  • Feb 2018
    Our joint work with Dr.Costas Panagiotakis and my supervisor Prof. Antonis Argyros, with title "A Graph-based Approach for Detecting Common Actions in Motion Capture Data and Videos" was accepted in Pattern Recognition Journal, Elsevier 2018 !
  • Feb 2017
    New project page available for our work on "Temporal Action Co-segmentation in 3D Motion Capture Data & Videos" here !
    image
    A new project webpage is published under www.ics.forth.gr/cvrl/evaco/ containing the datasets and additional results for our method presented in CVPR2017 as "Temporal Action Co-segmentation in 3D Motion Capture Data & Videos".
  • Feb 2017
    image
    Our paper "Temporal Action Co-segmentation in 3D Motion Capture Data & Videos" was accepted in CVPR 2017 !
    image
    Our joint work with Dr.Costas Panagiotakis and my supervisor Prof. Antonis Argyros was accepted to the IEEE Computer Vision & Pattern Recognition 2017 conference ! The work is titled in "Temporal Action Co-segmentation in 3D Motion Capture Data & Videos"..

Projects

    • image

      Temporal Action Co-Segmentation in 3D Motion Capture Data and Videos (EVACO)

      Given two action sequences, we are interested in the problem of temporal action co-segmentation in 3D motion capture data and videos that is spotting/co-segmenting all pairs of sub-sequences that represent the same action within those sequences.

      Project Page    Datasets    PDF    BibTex

      Brief Description

      We propose a totally unsupervised solution to this problem. No a-priori model of the actions is assumed to be available. The number of common sub-sequences may be unknown. The sub-sequences can be located anywhere in the original sequences, may differ in duration and the corresponding actions may be performed by a different person, in different style The unsupervised discovery of common patterns in images and videos is considered an important and unsolved problem in computer vision. We are interested in the temporal aspect of the problem and focus on action sequences (sequences of 3D motion capture data or video data) that contain multiple common actions

      We treat this type of temporal action co-segmentation as a stochastic optimization problem that is solved by employing Particle Swarm Optimization (PSO). The objective function that is minimized by PSO capitalizes on Dynamic Time Warping (DTW) to compare two action sub-sequences.

      Contributors

      Konstantinos Papoutsakis, Costas Panagiotakis, Antonis A. Argyros.
      This work has been supported by the EU projects ACANTO, Co4Robots.

      Relevant publications

      K. Papoutsakis, C. Panagiotakis and A.A. Argyros, "Temporal Action Co-Segmentation in 3D Motion Capture Data and Videos", In IEEE Computer Vision and Pattern Recognition (CVPR 2017), IEEE, Honolulu, Hawaii, USA, July 2017

      Sample results




      The electronic versions of the above publications can be downloaded from the publications page.

    • image

      Online segmentation and classification of actions

      We provide a discriminative framework for online simultaneous segmentation and classification of visual actions, which deals effectively with unknown sequences that may interrupt the known sequential patterns.

      Brief Description

      In this work, we provide a discriminative framework for online simultaneous segmentation and classification of visual actions, which deals effectively with unknown sequences that may interrupt the known sequential patterns. To this end we employ Hough transform to vote in a 3D space for the begin point, the end point and the label of the segmented part of the input stream. An SVM is used to model each class and to suggest putative labeled segments on the timeline. To identify the most plausible segments among the putative ones we apply a dynamic programming algorithm, which maximises an objective function for label assignment in linear time. The performance of our method is evaluated on synthetic as well as on real data (Weizmann and Berkeley multimodal human action database). The proposed approach is of comparable accuracy to the state of the art for online stream segmentation and classification and performs considerably better in the presence of previously unseen actions.

      The outline of the method

      image

      Contributors

      Dimitrios Kosmopoulos, Konstantinos Papoutsakis, Antonis A. Argyros.
      This work has been supported by the EU projects , Hobbit and ACANTO.

      Relevant publications

      D. Kosmopoulos, K. Papoutsakis and A.A. Argyros, "A framework for online segmentation and classification
      of modeled actions performed in the context of unmodeled ones", IEEE Transactions on Circuits
      and Systems for Video Technology (TCSVT), IEEE, July 2016. BibTex

      D. Kosmopoulos, K. Papoutsakis and A.A. Argyros, "Online segmentation and classification of modeled
      actions performed in the context of unmodeled ones", In British Machine Vision Conference
      (BMVC 2014), BMVA, Nottingham, UK, September 2014. PDF    BibTex

      More details and the electronic versions of the above publications can be downloaded from the publications page.

      Sample resutls




      The electronic versions of the above publications can be downloaded from the publications page.

    • image

      Gesture recognition for the perceptual support of assistive robots

      We propose a new approach for vision-based gesture recognition to support robust and efficient human robot interaction towards developing socially assistive robots. The considered gestural vocabulary consists of five, user specified hand gestures that convey messages of fundamental importance in the context of human-robot dialogue.

      Brief Description

      We propose a new approach for vision-based gesture recognition to support robust and efficient human robot interaction towards developing socially assistive robots. The considered gestural vocabulary consists of five, user specified hand gestures that convey messages of fundamental importance in the context of human-robot dialogue. Despite their small number, the recognition of these gestures exhibits considerable challenges. Aiming at natural, easy-to-memorize means of interaction, users have identified gestures consisting of both static and dynamic hand configurations that involve different scales of observation (from arms to fingers) and exhibit intrinsic ambiguities. Moreover, the gestures need to be recognized regardless of the multifaceted variability of the human subjects performing them. Recognition needs to be performed online, in continuous video streams containing other irrelevant/unmodeled motions. All the above need to be achieved by analyzing information acquired by a possibly moving RGBD camera, in cluttered environments with considerable light variations. We present a gesture recognition method that addresses the above challenges, as well as promising experimental results obtained from relevant user trials.

      The set of the supported gestures

      image

      Illustration of the supported gestures. The correspondence between gestures and physical actions of hands/arms are as follows: (a) “Yes": A thumb up hand posture. (b) “No": A sideways waiving hand with extended index finger. (c) “Reward": A circular motion of an open palm at a plane parallel to the image plane. (d) “Stop/cancel": A two-handed push forward gesture. (e) “Help": two arms in a cross configuration.

      Intermediate results

      image

      Illustration of intermediate results for hand detection. (a) Input RGB frame. (b) Input depth frame It. (c) The binary mask Mt where far-away structures have been suppressed and depth discontinuities of It appear as background pixels. Skeleton points St are shown superimposed (red pixels). (d) A forest of minimum spanning trees is computed based on (c), identifying initial hand hypotheses. Circles represent the palm centers. (e) Checking hypotheses against a simple hand model facilitates the detection of the actual hands, filtering out wrong hypotheses. (f) Another example showing the detection results (wrist, palm, fingers) in a scene with two hands.

      Contributors

      Damien Michel, Konstantinos Papoutsakis, Antonis A. Argyros.
      This work has been supported by the EU projects , Hobbit and ACANTO.

      Relevant publications

      D.Michel, K. Papoutsakis, A.A. Argyros, “Gesture recognition for the perceptual support of assistive robots”,
      International Symposium on Visual Computing (ISVC 2014), Las Vegas, Nevada, USA, Dec. 8-10, 2014. BibTex

      D. Michel, K. Papoutsakis and A.A. Argyros, "Gesture Recognition Apparatuses, Methods and Systems for Human-Machine Interaction",
      United States Patent No 20160078289, Filed: 16 September, 2015, Published: 17 March, 2016. BibTex PDF

      Sample results




      The electronic versions of the above publications can be downloaded from the publications page.

    • image

      Hobbit, a care robot supporting independent living at home

      The Hobbit project combines research from robotics, gerontology, and human-robot interaction to develop a care robot which is capable of fall prevention and detection as well as emergency detection and handling. Moreover, to enable daily interaction with the robot, other functions are added, such as bringing objects, offering reminders, and entertainment. The interaction with the user is based on a multimodal user interface including automatic speech recognition, text-to-speech, gesture recognition, and a graphical touch-based user interface. We performed controlled laboratory user studies with a total of 49 participants (aged 70 plus) in three EU countries (Austria, Greece, and Sweden).

      Brief Description

      In the context of "Hobbit-The mutual care robot" we have researched several approaches towards developing visual competencies for socially assistive robots within the framework of the HOBBIT project. We show how we integrated several vision modules using a layered architectural scheme. Our goal is to endow the mobile robot with visual perception capabilities so that it can interact with the users. We present the key modules of independent motion detection, object detection, body localization, person tracking, head pose estimation and action recognition and we explain how they serve the goal of natural integration of robots in social environments.

      Some of the efficient, real-time vision-based functionalities of Hobbit are listed below:

      • 3D human detection and tracking in a domestic environment.
      • image
      • Gesture recognition for a set of 5 predefined gestures for human-robot interaction of elderly users.
      • image image
      • Vision-based fall detection and recognition of a fallen person on the floor.
      • image
      • A robot-based application for physical exercise training of elderly users.
      • image

      This work has been supported by the EU project Hobbit, FP7-288146 (STREP).

      Relevant publications

      M. Foukarakis, I. Adami, D. Ioannidi, A. Leonidis, D. Michel, A. Qammaz, K. Papoutsakis, M. Antona and A.A. Argyros, "A Robot-based Application for Physical Exercise Training", In International Conference on Information and Communication Technologies for Ageing Well and e-Health (ICT4AWE 2016), Scitepress, pp. 45-52, Rome, Italy, April 2016
      BibTex, PDF

      D. Michel, K. Papoutsakis and A.A. Argyros, "Gesture Recognition Apparatuses, Methods and Systems for Human-Machine Interaction", United States Patent No 20160078289, Filed: 16 September, 2015, Published: 17 March, 2016
      BibTex, PDF, DOI

      D. Fischinger, P. Einramhof, K. Papoutsakis, W. Wohlkinger, P. Mayer, P. Panek, S. Hofmann, T. Koertner, A. Weiss, A.A. Argyros and others, "Hobbit, a care robot supporting independent living at home: First prototype and lessons learned", Robotics and Autonomous Systems, Elsevier, vol. 75, no. A, pp. 60-78, January 2016
      BibTex, PDF, DOI

      D. Michel, K.E. Papoutsakis and A.A. Argyros, "Gesture Recognition Supporting the Interaction of Humans with Socially Assistive Robots", In Advances in Visual Computing (ISVC 2014), Springer, pp. 793-804, Las Vegas, Nevada, USA, December 2014.
      BibTex, PDF, DOI

      K. Papoutsakis, P. Padeleris, A. Ntelidakis, S. Stefanou, X. Zabulis, D. Kosmopoulos and A.A. Argyros, "Developing visual competencies for socially assistive robots: the HOBBIT approach", In International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2013), ACM, pp. 1-7, Rhodes, Greece, May 2013
      BibTex, PDF, DOI

      Sample results






      The electronic versions of the above publications can be downloaded from the publications page.

    • image

      Integrating tracking with fine object segmentation in videos

      A novel method was researched for on-line, joint object tracking and segmentation in a monocular video captured by a possibly moving camera. Our goal is to integrate tracking and fine segmentation of a single, previously unseen, potentially non-rigid object of unconstrained appearance, given its segmentation in the first frame of an image sequence as the only prior information.

      Brief Description

      To achieve an efficient solution for simultaneous objecte tracking and fine segmentation, we tightly couple an existing kernel-based object tracking method with Random Walker-based image segmentation. Bayesian inference mediates between tracking and segmentation, enabling effective data fusion of pixel-wise spatial and color visual cues. The fine segmentation of an object at a certain frame provides tracking with reliable initialization for the next frame, closing the loop between the two building blocks of the proposed framework. The effectiveness of the proposed methodology is evaluated experimentally by comparing it to a large collection of state of the art tracking and video-based object segmentation methods on the basis of a data set consisting of several challenging image sequences for which ground truth data is available.

      The outline of the method

      image

      The outline of the method with sample intermediate results

      image

      Contributors

      Konstantinos Papoutsakis, Antonis A. Argyros.
      This work has been supported by the EU projects , GRASP and robohow.cog.

      Relevant publications

      K. Papoutsakis, A.A. Argyros, “Integrating tracking with fine object segmentation”,
      Image and Vision Computing, Volume 31, Issue 10, pp. 771-785, Oct. 2013.
      BibTex, PDF

      K. Papoutsakis, A.A. Argyros, “Object tracking and segmentation in a closed loop”,
      in Proceedings of the International Symposium on Visual Computing, ISVC’2010,
      Las Vegas, USA, Advances in Visual Computing, Lecture Notes in Computer Science.
      BibTex, PDF, DOI

      Sample results




      The electronic versions of the above publications can be downloaded from the publications page.

  • Filter by type:

    Sort by year:

    Unsupervised and Explainable Assessment of Video Similarity"

    K. Papoutsakis and A.A. Argyros
    Conference Papers In British Machine Vision Conference (BMVC 2019), BMVA, Cardiff, UK, September 2019

    Abstract

    We propose a novel unsupervised method that assesses the similarity of two videos on the basis of the estimated relatedness of the objects and their behavior, and provides arguments supporting this assessment. A video is represented as a complete undirected action graph that encapsulates information on the types of objects and the way they (inter) act. The similarity of a pair of videos is estimated based on the bipartite Graph Edit Distance (GED) of the corresponding action graphs. As a consequence, on-top of estimating a quantitative measure of video similarity, our method establishes spatiotemporal correspondences between objects across videos if these objects are semantically related, if/when they interact similarly, or both. We consider this an important step towards explainable assessment of video and action similarity. The proposed method is evaluated on a publicly available dataset on the tasks of activity classification and ranking and is shown to compare favorably to state of the art supervised learning methods.

    A Graph-based Approach for Detecting Common Actions in Motion Capture Data and Videos

    C. Panagiotakis, K. Papoutsakis and A.A. Argyros
    Journal Paper In Pattern Recognition, Elsevier, February 2018

    Abstract

    We present a novel solution to the problem of detecting common actions in time series of motion capture data and videos. Given two action sequences, our method discovers all pairs of similar subsequences, i.e. subsequences that represent the same action. This is achieved in a completely unsupervised manner, i.e., without any prior knowledge of the type of actions, their number and their duration. These common subsequences (commonalities) may be located anywhere in the original sequences, may differ in duration and may be performed under different conditions e.g., by a different actor.

    Evaluating Method Design Options for Action Classification based on Bags of Visual Words

    V. Manousaki, K. Papoutsakis and A.A. Argyros
    Conference Papers In International Conference on Computer Vision Theory and Applications (VISAPP 2018), Scitepress, January 2018

    Abstract

    The Bags of Visua lWords (BoVWs) framework has been applied successfully to several computer vision tasks. In this work we are particularly interested on its application to the problem of action recognition/classification. The key design decisions for a method that follows the BoVWs framework are (a) the visual features to be employed, (b) the size of the codebook to be used for representing a certain action and (c) the classifier applied to the developed representation to solve the classification task. We perform several experiments to investigate a variety of options regarding all the aforementioned design parameters. We also propose a new feature type and we suggest a method that determines automatically the size of the codebook. The experimental results show that our proposals produce results that are competitive to the outcomes of state of the art methods.

    Temporal Action Co-Segmentation in 3D Motion Capture Data and Videos

    K. Papoutsakis, C. Panagiotakis and A.A. Argyros
    Conference Papers In IEEE Computer Vision and Pattern Recognition (CVPR), July 2017

    Abstract

    Given two action sequences, we are interested in spotting/co-segmenting all pairs of sub-sequences that represent the same action. We propose a totally unsupervised solution to this problem. No a-priori model of the actions is assumed to be available. The number of common subsequences may be unknown. The sub-sequences can be located anywhere in the original sequences, may differ in duration and the corresponding actions may be performed by a different person, in different style. We treat this type of temporal action co-segmentation as a stochastic optimization problem that is solved by employing Particle Swarm Optimization (PSO). The objective function that is minimized by PSO capitalizes on Dynamic TimeWarping (DTW) to compare two action sequences. Due to the generic problem formulation and solution, the proposed method can be applied to motion capture (i.e., 3D skeletal) data or to conventional RGB video data acquired in the wild. We present extensive quantitative experiments on several standard, ground truthed datasets. The obtained results demonstrate that the proposed method achieves a remarkable increase in co-segmentation quality compared to all tested existing state of the art methods.

    A framework for online segmentation and classification of modeled actions performed in the context of unmodeled ones

    D. Kosmopoulos, K. Papoutsakis and A.A. Argyros
    Journal Paper IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), July 2016

    Abstract:

    In this work, we propose a discriminative framework for online simultaneous segmentation and classification of modeled visual actions that can be performed in the context of other, unknown actions. To this end, we employ Hough transform to vote in a 3D space for the begin point, the end point and the label of the segmented part of the input stream. An SVM is used to model each class and to suggest putative labeled segments on the timeline. To identify the most plausible segments among the putative ones we apply a dynamic programming algorithm, which maximises the likelihood for label assignment in linear time. The performance of our method is evaluated on synthetic, as well as on real data (Weizmann, TUM Kitchen, UTKAD and Berkeley multimodal human action databases). Extensive quantitative results obtained on a number of standard datasets demonstrate that the proposed approach is of comparable accuracy to the state of the art for online stream segmentation and classification when all performed actions are known and performs considerably better in the presence of unmodeled actions.

    A Robot-based Application for Physical Exercise Training

    M. Foukarakis, I. Adami, D. Ioannidi, A. Leonidis, D. Michel, A. Qammaz, K. Papoutsakis, M. Antona and A.A. Argyros
    Conference Papers In International Conference on Information and Communication Technologies for Ageing Well and e-Health (ICT4AWE 2016), Scitepress, pp. 45-52, Rome, Italy, April 2016

    Abstract

    According to studies, performing physical exercise is beneficial for reducing the risk of falling in the elderly and prolonging their stay at home. In addition, regular exercising helps cognitive function and increases positive behaviour for seniors with cognitive impairment and dementia. In this paper, a fitness application integrated into a service robot is presented. Its aim is to motivate the users to perform physical training by providing relevant exercises and useful feedback on their progress. The application utilizes the robot vision system to track and recognize user movements and activities and supports multimodal interaction with the user. The pa-per describes the design challenges, the system architecture, the user interface and the human motion capturing module. Additionally, it discusses some results from user testing in laboratory and home-based trials.

    Gesture Recognition Apparatuses, Methods and Systems for Human-Machine Interaction

    D. Michel, K. Papoutsakis and A.A. Argyros
    Book Chapter United States Patent No 20160078289, Filed: 16 September, 2015, Published: 17 March, 2016

    Abstract

    The GESTURE RECOGNITION APPARATUSES, METHODS AND SYSTEMS FOR HUMAN-MACHINE INTERACTION (GRA) discloses vision-based gesture recognition. GRA can be implemented in any application involving tracking, detection and/or recognition of gestures or motion in general. Disclosed methods and systems consider a gestural vocabulary of a predefined number of user specified static and/or dynamic hand gestures that are mapped with a database to convey messages. In one implementation, the disclosed systems and methods support gesture recognition by detecting and tracking body parts, such as arms, hands and fingers, and by performing spatio-temporal segmentation and recognition of the set of predefined gestures, based on data acquired by an RGBD sensor. In one implementation, a model of the hand is employed to detect hand and finger candidates. At a higher level, hand posture models are defined and serve as building blocks to recognize gestures.

    Hobbit, a care robot supporting independent living at home: First prototype and lessons learned"

    D. Fischinger, P. Einramhof, K. Papoutsakis, W. Wohlkinger, P. Mayer, P. Panek, S. Hofmann, T. Koertner, A. Weiss, A.A. Argyros
    Journal PaperRobotics and Autonomous Systems, Elsevier, vol. 75, no. A, pp. 60-78, January 2016.

    Abstract

    One option to address the challenge of demographic transition is to build robots that enable aging in place. Falling has been identified as the most relevant factor to cause a move to a care facility. The Hobbit project combines research from robotics, gerontology, and human–robot interaction to develop a care robot which is capable of fall prevention and detection as well as emergency detection and handling. Moreover, to enable daily interaction with the robot, other functions are added, such as bringing objects, offering reminders, and entertainment. The interaction with the user is based on a multimodal user interface including automatic speech recognition, text-to-speech, gesture recognition, and a graphical touch-based user interface. We performed controlled laboratory user studies with a total of 49 participants (aged 70 plus) in three EU countries (Austria, Greece, and Sweden). The collected user responses on perceived usability, acceptance, and affordability of the robot demonstrate a positive reception of the robot from its target user group. This article describes the principles and system components for navigation and manipulation in domestic environments, the interaction paradigm and its implementation in a multimodal user interface, the core robot tasks, as well as the results from the user studies, which are also reflected in terms of lessons we learned and we believe are useful to fellow researchers.

    Gesture Recognition Supporting the Interaction of Humans with Socially Assistive Robots

    D. Michel, K.E. Papoutsakis and A.A. Argyros,
    Conference Papers In Advances in Visual Computing (ISVC 2014), Springer, pp. 793-804, Las Vegas, Nevada, USA, December 2014.

    Abstract

    We propose a new approach for vision-based gesture recognition to support robust and efficient human robot interaction towards developing socially assistive robots. The considered gestural vocabulary consists of five, user specified hand gestures that convey messages of fundamental importance in the context of human-robot dialogue. Despite their small number, the recognition of these gestures exhibits considerable challenges. Aiming at natural, easy-to-memorize means of interaction, users have identified gestures consisting of both static and dynamic hand configurations that involve different scales of observation (from arms to fingers) and exhibit intrinsic ambiguities. Moreover, the gestures need to be recognized regardless of the multifaceted variability of the human subjects performing them. Recognition needs to be performed online, in continuous video streams containing other irrelevant/unmodeled motions. All the above need to be achieved by analyzing information acquired by a possibly moving RGBD camera, in cluttered environments with considerable light variations. We present a gesture recognition method that addresses the above challenges, as well as promising experimental results obtained from relevant user trials.

    Online segmentation and classification of modeled actions performed in the context of unmodeled ones

    D.I. Kosmopoulos, K. Papoutsakis and A.A. Argyros
    Conference Papers In British Machine Vision Conference (BMVC 2014), BMVA, Nottingham, UK, September 2014.

    Abstract

    In this work, we provide a discriminative framework for online simultaneous segmentation and classification of visual actions, which deals effectively with unknown sequences that may interrupt the known sequential patterns. To this end we employ Hough transform to vote in a 3D space for the begin point, the end point and the label of the segmented part of the input stream. An SVM is used to model each class and to suggest putative labeled segments on the timeline. To identify the most plausible segments among the putative ones we apply a dynamic programming algorithm, which maximises an objective function for label assignment in linear time. The performance of our method is evaluated on synthetic as well as on real data (Weizmann and Berkeley multimodal human action database). The proposed approach is of comparable accuracy to the state of the art for online stream segmentation and classification and performs considerably better in the presence of previously unseen actions.

    Integrating tracking with fine object segmentation

    K.E. Papoutsakis and A.A. Argyros
    Journal Paper Image and Vision Computing, Elsevier, vol. 31, no. 10, pp. 771-785, 2013

    Abstract

    We present a novel method for on-line, joint object tracking and segmentation in a monocular video captured by a possibly moving camera. Our goal is to integrate tracking and fine segmentation of a single, previously unseen, potentially non-rigid object of unconstrained appearance, given its segmentation in the first frame of an image sequence as the only prior information. To this end, we tightly couple an existing kernel-based object tracking method with Random Walker-based image segmentation. Bayesian inference mediates between tracking and segmentation, enabling effective data fusion of pixel-wise spatial and color visual cues. The fine segmentation of an object at a certain frame provides tracking with reliable initialization for the next frame, closing the loop between the two building blocks of the proposed framework. The effectiveness of the proposed methodology is evaluated experimentally by comparing it to a large collection of state of the art tracking and video-based object segmentation methods on the basis of a data set consisting of several challenging image sequences for which ground truth data is available.

    Hobbit-The Mutual Care Robot

    D. Fischinger, P. Einramhof, W. Wohlkinger, K. Papoutsakis, P. Mayer, P. Panek, T. Koertner, S. Hofmann, A.A. Argyros and M. Vincze
    Conference Papers In IEEE/RSJ International Conference on Intelligent Robots and Systems Workshops (ASROB 2013 - IROSW 2013), Tokyo, Japan, 2013

    Abstract

    One option to face the aging society is to build robots that help old persons to stay longer at home. We present Hobbit, a robot that attempts to let users feel safe at home by preventing and detecting falls. Falling has been identified as the highest risk for older adults of getting injured so badly that they can no longer live independently at home and have to move to a care facility. Hobbit is intended to provide high usability and acceptability for the target user group while, at the same time, needs to be affordable for private customers. The development process so far (1.5 years) included a thorough user requirement analysis, conceptual interaction design, prototyping and implementation of key behaviors, as well as extensive empirical testing with target users in the laboratory. We shortly describe the overall interdisciplinary decision-making and conceptualization of the robot and will then focus on the system itself describing the hardware, basic components, and the robot tasks. Finally, we will summarize the findings of the first empirical test with 49 users in three countries and give an outlook of how the platform will be extended in future.

    Developing visual competencies for socially assistive robots: the HOBBIT approach

    K. Papoutsakis, P. Padeleris, A. Ntelidakis, S. Stefanou, X. Zabulis, D. Kosmopoulos and A.A. Argyros,
    Conference Papers In International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2013), ACM, pp. 1-7, Rhodes, Greece, May 2013

    Abstract

    In this paper, we present our approach towards developing visual competencies for socially assistive robots within the framework of the HOBBIT project. We show how we integrated several vision modules using a layered architectural scheme. Our goal is to endow the mobile robot with visual perception capabilities so that it can interact with the users. We present the key modules of independent motion detection, object detection, body localization, person tracking, head pose estimation and action recognition and we explain how they serve the goal of natural integration of robots in social environments.

    Object Tracking and Segmentation in a Closed Loop

    K. Papoutsakis and A.A. Argyros,
    Conference Papers In Advances in Visual Computing (ISVC 2010), Springer, pp. 405-416, Las Vegas, Nevada, USA, November 2010.

    Abstract

    We introduce a new method for integrated tracking and segmentation of a single non-rigid object in an monocular video, captured by a possibly moving camera. A closed-loop interaction between EM-like color-histogram-based tracking and Random Walker-based image segmentation is proposed, which results in reduced tracking drifts and in fine object segmentation. More specifically, pixel-wise spatial and color image cues are fused using Bayesian inference to guide object segmentation. The spatial properties and the appearance of the segmented objects are exploited to initialize the tracking algorithm in the next step, closing the loop between tracking and segmentation. As confirmed by experimental results on a variety of image sequences, the proposed approach efficiently tracks and segments previously unseen objects of varying appearance and shape, under challenging environmental conditions.

    Teaching Assistantship



    I have been serving as a teaching assistant during my post-graduate studies in the Computer Science Department, University of Crete, for the following courses:

    • CS-587: Neural Networks and Learning of Hierarchical Representation
    • CS-573: Discrete Optimization Methods
    • CS-472: Computational Vision
    • CS-280: Theory of Computation
    • CS-240: Data Structures
    • CS-118: Discrete Mathematics
    • CS-100: Introduction to Computer Science