Journal Articles
M. Sigalas, M. Pateraki and P. Trahanias
"Full-body pose tracking - the Top View Reprojection approach". IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. PP, no. 99, 2015.
Abstract | bib | doi
Abstract -- Recent introduction of low-cost depth cameras triggered a number of interesting works, pushing forward the state-of-the-art in human body pose extraction and tracking. However, despite the remarkable progress, many of the contemporary methods cope inadequately with complex scenarios, involving multiple interacting users, under the presence of severe inter- and intra-occlusions. In this work, we present a model-based approach for markerless articulated full body pose extraction and tracking in RGB-D sequences. A cylinder-based model is employed to represent the human body. For each body part a set of hypotheses is generated and tracked over time by a Particle Filter. To evaluate each hypothesis, we employ a novel metric that considers the reprojected Top View of the corresponding body part. The latter, in conjunction with depth information, effectively copes with difficult and ambiguous cases, such as severe occlusions. For evaluation purposes, we conducted several series of experiments using data from a public human action database, as well as own-collected data involving varying number of interacting users. The performance of the proposed method has been further compared against that of the Microsoft’s Kinect SDK and NiTETM using ground truth information. The results obtained attest for the effectiveness of our approach.
@Article{sigalas2015_TVR,
Title = {Full-body Pose Tracking - the Top View Reprojection Approach},
Author = {Sigalas, Markos and Pateraki, Maria and Trahanias, Panos},
Journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
Year = {2015},
Number = {99},
Volume = {PP},
Publisher = {IEEE},
Url = {"http://dx.doi.org/10.1109/TPAMI.2015.2502582"}
}
In Conference Proceedings
M. Sigalas, M. Pateraki and P. Trahanias
"Visual Estimation of Attentive Cues in HRI: The Case of Torso and Head Pose". In Proc. of Computer Vision Systems. Springer International Publishing, 2015. 375-388.
Abstract | bib | doi
Abstract -- Capturing visual human-centered information is a fundamental input source for effective and successful human-robot interaction (HRI) in dynamic multi-party social settings. Torso and head pose, as forms of nonverbal communication, support the derivation people’s focus of attention, a key variable in the analysis of human behaviour in HRI paradigms encompassing social aspects. Towards this goal, we have developed a model-based approach for torso and head pose estimation to overcome key limitations in free-form interaction scenarios and issues of partial intra- and inter-person occlusions. The proposed approach builds up on the concept of Top View Re-projection (TVR) to uniformly treat the respective body parts, modelled as cylinders. For each body part a number of pose hypotheses is sampled from its configuration space. Each pose hypothesis is evaluated against the a scoring function and the hypothesis with the best score yields for the assumed pose and the location of the joints. A refinement step on head pose is applied based on tracking facial patch deformations to compute for the horizontal off-plane rotation. The overall approach forms one of the core component of a vision system integrated in a robotic platform that supports socially appropriate, multi-party, multimodal interaction in a bartending scenario. Results in the robot’s environment during real HRI experiments with varying number of users attest for the effectiveness of our approach.
@incollection{sigalas2015visual,
title={Visual Estimation of Attentive Cues in HRI: The Case of Torso and Head Pose},
author={Sigalas, Markos and Pateraki, Maria and Trahanias, Panos},
booktitle={Computer Vision Systems},
pages={375--388},
year={2015},
publisher={Springer}
}
M. Sigalas, M. Pateraki and P. Trahanias
"Robust articulated upper body pose tracking under severe occlusions". In Proc. of Intelligent Robots and Systems (IROS 2014), 2014 IEEE/RSJ International Conference on. IEEE, September 2014.
Abstract | bib | doi
Abstract -- Articulated human body tracking is one of the most thoroughly examined, yet still challenging, tasks in Human Robot Interaction. The emergence of low-cost real-time depth cameras has greatly pushed forward the state of the art in the field. Nevertheless, the overall performance in complex, real life scenarios is an open-ended problem, mainly due to the high-dimensionality of the problem, the common presence of severe occlusions in the observed scene data, and errors in the segmentation and pose initialization processes. In this paper we propose a novel model-based approach for markerless pose detection and tracking of the articulated upper body of multiple users in RGB-D sequences. The main contribution of our work lies in the introduction and further development of a virtual User Top View, a hypothesized view aligned to the main torso axis of each user, to robustly estimate the 3D torso pose even under severe intra- and inter-personal occlusions, exempting at the same time the requirement of arbitrary initialization. The extracted 3D torso pose, along with a human arm kinematic model, gives rise to the generation of arms hypotheses, tracked via Particle Filters, and for which ordered rendering is used to detect possible occlusions and collisions. Experimental results in realistic scenarios, as well as comparative tests against the NiTE user generator middleware using ground truth data, validate the effectiveness of the proposed method.
@inproceedings{sigalas2014robust,
title={Robust articulated upper body pose tracking under severe occlusions},
author={Sigalas, Markos and Pateraki, Maria and Trahanias, Panos},
booktitle={Intelligent Robots and Systems (IROS 2014), 2014 IEEE/RSJ International Conference on},
pages={4104--4111},
year={2014},
organization={IEEE}
}
M. Sigalas, M. Pateraki, I. Oikonomidis and P. Trahanias
"Robust Model-Based 3D Torso Pose Estimation in RGB-D Sequences". In Proc. of the 2nd IEEE Workshop on Dynamic Shape Capture and Analysis, 14th International Conference on Computer Vision, Sydney, Australia, December 2013.
Abstract | bib | doi
Abstract -- Free-form Human Robot Interaction (HRI) in naturalistic environments remains a challenging computer vision task. In this context, the extraction of human-body pose information is of utmost importance. Although the emergence of real-time depth cameras greatly facilitated this task, issues which limit the performance of existing methods in relevant HRI applications still exist. Applicability of current state-of-the art approaches is constrained by their inherent requirement of an initialization phase prior to deriving body pose information, which in complex, realistic scenarios, is often hard, if not impossible. In this work we present a data-driven model-based method for 3D torso pose estimation from RGB-D image sequences, eliminating the requirement of an initialization phase. The detected face of the user steers the initiation of shoulder areas hypotheses, based on illumination, scale and pose invariant features on the RGB silhouette. Depth point cloud information is subsequently utilized to approximate the shoulder joints and model the human torso based on a set of 3D geometric primitives and the estimation of the 3D torso pose is derived via a global optimization scheme. Experimental results in various environments, as well as using ground truth data and comparing to OpenNI User generator middleware results, validate the effectiveness of the proposed method.
@inproceedings{sigalas2013robust,
title={Robust model-based 3d torso pose estimation in rgb-d sequences},
author={Sigalas, Markos and Pateraki, Maria and Oikonomidis, Iason and Trahanias, Panos},
booktitle={Computer Vision Workshops (ICCVW), 2013 IEEE International Conference on},
pages={315--322},
year={2013},
organization={IEEE}
}
M. Giuliani, R. Petrick, M.E. Foster, A. Gaschler, A. Isard, M. Pateraki and M. Sigalas
"Comparing task-based and socially intelligent behaviour in a robot bartender". In Proc. of the 15th ACM on International conference on multimodal interaction. ACM, 2013.
Abstract | bib | doi
Abstract -- We address the question of whether service robots that interact with humans in public spaces must express socially appropriate behaviour. To do so, we implemented a robot bartender which is able to take drink orders from humans and serve drinks to them. By using a high-level automated planner, we explore two different robot interaction styles: in the task only setting, the robot simply fulfils its goal of asking customers for drink orders and serving them drinks; in the socially intelligent setting, the robot additionally acts in a manner socially appropriate to the bartender scenario, based on the behaviour of humans observed in natural bar interactions. The results of a user study show that the interactions with the socially intelligent robot were somewhat more efficient, but the two implemented behaviour settings had only a small influence on the subjective ratings. However, there were objective factors that influenced participant ratings: the overall duration of the interaction had a positive influence on the ratings, while the number of system order requests had a negative influence. We also found a cultural difference: German participants gave the system higher pre-test ratings than participants who interacted in English, although the post-test scores were similar.
@inproceedings{giuliani2013comparing,
title={Comparing task-based and socially intelligent behaviour in a robot bartender},
author={Giuliani, Manuel and Petrick, Ronald and Foster, Mary Ellen and Gaschler, Andre and Isard, Amy and Pateraki, Maria and Sigalas, Markos},
booktitle={Proceedings of the 15th ACM on International conference on multimodal interaction},
pages={263--270},
year={2013},
organization={ACM}
}
M. Pateraki, M. Sigalas, G. Chliveros and P. Trahanias
"Visual human-robot communication in social settings". In Proc. of Workshop on Semantics, Identification and Control of Robot-Human_Environment Interaction, IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, Germany, May 2013.
Abstract | bib | pdf
Abstract -- Supporting human-robot interaction (HRI) in dynamic, multi-party social settings relies on a number of input and output modalities for visual human tracking, language processing, high-level reasoning, robot control, etc. Capturing visual human-centered information is a fundamental input source in HRI for effective and successful interaction. The current paper deals with visual processing in dynamic scenes and presents an integrated vision system that combines a number of different cues (such as color, depth, motion) to track and recognize human actions in challenging environments. The overall system comprises of a number of vision modules for human identification and tracking, extraction of pose-related information from body and face, identification of a specific set of communicative gestures (e.g. “waving, pointing”) as well as tracking of objects towards identification of manipulative gestures that act on objects in the environment (e.g. “grab glass”, “raise bottle”). Experimental results from a bartending scenario as well a comparative assessment of a subset of modules validate the effectiveness of the proposed system.
@inproceedings{pateraki2013visual,
title={Visual human-robot communication in social settings},
author={Pateraki, Maria and Sigalas, Markos and Chliveros, Georgios and Trahanias, Panos},
booktitle={Proceedings of ICRA Workshop on Semantics, Identification and Control of Robot-Human-Environment Interaction},
year={2013}
}
M. Sigalas, H. Baltzakis, and P. Trahanias
"Gesture recognition based on arm tracking for human-robot interaction". In Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Taipei, Taiwan, October 2010.
Abstract | bib | pdf
Abstract -- In this paper we present a novel approach for hand gesture recognition. The proposed system utilizes upper body part tracking in a 9-dimensional configuration space and two Multi-Layer Perceptron/Radial Basis Function (MLP/RBF) neural network classifiers, one for each arm. Classification is achieved by buffering the trajectory of each arm and feeding
it to the MLP Neural Network which is trained to recognize between five gesturing states. The RBF neural network is trained as a predictor for the future gesturing state of the system. By feeding the output of the RBF back to the MLP classifier, we achieve temporal consistency and robustness to the classification results.
The proposed approach has been assessed using several video sequences and the results obtained are presented in this paper.
@INPROCEEDINGS{msigalas_iros10,
AUTHOR = {Sigalas, M. and Baltzakis, H. and Trahanias, P.},
MONTH = OCT,
YEAR = 2010,
TITLE = {Gesture recognition based on arm tracking for human-robot interaction},
BOOKTITLE = {Proc. IEEE/RSJ International Conference on Intelligent Robotics and Systems (IROS)},
pages={5424--5429},
ADDRESS = {Taipei, Taiwan},
HTYPE = {conference}
}
M. Sigalas, H. Baltzakis, and P. Trahanias
"Temporal gesture recognition for human-robot interaction". In Proc. of Mutimodal Human-Robot Interfaces Workshop held within IEEE International Conference on Robotics and Automation (ICRA), Anchorage, Alaska, USA, May 2010.
Abstract | bib | pdf
Abstract -- This paper describes a novel hand gesture recognition system intended to support natural interaction with autonomously navigating robots that guide visitors in museums and exhibition centers. The proposed system utilizes upper body part tracking and two neural network-based classifiers, one for each arm. Tracking is performed in a 9-DoF configuration space and it is facilitated by means of a probabilistic approach which combines particle filters with hidden Markov models in order to enable the simultaneous tracking of several hypotheses for the body orientation and the configuration of each of the two arms.
Given the arm trajectories in the configuration space, classification is facilitated separately for each arm by means of a combined MLP/RBF neural network structure. The MLP is trained as a standard classifier while the RBF neural network is trained as a predictor for the future state of the system. By feeding the output of the RBF back to the MLP classifier, we achieve temporal consistency and robustness to the classification results.
@INPROCEEDINGS{msigalas_icra10,
AUTHOR = {Sigalas, M. and Baltzakis, H. and Trahanias, P.},
MONTH = MAY,
YEAR = 2010,
TITLE = {Temporal gesture recognition for human-robot interaction},
BOOKTITLE = {Proc. Mutimodal Human-Robot Interfaces Workshop held within IEEE International Conference on Robotics and Automation (ICRA)},
ADDRESS = {Anchorage, Alaska, USA},
URL = {papers/conferences/msigalas_icra10.pdf},
HTYPE = {conference}
}
M. Sigalas, H. Baltzakis, and P. Trahanias
"Visual tracking of independently moving body and arms". In Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), St. Louis, MO, USA, October 2009.
Abstract | bib | pdf
Abstract -- Tracking of the upper human body is one of the
most interesting and challenging research fields in computer
vision and comprises an important component used in gesture
recognition applications. In this paper a probabilistic approach
towards arm and hand tracking is presented. We propose the
use of a kinematics model together with a segmentation of the
parameter space to cope with the space dimensionality
problem. Moreover, the combination of particle filters with
hidden Markov models enables the simultaneous tracking of
several hypotheses for the body orientation and the
configuration of each of the arms.
@INPROCEEDINGS{msigalas_iros09,
AUTHOR = {Sigalas, M. and Baltzakis, H. and Trahanias, P.},
MONTH = OCT,
YEAR = 2009,
TITLE = {Visual tracking of independently moving body and arms},
BOOKTITLE = {Proc. IEEE/RSJ International Conference on Intelligent Robotics and Systems (IROS)},
ADDRESS = {St. Louis, MO, USA},
URL = {papers/conferences/msigalas_iros09.pdf},
HTYPE = {conference}
}
M. Sigalas, G. Vouros
"Energy efficient area coverage in arbitrary sensor networks". European Workshop on Wireless Sensor Networks (ESWN'06) Poster Session, Zurich, Switzerland, February 2006.
Abstract | bib | pdf
Abstract -- Arbitrary sensor networks comprise randomly deployed sensors
that may have different capabilities and capacities and are fully
autonomous. This paper deals with static nodes in synchronized
networks that have different sensing and communication
capabilities and different energy capacities. The paper describes
our first results towards a fully localized solution for energy
efficient area coverage of arbitrary sensor networks that comprise
autonomous nodes. According to the proposed solution a node
sleeps when (a) it is not needed for preserving system’s
connectivity and (b) its sensing area is covered. This solution
works very efficiently for nodes with different sensing and
communication abilities, relaxing many of the limitations and
assumptions made in other proposals.
@INPROCEEDINGS{msigalas_ewsn06,
AUTHOR = {Sigalas, M. and Vouros G.},
MONTH = FEB,
YEAR = 2006,
TITLE = {Energy Efficient Area Coverage in Arbitrary Sensor Networks},
ADDRESS = {Zurich, Switzerland},
URL = {papers/conferences/msigalas_ewsn06.pdf},
HTYPE = {conference}
}