Antonis Argyros

This is a list of my publications in reverse chronological order. All publication fields are searchable (i.e., authors, paper title, journal/venue, year, abstract, etc).

The documents listed have been provided as a means to ensure timely dissemination of scholarly and technical work. Copyright and all rights of listed publications are maintained by the authors or publication as noted. These works may not be copied for commercial redistribution, republication, dissemination, or reposting without the explicit permission of the copyright holder.

F. Gouidis, K. Papantoniou, K. Papoutsakis, T. Patkos, A.A. Argyros and D. Plexousakis, "Fusing Domain-Specific Content from Large Language Models into Knowledge Graphs for Enhanced Zero Shot Object State Classification", In In AAAI 2024 Spring Symposium on Empowering Machine Learning and Large Language Models with Domain and Commonsense Knowledge, (AAAI-MAKE), also available at CoRR, arXiv, Stanford University, USA, March 2024.
[Abstract] [BibTeX]

Abstract: Domain-specific knowledge can significantly contribute to addressing a wide variety of vision tasks. However, the generation of such knowledge entails considerable human labor and time costs. This study investigates the potential of Large Language Models (LLMs) in generating and providing domain-specific information through semantic embeddings. To achieve this, an LLM is integrated into a pipeline that utilizes Knowledge Graphs and pre-trained semantic vectors in the context of the Vision-based Zero-shot Object State Classification task. We thoroughly examine the behavior of the LLM through an extensive ablation study. Our findings reveal that the integration of LLM-based embeddings, in combination with general-purpose pre-trained embeddings, leads to substantial performance improvements. Drawing insights from this ablation study, we conduct a comparative analysis against competing models, thereby highlighting the state-of-the-art performance achieved by the proposed approach.

BibTeX:

@inproceedings{Gouidis2024b,
  author = {Gouidis, Filippos and Papantoniou, Katerina and Papoutsakis, Konstantinos and Patkos, Theodore and Argyros, Antonis A and Plexousakis, Dimitris},
  title = {Fusing Domain-Specific Content from Large Language Models into Knowledge Graphs for Enhanced Zero Shot Object State Classification},
  booktitle = {In AAAI 2024 Spring Symposium on Empowering Machine Learning and Large Language Models with Domain and Commonsense Knowledge, (AAAI-MAKE), also available at CoRR, arXiv},
  year = {2024},
  month = {March},
  address = {Stanford University, USA},
  projects =  {I.C.HUMANS}
}

F. Gouidis, K. Papantoniou, K. Papoutsakis, T. Patkos, A. Argyros and D. Plexousakis, "Fusing Domain-Specific Content from Large Language Models into Knowledge Graphs for Enhanced Zero Shot Object State Classification", CoRR, arXiv, March 2024.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

BibTeX:

@arxivarticle{gouidis2024fusing,
  author = {Filippos Gouidis and Katerina Papantoniou and Konstantinos Papoutsakis and Theodore Patkos and Antonis Argyros and Dimitris Plexousakis},
  title = {Fusing Domain-Specific Content from Large Language Models into Knowledge Graphs for Enhanced Zero Shot Object State Classification},
  journal = {CoRR, arXiv},
  year = {2024},
  month = {March},
  url = {https://arxiv.org/abs/2403.12151},
  doi = {https://arxiv.org/abs/2403.12151},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2024_03_arxiv_fusing_Gouidis.pdf}
}

G. Galanakis, X. Zabulis and A.A. Argyros, "Nearest neighbor-based data denoising for deep metric learning", In International Conference on Computer Vision Theory and Applications (VISAPP 2024), Scitepress, pp. 595-603, Rome, Italy, February 2024.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: The effectiveness of supervised deep metric learning relies on the availability of a correctly annotated dataset, i.e., a dataset where images are associated with correct class labels. The presence of incorrect labels in a dataset disorients the learning process. In this paper, we propose an approach to combat the presence of such label noise in datasets. Our approach operates online, during training and on the batch level. It examines the neighborhood of samples, considers which of them are noisy and eliminates them from the current training step. The neighborhood is established using features obtained from the entire dataset during previous training epochs and therefore is updated as the model learns better data representations. We evaluate our approach using multiple datasets and loss functions, and demonstrate that it is better or comparable to the competition. At the same time, in contrast to the competition, it does not require knowledge of the noise contamination rate of the examined datasets.

BibTeX:

@inproceedings{Galanakis2024,
  author = {Galanakis, George and Zabulis, Xenophon and Argyros, Antonis A},
  title = {Nearest neighbor-based data denoising for deep metric learning},
  booktitle = {International Conference on Computer Vision Theory and Applications (VISAPP 2024)},
  publisher = {Scitepress},
  year = {2024},
  month = {February},
  pages = {595--603},
  address = {Rome, Italy},
  projects =  {I.C.HUMANS,CRAEFT},
  doi = {10.5220/0012383000003660},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2024_02_VISAPP_Galanakis.pdf}
}

F. Gouidis, K. Papoutsakis, T. Patkos, A.A. Argyros and D. Plexousakis, "Exploring the Impact of Knowledge Graphs on Zero-Shot Visual Object State Classification", In International Conference on Computer Vision Theory and Applications (VISAPP 2024), Scitepress, pp. 738-749, Rome, Italy, February 2024.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: In this work, we explore the potential of using Knowledge Graphs (KGs) to leverage an effective Zero-Shot Learning (ZSL) approach for the task of Object State Classification (OSC) in images. On this problem, the performance of traditional supervised learning methods is hindered mainly by data scarcity, as they attempt to encode the highly varying visual features of a multitude of combinations of objects and object states. The ZSL paradigm indicates a promising alternative to enable the classification of object state categories by leveraging structured semantic descriptions acquired by external commonsense knowledge sources - in our case visually grounded KGs. We formulate an effective ZSL scheme by employing a Transformer-based Graph Neural Network model and a pre-trained vision-based CNN model. We also investigate best practices for both the construction and integration of common-sense semantic information based on KGs. In our extensive experimental evaluation, we relied on 5 different knowledge repositories and 30 KGs, which are constructed semi-automatically via querying known object state classes to retrieve contextual information at different node depths. The performance of vision-language models for ZS-OSC is also assessed. Overall, the obtained results suggest performance improvement for ZS-OSC models on all four different image datasets, while both the size of a KG and the sources utilized for their construction are important for task performance.

BibTeX:

@inproceedings{Gouidis2024a,
  author = {Gouidis, Filippos and Papoutsakis, Konstantinos and Patkos, Theodore and Argyros, Antonis A and Plexousakis, Dimitris},
  title = {Exploring the Impact of Knowledge Graphs on Zero-Shot Visual Object State Classification},
  booktitle = {International Conference on Computer Vision Theory and Applications (VISAPP 2024)},
  publisher = {Scitepress},
  year = {2024},
  month = {February},
  pages = {738--749},
  address = {Rome, Italy},
  projects =  {I.C.HUMANS},
  doi = {10.5220/0012383000003660},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2024_02_VISAPP_Gouidis.pdf}
}

G. Savathrakis and A.A. Argyros, "An Automated Method for the Creation of Oriented Bounding Boxes in Remote Sensing Ship Detection Datasets", In IEEE Winter Conference on Applications of Computer Vision Workshops (MaCVis 2024 - WACVW 2024), IEEE, pp. 830-839, Waikoloa, Hawaii, USA, January 2024.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: In a variety of maritime applications, the task of accurately detecting ships from remote sensing images is of significant importance. Various object detection algorithms localize objects by identifying either their Horizontal Bounding Boxes (HBBs) or their Oriented Bounding Boxes (OBBs). OBBs provide a far more accurate/tighter localization of object regions as well as their orientation. Several ship detection datasets provide annotations that include both HBBs and OBBs. However, many of them do not include OBB annotations. In this work, we propose a method which takes the ships’ HBB annotations as input, and automatically calculates the corresponding OBBs. The proposed method consists of three main parts, (a) object segmentation that is built upon the Segment-Anything Model (SAM) to calculate object masks based on the information provided by the HBBs, (b) morphological filtering which eliminates possible artifacts stemming from the segmentation process, and (c) contour detection applied to the post-processed masks that are used to compute the optimal OBBs of the target objects. By automating the process of OBB annotation, the proposed method permits the exploitation of existing HBB-annotated datasets to train ship detectors of improved performance. We support this finding by reporting the results of several experiments that involve standard datasets, as well as state of the art object detectors

BibTeX:

@inproceedings{Savathrakis2024,
  author = {Savathrakis, Giorgos and Argyros, Antonis A},
  title = {An Automated Method for the Creation of Oriented Bounding Boxes in Remote Sensing Ship Detection Datasets},
  booktitle = {IEEE Winter Conference on Applications of Computer Vision Workshops (MaCVis 2024 - WACVW 2024)},
  publisher = {IEEE},
  year = {2024},
  month = {January},
  pages = {830--839},
  address = {Waikoloa, Hawaii, USA},
  url = {https://openaccess.thecvf.com/content/WACV2024W/MaCVi/html/Savathrakis_An_Automated_Method_for_the_Creation_of_Oriented_Bounding_Boxes_WACVW_2024_paper.html},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2024_01_MaCVi_Savathrakis.pdf}
}

K. Bacharidis and A. Argyros, "Repetition-aware Image Sequence Sampling for Recognizing Repetitive Human Actions", In IEEE/CVF International Conference on Computer Vision Workshops (ACVR 2023 - ICCVW 2023), IEEE, pp. 1878-1887, Paris, France, October 2023.
[Abstract] [BibTeX] [PDF]

Abstract: In the field of video-based human action recognition (HAR), standard hand-crafted and deep learning-based approaches are constrained by the computational and memory requirements of their models and the length of the input sequence that can be processed during learning. Sampling techniques employing a windowed or a random clip cropping have been the simplest and most effective ways to cope with limitations on the maximum possible length of the input sequence. However, such designs do not guarantee that the correct ordering of the action steps is captured, or require several learning iterations. In this work we address this problem for the class of repetitive actions. Specifically, given a temporal segmentation of a repetitive action into its repetitive segments, we propose and develop novel approaches for ranking and selecting/sampling segments so as to improve learning in deep models for HAR. We show that by employing the proposed repetition-aware sampling schemes in state-of-the-art deep models for HAR, the action recognition accuracy is increased. The proposed approach is evaluated on existing datasets as well as on a new dataset that is tailored to the quantitative evaluation of the task at hand. The obtained results reveal how our approach performs in relation to various characteristics of the observed repetitive actions (repetition frequency, their effects on scene objects, etc) and demonstrate the obtained performance improvements..

BibTeX:

@inproceedings{Bacharidis23,
  author = {Bacharidis, Konstantinos and Argyros, Antonis},
  title = {Repetition-aware Image Sequence Sampling for Recognizing Repetitive Human Actions},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops (ACVR 2023 - ICCVW 2023)},
  publisher = {IEEE},
  year = {2023},
  month = {October},
  pages = {1878--1887},
  address = {Paris, France},
  projects =  {VMWARE,I.C.HUMANS},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2023_10_ACVR_Bacharidis.pdf}
}

G. Karvounas, N. Kyriazis, I. Oikonomidis and A. Argyros, "Dynamic Multiview Refinement of 3D Hand Datasets using Differentiable Ray Tracing", In IEEE/CVF International Conference on Computer Vision Workshops (AMFG 2023 - ICCVW 2023), IEEE, pp. 3156-3166, Paris, France, October 2023.
[Abstract] [BibTeX] [PDF]

Abstract: With the increasing importance of AI applications in the field of 3D estimation of hand state, the quality of the datasets used for training the relevant models is of utmost importance. Especially in the case of datasets consisting of real-world images, the quality of annotations, i.e., how accurately the provided ground truth reflects the true state of the scene, can greatly affect the performance of downstream applications. In this work, we propose a methodology with significant impact on improving ubiquitous 3D hand geometry datasets that contain real images with imperfect annotations. Our approach leverages multi-view imagery, temporal consistency, and a disentangled representation of hand shape, texture, and environment lighting. This allows to refine the hand geometry of existing datasets and also paves the way for texture extraction. Extensive experiments on synthetic and real-world data show that our method outperforms the current state of the art, resulting in more accurate and visually pleasing reconstructions of hand gestures.

BibTeX:

@inproceedings{Karvounas2023,
  author = {Karvounas, Giorgos and Kyriazis, Nikolaos and Oikonomidis, Iason and Argyros, Antonis},
  title = {Dynamic Multiview Refinement of 3D Hand Datasets using Differentiable Ray Tracing},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops (AMFG 2023 - ICCVW 2023)},
  publisher = {IEEE},
  year = {2023},
  month = {October},
  pages = {3156--3166},
  address = {Paris, France},
  projects =  {VMWARE,I.C.HUMANS},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2023_10_AMFG_Karvounas.pdf}
}

V. Manousaki, K. Bacharidis, K. Papoutsakis and A. Argyros, "VLMAH: Visual-Linguistic Modeling of Action History for Effective Action Anticipation", In IEEE/CVF International Conference on Computer Vision Workshops (ACVR 2023 - ICCVW 2023), IEEE, pp. 1917-1927, Paris, France, October 2023.
[Abstract] [BibTeX] [PDF]

Abstract: Although existing methods for action anticipation have shown considerably improved performance on the predictability of future events in videos, the way they exploit information related to past actions is constrained by time duration and encoding complexity. This paper addresses the task of action anticipation by taking into consideration the history of all executed actions throughout long, procedural activities. A novel approach noted as Visual-Linguistic Modeling of Action History (VLMAH) is proposed that fuses the immediate past in the form of visual features as well as the distant past based on a cost-effective form of linguistic constructs (semantic labels of the nouns, verbs, or actions). Our approach generates accurate near-future action predictions during procedural activities by leveraging information on the long- and short-term past. Extensive experimental evaluation was conducted on three challenging video datasets containing procedural activities, namely the Meccano, the Assembly-101, and the 50Salads. The obtained results validate the importance of incorporating long-term action history for action anticipation and document the significant improvement of the state-of-the-art Top-1 accuracy performance.

BibTeX:

@inproceedings{Manousaki2023,
  author = {Manousaki, Victoria and Bacharidis, Kostas and Papoutsakis, Konstantinos and Argyros, Antonis},
  title = {VLMAH: Visual-Linguistic Modeling of Action History for Effective Action Anticipation},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops (ACVR 2023 - ICCVW 2023)},
  publisher = {IEEE},
  year = {2023},
  month = {October},
  pages = {1917--1927},
  address = {Paris, France},
  projects =  {VMWARE,I.C.HUMANS},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2023_10_ACVR_Manousaki.pdf}
}

V. Manousaki and A. Argyros, "Partial Alignment of Time Series for Action and Activity Prediction", In Springer Book of VISAPP 2022, selected revised papers, Springer Nature Switzerland, pp. 89-107, October 2023.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: The temporal alignment of two complete action/activity sequences has been the focus of interest in many research works. However, the problem of partially aligning an incomplete sequence to a complete one has not been sufficiently explored. Very effective alignment algorithms such as Dynamic Time Warping (DTW) and Soft Dynamic Time Warping (S-DTW) are not capable of handling incomplete sequences. To overcome this limitation the Open-End DTW (OE-DTW) and the Open-Begin-End DTW (OBE-DTW) algorithms were introduced. The OE-DTW has the capability to align sequences with common begin points but unknown ending points, while the OBE-DTW has the ability to align unsegmented sequences. We focus on two new alignment algorithms, namely the Open-End Soft DTW (OE-S-DTW) and the Open-Begin-End Soft DTW (OBE-S-DTW) which combine the partial alignment capabilities of OE-DTW and OBE-DTW with those of Soft DTW (S-DTW). Specifically, these algorithms have the segregational capabilities of DTW combined with the soft-minimum operator of the S-DTW algorithm that results in improved, differentiable alignment in the case of continuous, unsegmented actions/activities. The developed algorithms are well-suited tools for addressing the problem of action prediction. By properly matching and aligning an on-going, incomplete action/activity sequence to prototype, complete ones, we may gain insight in what comes next in the on-going action/activity. The proposed algorithms are evaluated on the MHAD, MHAD101-v/-s, MSR Daily Activities and CAD-120 datasets and are shown to outperform relevant state of the art approaches.

BibTeX:

@inproceedings{Manousaki2023b,
  author = {Manousaki, Victoria and Argyros, Antonis},
  title = {Partial Alignment of Time Series for Action and Activity Prediction},
  booktitle = {Springer Book of VISAPP 2022, selected revised papers},
  publisher = {Springer Nature Switzerland},
  year = {2023},
  month = {October},
  pages = {89--107},
  url = {https://link.springer.com/chapter/10.1007/978-3-031-45725-8_5},
  projects =  {VMWARE,I.C.HUMANS},
  doi = {10.1007/978-3-031-45725-8_5},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2023_10_VISAPP_Manousaki.pdf}
}

V.C. Nicodemou, I. Oikonomidis and A. Argyros, "RV-VAE: Integrating Random Variable Algebra into Variational Autoencoders", In IEEE/CVF International Conference on Computer Vision Workshops (ViPriors 2023 - ICCVW 2023), IEEE, pp. 196-205, Paris, France, October 2023.
[Abstract] [BibTeX] [PDF]

Abstract: Among deep generative models, variational autoencoders (VAEs) are a central approach in generating new samples from a learned, latent space while effectively reconstructing input data. The original formulation requires a stochastic sampling operation, implemented via the reparameterization trick, to approximate a posterior latent distribution. In this paper, we introduce a novel approach that leverages the full distributions of encoded input to optimize the model over the entire range of the data, instead of discrete samples. We treat the encoded distributions as continuous random variables and use operations defined by the algebra of random variables during decoding. This approach integrates an innate mathematical prior into the model, helping to improve data efficiency and reduce computational load. Experimental results across different datasets and architectures confirm that this modification enhances VAE-based architectures' performance. Specifically, our approach improves the reconstruction error and generative capabilities of several VAE architectures, as measured by the Fréchet Inception Distance (FID) metric, while exhibiting similar or better training convergence behavior. Our method exemplifies the power of combining deep learning with inductive priors, promoting data efficiency and less reliance on brute-force learning.

BibTeX:

@inproceedings{Nicodemou023,
  author = {Nicodemou, Vassilis C and Oikonomidis, Iason and Argyros, Antonis},
  title = {RV-VAE: Integrating Random Variable Algebra into Variational Autoencoders},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops (ViPriors 2023 - ICCVW 2023)},
  publisher = {IEEE},
  year = {2023},
  month = {October},
  pages = {196--205},
  address = {Paris, France},
  projects =  {VMWARE,I.C.HUMANS},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2023_10_VIPRIORS_Nicodemou.pdf}
}

A. Qammaz and A. Argyros, "A Unified Approach for Occlusion Tolerant 3D Facial Pose Capture and Gaze Estimation using MocapNETs", In IEEE/CVF International Conference on Computer Vision Workshops (AMFG 2023 - ICCVW 2023), IEEE, pp. 3178-3188, Paris, France, October 2023.
[Abstract] [BibTeX] [PDF]

Abstract: We tackle the challenging problems of 3D facial capture, head pose and gaze estimation. We do so by extending MocapNET, a highly effective deep learning motion capture framework. By leveraging state-of-the-art RGB/2D joint estimators, the proposed network ensemble converts 2D facial keypoints into a real-time 3D Bio-Vision Hierarchy (BVH) skeleton in an end-to-end fashion, incorporating inverse kinematics computations. Our approach achieves satisfactory performance on benchmark datasets and also architecturally excels in challenging scenarios with significant facial occlusions. Moreover, it runs in real-time on CPU, which makes it an ideal choice for applications requiring low-latency interactions. Overall, our unified approach for facial capture, head pose and gaze estimation provides a robust solution for capturing facial expressions and visual focus, with huge potential in HCI and AR/VR applications. Notably, our approach is naturally integrable with MocapNETs for 3D human body and hands pose estimation, offering one of the few state-of-the-art unified approaches that enable holistic recovery of 3D information regarding human gaze, face, upper/lower body, hands, and feet.

BibTeX:

@inproceedings{Qammaz2023b,
  author = {Qammaz, Ammar and Argyros, Antonis},
  title = {A Unified Approach for Occlusion Tolerant 3D Facial Pose Capture and Gaze Estimation using MocapNETs},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops (AMFG 2023 - ICCVW 2023)},
  publisher = {IEEE},
  year = {2023},
  month = {October},
  pages = {3178--3188},
  address = {Paris, France},
  projects =  {VMWARE,I.C.HUMANS},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2023_10_AMFG_Qammaz.pdf}
}

D. Drosakis and A. Argyros, "3D Hand Shape and Pose Estimation based on 2D Hand Keypoints", In International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2023) (Best Techical Paper Award), ACM, pp. 148-153, Corfu, Greece, July 2023.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: We present a method for simultaneous 3D hand shape and pose estimation on a single RGB image frame. Specifically, our method fits the MANO 3D hand model to 2D hand keypoints. Fitting is achieved based on a novel 2D objective function that exploits anatomical joint limits, combined with a shape regularization term on the MANO hand model, jointly optimizing the 3D shape and pose of the hand in a single frame. In a series of quantitative experiments on well-established datasets annotated with ground truth, we show that it is possible to obtain reconstructions that are competitive and, in some cases, superior to existing 3D hand pose estimation approaches.

BibTeX:

@inproceedings{Drosakis2023,
  author = {Drosakis, Drosakis and Argyros, Antonis},
  title = {3D Hand Shape and Pose Estimation based on 2D Hand Keypoints},
  booktitle = {International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2023) (Best Techical Paper Award)},
  publisher = {ACM},
  year = {2023},
  month = {July},
  pages = {148--153},
  address = {Corfu, Greece},
  projects =  {I.C.HUMANS,SignGuide},
  doi = {10.1145/3594806.3594838},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2023_07_PETRA_Drosakis.pdf}
}

A. Qammaz and A. Argyros, "Compacting MocapNET-based 3D Human Pose Estimation via Dimensionality Reduction", In International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2023), ACM, pp. 306-312, Corfu, Greece, July 2023.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: MocapNETs are state of the art Neural Network (NN) ensembles that estimate 3D human pose based on visual input in the form of an RGB image. They do so by deriving a 3D Bio Vision Hierarchy (BVH) skeleton from estimated 2D human body joint projections. BVH output makes MocapNETs directly compatible with a large variety of 3D graphics engines, where virtual avatars can be directly animated from RGB sources and off-the-shelf webcam input. MocapNETs have satisfactory accuracy and state of the art computational performance that, however, prior to this work was not sufficient for their deployment on embedded devices. In this paper we explore dimensionality reduction via the use of Principal Components Analysis (PCA) as a means to optimize their size and make them applicable to mobile and edge devices. PCA allows (a) reduction of input dimensionality, (b) fine-grained control over the variance covered by the maintained dimensions and, (c) drastic reduction of the total number of model/network parameters without compromising regression accuracy. Extensive experiments on the CMU BVH dataset provide insight on the effective receptive fields for densely connected networks. Moreover, PCA-based dimensionality reduction results in a 35% smaller NN compared to the baseline and derives BVH skeletons without accuracy degradation. As such, the proposed compact NN solution becomes deployable on the Raspberry Pi 4 ARM CPU @ 23Hz.

BibTeX:

@inproceedings{Qammaz2023,
  author = {Qammaz, Ammar and Argyros, Antonis},
  title = {Compacting MocapNET-based 3D Human Pose Estimation via Dimensionality Reduction},
  booktitle = {International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2023)},
  publisher = {ACM},
  year = {2023},
  month = {July},
  pages = {306--312},
  address = {Corfu, Greece},
  projects =  {BonsApps,I.C.HUMANS,SignGuide},
  doi = {10.1145/3594806.3594841},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2023_07_PETRA_Qammaz.pdf}
}

F. Gouidis, T. Patkos, A.A. Argyros and D. Plexousakis, "Leveraging Knowledge Graphs for Zero-Shot Object-agnostic State Classification", CoRR, arXiv, July 2023.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We investigate the problem of Object State Classification (OSC) as a zero-shot learning problem. Specifically, we propose the first Object-agnostic State Classification (OaSC) method that infers the state of a certain object without relying on the knowledge or the estimation of the object class. In that direction, we capitalize on Knowledge Graphs (KGs) for structuring and organizing knowledge, which, in combination with visual information, enable the inference of the states of objects in object/state pairs that have not been encountered in the method's training set. A series of experiments investigate the performance of the proposed method in various settings, against several hypotheses and in comparison with state of the art approaches for object attribute classification. The experimental results demonstrate that the knowledge of an object class is not decisive for the prediction of its state. Moreover, the proposed OaSC method outperforms existing methods in all datasets and benchmarks by a great margin.

BibTeX:

@arxivarticle{gouidis2023arx,
  author = {Gouidis, Filippos and Patkos, Theodore and Argyros, Antonis A and Plexousakis, Dimitris},
  title = {Leveraging Knowledge Graphs for Zero-Shot Object-agnostic State Classification},
  journal = {CoRR, arXiv},
  year = {2023},
  month = {July},
  url = {https://arxiv.org/abs/2307.12179},
  doi = {10.48550/arXiv.2307.12179},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2023_07_arxiv_aosc_Gouidis.pdf}
}

N. Vasilikopoulos, N. Kolotouros, A. Tsoli and A.A. Argyros, "TAPE: Temporal Attention-based Probabilistic human pose and shape Estimation", In Scandinavian Conference on Image Analysis (SCIA 2023), also available at CoRR, arXiv, Springer, pp. 318-431, Kittilä, Finland, April 2023.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Reconstructing 3D human pose and shape from monocular videos is a well-studied but challenging problem. Common challenges include occlusions, the inherent ambiguities in the 2D to 3D mapping and the computational complexity of video processing. Existing methods ignore the ambiguities of the reconstruction and provide a single deterministic estimate for the 3D pose. In order to address these issues, we present a Temporal Attention based Probabilistic human pose and shape Estimation method (TAPE) that operates on an RGB video. More specifically, we propose to use a neural network to encode video frames to temporal features using an attention-based neural network. Given these features, we output a per-frame but temporally-informed probability distribution for the human pose using Normalizing Flows. We show that TAPE outperforms state-of-the-art methods in standard benchmarks and serves as an effective video-based prior for optimizationbased human pose and shape estimation.

BibTeX:

@inproceedings{Vasilikopoulos2023,
  author = {Vasilikopoulos, Nikolaos and Kolotouros, Nikos and Tsoli, Aggeliki and Argyros, Antonis A},
  title = {TAPE: Temporal Attention-based Probabilistic human pose and shape Estimation},
  booktitle = {Scandinavian Conference on Image Analysis (SCIA 2023), also available at CoRR, arXiv},
  publisher = {Springer},
  year = {2023},
  month = {April},
  pages = {318--431},
  address = {Kittilä, Finland},
  projects =  {I.C.HUMANS},
  doi = {10.1007/978-3-031-31438-4_28},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2023_04_SCIA_Vasilikopoulos.pdf}
}

N. Vasilikopoulos, N. Kolotouros, A. Tsoli and A.A. Argyros, "TAPE: Temporal Attention-based Probabilistic human pose and shape Estimation", CoRR, arXiv, April 2023.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

BibTeX:

@arxivarticle{vasilikopoulos2023arx,
  author = {Vasilikopoulos, Nikolaos and Kolotouros, Nikos and Tsoli, Aggeliki and Argyros, Antonis A},
  title = {TAPE: Temporal Attention-based Probabilistic human pose and shape Estimation},
  journal = {CoRR, arXiv},
  year = {2023},
  month = {April},
  url = {https://arxiv.org/abs/2305.00181},
  doi = {10.48550/arXiv.2305.00181},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2023_04_arxiv_tape_Vasilikopoulos.pdf}
}

S. Panagou, M. Sileo, K. Papoutsakis, F. Fruggiero, A. Qammaz and A.A. Argyros, "Complexity based investigation in collaborative assembly scenarios via non intrusive techniques", Procedia Computer Science, Special issue, 4th International Conference on Industry 4.0 and Smart Manufacturing (ISM 2022), Elsevier, vol. 217, pp. 478-485, 2023.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Human and robot collaboration in assembly tasks is an integral part in modern manufactories. Robots provide advantages in both process and productivity with their repeatability and usability in different tasks, while human operators provide flexibility and can act as safeguards. However, process complexity increases which can lower the overall quality. Increased complexity can negatively influence decision making due to cognitive load on human operators, which can lead to lower quality, be it product, process or human work. Moreover, it can lead to safety risks, human-system error and accidents. In this work, we present the reliminary results on an experiment performed with student-participants, based on an assembly task. The experiment was set up to emulate an industrial assembly, and data collection was performed through qualitative and non-intrusive quantitative methods. Questionnaires were used to assess perceptual task complexity and cognitive load, while a stereo camera provided recordings for after-task analysis on process errors and human work quality based on a 3D skeleton-based human pose estimation and tracking method. The aim of the study is to investigate causes of errors and implications on quality. Future direction of the work is discussed.

BibTeX:

@article{Panagou2023,
  author = {Panagou, Sotirios and Sileo, Monica and Papoutsakis, Konstantinos and Fruggiero, Fabio and Qammaz, Ammar and Argyros, Antonis A},
  title = {Complexity based investigation in collaborative assembly scenarios via non intrusive techniques},
  journal = {Procedia Computer Science, Special issue, 4th International Conference on Industry 4.0 and Smart Manufacturing (ISM 2022)},
  publisher = {Elsevier},
  year = {2023},
  volume = {217},
  pages = {478--485},
  address = {Austria and online},
  url = {https://www.sciencedirect.com/science/article/pii/S1877050922023213},
  projects =  {SUSTAGE},
  doi = {10.1016/j.procs.2022.12.243},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2022_11_ISM_Panagou.pdf}
}

X. Zabulis, N. Partarakis, I. Demeridou, P. Doulgeraki, E. Zidianakis, A. Argyros, M. Theodoridou, Y. Marketakis, C. Meghini, V. Bartalesi, N. Pratelli, C. Holz, P. Streli, M. Meier, M.K. Seidler, L. Werup, P.F. Sichani, S. Manitsaris, G. Senteri, A. Dubois, C. Ringas, A. Ziova, E. Tasiopoulou, D. Kaplanidi, D. Arnaud, P. Hee, G. Canavate, M.-A. Benvenuti and J. Krivokapic, "A Roadmap for Craft Understanding, Education, Training, and Preservation", Heritage, vol. 6, no. 7, pp. 5305-5328, 2023.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: A roadmap is proposed that defines a systematic approach for craft preservation and its evaluation. The proposed roadmap aims to deepen craft understanding so that blueprints of appropriate tools that support craft documentation, education, and training can be designed while achieving preservation through the stimulation and diversification of practitioner income. In addition to this roadmap, an evaluation strategy is proposed to validate the efficacy of the developed results and provide a benchmark for the efficacy of craft preservation approaches. The proposed contribution aims at the catalyzation of craft education and training with digital aids, widening access and engagement to crafts, economizing learning, increasing exercisability, and relaxing remoteness constraints in craft learning.

BibTeX:

@article{Zabulis2023,
  author = {Zabulis, Xenophon and Partarakis, Nikolaos and Demeridou, Ioanna and Doulgeraki, Paraskevi and Zidianakis, Emmanouil and Argyros, Antonis and Theodoridou, Maria and Marketakis, Yannis and Meghini, Carlo and Bartalesi, Valentina and Pratelli, Nicolò and Holz, Christian and Streli, Paul and Meier, Manuel and Seidler, Matias Katajavaara and Werup, Laura and Sichani, Peiman Fallahian and Manitsaris, Sotiris and Senteri, Gavriela and Dubois, Arnaud and Ringas, Chistodoulos and Ziova, Aikaterini and Tasiopoulou, Eleana and Kaplanidi, Danai and Arnaud, David and Hee, Patricia and Canavate, Gregorio and Benvenuti, Marie-Adelaide and Krivokapic, Jelena},
  title = {A Roadmap for Craft Understanding, Education, Training, and Preservation},
  journal = {Heritage},
  year = {2023},
  volume = {6},
  number = {7},
  pages = {5305--5328},
  url = {https://www.mdpi.com/2571-9408/6/7/280},
  projects =  {CRAEFT},
  doi = {10.3390/heritage6070280},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2023_journal_Heritage_CRAEFT.pdf}
}

X. Zabulis, N. Partarakis, A. Argyros, A. Tsoli, A. Qammaz, I. Adami, P. Doulgeraki, E. Karuzaki, A. Chatziantoniou, N. Patsiouras, E. Stefanidi, Z. Stefanidi, A. Rigaki, M. Doulgeraki, A. Patakos, S. Manitsaris, A. Glushkova, B.E.O. Padilla, D. Menychtas, D. Makrygiannis, C. Meghini, V. Bartalesi, D. Metilli, N. Magnenat-Thalmann, E. Bakas, N. Cadi, D. van Dijk, P. de Sterke, M. Wippoo, M. van der Vaart, C. Ringas, M. Fasoula, E. Tasiopoulou, D. Kaplanidi, L. Pannese, V. Nitti, C. Cuenca, A.-L. Carre, A. Dubois, H. Hauser, C. Beisswenger, D. Blatt, I. Neumann and U. Denter, "The Mingei Handbook on Heritage Craft representation and preservation", Zenodo, 2022.
[BibTeX] [DOI] [PDF] [URL]

BibTeX:

@book{zabulis2022,
  author = {Xenophon Zabulis and Nikolaos Partarakis and Antonis Argyros and Aggeliki Tsoli and Ammar Qammaz and Ilia Adami and Paraskevi Doulgeraki and Effie Karuzaki and Antonios Chatziantoniou and Nikolaos Patsiouras and Evropi Stefanidi and Zinovia Stefanidi and Anastasia Rigaki and Maria Doulgeraki and Antreas Patakos and Sotiris Manitsaris and Alina Glushkova and Brenda Elizabeth Olivas Padilla and Dimitrios Menychtas and Dimitrios Makrygiannis and Carlo Meghini and Valentina Bartalesi and Daniele Metilli and Nadia Magnenat-Thalmann and Evangelia Bakas and Nedjma Cadi and Dick van Dijk and Pam de Sterke and Meia Wippoo and Merel van der Vaart and Chris Ringas and Maria Fasoula and Eleana Tasiopoulou and Danae Kaplanidi and Lucia Pannese and Vito Nitti and Catherine Cuenca and Anne-Laure Carre and Arnaud Dubois and Hansgeorg Hauser and Cynthia Beisswenger and Dieter Blatt and Ilka Neumann and Ulrike Denter},
  title = {The Mingei Handbook on Heritage Craft representation and preservation},
  publisher = {Zenodo},
  year = {2022},
  month = {October},
  file = {M:\antonis\professional\_html\mypapers\2022_10_book_mingeihandbook.pdf:PDF},
  projects =  {MINGEI},
  doi = {10.5281/zenodo.7267365},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2022_10_book_mingeihandbook.pdf}
}

K. Bacharidis and A.A. Argyros, "Cross-domain Learning in Deep HAR Models via Natural Language Processing on Action Labels", In Advances in Visual Computing (ISVC 2022), Springer, San Diego, USA, October 2022.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Nowadays, deep learning approaches lead the state-of-the-art scores in human activity recognition (HAR). However, the supervised nature of these approaches still relies heavily on the size and the quality of the available training datasets. The complexity of activities of existing HAR video datasets ranges from simple coarse actions, such as sitting, to complex activities, consisting of multiple actions with subtle variations in appearance and execution. For the latter, the available datasets rarely contain adequate data samples. In this paper, we propose an approach to exploit the action-related information in action label sentences to combine HAR datasets that share a sufficient amount of actions with high linguistic similarity in their labels. We evaluate the effect of inter- and intra-dataset label linguistic similarity rate in the process of a cross-dataset knowledge distillation. In addition, we propose a deep neural network design that enables joint learning and leverages, for each dataset, the additional training data from the other dataset, for actions with high linguistic similarity. Finally, in a series of quantitative and qualitative experiments, we show that our approach improves the performance for both datasets, compared to a single dataset learning scheme.

BibTeX:

@inproceedings{Bacharidis2022a,
  author = {Bacharidis, Konstantinos and Argyros, Antonis A},
  title = {Cross-domain Learning in Deep HAR Models via Natural Language Processing on Action Labels},
  booktitle = {Advances in Visual Computing (ISVC 2022)},
  publisher = {Springer},
  year = {2022},
  month = {October},
  address = {San Diego, USA},
  projects =  {I.C.HUMANS},
  doi = {10.1007/978-3-031-20713-6_27},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2022_10_isvc_bacharidis.pdf}
}

V. Manousaki, K. Papoutsakis and A.A. Argyros, "Graphing the Future: Activity and Next Active Object Prediction using Graph-based Activity Representations", In Advances in Visual Computing (ISVC 2022), also available at CoRR, arXiv, Springer, San Diego, USA, October 2022.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: We present a novel approach for the visual prediction of human-object interactions in videos. Rather than forecasting the human and object motion or the future hand-object contact points, we aim at predicting (a) the class of the on-going human-object interaction and (b) the class(es) of the next active object(s) (NAOs), i.e., the object(s) that will be involved in the interaction in the near future as well as the time the interaction will occur. Graph matching relies on the efficient Graph Edit distance (GED) method. The experimental evaluation of the proposed approach was conducted using two well-established video datasets that contain human-object interactions, namely the MSR Daily Activities and the CAD120. High prediction accuracy was obtained for both action prediction and NAO forecasting.

BibTeX:

@inproceedings{Manousaki2022a,
  author = {Manousaki, Victoria and Papoutsakis, Konstantinos and Argyros, Antonis A},
  title = {Graphing the Future: Activity and Next Active Object Prediction using Graph-based Activity Representations},
  booktitle = {Advances in Visual Computing (ISVC 2022), also available at CoRR, arXiv},
  publisher = {Springer},
  year = {2022},
  month = {October},
  address = {San Diego, USA},
  projects =  {I.C.HUMANS},
  doi = {10.1007/978-3-031-20713-6_23},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2022_10_isvc_gtf.pdf}
}

S. Oprea, P. Martinez-Gonzalez, A. Garcia-Garcia, J.A. Castro-Vargas, J.G.-R. Sergio Orts-Escolano and A. Argyros, "A Review on Deep Learning Techniques for Video Prediction", IEEE Transactions on Pattern Analysis and Machine Intelligence, also available at CoRR, arXiv, IEEE, vol. 44, pp. 2806-2826, June 2022.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: The ability to predict, anticipate and reason about future outcomes is a key component of intelligent decision-making systems. In light of the success of deep learning in computer vision, deep-learning-based video prediction emerged as a promising research direction. Defined as a self-supervised learning task, video prediction represents a suitable framework for representation learning, as it demonstrated potential capabilities for extracting meaningful representations of the underlying patterns in natural videos. Motivated by the increasing interest in this task, we provide a review on the deep learning methods for prediction in video sequences. We firstly define the video prediction fundamentals, as well as mandatory background concepts and the most used datasets. Next, we carefully analyze existing video prediction models organized according to a proposed taxonomy, highlighting their contributions and their significance in the field. The summary of the datasets and methods is accompanied with experimental results that facilitate the assessment of the state of the art on a quantitative basis. The paper is summarized by drawing some general conclusions, identifying open research challenges and by pointing out future research directions.

BibTeX:

@article{oprea2020b,
  author = {Sergiu Oprea and Pablo Martinez-Gonzalez and Alberto Garcia-Garcia and John Alejandro Castro-Vargas and Sergio Orts-Escolano, Jose Garcia-Rodriguez and Antonis Argyros},
  title = {A Review on Deep Learning Techniques for Video Prediction},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence, also available at CoRR, arXiv},
  publisher = {IEEE},
  year = {2022},
  month = {June},
  volume = {44},
  pages = {2806--2826},
  url = {https://doi.org/10.1109/TPAMI.2020.3045007},
  projects =  {FORTH},
  doi = {10.1109/TPAMI.2020.3045007},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2022_journal_PAMI_videoprediction.pdf}
}

D. Kosmopoulos, C. Constantinopoulos, M. Trigka, D. Papazachariou, K. Antzakas, V. Lampropoulou, A. Argyros, I. Oikonomidis, A. Roussos, N. Partarakis, G. Papagiannakis, K. Grigoriadis, A. Koukouvou and A. Moneda, "Museum Guidance in Sign Language: the SignGuide project vision", In av-cult Workshop, International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2022), ACM, Corfu, Greece, June 2022.
[Abstract] [BibTeX] [PDF]

Abstract: We present an overview of the SignGuide project. Its main goal is to develop a prototype interactive museum guide system for deaf visitors using mobile devices that will be able to receive visitors’ questions in their native (sign language) with regard to the exhibits, and to provide additional content also in sign language using an avatar or video, utilizing techniques from the field of computer vision and machine learning. The paper presents the basic ideas and technologies involved as well as some preliminary results.

BibTeX:

@inproceedings{Kosmopoulos2022a,
  author = {Kosmopoulos, Dimitrios and Constantinopoulos, Constantinos and Trigka, Maria and Papazachariou, Dimitrios and Antzakas, Klimis and Lampropoulou, Venetta and Argyros, Antonis and Oikonomidis, Iason and Roussos, Anastasios and Partarakis, Nikolaos and Papagiannakis, Georgios and Grigoriadis, Konstandinos and Koukouvou, Angeliki and Moneda, Angeliki},
  title = {Museum Guidance in Sign Language: the SignGuide project vision},
  booktitle = {av-cult Workshop, International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2022)},
  publisher = {ACM},
  year = {2022},
  month = {June},
  address = {Corfu, Greece},
  projects =  {SignGuide},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2022_06_PETRA_SignGuide.pdf}
}

P. Padeleris and A.A. Argyros, "PE-former: Pose Estimation Transformer", In International Conference on Pattern Recognition and Artificial Intelligence (ICPRAI 2022), also available at CoRR, arXiv, LNCS, pp. 3-14, Paris, France, June 2022.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Vision transformer architectures have been demonstrated to work very effectively for image classification tasks. Efforts to solve more challenging vision tasks with transformers rely on convolutional backbones for feature extraction. In this paper we investigate the use of a pure transformer architecture (i.e., one with no CNN backbone) for the problem of 2D body pose estimation. We evaluate two ViT architectures on the COCO dataset. We demonstrate that using an encoder-decoder transformer architecture yields state of the art results on this estimation problem.

BibTeX:

@inproceedings{Padeleris22a,
  author = {Padeleris, Paschalis and Argyros, Antonis A},
  title = {PE-former: Pose Estimation Transformer},
  booktitle = {International Conference on Pattern Recognition and Artificial Intelligence (ICPRAI 2022), also available at CoRR, arXiv},
  publisher = {LNCS},
  year = {2022},
  month = {June},
  pages = {3--14},
  address = {Paris, France},
  projects =  {FORTH},
  doi = {10.1007/978-3-031-09282-4_1},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2022_07_ICPRAI_PEformer.pdf}
}

F. Gouidis, T. Patkos, A.A. Argyros and D. Plexousakis, "Detecting Object States vs Detecting Objects: A New Dataset and a Quantitative Experimental Study", In International Conference on Computer Vision Theory and Applications (VISAPP 2022), also available at CoRR, arXiv, Scitepress, pp. 590-600, Online, February 2022.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: The detection of object states in images (State Detection - SD) is a problem of both theoretical and practical importance and it is tightly interwoven with other important computer vision problems, such as action recognition and affordance detection. It is also highly relevant to any entity that needs to reason and act in dynamic domains, such as robotic systems and intelligent agents. Despite its importance, up to now, the research on this problem has been limited. In this paper, we attempt a systematic study of the SD problem. First, we introduce the Object State Detection Dataset (OSDD), a new publicly available dataset consisting of more than 19,000 annotations for 18 object categories and 9 state classes. Second, using a standard deep learning framework used for Object Detection (OD), we conduct a number of appropriately designed experiments, towards an in-depth study of the behavior of the SD problem. This study enables the setup of a baseline on the performance of SD, as well as its relative performance in comparison to OD, in a variety of scenarios. Overall, the experimental outcomes confirm that SD is harder than OD and that tailored SD methods need to be developed for addressing effectively this significant problem.

BibTeX:

@inproceedings{Gouidis2022a,
  author = {Gouidis, Filippos and Patkos, Theodore and Argyros, Antonis A and Plexousakis, Dimitris},
  title = {Detecting Object States vs Detecting Objects: A New Dataset and a Quantitative Experimental Study},
  booktitle = {International Conference on Computer Vision Theory and Applications (VISAPP 2022), also available at CoRR, arXiv},
  publisher = {Scitepress},
  year = {2022},
  month = {February},
  pages = {590--600},
  address = {Online},
  projects =  {SOCOLA},
  doi = {https://10.5220/0010882300003124},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2022_02_VISAPP_Gouidis.pdf}
}

G. Lydakis, I. Oikonomidis, D. Kosmopoulos and A.A. Argyros, "Exploitation of noisy automatic data annotation and its application to hand posture classification", In International Conference on Computer Vision Theory and Applications (VISAPP 2022), Scitepress, pp. 632-641, Online, February 2022.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: The success of deep learning in recent years relies on the availability of large amounts of accurately annotated training data. In this work, we investigate a technique for utilizing automatically annotated data in classification problems. Using a small number of manually annotated samples, and a large set of data that feature automatically created, noisy labels, our approach trains a Convolutional Neural Network (CNN) in an iterative manner. The automatic annotations are combined with the predictions of the network in order to gradually expand the training set. In order to evaluate the performance of the proposed approach, we apply it to the problem of hand posture recognition from RGB images. We compare the results of training a CNN classifier with and without the use of our technique. Our method yields a significant increase in average classification accuracy, and also decreases the deviation in class accuracies, thus indicating the validity and the usefulness of the proposed approach.

BibTeX:

@inproceedings{Lydakis2022,
  author = {Lydakis, Giorgos and Oikonomidis, Iason and Kosmopoulos, Dimitrios and Argyros, Antonis A},
  title = {Exploitation of noisy automatic data annotation and its application to hand posture classification},
  booktitle = {International Conference on Computer Vision Theory and Applications (VISAPP 2022)},
  publisher = {Scitepress},
  year = {2022},
  month = {February},
  pages = {632--641},
  address = {Online},
  projects =  {HEALTHSIGN},
  doi = {https://10.5220/0010882300003124},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2022_02_VISAPP_Lydakis.pdf}
}

V. Manousaki and A.A. Argyros, "Segregational Soft Dynamic Time Warping and its Application to Action Prediction", In International Conference on Computer Vision Theory and Applications (VISAPP 2022), Scitepress, pp. 226-235, Online, February 2022.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Aligning the execution of complete actions captured in segmented videos has been a problem explored by Dynamic Time Warping (DTW) and Soft Dynamic Time Warping (S-DTW) algorithms. The limitation of these algorithms is that they cannot align unsegmented actions, i.e., actions that appear between other actions. This limitation is mitigated by the use of two existing DTW variants, namely the Open-End DTW (OE-DTW) and the Open-Begin-End DTW (OBE-DTW). OE-DTW is designed for aligning actions of known begin point but unknown end point, while OBE-DTW handles continuous, completely unsegmented actions with unknown begin and end points. In this paper, we combine the merits of S-DTW with those of OE-DTW and OBE-DTW. In that direction, we propose two new DTW variants, the Open-End Soft DTW (OE-S-DTW) and the Open-Begin-End Soft DTW (OBE-S-DTW). The superiority of the proposed algorithms lies in the combination of the soft-minimum operator and the relaxation of the boundary constraints of S-DTW, with the segregational capabilities of OE-DTW and OBE-DTW, resulting in better and differentiable action alignment in the case of continuous, unsegmented videos. We evaluate the proposed algorithms on the task of action prediction on standard datasets such as MHAD, MHAD101-v/-s, MSR Daily Activities and CAD-120. Our experimental results show the superiority of the proposed algorithms to existing video alignment methods.

BibTeX:

@inproceedings{Manousaki2022a,
  author = {Manousaki, Victoria and Argyros, Antonis A},
  title = {Segregational Soft Dynamic Time Warping and its Application to Action Prediction},
  booktitle = {International Conference on Computer Vision Theory and Applications (VISAPP 2022)},
  publisher = {Scitepress},
  year = {2022},
  month = {February},
  pages = {226--235},
  address = {Online},
  projects =  {I.C.HUMANS,ELIDEK},
  doi = {https://10.5220/0010882300003124},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2022_02_VISAPP_Manousaki.pdf}
}

K. Bacharidis and A. Argyros, "Exploiting the Nature of Repetitive Actions for Their Effective and Efficient Recognition", Frontiers, vol. 4, 2022.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: In the field of human action recognition (HAR), the recognition of actions with large duration is hindered by the memorization capacity limitations of the standard probabilistic and recurrent neural network (R-NN) approaches that are used for temporal sequence modeling. The simplest remedy is to employ methods that reduce the input sequence length, by performing window sampling, pooling, or key-frame extraction. However, due to the nature of the frame selection criteria or the employed pooling operations, the majority of these approaches do not guarantee that the useful, discriminative information is preserved. In this work, we focus on the case of repetitive actions. In such actions, a discriminative, core execution motif is maintained throughout each repetition, with slight variations in execution style and duration. Additionally, scene appearance may change as a consequence of the action. We exploit those two key observations on the nature of repetitive actions to build a compact and efficient representation of long actions by maintaining the discriminative sample information and removing redundant information which is due to task repetitiveness. We show that by partitioning an input sequence based on repetition and by treating each repetition as a discrete sample, HAR models can achieve an increase of up to 4% in action recognition accuracy. Additionally, we investigate the relation between the dataset and action set attributes with this strategy and explore the conditions under which the utilization of repetitiveness for input sequence sampling, is a useful preprocessing step in HAR. Finally, we suggest deep NN design directions that enable the effective exploitation of the distinctive action-related information found in repetitiveness, and evaluate them with a simple deep architecture that follows these principles.

BibTeX:

@article{Bacharidis2022a,
  author = {Bacharidis, Konstantinos and Argyros, Antonis},
  title = {Exploiting the Nature of Repetitive Actions for Their Effective and Efficient Recognition},
  booktitle = {Frontiers in Computer Science},
  publisher = {Frontiers},
  year = {2022},
  volume = {4},
  url = {https://www.frontiersin.org/article/10.3389/fcomp.2022.806027},
  projects =  {I.C.HUMANS},
  doi = {10.3389/fcomp.2022.806027},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2022_journal_frontiers_Bacharidis.pdf}
}

H. Hauser, C. Beisswenger, N. Partarakis, X. Zabulis, I. Adami, E. Zidianakis, A. Patakos, N. Patsiouras, E. Karuzaki, M. Foukarakis, A. Tsoli, A. Qammaz, A. Argyros, N. Cadi, E. Baka, N.M. Thalmann, B. Olivias, D. Makrygiannis, A. Glushkova, S. Manitsaris, V. Nitti and L. Panesse, "Multimodal Narratives for the Presentation of Silk Heritage in the Museum", MDPI, vol. 5, no. 1, pp. 461-487, 2022.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: In this paper, a representation based on digital assets and semantic annotations is established for Traditional Craft instances, in a way that captures their socio-historic context and preserves both their tangible and intangible Cultural Heritage dimensions. These meaningful and documented experiential presentations are delivered to the target audience through narratives that address a range of uses, including personalized storytelling, interactive Augmented Reality (AR), augmented physical artifacts, Mixed Reality (MR) exhibitions, and the Web. The provided engaging cultural experiences have the potential to have an impact on interest growth and tourism, which can support Traditional Craft communities and institutions. A secondary impact is the attraction of new apprentices through training and demonstrations that guarantee long-term preservation. The proposed approach is demonstrated in the context of textile manufacturing as practiced by the community of the Haus der Seidenkultur, a former silk factory that was turned into a museum where the traditional craft of Jacquard weaving is still practiced.

BibTeX:

@article{Partarakis2022,
  author = {Hauser, Hansgeorg and Beisswenger, Cynthia and Partarakis, Nikolaos and Zabulis, Xenophon and Adami, Ilia and Zidianakis, Emmanouil and Patakos, Andreas and Patsiouras, Nikolaos and Karuzaki, Effie and Foukarakis, Michalis and Tsoli, Aggeliki and Qammaz, Ammar and Argyros, Antonis and Cadi, Nedjma and Baka, Evangelia and Thalmann, Nadia Magnenat and Olivias, Brenda and Makrygiannis, Dimitrios and Glushkova, Alina and Manitsaris, Sotirios and Nitti, Vito and Panesse, Lucia},
  title = {Multimodal Narratives for the Presentation of Silk Heritage in the Museum},
  booktitle = {Heritage},
  publisher = {MDPI},
  year = {2022},
  volume = {5},
  number = {1},
  pages = {461--487},
  url = {https://www.mdpi.com/2571-9408/5/1/27},
  projects =  {MINGEI},
  doi = {10.3390/heritage5010027},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2022_journal_mingei.pdf}
}

V. Manousaki, K. Papoutsakis and A. Argyros, "Graphing the Future: Activity and Next Active Object Prediction using Graph-based Activity Representations", CoRR, Arxiv, 2022.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

BibTeX:

@arxivarticle{Manousaki2022arxiv,
  author = {Manousaki, Victoria and Papoutsakis, Konstantinos and Argyros, Antonis},
  title = {Graphing the Future: Activity and Next Active Object Prediction using Graph-based Activity Representations},
  journal = {CoRR, Arxiv},
  year = {2022},
  url = {https://arxiv.org/abs/2209.05194},
  projects =  {I.C.HUMANS},
  doi = {10.48550/ARXIV.2209.05194},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2022_09_arxiv_gtf.pdf}
}

F. Gouidis, T. Patkos, A.A. Argyros and D. Plexousakis, "Detecting Object States vs Detecting Objects: A New Dataset and a Quantitative Experimental Study", CoRR, Arxiv, December 2021.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

BibTeX:

@arxivarticle{Gouidis2021b,
  author = {Gouidis, Filippos and Patkos, Theodore and Argyros, Antonis A and Plexousakis, Dimitris},
  title = {Detecting Object States vs Detecting Objects: A New Dataset and a Quantitative Experimental Study},
  journal = {CoRR, Arxiv},
  year = {2021},
  month = {December},
  url = {https://arxiv.org/abs/2112.08281},
  projects =  {SOCOLA},
  doi = {10.48550/arXiv.2112.08281},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_12_arxiv_Gouidis.pdf}
}

P. Panteleris and A. Argyros, "PE-former: Pose Estimation Transformer", CoRR, Arxiv, December 2021.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

BibTeX:

@arxivarticle{Padeleris2021,
  author = {Paschalis Panteleris and Antonis Argyros},
  title = {PE-former: Pose Estimation Transformer},
  journal = {CoRR, Arxiv},
  year = {2021},
  month = {December},
  url = {https://arxiv.org/abs/2112.04981},
  projects =  {FORTH},
  doi = {10.48550/arXiv.2112.04981},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_12_arxiv_Padeleris.pdf}
}

P. Schillinger, S. García, A. Makris, K. Roditakis, M. Logothetis, K. Alevizos, W. Ren, P. Tajvar, P. Pelliccione, A. Argyros, K.J. Kyriakopoulos and D.V. Dimarogonas, "Adaptive heterogeneous multi-robot collaboration from formal task specifications", Robotics and Autonomous Systems, vol. 145, November 2021.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Efficiently coordinating different types of robots is an important enabler for many commercial and industrial automation tasks. Here, we present a distributed framework that enables a team of heterogeneous robots to dynamically generate actions from a common, user-defined goal specification. In particular, we discuss the integration of various robotic capabilities into a common task allocation and planning formalism, as well as the specification of expressive, temporally-extended goals by non-expert users. Models for task allocation and execution both consider non-deterministic outcomes of actions and thus, are suitable for a wide range of real-world tasks including formally specified reactions to online observations. One main focus of our paper is to evaluate the framework and its integration of software modules through a number of experiments. These experiments comprise industry-inspired scenarios as motivated by future real-world applications. Finally, we discuss the results and learnings for motivating practically relevant, future research questions.

BibTeX:

@article{SCHILLINGER2021103866,
  author = {Philipp Schillinger and Sergio García and Alexandros Makris and Konstantinos Roditakis and Michalis Logothetis and Konstantinos Alevizos and Wei Ren and Pouria Tajvar and Patrizio Pelliccione and Antonis Argyros and Kostas J. Kyriakopoulos and Dimos V. Dimarogonas},
  title = {Adaptive heterogeneous multi-robot collaboration from formal task specifications},
  journal = {Robotics and Autonomous Systems},
  year = {2021},
  month = {November},
  volume = {145},
  url = {https://www.sciencedirect.com/science/article/pii/S0921889021001512},
  projects =  {CO4ROBOTS},
  doi = {10.1016/j.robot.2021.103866},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_journal_RAS_adaptive_co4robots.pdf}
}

G. Karvounas, N. Kyriazis, I. Oikonomidis, A. Tsoli and A.A. Argyros, "Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing", In British Machine Vision Conference (BMVC 2021), BMVA, Virtual, UK, November 2021.
[Abstract] [BibTeX] [PDF]

Abstract: The amount and quality of datasets and tools available in the research field of hand pose and shape estimation act as evidence to the significant progress that has been made. We find that there is still room for improvement in both fronts, and even beyond. Even the datasets of the highest quality, reported to date, have shortcomings in annotation. There are tools in the literature that can assist in that direction and yet they have not been considered, so far. To demonstrate how these gaps can be bridged, we employ such a publicly available, multi-camera dataset of hands (InterHands), and perform effective image-based refinement to improve on the imperfect ground truth annotations, yielding a better dataset. The image-based refinement is achieved through raytracing, a method that has not been employed so far to relevant problems and is hereby shown to be superior to the approximative alternatives that have been employed in the past. To tackle the lack of reliable ground truth, we resort to realistic synthetic data, to show that the improvement we induce is indeed significant, qualitatively, and quantitatively, too.

BibTeX:

@inproceedings{Karvounas2021,
  author = {Karvounas, Giorgos and Kyriazis, Nikolaos and Oikonomidis, Iason and Tsoli, Aggeliki and Argyros, Antonis A},
  title = {Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing},
  booktitle = {British Machine Vision Conference (BMVC 2021)},
  publisher = {BMVA},
  year = {2021},
  month = {November},
  address = {Virtual, UK},
  projects =  {I.C.HUMANS},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_11_BMVC_Karvounas.pdf}
}

A. Qammaz and A.A. Argyros, "Towards Holistic Real-time Human 3D Pose Estimation using MocapNETs", In British Machine Vision Conference (BMVC 2021), BMVA, Virtual, UK, November 2021.
[Abstract] [BibTeX] [PDF]

Abstract: In this work, we extend a method originally devised for 3D body pose estimation to tackle the 3D hand pose estimation task. Due to its compositionality and compact Bio Vision Hierarchy (BVH) output, the resulting method can be combined with the original body 3D pose estimation method. This is achieved based on a novel neural network architecture combining key design characteristics of DenseNets, ResNets and MocapNETs trainable to accommodate both bodies and hands. The resulting method is assessed quantitatively in well-established hand and body pose estimation datasets. The obtained results show that the proposed enhancements result in competitive performance for hands, as well as on accuracy and performance benefits for the original body estimation task. Moreover, we show qualitatively that due to its real-time performance and easy deployment using off-the-shelf webcam equipped PCs, the proposed solution can become a valuable perceptual building block supporting a variety of applications.

BibTeX:

@inproceedings{Qammaz2021b,
  author = {Qammaz, Ammar and Argyros, Antonis A},
  title = {Towards Holistic Real-time Human 3D Pose Estimation using MocapNETs},
  booktitle = {British Machine Vision Conference (BMVC 2021)},
  publisher = {BMVA},
  year = {2021},
  month = {November},
  address = {Virtual, UK},
  projects =  {I.C.HUMANS},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_11_BMVC_Qammaz.pdf}
}

D. Bautembach, I. Oikonomidis and A. Argyros, "Even Faster SNN Simulation with Lazy+Event-driven Plasticity and Shared Atomics", In High Performance Extreme Computing (HPEC 2021), also available at CoRR, arXiv, September 2021.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: We present two novel optimizations that accelerate clock-based spiking neural network (SNN) simulators. The first one targets spike timing dependent plasticity (STDP). It combines lazy- with event-driven plasticity and efficiently facilitates the computation of pre- and post-synaptic spikes using bitfields and integer intrinsics. It offers higher bandwidth than event-driven plasticity alone and achieves a 1.5x-2x speedup over our closest competitor. The second optimization targets spike delivery. We partition our graph representation in a way that bounds the number of neurons that need be updated at any given time which allows us to perform said update in shared memory instead of global memory. This is 2x-2.5x faster than our closest competitor. Both optimizations represent the final evolutionary stages of years of iteration on STDP and spike delivery inside "Spice" (/spaIk/), our state of the art SNN simulator. The proposed optimizations are not exclusive to our graph representation or pipeline but are applicable to a multitude of simulator designs. We evaluate our performance on three well-established models and compare ourselves against three other state of the art simulators.

BibTeX:

@inproceedings{Bautembach2021b,
  author = {Dennis Bautembach and Iason Oikonomidis and Antonis Argyros},
  title = {Even Faster SNN Simulation with Lazy+Event-driven Plasticity and Shared Atomics},
  booktitle = {High Performance Extreme Computing (HPEC 2021), also available at CoRR, arXiv},
  year = {2021},
  month = {September},
  url = {https://arxiv.org/abs/2107.04092},
  projects =  {FORTH},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_09_HPEC_Bautembach.pdf}
}

V. Manousaki, K. Papoutsakis and A.A. Argyros, "Action Prediction During Human-Object Interaction based on DTW and Early Fusion of Human and Object Representations", In International Conference on Vision Systems (ICVS 2021), Springer, pp. 169-179, Vienna, Austria, September 2021.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Action prediction is defined as the inference of an action label while the action is still ongoing. Such a capability is extremely useful for early response and further action planning. In this paper, we consider the problem of action prediction in scenarios involving humans interacting with objects. We formulate an approach that builds time series representations of the performance of the humans and the objects. Such a representation of an ongoing action is then compared to prototype actions. This is achieved by a Dynamic Time Warping (DTW)-based time series alignment framework which identifies the best match between the ongoing action and the prototype ones. Our approach is evaluated quantitatively on three standard benchmark datasets. Our experimental results reveal the importance of the fusion of human- and object-centered action representations in the accuracy of action prediction. Moreover, we demonstrate that the proposed approach achieves significantly higher action prediction accuracy compared to competitive methods.

BibTeX:

@inproceedings{Manousaki2021,
  author = {Manousaki, Victoria and Papoutsakis, Konstantinos and Argyros, Antonis A},
  title = {Action Prediction During Human-Object Interaction based on DTW and Early Fusion of Human and Object Representations},
  booktitle = {International Conference on Vision Systems (ICVS 2021)},
  publisher = {Springer},
  year = {2021},
  month = {September},
  pages = {169--179},
  address = {Vienna, Austria},
  url = {http://www.scitepress.org/PublicationsDetail.aspx?ID=x1JKg2Ydp4w=&t=1},
  projects =  {ELIDEK},
  doi = {10.1007/978-3-030-87156-7_14},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_09_ICVS_Manousaki.pdf}
}

D. Bautembach, I. Oikonomidis and A. Argyros, "Multi-GPU SNN Simulation with Static Load Balancing", In IEEE International Joint Conference of Neural Networks (IJCNN 2021), also available at CoRR, arXiv, July 2021.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: We present a clock-driven Spiking Neural Network simulator which is up to 3x faster than the state of the art while, at the same time, being more general and requiring less programming effort on both the user's and maintainer's side. This is made possible by designing our pipeline around "work queues" which act as interfaces between stages and greatly reduce implementation complexity. We evaluate our work using three well-established SNN models on a series of benchmarks.

BibTeX:

@inproceedings{Bautembach2021b,
  author = {Dennis Bautembach and Iason Oikonomidis and Antonis Argyros},
  title = {Multi-GPU SNN Simulation with Static Load Balancing},
  booktitle = {IEEE International Joint Conference of Neural Networks (IJCNN 2021), also available at CoRR, arXiv},
  year = {2021},
  month = {July},
  url = {https://arxiv.org/abs/2102.04681},
  projects =  {FORTH},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_02_arxiv_bautembach.pdf}
}

S. Oprea, G. Karvounas, P. Martinez-Gonzalez, N. Kyriazis, S. Orts-Escolano, I. Oikonomidis, A. Garcia-Garcia, A. Tsoli, J. Garcia-Rodriguez and A. Argyros, "H-GAN: the power of GANs in your Hands", In IEEE International Joint Conference of Neural Networks (IJCNN 2021), also available at CoRR, arXiv, IEEE, pp. 1-8, July 2021.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We present HandGAN (H-GAN), a cycle-consistent adversarial learning approach implementing multi-scale perceptual discriminators. It is designed to translate synthetic images of hands to the real domain. Synthetic hands provide complete ground-truth annotations, yet they are not representative ofthe target distribution of real-world data. We strive to provide the perfect blend of a realistic hand appearance with synthetic annotations. Relying on image-to-image translation, we improve synthetic hands appearance to approximate the statistical distribution underlying a collection of real images of hands. H-GAN tackles not only cross-domain tone mapping but also structural differences in localized areas such as shading discontinuities. Results are evaluated on a qualitative and quantitative basis improving previous works. Furthermore, we successfully apply the generated images to the hand classification task.

BibTeX:

@inproceedings{Oprea2021b,
  author = {Sergiu Oprea and Giorgos Karvounas and Pablo Martinez-Gonzalez and Nikolaos Kyriazis and Sergio Orts-Escolano and Iason Oikonomidis and Alberto Garcia-Garcia and Aggeliki Tsoli and Jose Garcia-Rodriguez and Antonis Argyros},
  title = {H-GAN: the power of GANs in your Hands},
  booktitle = {IEEE International Joint Conference of Neural Networks (IJCNN 2021), also available at CoRR, arXiv},
  publisher = {IEEE},
  year = {2021},
  month = {July},
  pages = {1--8},
  url = {https://arxiv.org/abs/2103.15017},
  projects =  {FORTH},
  doi = {10.1109/IJCNN52387.2021.9534144},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_03_arxiv_oprea.pdf}
}

D. Bautembach, I. Oikonomidis and A. Argyros, "Even Faster SNN Simulation with Lazy+Event-driven Plasticity and Shared Atomics", CoRR, Arxiv, July 2021.
[Abstract] [BibTeX] [PDF] [URL]

BibTeX:

@arxivarticle{db2021,
  author = {Dennis Bautembach and Iason Oikonomidis and Antonis Argyros},
  title = {Even Faster SNN Simulation with Lazy+Event-driven Plasticity and Shared Atomics},
  journal = {CoRR, Arxiv},
  year = {2021},
  month = {July},
  url = {https://arxiv.org/abs/2107.04092},
  projects =  {FORTH},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_07_arxiv_Bautembach.pdf}
}

G. Karvounas, N. Kyriazis, I. Oikonomidis, A. Tsoli and A. Argyros, "Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing", CoRR, Arxiv, July 2021.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: The amount and quality of datasets and tools available in the research field of hand pose and shape estimation act as evidence to the significant progress that has been made. We find that there is still room for improvement in both fronts, and even beyond. Even the datasets of the highest quality, reported to date, have shortcomings in annotation. There are tools in the literature that can assist in that direction and yet they have not been considered, so far. To demonstrate how these gaps can be bridged, we employ such a publicly available, multi-camera dataset of hands (InterHand2.6M), and perform effective image-based refinement to improve on the imperfect ground truth annotations, yielding a better dataset. The image-based refinement is achieved through raytracing, a method that has not been employed so far to relevant problems and is hereby shown to be superior to the approximative alternatives that have been employed in the past. To tackle the lack of reliable ground truth, we resort to realistic synthetic data, to show that the improvement we induce is indeed significant, qualitatively, and quantitatively, too.

BibTeX:

@arxivarticle{gkarv2021,
  author = {Giorgos Karvounas and Nikolaos Kyriazis and Iason Oikonomidis and Aggeliki Tsoli and Antonis Argyros},
  title = {Multi-view Image-based Hand Geometry Refinement using Differentiable Monte Carlo Ray Tracing},
  journal = {CoRR, Arxiv},
  year = {2021},
  month = {July},
  url = {http://arxiv.org/abs/2107.05509},
  projects =  {I.C.HUMANS},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_07_arxiv_Karvounas.pdf}
}

M. Logothetis, G. Karras, K. Alevizos, C. Verginis, P. Roque, K. Roditakis, A. Makris, S. Garcıa, P. Schillinger, A.D. Fava, P. Pelliccione, A. Argyros, K. Kyriakopoulos and D. Dimarogonas, "A Decentralized Framework for Efficient Cooperation of Heterogeneous Robotic Agents", IEEE Robotics and Automation Magazine, IEEE, vol. 28, no. 2, pp. 74-87, June 2021.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: In social and industrial facilities of the future like hospitals, hotels, and warehouses, teams of robots will be deployed to assist humans accomplish everyday tasks such as object handling, transportation, or pickup and delivery operations. In such a context, different robots (e.g., mobile platforms, static manipulators, or mobile manipulators) with different actuation, manipulation, and perception capabilities must be coordinated in order to achieve various complex tasks (e.g. cooperative parts assembly in automotive industry, or loading and unloading of palettes in warehouses) that require collaborative actions with each other and with human operators (Figure 1). The efficient supervision and coordination of a heterogeneous system mandates a decentralized framework that integrates high-level task-planning, low-level motion planning and control, and robust real-time sensing of the robots dynamic environment. Decentralization in multi-agent robotic systems is of utmost importance, since it provides flexibility, scalability and fault-tolerance capabilities. In this work, we present the architecture of the decentralized framework developed within the context of EU Project Co4Robots and its application in a multi-tasking collaboration scenario involving various heterogeneous robots and humans.

BibTeX:

@article{Logothetis21,
  author = {Michalis Logothetis and George Karras and Konstantinos Alevizos and Christos Verginis and Pedro Roque and Konstantinos Roditakis and Alexandros Makris and Sergio Garcıa and Philipp Schillinger and Alessandro Di Fava and Patrizio Pelliccione and Antonis Argyros and Kostas Kyriakopoulos and Dimos Dimarogonas},
  title = {A Decentralized Framework for Efficient Cooperation of Heterogeneous Robotic Agents},
  journal = {IEEE Robotics and Automation Magazine},
  publisher = {IEEE},
  year = {2021},
  month = {June},
  volume = {28},
  number = {2},
  pages = {74--87},
  projects =  {CO4ROBOTS},
  doi = {10.1109/MRA.2021.3064761},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_journal_RAM_co4robots.pdf}
}

K. Roditakis, A. Makris and A.A. Argyros, "Towards Improved and Interpretable Action Quality Assessment with Self-Supervised Alignment", In HumanInteract Workshop, in conjunction with International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2021), ACM, Corfu, Greece, June 2021.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Action Quality Assessment (AQA) is a video understanding task aiming at the quantification of the execution quality of an action. One of the main challenges in relevant, deep learning-based approaches is the collection of training data annotated by experts. Current methods perform fine-tuning on pre-trained backbone models and aim to improve performance by modelling the subjects and the scene. In this work we consider embeddings extracted using a self-supervised training method based on a differential cycle consistency loss between sequences of actions. These are shown to improve the state-of-the-art without the need for additional annotations or scene modelling. The same embeddings are also used to temporally align the sequences prior to quality assessment which further increases the accuracy, provides robustness to variance in execution speed and enables us to provide fine-grained interpretability of the assessment score. The experimental evaluation of the method on the MTL-AQA dataset, demonstrates significant accuracy gain compared to the state-of-the-art baselines which grows even more when the action execution sequences are not well aligned.

BibTeX:

@inproceedings{Roditakis2021,
  author = {Roditakis, Konstantinos and Makris, Alexandros and Argyros, Antonis A},
  title = {Towards Improved and Interpretable Action Quality Assessment with Self-Supervised Alignment},
  booktitle = {HumanInteract Workshop, in conjunction with International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2021)},
  publisher = {ACM},
  year = {2021},
  month = {June},
  address = {Corfu, Greece},
  projects =  {FORTH},
  doi = {10.1145/3453892.3461624},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_06_PETRA_AQA-Roditakis.pdf}
}

D. Bautembach, I. Oikonomidis and A. Argyros, "Multi-GPU SNN Simulation with Static Load Balancing", CoRR, arXiv, April 2021.
[Abstract] [BibTeX] [PDF] [URL]

BibTeX:

@arxivarticle{Bautembach2021c,
  author = {Dennis Bautembach and Iason Oikonomidis and Antonis Argyros},
  title = {Multi-GPU SNN Simulation with Static Load Balancing},
  journal = {CoRR, arXiv},
  year = {2021},
  month = {April},
  url = {https://arxiv.org/abs/2102.04681},
  projects =  {FORTH},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_02_arxiv_bautembach.pdf}
}

S. Oprea, G. Karvounas, P. Martinez-Gonzalez, N. Kyriazis, S. Orts-Escolano, I. Oikonomidis, A. Garcia-Garcia, A. Tsoli, J. Garcia-Rodriguez and A. Argyros, "H-GAN: the power of GANs in your Hands", CoRR, arXiv, March 2021.
[Abstract] [BibTeX] [PDF] [URL]

BibTeX:

@arxivarticle{Oprea21,
  author = {Sergiu Oprea and Giorgos Karvounas and Pablo Martinez-Gonzalez and Nikolaos Kyriazis and Sergio Orts-Escolano and Iason Oikonomidis and Alberto Garcia-Garcia and Aggeliki Tsoli and Jose Garcia-Rodriguez and Antonis Argyros},
  title = {H-GAN: the power of GANs in your Hands},
  journal = {CoRR, arXiv},
  year = {2021},
  month = {March},
  url = {https://arxiv.org/abs/2103.15017},
  projects =  {FORTH},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_03_arxiv_oprea.pdf}
}

K. Bacharidis and A.A. Argyros, "Extracting Action Hierarchies from Action Labels and their Use in Deep Action Recognition", In IEEE International Conference on Pattern Recognition (ICPR 2020), pp. 339-346, January 2021.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Human activity recognition is a fundamental and challenging task in computer vision. Its solution can support multiple and diverse applications in areas including but not limited to smart homes, surveillance, daily living assistance, Human-Robot Collaboration (HRC), etc. In realistic conditions, the complexity of human activities ranges from simple coarse actions, such as siting or standing up, to more complex activities that consist of multiple actions with subtle variations in appearance
and motion patterns. A large variety of existing datasets target specic action classes, with some of them being coarse and others being ne-grained. In all of them, a description of the action and its complexity is manifested in the action label sentence. As the action/activity complexity increases, so is the label sentence size and the amount of action-related semantic information contained in this description. In this paper, we propose an approach to exploit the information content of these action labels to formulate a coarse-to-ne action hierarchy based on linguistic label associations, and investigate the potential benets and drawbacks. Moreover, in a series of quantitative and qualitative experiments, we show that the exploitation of this hierarchical organization of action classes in different levels of granularity improves the learning speed and overall performance of a range of baseline and mid-range deep architectures for human action recognition (HAR).

BibTeX:

@inproceedings{Bacharidis2020b,
  author = {Konstantinos Bacharidis and Antonis A. Argyros},
  title = {Extracting Action Hierarchies from Action Labels and their Use in Deep Action Recognition},
  booktitle = {IEEE International Conference on Pattern Recognition (ICPR 2020)},
  year = {2021},
  month = {January},
  pages = {339--346},
  projects =  {Co4Robots},
  doi = {10.1109/ICPR48806.2021.9412033},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_01_ICPR_Bacharidis.pdf}
}

A. Qammaz and A.A. Argyros, "Occlusion-tolerant and personalized 3D human pose estimation in RGB images", In IEEE International Conference on Pattern Recognition (ICPR 2020), pp. 6904-6911, January 2021.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We introduce a real-time method that estimates the 3D human pose directly in the popular BVH format, given estimations of the 2D body joints in RGB images. Our contributions include: (a) A novel and compact 2D pose representation. (b) A human body orientation classifier and an ensemble of orientation-tuned neural networks that regress the 3D human pose by also allowing for the decomposition of the body to an upper and lower kinematic hierarchy. This permits the recovery of the human pose even in the case of significant occlusions. (c) An efficient Inverse Kinematics solver that refines the neural-network-based solution providing 3D human pose estimations that are consistent with the limb sizes of a target person (if known). All the above yield a 33% accuracy improvement on the H3.6M dataset compared to the baseline MocapNET method while maintaining real-time performance (70 fps in CPU-only execution).

BibTeX:

@inproceedings{Qammaz2020,
  author = {Ammar Qammaz and Antonis A. Argyros},
  title = {Occlusion-tolerant and personalized 3D human pose estimation in RGB images},
  booktitle = {IEEE International Conference on Pattern Recognition (ICPR 2020)},
  year = {2021},
  month = {January},
  pages = {6904--6911},
  url = {http://users.ics.forth.gr/ argyros/res_mocapnet_II.html},
  projects =  {Co4Robots},
  doi = {10.1109/ICPR48806.2021.9411956},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_01_ICPR_Qammaz.pdf},
  videolink = {https://youtu.be/Jgz1MRq-I-k}
}

M.A. Eid, A.M. Michelin, G. Korres, S. Ba'ara, H. Assadi, H. Alsuradi, R. Sayegh and A. Argyros, "FaceGuard: A Wearable System to Avoid Face Touching", Frontiers in Robotics and AI, Frontiers, vol. 8, pp. 47, 2021.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Most people touch their faces unconsciously, for instance to scratch an itch or to rest one’s chin in their hands. To reduce the spread of the novel coronavirus (COVID-19), public health officials recommend against touching one’s face, as the virus is transmitted through mucous membranes in the mouth, nose and eyes. Students, office workers, medical personnel and people on trains were found to touch their faces between 9 and 23 times per hour. This paper introduces FaceGuard, a system that utilizes machine learning to predict hand movements that result in touching the face, and provides sensory feedback to stop the user from touching the face. The system utilizes an inertial measurement unit (IMU) to obtain features that characterize hand movement involving face touching. A Convolutional Neural Network (CNN) based prediction model is developed and trained with data from 4800 trials recorded from 40 participants. Training data is collected for hand movements involving face touching during various everyday activities such as sitting, standing, or walking. Results showed that while the average time needed to touch the face is 1200 ms, a prediction accuracy of more than 92% is achieved with less than 550 ms of IMU data. As for the sensory response, the paper presents a psychophysical experiment to compare the response time for three sensory feedback modalities, namely visual, auditory, and vibrotactile. Results demonstrate that the response time is significantly smaller for vibrotactile feedback (427.3 ms) compared to visual (561.70 ms) and auditory (520.97 ms). Furthermore, the success rate (to avoid face touching) is also statistically higher for vibrotactile and auditory feedback compared to visual feedback. These results demonstrate the feasibility of predicting a hand movement and providing timely sensory feedback within less than a second in order to avoid face touching.

BibTeX:

@article{Eid21,
  author = {Mohamad Ahmad Eid and Allan Michael Michelin and Georgios Korres and Sara Ba'ara and Hadi Assadi and Haneen Alsuradi and Rony Sayegh and Antonis Argyros},
  title = {FaceGuard: A Wearable System to Avoid Face Touching},
  journal = {Frontiers in Robotics and AI},
  publisher = {Frontiers},
  year = {2021},
  volume = {8},
  pages = {47},
  projects =  {FORTH},
  doi = {10.3389/frobt.2021.612392},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_journal_FiRAI_FaceGuard.pdf}
}

C. Hernandez-Matas, X. Zabulis and A.A. Argyros, "Retinal Image Registration as a Tool for Supporting Clinical Applications", Computer Methods and Programs in Biomedicine, Elsevier, vol. 199, pp. 105900, 2021.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Background and Objective: The study of small vessels allows for the analysis and diagnosis of diseases with strong vasculopathy. This type of vessels can be observed non-invasively in the retina via fundoscopy. The analysis of these vessels can be facilitated by applications built upon Retinal Image Registration (RIR), such as mosaicing, Super Resolution (SR) or eye shape estimation. RIR is challenging due to possible changes in the retina across time, the utilization of diverse acquisition devices with varying properties, or the curved shape of the retina. Methods: We employ the Retinal Image Registration through Eye Modelling and Pose Estimation (REMPE) framework, which simultaneously estimates the cameras’ relative poses, as well as eye shape and orientation to develop RIR applications and to study their effectiveness. Results: We assess quantitatively the suitability of the REMPE framework towards achieving SR and eye shape estimation. Additionally, we provide indicative results demonstrating qualitatively its usefulness in the context of longitudinal studies, mosaicing, and multiple image registration. Besides the improvement over registration accuracy, demonstrated via registration applications, the most important novelty presented in this work is the eye shape estimation and the generation of 3D point meshes. This has the potential for allowing clinicians to perform measurements on 3D representations of the eye, instead of doing so in 2D images that contain distortions induced because of the projection on the image space. Conclusions: RIR is very effective in supporting applications such as SR, eye shape estimation, longitudinal studies, mosaicing and multiple image registration. Its improved registration accuracy compared to the state of the art translates directly in improved performance when supporting the aforementioned applications.

BibTeX:

@article{Matas2020b,
  author = {Hernandez-Matas, Carlos and Zabulis, Xenophon and Argyros, Antonis A},
  title = {Retinal Image Registration as a Tool for Supporting Clinical Applications},
  journal = {Computer Methods and Programs in Biomedicine},
  publisher = {Elsevier},
  year = {2021},
  volume = {199},
  pages = {105900},
  projects =  {REVAMMAD},
  doi = {10.1016/j.cmpb.2020.105900},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2020_journal_CMPB.pdf}
}

P. Panteleris, D. Michel and A. Argyros, "Towards Augmented Reality in Museums: Evaluation of Design Choices for 3D Object Pose Estimation", Frontiers in Virtual Reality, Frontiers, vol. 2, pp. 23, 2021.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: The solutions to many computer vision problems, including that of 6D object pose estimation, are dominated nowadays by the explosion of the learning-based paradigm. In this paper, we investigate 6D object pose estimation in a practical, real-word setting in which a mobile device (smartphone/tablet) needs to be localized in front of a museum exhibit, in support of an augmented-reality application scenario. In view of the constraints and the priorities set by this particular setting, we consider an appropriately tailored classical as well as a learning-based method. Moreover, we develop a hybrid method that consists of both classical and learning based components. All three methods are evaluated quantitatively on a standard, benchmark dataset, but also on a new dataset that is specific to the museum guidance scenario of interest.

BibTeX:

@article{Padeler21,
  author = {Paschalis Panteleris and Damien Michel and Antonis Argyros},
  title = {Towards Augmented Reality in Museums: Evaluation of Design Choices for 3D Object Pose Estimation},
  journal = {Frontiers in Virtual Reality},
  publisher = {Frontiers},
  year = {2021},
  volume = {2},
  pages = {23},
  url = {https://www.frontiersin.org/article/10.3389/frvir.2021.649784},
  projects =  {MUSLEARN},
  doi = {10.3389/frvir.2021.649784},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2021_journal_FiVR_6Dpose.pdf}
}

G. Galanakis, X. Zabulis and A.A. Argyros, "Unsupervised domain adaptation for person re-identification with few and unlabeled target data", In Advances in Visual Computing (ISVC 2020), Springer, pp. 357-373, October 2020.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Existing, fully supervised methods for person re-identification (ReID) require annotated data acquired in the target domain in which the method is expected to operate. This includes the IDs as well as images of persons in that domain. This is an obstacle in the deployment of ReID methods in novel settings. For solving this problem, semi-supervised or even unsupervised ReID methods have been proposed. Still, due to their assumptions and operational requirements, such methods are not easily deployable and/or prove less performant to novel domains/settings, especially those related to small person galleries. In this paper, we propose a novel approach for person ReID that alleviates these problems. This is achieved by proposing a completely unsupervised method for fine tuning the ReID performance of models learned in prior, auxiliary domains, to new, completely different ones. The proposed model adaptation is achieved based on only few and unlabeled target persons' data. Extensive experiments investigate several aspects of the proposed method in an ablative study. Moreover, we show that the proposed method is able to improve considerably the performance of state-of-the-art ReID methods in state-of-the-art datasets.

BibTeX:

@inproceedings{Galanakis2020,
  author = {Galanakis, George and Zabulis, Xenophon and Argyros, Antonis A},
  title = {Unsupervised domain adaptation for person re-identification with few and unlabeled target data},
  booktitle = {Advances in Visual Computing (ISVC 2020)},
  publisher = {Springer},
  year = {2020},
  month = {October},
  pages = {357--373},
  url = {https://link.springer.com/chapter/10.1007/978-3-030-64559-5_28},
  projects =  {FORTH},
  doi = {10.1007/978-3-030-64559-5_28},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2020_10_ISVC_Galanakis.pdf}
}

K. Bacharidis and A.A. Argyros, "Improving Deep Learning Approaches for Human Activity Recognition based on Natural Language Processing of Action Labels", In IEEE International Joint Conference of Neural Networks (IJCNN 2020), Special session on Machine Learning and Deep Learning Methods applied to Vision and Robotics (MLDLMVR), July 2020.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Human activity recognition has always been an appealing research topic in computer vision due its theoretic interest and vast range of applications. In recent years, machine learning has dominated computer vision and human activity recognition research. Supervised learning methods and especially deep learning-based ones are considered to provide the best solutions for this task, achieving state-of-the art results. However, the performance of deep learning-based approaches depends greatly on the modelling capabilities of the spatio-temporal neural network architecture and the learning goals of the training process. Moreover, the design complexity is task-depended. In this paper, we show that we can exploit the information contained in the label description of action classes (action labels) to extract information regarding their similarity which can then be used to steer the learning process and improve the activity recognition performance. Moreover, we experimentally verify that the adopted strategy can be useful in both single and multi-stream architectures, providing better scalability on the training of the network in more complex datasets featuring activity classes with larger intra- and inter-class similarities.

BibTeX:

@inproceedings{Bacharidis2020a,
  author = {Konstantinos Bacharidis and Antonis A. Argyros},
  title = {Improving Deep Learning Approaches for Human Activity Recognition based on Natural Language Processing of Action Labels},
  booktitle = {IEEE International Joint Conference of Neural Networks (IJCNN 2020), Special session on Machine Learning and Deep Learning Methods applied to Vision and Robotics (MLDLMVR)},
  year = {2020},
  month = {July},
  projects =  {Co4Robots},
  doi = {10.1109/IJCNN48605.2020.9207397},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2020_07_IJCNN_Bacharidis.pdf}
}

D. Bautembach, I. Oikonomidis, N. Kyriazis and A.A. Argyros, "Faster and Simpler SNN Simulation with Work Queues", In IEEE International Joint Conference of Neural Networks (IJCNN 2020), also available at CoRR, arXiv, pp. 1-8, July 2020.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We present a clock-driven Spiking Neural Network simulator which is up to 3x faster than the state of the art while, at the same time, being more general and requiring less programming effort on both the user’s and maintainer’s side. This is made possible by designing our pipeline around “work queues” which act as interfaces between stages and greatly reduce implementation complexity. We evaluate our work using three well-established SNN models on a series of benchmarks.

BibTeX:

@inproceedings{Bautembach2020,
  author = {Dennis Bautembach and Iason Oikonomidis and Nikolaos Kyriazis and Antonis A. Argyros},
  title = {Faster and Simpler SNN Simulation with Work Queues},
  booktitle = {IEEE International Joint Conference of Neural Networks (IJCNN 2020), also available at CoRR, arXiv},
  year = {2020},
  month = {July},
  pages = {1--8},
  url = {http://arxiv.org/abs/1912.07423},
  projects =  {Co4Robots},
  doi = {10.1109/IJCNN48605.2020.9206752},
  pdflink = {M:\antonis\professional\_html\mypapers\2020_07_IJCNN_Bautembach.pdf:PDF}
}

V. Nicodemou, I. Oikonomidis, G. Tzimiropoulos and A.A. Argyros, "Learning to Infer the Depth Map of a Hand from its Color Image", In IEEE International Joint Conference of Neural Networks (IJCNN 2020), Special session on Machine Learning and Deep Learning Methods applied to Vision and Robotics (MLDLMVR), also available at CoRR, arXiv, July 2020.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: We propose the first approach to the problem of inferring the depth map of a human hand based on a single RGB image. We achieve this with a Convolutional Neural Network (CNN) that employs a stacked hourglass model as its main building block. Intermediate supervision is used in several outputs of the proposed architecture in a staged approach. To aid the process of training and inference, hand segmentation masks are also estimated in such an intermediate supervision step, and used to guide the subsequent depth estimation process. In order to train and evaluate the proposed method we compile and make publicly available HandRGBD, a new dataset of 20,601 views of hands, each consisting of an RGB image and an aligned depth map. Based on HandRGBD, we explore variants of the proposed approach in an ablative study and determine the best performing one. The results of an extensive experimental evaluation demonstrate that hand depth estimation from a single RGB frame can be achieved with an accuracy of 22mm, which is comparable to the accuracy achieved by contemporary low-cost depth cameras. Such a 3D reconstruction of hands based on RGB information is valuable as a final result on its own right, but also as an input to several otherhand analysis and perception algorithms that require depth input. Essentially, in such a context, the proposed approach bridges the gap between RGB and RGBD, by making all existing RGBD-based methods applicable to RGB input.

BibTeX:

@inproceedings{Nicodemou2020b,
  author = {Vassilis Nicodemou and Iason Oikonomidis and George Tzimiropoulos and Antonis A. Argyros},
  title = {Learning to Infer the Depth Map of a Hand from its Color Image},
  booktitle = {IEEE International Joint Conference of Neural Networks (IJCNN 2020), Special session on Machine Learning and Deep Learning Methods applied to Vision and Robotics (MLDLMVR), also available at CoRR, arXiv},
  year = {2020},
  month = {July},
  url = {https://arxiv.org/abs/1812.02486},
  projects =  {Co4Robots,HealthSign},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_12_arxiv_nicodemou.pdf}
}

G. Styliaras, C. Constantinopoulos, P. Panteleris, D. Michel, N. Pantzou, K. Papavasileiou, K. Tzortzi, A. Argyros and D. Kosmopoulos, "The MuseLearn platform: personalized content for museum visitors assisted by vision-based recognition and 3D pose estimation of exhibits", In 16th International Conference on Artificial Intelligence Applications and Innovations (AIAI 2020), Springer, pp. 439-451, June 2020.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: MuseLearn is a platform that enhances the presentation of the exhibits of a museum with multimedia-rich content that is adapted and recommended for certain visitor profiles and playbacks on their mobile devices. The platform consists mainly of a content management system that stores and prepares multimedia material for the presentation of exhibits; a recommender system that monitors objectively the visitor’s behavior so that it can further adapt the content to their needs; and a pose estimation system that identifies an exhibit and links it to the additional content that is prepared for it. We present the systems and the initial results for a selected set of exhibits in Herakleidon Museum, a museum holding temporary exhibitions mainly about ancient Greek technology. The initial evaluation that we presented is encouraging for all systems. Thus, the plan is to use the developed systems for all museum exhibits as well as to enhance their functionality.

BibTeX:

@inproceedings{styliaras2020,
  author = {G Styliaras and C Constantinopoulos and P Panteleris and D Michel and N Pantzou and K Papavasileiou and K Tzortzi and Antonis Argyros and D Kosmopoulos},
  title = {The MuseLearn platform: personalized content for museum visitors assisted by vision-based recognition and 3D pose estimation of exhibits},
  booktitle = {16th International Conference on Artificial Intelligence Applications and Innovations (AIAI 2020)},
  publisher = {Springer},
  year = {2020},
  month = {June},
  volume = {583},
  pages = {439--451},
  url = {https://doi.org/10.1007/978-3-030-49161-1_37},
  projects =  {MuseLearn},
  doi = {10.1007/978-3-030-49161-1_37},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2020_06_AIAI_muselearn.pdf}
}

G. Park, A.A. Argyros, J. Lee and W. Woo, "3D Hand Tracking in the Presence of Excessive Motion Blur", IEEE Transactions on Visualization and Computer Graphics (TVCG), IEEE, vol. 26, no. 5, pp. 1891-1901, May 2020.
[Abstract] [BibTeX] [DOI] [PDF] [VIDEO]

Abstract: We present a sensor-fusion method that exploits a depth camera and a gyroscope to track the articulation of a hand in the presence of excessive motion blur. In case of slow and smooth hand motions, the existing methods estimate the hand pose fairly accurately and robustly, despite challenges due to the high dimensionality of the problem, self-occlusions, uniform appearance of hand parts, etc. However, the accuracy of hand pose estimation drops considerably for fast-moving hands because the depth image is severely distorted due to motion blur. Moreover, when hands move fast, the actual hand pose is far from the one estimated in the previous frame, therefore the assumption of temporal continuity on which tracking methods rely, is not valid. In this paper, we track fast-moving hands with the combination of a gyroscope and a depth camera. As a first step, we calibrate a depth camera and a gyroscope attached to a hand so as to identify their time and pose offsets. Following that, we fuse the rotation information of the calibrated gyroscope with model-based hierarchical particle filter tracking. A series of quantitative and qualitative experiments demonstrate that the proposed method performs more accurately and robustly in the presence of motion blur, when compared to state of the art algorithms, especially in the case of very fast hand rotations.

BibTeX:

@article{Park2020,
  author = {Park, Gabyong and Argyros, Antonis A and Lee, Juyoung and Woo, Woontack},
  title = {3D Hand Tracking in the Presence of Excessive Motion Blur},
  journal = {IEEE Transactions on Visualization and Computer Graphics (TVCG)},
  publisher = {IEEE},
  year = {2020},
  month = {May},
  volume = {26},
  number = {5},
  pages = {1891--1901},
  doi = {10.1109/TVCG.2020.2973057},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2020_journal_TVCG_HandTrackingMotionBlur.pdf},
  videolink = {https://youtu.be/wSRF1kYAKxc}
}

D. Kosmopoulos, I. Oikonomidis, K. Konstantinopoulos, N. Arvanitis, K. Antzakas, A. Bifis, G. Lydakis, A. Roussos and A.A. Argyros, "Towards a Visual Sign Language Dataset for Home Care Services", In IEEE International Conference on Automatic Face and Gesture Recognition (FG'2020), IEEE, pp. 622-626, Buenos Ayres, Argentina, May 2020.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We present our work towards creating a dataset, which is intended to be used for the implementation of a home care services system for the deaf. The dataset includes recorded realistic scenarios of interactions between deaf patients and psychiatrists in their native sign language. The scenarios allow for contextualized representations, in contrast to typical datasets presenting isolated signs or sentences. It includes continuous videos in RGB and depth, which are challenging to analyze and closely resemble real-life scenarios. The research on representation of signs is supported by providing the hand shapes and trajectories for every video using hand and skeleton models, as well as facial features. Furthermore, the dataset may be used for the study of emotional context in Sign Language, since such conversations are typically emotionally charged.

BibTeX:

@inproceedings{Kosmopoulos2020,
  author = {Dimitrios Kosmopoulos and Iason Oikonomidis and Konstantinos Konstantinopoulos and Nikolaos Arvanitis and Klimis Antzakas and Aristidis Bifis and Georgios Lydakis and Anastasios Roussos and Antonis A. Argyros},
  title = {Towards a Visual Sign Language Dataset for Home Care Services},
  booktitle = {IEEE International Conference on Automatic Face and Gesture Recognition (FG'2020)},
  publisher = {IEEE},
  year = {2020},
  month = {May},
  pages = {622--626},
  address = {Buenos Ayres, Argentina},
  url = {https://doi.ieeecomputersociety.org/10.1109/FG47880.2020.00099},
  projects =  {HealthSign},
  doi = {10.1109/FG47880.2020.00099},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2020_05_FG_healthsign.pdf}
}

F. Gouidis, A. Vassiliades, T. Patkos, A.A. Argyros, N. Bassiliades and D. Plexousakis, "A Review on Intelligent Object Perception Methods Combining Knowledge-based Reasoning and Machine Learning", In AAAI 2020 Spring Symposium on Combining Machine Learning and Knowledge Engineering in Practice, (AAAI-MAKE), also available at CoRR, arXiv, March 2020.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Object perception is a fundamental sub-field of Computer Vision, covering a multitude of individual areas and having contributed high-impact results. While Machine Learning has been traditionally applied to address related problems, recent works also seek ways to integrate knowledge engineering in order to expand the level of intelligence of the visual interpretation of objects, their properties and their relations with their environment. In this paper, we attempt a systematic investigation of how knowledge-based methods contribute to diverse object perception tasks. We review the latest achievements and identify prominent research directions.

BibTeX:

@inproceedings{gouidis2020,
  author = {Gouidis, Filippos and Vassiliades, Alexandros and Patkos, Theodore and Argyros, Antonis A and Bassiliades, Nick and Plexousakis, Dimitris},
  title = {A Review on Intelligent Object Perception Methods Combining Knowledge-based Reasoning and Machine Learning},
  booktitle = {AAAI 2020 Spring Symposium on Combining Machine Learning and Knowledge Engineering in Practice, (AAAI-MAKE), also available at CoRR, arXiv},
  year = {2020},
  month = {March},
  url = {https://arxiv.org/abs/1912.11861},
  doi = {10.48550/arXiv.1912.11861},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2020_03_AAAI_MAKE_Review_Gouidis.pdf}
}

C. Hernandez-Matas, X. Zabulis and A.A. Argyros, "REMPE: Registration of Retinal Images Through Eye Modelling and Pose Estimation", IEEE Journal of Biomedical and Health Informatics, IEEE, vol. 24, no. 12, pp. 3362-3373, 2020.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Objective: In-vivo assessment of small vessels can promote accurate diagnosis and monitoring of diseases related to vasculopathy, such as hypertension and diabetes. The eye provides a unique, open, and accessible window for directly imaging small vessels in the retina with non-invasive techniques, such as fundoscopy. In this context, accurate registration of retinal images is of paramount importance in the comparison of vessel measurements from original and follow-up examinations, which is required for monitoring the disease and its treatment. At the same time, retinal registration exhibits a range of challenges due to the curved shape of the retina and the modification of imaged tissue across examinations. Thereby, the objective is to improve the state-of-the-art in the accuracy of retinal image registration. Method: In this work, a registration framework that simultaneously estimates eye pose and shape is proposed. Corresponding points in the retinal images are utilized to solve the registration as a 3D pose estimation. Results: The proposed framework is evaluated quantitatively and shown to outperform state-of-the-art methods in retinal image registration for fundoscopy images. Conclusion: Retinal image registration methods based on eye modelling allow to perform more accurate registration than conventional methods. Significance: This is the first method to perform retinal image registration combined with eye modelling. The method improves the state-of-the-art in accuracy of retinal registration for fundoscopy images, quantitatively evaluated in benchmark datasets annotated with ground truth. The implementation of registration method has been made publicly available.

BibTeX:

@article{Matas2020,
  author = {Hernandez-Matas, Carlos and Zabulis, Xenophon and Argyros, Antonis A},
  title = {REMPE: Registration of Retinal Images Through Eye Modelling and Pose Estimation},
  journal = {IEEE Journal of Biomedical and Health Informatics},
  publisher = {IEEE},
  year = {2020},
  volume = {24},
  number = {12},
  pages = {3362--3373},
  projects =  {REVAMMAD},
  doi = {10.1109/JBHI.2020.2984483},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2020_journal_JBHI_REMPE.pdf}
}

V.C. Nicodemou, I. Oikonomidis and A.A. Argyros, "Single Shot 3D Hand Pose Estimation Using Radial Basis Function Networks Trained on Synthetic Data", Pattern Analysis and Applications, Springer, vol. 23, no. 1, pp. 415-428, 2020.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: In this work we present a novel framework to perform single shot hand pose estimation using depth data as input. The method follows a coarse to ne strategy and employs several Radial Basis Function Networks (RBFNs) that are trained on a large synthetic dataset. At runtime, rstly, an initialization RBFN is used to provide a rough estimation of the hand's 3D pose. Subsequently, several specialized RBFNs are employed to improve that initial estimation in an iterative renement scheme. To train the RBFNs we select a set of hand poses from a real world sequence that are as diverse as possible. We use this representativeset, along with a dense sampling of all possible rotations, as a seed to generate a large synthetic training set. The method is parallelizable taking advantage of the inherent data-parallelism of RBFNs. Furthermore, the method requires few real-world data and virtually no manual annotation. We perform a quantitative evaluation of our method on a testing sequence of our own. Furthermore, we present quantitative and qualitative results on a public dataset that is commonly used to evaluate hand pose estimation and tracking methods. We show that our approach achieves promising results in all cases..

BibTeX:

@article{Nicodemou2020,
  author = {Nicodemou, Vassilis C and Oikonomidis, Iason and Argyros, Antonis A},
  title = {Single Shot 3D Hand Pose Estimation Using Radial Basis Function Networks Trained on Synthetic Data},
  journal = {Pattern Analysis and Applications},
  publisher = {Springer},
  year = {2020},
  volume = {23},
  number = {1},
  pages = {415--428},
  url = {https://doi.org/10.1007/s10044-019-00801-7},
  projects =  {CO4ROBOTS,HEALTHSIGN},
  doi = {10.1007/s10044-019-00801-7},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_journal_PAA_Nicodemou.pdf}
}

C. Panagiotakis and A.A. Argyros, "Region-based Fitting of Overlapping Ellipses and its Application to Cells Segmentation", Image and Vision Computing, Elsevier, vol. 93, pp. 103810, 2020.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: We present RFOVE, a region-based method for approximating an arbitrary 2D shape with an automatically determined number of possibly overlapping ellipses. RFOVE is completely unsupervised, operates without any assumption or prior knowledge on the object’s shape and extends and improves the Decremental Ellipse Fitting Algorithm (DEFA) [1]. Both RFOVE and DEFA solve the multi-ellipse fitting problem by performing model selection that is guided by the minimization of the Akaike Information Criterion on a suitably defined shape complexity measure. However, in contrast to DEFA, RFOVE minimizes an objective function that allows for ellipses with higher degree of overlap and, thus, achieves better ellipse-based shape approximation. A comparative evaluation of RFOVE with DEFA on several standard datasets shows that RFOVE achieves better shape coverage with simpler models (less ellipses). As a practical exploitation of RFOVE, we present its application to the problem of detecting and segmenting potentially overlapping cells in fluorescence microscopy images. Quantitative results obtained in three public datasets (one synthetic and two with more than 4000 actual stained cells) show the superiority of RFOVE over the state of the art in overlapping cells segmentation.

BibTeX:

@article{PangiotakisArgyros2020,
  author = {Panagiotakis, Costas and Argyros, Antonis A},
  title = {Region-based Fitting of Overlapping Ellipses and its Application to Cells Segmentation},
  journal = {Image and Vision Computing},
  publisher = {Elsevier},
  year = {2020},
  volume = {93},
  pages = {103810},
  projects =  {FORTH},
  doi = {10.1016/j.imavis.2019.09.001},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2020_journal_IMAVIS_EllipsesCellSegmentation.pdf}
}

BibTeX:

@arxivarticle{oprea2020,
  author = {Sergiu Oprea and Pablo Martinez-Gonzalez and Alberto Garcia-Garcia and John Alejandro Castro-Vargas and Sergio Orts-Escolano, Jose Garcia-Rodriguez and Antonis Argyros},
  title = {A Review on Deep Learning Techniques for Video Prediction},
  journal = {CoRR, arXiv},
  year = {2020},
  url = {http://arxiv.org/abs/2004.05214},
  projects =  {FORTH},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2020_04_arxiv_videoprediction.pdf}
}

M. Bajones, D. Fischinger, A. Weiss, P.D.L. Puente, D. Wolf, M. Vincze, T. Körtner, M. Weninger, K. Papoutsakis, D. Michel, A. Qammaz, P. Panteleris, M. Foukarakis, I. Adami, D. Ioannidi, A. Leonidis, M. Antona, A. Argyros, P. Mayer, P. Panek, H. Eftring and S. Frennert, "Results of Field Trials with a Mobile Service Robot for Older Adults in 16 Private Households", ACM Transactions on Human-Robot Interaction, ACM, vol. 9, no. 2, pp. 10:1-10:27, December 2019.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: In this article, we present results obtained from field trials with the Hobbit robotic platform, an assistive,social service robot aiming at enabling prolonged independent living of older adults in their own homes. Ourmain contribution lies within the detailed results on perceived safety, usability, and acceptance from fieldtrials with autonomous robots in real homes of older users. In these field trials, we studied how 16 olderadults (75 plus) lived with autonomously interacting service robots over multiple weeks.Robots have been employed for periods of months previously in home environments for older people, andsome have been tested with manipulation abilities, but this is the first time a study has tested a robot inprivate homes that provided the combination of manipulation abilities, autonomous navigation, and non-scheduled interaction for an extended period of time. This article aims to explore how older adults interactwith such a robot in their private homes. Our results show that all users interacted with Hobbit daily, ratedmost functions as well working, and reported that they believe that Hobbit will be part of future elderlycare. We show that Hobbit’s adaptive behavior approach towards the user increasingly eased the interactionbetween the users and the robot. Our trials reveal the necessity to move into actual users’ homes, as onlythere, we encounter real-world challenges and demonstrate issues such as misinterpretation of actions duringnon-scripted human-robot interaction.

BibTeX:

@article{bajones2019,
  author = {Bajones, Markus and Fischinger, David and Weiss, Astrid and Puente, Paloma De La and Wolf, Daniel and Vincze, Markus and Körtner, Tobias and Weninger, Markus and Papoutsakis, Konstantinos and Michel, Damien and Qammaz, Ammar and Panteleris, Paschalis and Foukarakis, Michalis and Adami, Ilia and Ioannidi, Danae and Leonidis, Asterios and Antona, Margherita and Argyros, Antonis and Mayer, Peter and Panek, Paul and Eftring, Håkan and Frennert, Susanne},
  title = {Results of Field Trials with a Mobile Service Robot for Older Adults in 16 Private Households},
  journal = {ACM Transactions on Human-Robot Interaction},
  publisher = {ACM},
  year = {2019},
  month = {December},
  volume = {9},
  number = {2},
  pages = {10:1--10:27},
  address = {New York, NY, USA},
  url = {http://doi.acm.org/10.1145/3368554},
  projects =  {HOBBIT},
  doi = {10.1145/3368554},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_journal_12_HRI_bajones.pdf}
}

D. Bautembach, I. Oikonomidis, N. Kyriazis and A. Argyros, "Faster and Simpler SNN Simulation with Work Queues", CoRR, arXiv, December 2019.
[Abstract] [BibTeX] [PDF] [URL]

BibTeX:

@arxivarticle{1912.07423,
  author = {Dennis Bautembach and Iason Oikonomidis and Nikolaos Kyriazis and Antonis Argyros},
  title = {Faster and Simpler SNN Simulation with Work Queues},
  journal = {CoRR, arXiv},
  year = {2019},
  month = {December},
  url = {http://arxiv.org/abs/1912.07423},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_12_arxiv_snn_bautembach.pdf}
}

BibTeX:

@arxivarticle{gouidis,
  author = {Gouidis, Filippos and Vassiliades, Alexandros and Patkos, Theodore and Argyros, Antonis A and Bassiliades, Nick and Plexousakis, Dimitris},
  title = {A Review on Intelligent Object Perception Methods Combining Knowledge-based Reasoning and Machine Learning},
  journal = {CoRR, arXiv},
  year = {2019},
  month = {December},
  url = {https://arxiv.org/abs/1912.11861},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_12_arxiv_Review_Gouidis.pdf}
}

C. Hernandez-Matas, A.A. Argyros and X. Zabulis, "Retinal image preprocessing, enhancement, and registration", Computational Retinal Image Analysis, Academic Press, pp. 59-77, 2019.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Preprocessing and enhancement is a prerequisite for a wide range of retinal image analysis methods. The goals of such tasks are to improve images and facilitate their subsequent analysis. Registration of retinal images enables the generation of images of higher definition retinal mosaics and facilitates the comparison of images from different examinations. The above processes contribute significantly to the screening and diagnosis of a wide range of diseases. This chapter reviews preprocessing, enhancement, and registration techniques for the modalities of fundus and tomographic imaging of the human eye.

BibTeX:

@incollection{Hernandez-Matas2019,
  author = {Hernandez-Matas, Carlos and Argyros, Antonis A and Zabulis, Xenophon},
  title = {Retinal image preprocessing, enhancement, and registration},
  booktitle = {Computational Retinal Image Analysis},
  publisher = {Academic Press},
  year = {2019},
  month = {November},
  pages = {59--77},
  file = {M:\antonis\professional\_html\mypapers\2019_11_book_retinalimagepreprocessing.pdf:PDF},
  projects =  {REVAMMAD},
  doi = {10.1016/B978-0-08-102816-2.00004-6},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_11_book_retinalimagepreprocessing.pdf}
}

G. Karvounas, I. Oikonomidis and A.A. Argyros, "ReActNet: Temporal Localization of Repetitive Activities in Real-World Videos", In IEEE International Conference on Computer Vision Workshops (ISV 2019 - ICCVW 2019), also available at CoRR, arXiv, IEEE, Seoul, S. Korea, October 2019.
[Abstract] [BibTeX] [PDF] [URL] [VIDEO]

Abstract: We address the problem of temporal localization of repetitive activities in a video, i.e., the problem of identifying all segments of a video that contain some sort of repetitive or periodic motion. To do so, the proposed method represents a video by the matrix of pairwise frame distances. These distances are computed on frame representations obtained with a convolutional neural network. On top of this representation, we design, implement and evaluate ReActNet, a lightweight convolutional neural network that classifies a given frame as belonging (or not) to a repetitive video segment. An important property of the employed representation is that it can handle repetitive segments of arbitrary number and duration. Furthermore, the proposed training process requires a relatively small number of annotated videos. Our method raises several of the limiting assumptions of existing approaches regarding the contents of the video and the types of the observed repetitive activities. Experimental results on recent, publicly available datasets validate our design choices, verify the generalization potential of ReActNet and demonstrate its superior performance in comparison to the current state of the art.

BibTeX:

@inproceedings{Karvounas019,
  author = {Karvounas, Giorgos and Oikonomidis, Iason and Argyros, Antonis A},
  title = {ReActNet: Temporal Localization of Repetitive Activities in Real-World Videos},
  booktitle = {IEEE International Conference on Computer Vision Workshops (ISV 2019 - ICCVW 2019), also available at CoRR, arXiv},
  publisher = {IEEE},
  year = {2019},
  month = {October},
  address = {Seoul, S. Korea},
  url = {http://users.ics.forth.gr/ argyros/res_reactnet.html},
  projects =  {Co4Robots},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_10_ISV_reactnet.pdf},
  videolink = {https://youtu.be/pPqg1lMkuaQ}
}

A. Tsoli and A.A. Argyros, "Patch-based reconstruction of a textureless deformable 3D surface from a single RGB image", In IEEE International Conference on Computer Vision Workshops (GMDL 2019 - ICCVW 2019), IEEE, pp. 4034-4043, Seoul, S. Korea, October 2019.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We propose a deep learning method for reconstructing a textureless deformable 3D surface from a single RGB image, under various lighting conditions. One of the challenges when training a neural network to predict the shape of a deformable object is that the object exhibits such a great deal of shape variation that it is essentially impractical to have a training set consisting of all possible deformations the object may realize. However, different areas
of the deformable object may exhibit similar types of deformations, e.g. similar wrinkles might appear in different areas on the surface of a cloth. Motivated by this, we propose learning local models of shape variation from image patches that we then combine into a global reconstruction of the observed object. Initially, we divide the input image into overlapping patches and a zero-mean depth map as well as a normal map are estimated for each patch using
deep learning. Stitching of depth maps is performed by finding the optimal translation of each patch depth map along the viewing direction of the camera and averaging the depth predictions of neighboring patches at their overlapping areas. Stitching of normal maps is performed by normalizing and averaging the normals predictions of neighboring patches at their overlapping areas. Finally, bilateral filtering is performed on the stitched depth and normal maps in
order to perform fine-scale smoothing at the regions around patch boundaries. We show increased accuracy compared to previous work even in the presence of limited training data and more effective generalization to unseen objects.

BibTeX:

@inproceedings{Tsoli2019,
  author = {Tsoli, Aggeliki and Argyros, Antonis A},
  title = {Patch-based reconstruction of a textureless deformable 3D surface from a single RGB image},
  booktitle = {IEEE International Conference on Computer Vision Workshops (GMDL 2019 - ICCVW 2019)},
  publisher = {IEEE},
  year = {2019},
  month = {October},
  pages = {4034--4043},
  address = {Seoul, S. Korea},
  url = {https://doi.org/10.1109/ICCVW.2019.00498},
  projects =  {Co4Robots},
  doi = {10.1109/ICCVW.2019.00498},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_10_GMDL_deformable.pdf}
}

C. Panagiotakis and A.A. Argyros, "A two-stage approach for commonality-based temporal localization of periodic motions", In International Conference on Computer Vision Systems (ICVS 2019), Springer, Thessaloniki, Greece, September 2019.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We present an unsupervised method for the detection of all temporal segments of videos or motion capture data, that correspond to periodic motions. The proposed method is based on the detection of similar segments (commonalities) in different parts of the input sequence and employs a two-stage approach that operates on the matrix of pairwise distances of all input frames. In the first stage, strong commonalities in different parts of the input are detected, while in the second stage a hysteresis-thresholding type of operation fine-tunes the initial detections. The quantitative evaluation of the proposed method on three standard ground-truth-annotated datasets (two video datasets, one 3D human motion capture dataset) demonstrate its improved performance in comparison to existing approaches..

BibTeX:

@inproceedings{Panagiotakis2019,
  author = {Panagiotakis, Costas and Argyros, Antonis A},
  title = {A two-stage approach for commonality-based temporal localization of periodic motions},
  booktitle = {International Conference on Computer Vision Systems (ICVS 2019)},
  publisher = {Springer},
  year = {2019},
  month = {September},
  address = {Thessaloniki, Greece},
  url = {https://doi.org/10.1007/978-3-030-34995-0_33},
  projects =  {CO4ROBOTS},
  doi = {10.1007/978-3-030-34995-0_33},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_09_ICVS_periodicity.pdf}
}

K. Papoutsakis and A.A. Argyros, "Unsupervised and Explainable Assessment of Video Similarity", In British Machine Vision Conference (BMVC 2019), BMVA, Cardiff, UK, September 2019.
[Abstract] [BibTeX] [PDF] [URL] [VIDEO]

Abstract: We propose a novel unsupervised method that assesses the similarity of two videos on the basis of the estimated relatedness of the objects and their behavior, and provides arguments supporting this assessment. A video is represented as a complete undirected action graph that encapsulates information on the types of objects and the way they (inter) act. The similarity of a pair of videos is estimated based on the bipartite Graph Edit Distance (GED) of the corresponding action graphs. As a consequence, on-top of estimating a quantitative measure of video similarity, our method establishes spatiotemporal correspondences between objects across videos if these objects are semantically related, if/when they interact similarly, or both. We consider this an important step towards explainable assessment of video and action similarity. The proposed method is evaluated on a publicly available dataset on the tasks of activity classification and ranking and is shown to compare favorably to state of the art supervised learning methods.

BibTeX:

@inproceedings{Papoutsakis2019,
  author = {Papoutsakis, Konstantinos and Argyros, Antonis A},
  title = {Unsupervised and Explainable Assessment of Video Similarity},
  booktitle = {British Machine Vision Conference (BMVC 2019)},
  publisher = {BMVA},
  year = {2019},
  month = {September},
  address = {Cardiff, UK},
  url = {http://users.ics.forth.gr/ argyros/res_videosimilarity.html},
  projects =  {CO4ROBOTS},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_09_BMVC_videosimilarity.pdf},
  videolink = {https://youtu.be/QUHYa72fFt0}
}

E.-O. Porfyrakis, A. Makris and A.A. Argyros, "3D Hand Tracking by Employing Probabilistic Principal Component Analysis to Model Action Priors", In International Conference on Computer Vision Systems (ICVS 2019), Springer, pp. 531-541, Thessaloniki, Greece, September 2019.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: Research in vision-based 3D hand tracking targets primarily the scenario in which a bare hand performs unconstrained motion in front of a camera system. Nevertheless, in several important application domains, augmenting the hand with color information so as to facilitate the tracking process constitutes an acceptable alternative. With this observation in mind, in this work we propose a modification of a state of the art method [12] for markerless 3D hand tracking, that takes advantage of the richer observations resulting from a colored glove. We do so by modifying the 3D hand model employed in the aforementioned hypothesize-and-test method as well as the objective function that is minimized in its optimization step. Quantitative and qualitative results obtained from a comparative evaluation of the baseline method to the proposed approach confirm that the latter achieves a remarkable increase in tracking accuracy and robustness and, at the same time, reduces drastically the associated computational costs.

BibTeX:

@inproceedings{Porfyrakis2019,
  author = {Porfyrakis, Emmanouil-Olof and Makris, Alexandros and Argyros, Antonis A},
  title = {3D Hand Tracking by Employing Probabilistic Principal Component Analysis to Model Action Priors},
  booktitle = {International Conference on Computer Vision Systems (ICVS 2019)},
  publisher = {Springer},
  year = {2019},
  month = {September},
  pages = {531--541},
  address = {Thessaloniki, Greece},
  url = {https://doi.org/10.1007/978-3-030-34995-0_48},
  projects =  {CO4ROBOTS},
  doi = {10.1007/978-3-030-34995-0_48},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_09_ICVS_handposemotionpriors.pdf},
  videolink = {https://youtu.be/L09qeohuJ9k}
}

A. Qammaz and A.A. Argyros, "MocapNET: Ensemble of SNN Encoders for 3D Human Pose Estimation in RGB Images", In British Machine Vision Conference (BMVC 2019), BMVA, Cardiff, UK, September 2019.
[Abstract] [BibTeX] [PDF] [URL] [VIDEO]

Abstract: We present MocapNET, an ensemble of SNNklambauer2017self encoders that estimates the 3D human body pose based on 2D joint estimations extracted from monocular RGB images. MocapNET provides an efficient divide and conquer strategy for supervised learning. It outputs skeletal information directly into the BVH Maddock_motioncapture format which can be rendered in real-time or imported without any additional processing in most popular 3D animation software. The proposed architecture achieves very fast (100Hz) 3D human pose estimations using only CPU processing..

BibTeX:

@inproceedings{Qammaz2019,
  author = {Qammaz, Ammar and Argyros, Antonis A},
  title = {MocapNET: Ensemble of SNN Encoders for 3D Human Pose Estimation in RGB Images},
  booktitle = {British Machine Vision Conference (BMVC 2019)},
  publisher = {BMVA},
  year = {2019},
  month = {September},
  address = {Cardiff, UK},
  url = {http://users.ics.forth.gr/ argyros/res_mocapnet.html},
  projects =  {CO4ROBOTS,MINGEI},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_09_BMVC_mocapnet.pdf},
  videolink = {https://youtu.be/fH5e-KMBvM0}
}

C. Constantinopoulos, D. Kosmopoulos, A.A. Argyros, I. Oikonomidis, V. Lampropoulou, K. Antzakas, C. Panagopoulos, A. Menychtas and C. Theoharatos, "The HealthSign project, current state and future activities", In IEEE International Conference on Information, Intelligence, Systems and Applications (IISA 2019), IEEE, Patras, Greece, July 2019.
[Abstract] [BibTeX] [PDF]

Abstract: This paper presents the HealthSign project, which deals with the problem of sign language recognition with focus on medical interaction scenarios. The deaf user will be able to communicate in his native sign language with a physician. The continuous signs will be translated to text and presented to the physician. Similarly, the speech will be recognized and presented as text to the deaf users. Two alternative versions of the system will be developed, one doing the recognition on a server, and another one doing the recognition on a mobile device.

BibTeX:

@inproceedings{Constantinopoulos2019,
  author = {Constantinos Constantinopoulos and Dimitrios Kosmopoulos and Antonis A. Argyros and Iason Oikonomidis and V Lampropoulou and Klimis Antzakas and C Panagopoulos and A Menychtas and C Theoharatos},
  title = {The HealthSign project, current state and future activities},
  booktitle = {IEEE International Conference on Information, Intelligence, Systems and Applications (IISA 2019)},
  publisher = {IEEE},
  year = {2019},
  month = {July},
  address = {Patras, Greece},
  projects =  {HealthSign},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_07_IISA_healthsign.pdf}
}

F. Gouidis, P. Panteleris, I. Oikonomidis and A.A. Argyros, "Accurate Hand Keypoint Localization on Mobile Devices", In Machine Vision Applications (MVA 19), also available at CoRR, arXiv, Tokyo, Japan, May 2019.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present a novel approach for 2D hand keypoint localization from regular color input. The proposed approach relies on an appropriately designed Convolutional Neural Network (CNN) that computes a set of heatmaps, one per hand keypoint of interest. Extensive experiments with the proposed method compare it against state of the art approaches and demonstrate its accuracy and computational performance on standard, publicly available datasets. The obtained results demonstrate that the proposed method matches or outperforms the competing methods in accuracy, but clearly outperforms them in computational efficiency, making it a suitable building block for applications that require hand keypoint estimation on mobile devices.

BibTeX:

@inproceedings{Gouidis2019,
  author = {Filippos Gouidis and Paschalis Panteleris and Iason Oikonomidis and Antonis A. Argyros},
  title = {Accurate Hand Keypoint Localization on Mobile Devices},
  booktitle = {Machine Vision Applications (MVA 19), also available at CoRR, arXiv},
  year = {2019},
  month = {May},
  address = {Tokyo, Japan},
  url = {http://users.ics.forth.gr/ argyros/res_2Dhandkeypoints.html},
  projects =  {Co4Robots,HealthSign},
  doi = {10.23919/MVA.2019.8758059},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_05_MVA_hand2Dkeypoints.pdf},
  videolink = {https://youtu.be/vl-n3At-gp8}
}

A. Makris and A.A. Argyros, "Robust 3D Human Pose Estimation Guided by Filtered Subsets of Body Keypoints", In Machine Vision Applications (MVA 19), Tokyo, Japan, May 2019.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We propose a novel hybrid human 3D body pose estimation method that uses RGBD input. The method relies on a deep neural network to get an initial 2D body pose. Using depth information from the sensor, a set of 2D landmarks on the body are transformed in 3D. Then, a multiple hypothesis tracker uses the obtained 2D and 3D body landmarks to estimate the 3D body pose. In order to safeguard from observation errors, each human pose hypothesis considered by the tracker is constructed using a gradient descent optimization scheme that is applied to a subset of the body landmarks. Landmark selection is driven by a set of geometric constraints and temporal continuity criteria. The resulting 3D poses are evaluated by an objective function that calculates densely the discrepancy between the 3D structure of the rendered 3D human body model and the actual depth observed by the sensor. The quantitative experiments show the advantages of the proposed method over a baseline that directly uses all landmark observations for the optimization, as well as over other recent 3D human pose estimation approaches.

BibTeX:

@inproceedings{Makris2019,
  author = {Makris, Alexandros and Argyros, Antonis A},
  title = {Robust 3D Human Pose Estimation Guided by Filtered Subsets of Body Keypoints},
  booktitle = {Machine Vision Applications (MVA 19)},
  year = {2019},
  month = {May},
  address = {Tokyo, Japan},
  url = {http://users.ics.forth.gr/ argyros/res_robustbodypose.html},
  projects =  {Co4Robots},
  doi = {10.23919/MVA.2019.8757907},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_05_MVA_bodypose.pdf},
  videolink = {https://youtu.be/jvLzGpnniWc}
}

G. Galanakis, X. Zabulis and A.A. Argyros, "Novelty Detection for Person Re-identification in an Open World", In International Conference on Computer Vision Theory and Applications (VISAPP 2019), Scitepress, pp. 401-411, Prague, Czech Republic, February 2019.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: A fundamental assumption in most contemporary person re-identification research, is that all query persons that need to be re-identified belong to a closed gallery of known persons, i.e., they have been observed and a representation of their appearance is available. For several real-world applications, this closed-world assumption does not hold, as image queries may contain people that the re-identification system has never observed before. In this work, we remove this constraining assumption. To do so, we introduce a novelty detection mechanism that decides whether a person in a query image exists in the gallery. The re-identification of persons existing in the gallery is easily achieved based on the persons representation employed by the novelty detection mechanism. The proposed method operates on a hybrid person descriptor that consists of both supervised (learnt) and unsupervised (hand-crafted) components. A series of experiments on public, state of the art datasets and in comparison with state of the art methods shows that the proposed approach is very accurate in identifying persons that have not been observed before and that this has a positive impact on re-identification accuracy.

BibTeX:

@inproceedings{Galanakis2019,
  author = {Galanakis, George and Zabulis, Xenophon and Argyros, Antonis A},
  title = {Novelty Detection for Person Re-identification in an Open World},
  booktitle = {International Conference on Computer Vision Theory and Applications (VISAPP 2019)},
  publisher = {Scitepress},
  year = {2019},
  month = {February},
  pages = {401--411},
  address = {Prague, Czech Republic},
  url = {http://insticc.org/node/TechnicalProgram/visigrapp/presentationDetails/73683},
  projects =  {Co4Robots},
  doi = {10.5220/0007368304010411},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_02_VISAPP_galanakis.pdf}
}

G. Karvounas, I. Oikonomidis and A. Argyros, "ReActNet: Temporal Localization of Repetitive Activities in Real-World Videos", CoRR, arXiv, 2019.
[Abstract] [BibTeX] [PDF] [URL] [VIDEO]

BibTeX:

@arxivarticle{1910.06096,
  author = {Giorgos Karvounas and Iason Oikonomidis and Antonis Argyros},
  title = {ReActNet: Temporal Localization of Repetitive Activities in Real-World Videos},
  journal = {CoRR, arXiv},
  year = {2019},
  url = {http://arxiv.org/abs/1910.06096},
  projects =  {Co4Robots},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2019_10_arxiv_reactnet.pdf},
  videolink = {https://youtu.be/pPqg1lMkuaQ}
}

T.-H. Pham, N. Kyriazis, A.A. Argyros and A. Kheddar, "Hand-Object Contact Force Estimation From Markerless Visual Tracking", IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE, vol. 40, no. 12, pp. 2883-2896, December 2018.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We consider the problem of estimating realistic contact forces during manipulation, backed with ground-truth measurements, using vision alone. Interaction forces are usually measured by mounting force transducers onto the manipulated objects or the hands. Those are costly, cumbersome, and alter the objects’ physical properties and their perception by the human sense of touch. Our work establishes that interaction forces can be estimated in a cost-effective, reliable, non-intrusive way using vision. This is a complex and challenging problem. Indeed, in multi-contact, a given motion can generally be caused by an infinity of possible force distributions. To alleviate the limitations of traditional models based on inverse optimization, we collect and release the first large-scale dataset on manipulation kinodynamics as 3:2 hours of synchronized force and motion measurements under 193 object-grasp configurations. We learn a mapping between high-level kinematic features based on the equations of motion and the underlying manipulation forces using recurrent neural networks (RNN). The RNN predictions are consistently refined using physics-based optimization through second-order cone programming (SOCP). We show that our method can successfully capture interaction forces compatible with both the observations and the way humans intuitively manipulate objects, using a single RGB-D camera.

BibTeX:

@article{Pham2017,
  author = {Pham, Tu-Hoa and Kyriazis, Nikolaos and Argyros, Antonis A and Kheddar, Abderrahmane},
  title = {Hand-Object Contact Force Estimation From Markerless Visual Tracking},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
  publisher = {IEEE},
  year = {2018},
  month = {December},
  volume = {40},
  number = {12},
  pages = {2883--2896},
  url = {http://users.ics.forth.gr/ argyros/res_fsv.html},
  projects =  {ROBOHOW,WEARHAP},
  doi = {10.1109/TPAMI.2017.2759736},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_journal_PAMI_fsv.pdf},
  videolink = {https://youtu.be/NhNV3tCcbd0}
}

D. Bautembach, I. Oikonomidis and A. Argyros, "Filling the Joints: Completion and Recovery of Incomplete 3D Human Poses", Technologies, MDPI, vol. 6, no. 4, October 2018.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We present a comparative study of three matrix completion and recovery techniques based on matrix inversion, gradient descent, and Lagrange multipliers, applied to the problem of human pose estimation. 3D human pose estimation algorithms may exhibit noise or may completely fail to provide estimates for some joints. A post-process is often employed to recover the missing joints’ locations from the remaining ones, typically by enforcing kinematic constraints or by using a prior learned from a database of natural poses. Matrix completion and recovery techniques fall into the latter category and operate by filling-in missing entries of a matrix whose available/non-missing entries may be additionally corrupted by noise. We compare the performance of three such techniques in terms of the estimation error of their output as well as their runtime, in a series of simulated and real-world experiments. We conclude by recommending use cases for each of the compared techniques.

BibTeX:

@article{Bautembach2018b,
  author = {Bautembach, Dennis and Oikonomidis, Iason and Argyros, Antonis},
  title = {Filling the Joints: Completion and Recovery of Incomplete 3D Human Poses},
  journal = {Technologies},
  publisher = {MDPI},
  year = {2018},
  month = {October},
  volume = {6},
  number = {4},
  url = {http://www.mdpi.com/2227-7080/6/4/97},
  projects =  {co4robots,ACANTO},
  doi = {10.3390/technologies6040097},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_journal_technologies_dennis.pdf}
}

C. Panagiotakis, G. Karvounas and A.A. Argyros, "Unsupervised Detection of Periodic Segments in Videos", In IEEE International Conference on Image Processing (ICIP 2018), pp. 923-927, Athens, Greece, October 2018.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We present a solution to the problem of discovering all periodic segments of a video and of estimating their period in a completely unsupervised manner. These segments may be located anywhere in the video, may differ in duration, speed, period and may represent unseen motion patterns of any type of objects (e.g., humans, animals, machines, etc). The proposed method capitalizes on earlier research on the problem of detecting common actions in videos, also known as commonality detection or video co-segmentation. The proposed method has been evaluated quantitatively and in comparison to a baseline, power-spectrum-based approach, on two ground-truth-annotated datasets (MHAD202-v, PERTUBE). From those, PERTUBE has been compiled specifically for the purposes of this study and includes a collection of youtube videos that have been shot in the wild, with several periodic segments. The results of this evaluation demonstrate that the propose method outperforms the baseline considerably, especially in the more challenging PERTUBE dataset.

BibTeX:

@inproceedings{Panagiotakis2018a,
  author = {Panagiotakis, Costas and Karvounas, Giorgos and Argyros, Antonis A.},
  title = {Unsupervised Detection of Periodic Segments in Videos},
  booktitle = {IEEE International Conference on Image Processing (ICIP 2018)},
  year = {2018},
  month = {October},
  pages = {923--927},
  address = {Athens, Greece},
  url = {http://www.ics.forth.gr/cvrl/pd},
  projects =  {ACANTO,Co4Robots},
  doi = {10.1109/ICIP.2018.8451336},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_10_ICIP_periodicity.pdf}
}

C. Panagiotakis and A.A. Argyros, "Cell Segmentation via Region-based Ellipse Fitting", In IEEE International Conference on Image Processing (ICIP 2018), pp. 2426-2430, Athens, Greece, October 2018.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We present a region based method for segmenting and splitting images of cells in an automatic and unsupervised manner. The detection of cell nuclei is based on the Bradley's method. False positives are automatically identified and rejected based on shape and intensity features. Additionally, the proposed method is able to automatically detect and split touching cells. To do so, we employ a variant of a region based multi-ellipse fitting method (DEFA) that makes use of constraints on the area of the split cells. The quantitative assessment of the proposed method has been based on two challenging public datasets. This experimental study shows that the proposed method outperforms clearly existing methods for segmenting fluorescence microscopy images.

BibTeX:

@inproceedings{Panagiotakis2018b,
  author = {Panagiotakis, Costas and Argyros, Antonis A.},
  title = {Cell Segmentation via Region-based Ellipse Fitting},
  booktitle = {IEEE International Conference on Image Processing (ICIP 2018)},
  year = {2018},
  month = {October},
  pages = {2426--2430},
  address = {Athens, Greece},
  url = {https://sites.google.com/site/costaspanagiotakis/research/cs},
  projects =  {none},
  doi = {10.1109/ICIP.2018.8451852},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_10_ICIP_cells.pdf}
}

I. Oikonomidis, G. Garcia-Hernando, A. Yao, A.A. Argyros, V. Lepetit and T.-K. Kim, "HANDS18: Methods, Techniques and Applications for Hand Observation", In European Conference on Computer Vision Workshops (HANDS 2018 - ECCVW 2018), also available at CoRR, arXiv, Springer, pp. 302-312, Munich, Germany, September 2018.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: This report outlines the proceedings of the Fourth International Workshop on Observing and Understanding Hands in Action (HANDS 2018). The fourth instantiation of this workshop attracted significant interest from both academia and the industry. The program of the workshop included regular papers that are published as the workshop's proceedings, extended abstracts, invited posters, and invited talks. Topics of the submitted works and invited talks and posters included novel methods for hand pose estimation from RGB, depth, or skeletal data, datasets for special cases and real-world applications, and techniques for hand motion re-targeting and hand gesture recognition. The invited speakers are leaders in their respective areas of specialization, coming from both industry and academia. The main conclusions that can be drawn are the turn of the community towards RGB data and the maturation of some methods and techniques, which in turn has led to increasing interest for real-world applications.

BibTeX:

@inproceedings{Oikonomhands2018,
  author = {Oikonomidis, Iason and Garcia-Hernando, Guillermo and Yao, Angela and Argyros, Antonis A. and Lepetit, Vincent and Kim, Tae-Kyun},
  title = {HANDS18: Methods, Techniques and Applications for Hand Observation},
  booktitle = {European Conference on Computer Vision Workshops (HANDS 2018 - ECCVW 2018), also available at CoRR, arXiv},
  publisher = {Springer},
  year = {2018},
  month = {September},
  pages = {302--312},
  address = {Munich, Germany},
  url = {http://arxiv.org/abs/1810.10818},
  projects =  {CO4ROBOTS},
  doi = {10.1007/978-3-030-11024-6_20},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_09_ECCVW_hands18.pdf}
}

A. Tsoli and A.A. Argyros, "Joint 3D tracking of a deformable object in interaction with a hand", In European Conference on Computer Vision (ECCV 2018), Springer, pp. 504-520, September 2018.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present a novel method that is able to track a complex deformable object in interaction with a hand. This is achieved by formulating and solving an optimization problem that jointly considers the hand, the deformable object and the hand/object contact points. The optimization evaluates several hand/object contact configuration hypotheses and adopts the one that results in the best fit of the object's model to the available RGBD observations in the vicinity of the hand. Thus, the hand is not treated as a distractor that occludes parts of the deformable object, but as a source of valuable information. Experimental results on a dataset that has been developed specifically for this new problem illustrate the superior performance of the proposed approach against relevant, state of the art solutions.

BibTeX:

@inproceedings{Tsoli2018,
  author = {Tsoli, Aggeliki and Argyros, Antonis A},
  title = {Joint 3D tracking of a deformable object in interaction with a hand},
  booktitle = {European Conference on Computer Vision (ECCV 2018)},
  publisher = {Springer},
  year = {2018},
  month = {September},
  pages = {504--520},
  url = {https://www.ics.forth.gr/cvrl/deformable_interaction/},
  projects =  {Co4Robots},
  doi = {10.1007/978-3-030-01264-9_30},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_ECCV_deformable_interaction.pdf},
  videolink = {https://youtu.be/JSOIy3D_5I0}
}

C. Panagiotakis, K. Papoutsakis and A.A. Argyros, "A Graph-based Approach for Detecting Common Actions in Motion Capture Data and Videos", Pattern Recognition, Elsevier, vol. 79, pp. 1-11, July 2018.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We present a novel solution to the problem of detecting common actions in time series of motion capture data and videos. Given two action sequences, our method discovers all pairs of similar subsequences, i.e. subsequences that represent the same action. This is achieved in a completely unsupervised manner, i.e., without any prior knowledge of the type of actions, their number and their duration. These common subsequences (commonalities) may be located anywhere in the original sequences, may differ in duration and may be performed under different conditions e.g., by a different actor. The proposed method performs a very ecient graph-based search on the matrix of pairwise frame distances of the two sequences. This search is also supported by an objective function that captures the trade o between the similarity of the common subsequences and their lengths. The proposed method has been evaluated quantitatively on challenging datasets and in comparison to state of the art approaches. The obtained results demonstrate that the proposed method outperforms the state of the art methods both in the quality of the obtained solutions and in computational performance.

BibTeX:

@article{Panagiotakis2018,
  author = {Panagiotakis, Costas and Papoutsakis, Konstantinos and Argyros, Antonis A},
  title = {A Graph-based Approach for Detecting Common Actions in Motion Capture Data and Videos},
  journal = {Pattern Recognition},
  publisher = {Elsevier},
  year = {2018},
  month = {July},
  volume = {79},
  pages = {1--11},
  url = {https://www.sciencedirect.com/science/article/pii/S0031320318300499},
  projects =  {ACANTO,CO4ROBOTS},
  doi = {10.1016/j.patcog.2018.02.001},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_journal_PR_mucos.pdf}
}

M. Bajones, D. Fischinger, A. Weiss, D. Wolf, T. Kortner, M. Weninger, K. Papoutsakis, D. Michel, A. Qammaz, P. Panteleris, M. Foukarakis, I. Adami, D. Ioannidi, A. Leonidis, M. Antona, A.A. Argyros, P.-M. Mayer, P. Panek, H. Eftring, S. Frennert, M. Vincze and P.D.L. Puente, "Hobbit - Providing Fall Detection and Prevention for the Elderly in the Real World", Journal of Robotics, Hindawi, pp. 1-20, June 2018.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: In this article we present the robot developed within the Hobbit project, a socially assistive service robot aiming at the challenge of enabling prolonged independent living of elderly people in their own homes. We present the second prototype (Hobbit PT2) in terms of hardware and functionality improvements following first user studies. Our main contribution lies within the description of all components developed within the Hobbit project leading to autonomous operation of 371 days during field trials in Austria, Greece and Sweden. In these field trials we studied how 18 elderly users (aged 75 plus) lived with the autonomously interacting service robot over multiple weeks. To the best of our knowledge, this is the first time a multi-functional, low-cost service robot equipped with a manipulator was studied and evaluated for several weeks under real-world conditions. We show that Hobbit’s adaptive approach towards the user increasingly eased the interaction between the users and Hobbit. We provide lessons learned regarding the need for adaptive behavior coordination, support during emergency situations and clear communication of robotic actions and their consequences for fellow researchers who are developing an autonomous, low-cost service robot designed to interact with their users in domestic contexts. Our trials show the necessity to move out into actual user homes, as only there we encounter issues such as misinterpretation of actions during non-scripted Human-Robot interaction.

BibTeX:

@article{Bajones2018,
  author = {Bajones, Markus and Fischinger, David and Weiss, Astrid and Wolf, Daniel and Kortner, Tobias and Weninger, Markus and Papoutsakis, Konstantinos and Michel, Damien and Qammaz, Ammar and Panteleris, Paschalis and Foukarakis, Michalis and Adami, Ilia and Ioannidi, Danai and Leonidis, Asterios and Antona, Margherita and Argyros, Antonis A and Mayer, Peter-Michael and Panek, Paul and Eftring, Håkan and Frennert, Susanne and Vincze, Markus and Puente, Paloma De La},
  title = {Hobbit - Providing Fall Detection and Prevention for the Elderly in the Real World},
  journal = {Journal of Robotics},
  publisher = {Hindawi},
  year = {2018},
  month = {June},
  pages = {1--20},
  url = {https://www.hindawi.com/journals/jr/2018/1754657},
  projects =  {HOBBIT},
  doi = {10.1155/2018/1754657},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_journal_JoR_HOBBIT.pdf}
}

D. Bautembach, I. Oikonomidis and A.A. Argyros, "A Comparative Study of Matrix Completion and Recovery Techniques for Human Pose Estimation", In International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2018), ACM, pp. 23-30, Corfu, Greece, June 2018.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We present a comparative study of three matrix completion and recovery techniques, applied to the problem of human pose estimation. Human pose estimation algorithms may exhibit estimation noise or may completely fail to provide estimates for some joints. A post-process is often employed to recover the missing joints' locations from the remaining ones, typically by enforcing kinematic constraints or by using a prior learned from a database of natural poses. Matrix completion and recovery techniques fall into the latter category and operate by filling-in missing entries of a matrix whose available/non-missing entries may be additionally corrupted by noise. We compare the performance of three such techniques in terms of the estimation error of their output as well as their runtime under varying parameters. We conclude by recommending use cases for each of the compared techniques.

BibTeX:

@inproceedings{Bautembach2018,
  author = {Bautembach, Dennis and Oikonomidis, Iason and Argyros, Antonis A},
  title = {A Comparative Study of Matrix Completion and Recovery Techniques for Human Pose Estimation},
  booktitle = {International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2018)},
  publisher = {ACM},
  year = {2018},
  month = {June},
  pages = {23--30},
  address = {Corfu, Greece},
  url = {http://doi.acm.org/10.1145/3197768.3197791},
  projects =  {Co4Robots, ACANTO},
  doi = {10.1145/3197768.3197791},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_06_PETRA_Bautembach.pdf}
}

D. Kosmopoulos, A.A. Argyros, C. Theoharatos, V. Lambropoulou, C. Panagopoulos and I. Maglogiannis, "The HealthSign Project: Vision and Objectives", In SmartCitiies Workshop, International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2018), ACM, pp. 502-506, Corfu, Greece, June 2018.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

BibTeX:

@inproceedings{Kosmopoulos2018,
  author = {Kosmopoulos, Dimitrios and Argyros, Antonis A and Theoharatos, C and Lambropoulou, V and Panagopoulos, C and Maglogiannis, Ilias},
  title = {The HealthSign Project: Vision and Objectives},
  booktitle = {SmartCitiies Workshop, International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2018)},
  publisher = {ACM},
  year = {2018},
  month = {June},
  pages = {502--506},
  address = {Corfu, Greece},
  url = {http://doi.acm.org/10.1145/3197768.3201547},
  projects =  {HealthSign},
  doi = {10.1145/3197768.3201547},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_06_PETRA_SmartCities_HealthSign.pdf}
}

A. Qammaz, S. Kosta, N. Kyriazis and A.A. Argyros, "Distributed Real-Time Generative 3D Hand Tracking using Edge GPGPU Acceleration", In ACM International Conference on Mobile Systems, Applications, and Services (ACM MobiSys 2018), ACM, pp. 540, June 2018.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: This work demonstrates a real-time 3D hand tracking application that runs via computation offloading. The proposed framework enables the application to run on low-end mobile devices such as laptops and tablets, despite the fact that they lack the sufficient hardware to perform the required computations locally. The network connection takes the place of a GPGPU accelerator and sharing resources with a larger workstation becomes the acceleration mechanism. The unique properties of a generative optimizer are examined and constitute a challenging use-case, since the requirement for real-time performance makes it very latency-sensitive.

BibTeX:

@inproceedings{Qammaz2018b,
  author = {Qammaz, Ammar and Kosta, Sokol and Kyriazis, Nikolaos and Argyros, Antonis A.},
  title = {Distributed Real-Time Generative 3D Hand Tracking using Edge GPGPU Acceleration},
  booktitle = {ACM International Conference on Mobile Systems, Applications, and Services (ACM MobiSys 2018)},
  publisher = {ACM},
  year = {2018},
  month = {June},
  pages = {540},
  projects =  {RAPID, Co4Robots},
  doi = {10.1145/3210240.3211112},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_06_mobisys.pdf}
}

S. Timotheatos, S. Piperakis, A.A. Argyros and P. Trahanias, "Vision based Horizon Detection for UAV Navigation", In 27th International Conference on Robotics in Alpe-Adria Danube Region (RAAD 2018), pp. 181-189, Patras, Greece, June 2018.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: In this paper, we present a novel framework for horizon line (HL) detection that can be effectively used for Unmanned Air Vehicle (UAV) navigation. Our scheme is based on a Canny edge and a Hough detector along with an optimization step performed by a Particle Swarm optimization (PSO) algorithm. The PSO's objective function is based on a variation of the Bag of Words (BOW) method to effectively consider multiple image descriptors and facilitate efficient computation times. More specifically, the image descriptors employed are Lab color features, texture features and SIFT features. We demonstrate the effectiveness and robustness of the proposed novel horizon line detector in multiple image sets captured under real world conditions. First, we experimentally compare the proposed scheme with the Hough HL detector and a deep learning HL estimator, a prominent example of line detection, and demonstrate a significant boost in accuracy. Furthermore, since from the horizon line the UAV roll and pitch angles can be derived, this scheme can be used for UAV navigation. To this end, we compare the horizon computed roll and pitch angles to the IMU ones obtained with a complementary filter, to further validate our approach.

BibTeX:

@inproceedings{Timotheatos2018,
  author = {Timotheatos, Stavros and Piperakis, Stylianos and Argyros, Antonis A and Trahanias, Panos},
  title = {Vision based Horizon Detection for UAV Navigation},
  booktitle = {27th International Conference on Robotics in Alpe-Adria Danube Region (RAAD 2018)},
  year = {2018},
  month = {June},
  pages = {181--189},
  address = {Patras, Greece},
  projects =  {none},
  doi = {10.1007/978-3-030-00232-9_19},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_06_RAAD_Timotheatos.pdf}
}

S. Yuan, G. Garcia-Hernando, B. Stenger, G. Moon, J. Chang, K. Lee, P. Molchanov, J. Kautz, S. Honari, L. Ge, J. Yuan, X. Chen, G. Wang, F. Yang, K. Akiyama, Y. Wu, Q. Wan, M. Madadi, S. Escalera, S. Li, D. Lee, I. Oikonomidis, A. Argyros and T. Kim, "Depth-based 3D Hand Pose Estimation: From Current Achievements to Future Goals", In IEEE Computer Vision and Pattern Recognition (CVPR 2018), also available at CoRR, arXiv, IEEE, pp. 2636-2645, Salt Lake city, Utah, USA, June 2018.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: In this paper, we strive to answer two questions: What is the current state of 3D hand pose estimation from depth images? And, what are the next challenges that need to be tackled? Following the successful Hands In the Million Challenge (HIM2017), we investigate the top 10 state-ofthe-art methods on three tasks: single frame 3D pose estimation, 3D hand tracking, and hand pose estimation during object interaction. We analyze the performance of different CNN structures with regard to hand shape, joint visibility, view point and articulation distributions. Our findings include: (1) isolated 3D hand pose estimation achieves low mean errors (10 mm) in the view point range of [70, 120] degrees, but it is far from being solved for extreme view points; (2) 3D volumetric representations outperform 2D CNNs, better capturing the spatial structure of the depth data; (3) Discriminative methods still generalize poorly to unseen hand shapes; (4) While joint occlusions pose a challenge for most methods, explicit modeling of structure constraints can significantly narrow the gap between errors on visible and occluded joints.

BibTeX:

@inproceedings{Yuan2018,
  author = {Yuan, S. and Garcia-Hernando, G. and Stenger, B. and Moon, G. and Chang, J.Y. and Lee, K.M. and Molchanov, P. and Kautz, J. and Honari, S. and Ge, L. and Yuan, J. and Chen, X. and Wang, G. and Yang, F. and Akiyama, K. and Wu, Y. and Wan, Q. and Madadi, M. and Escalera, S. and Li, S. and Lee, D. and Oikonomidis, I. and Argyros, A. and Kim, T.K.},
  title = {Depth-based 3D Hand Pose Estimation: From Current Achievements to Future Goals},
  booktitle = {IEEE Computer Vision and Pattern Recognition (CVPR 2018), also available at CoRR, arXiv},
  publisher = {IEEE},
  year = {2018},
  month = {June},
  pages = {2636--2645},
  address = {Salt Lake city, Utah, USA},
  url = {https://arxiv.org/abs/1712.03917},
  projects =  {Co4Robots},
  doi = {10.1109/CVPR.2018.00279},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_06_CVPR_HandPose.pdf}
}

A. Qammaz, S. Kosta, N. Kyriazis and A. Argyros, "On the Feasibility of Real-Time 3D Hand Tracking using Edge GPGPU Acceleration", CoRR, Arxiv, April 2018.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: This paper presents the case study of a non-intrusive porting of a monolithic C++ library for real-time 3D hand tracking, to the domain of edge-based computation. Towards a proof of concept, the case study considers a pair of workstations, a computationally powerful and a computationally weak one. By wrapping the C++ library in Java container and by capitalizing on a Java-based offloading infrastructure that supports both CPU and GPGPU computations, we are able to establish automatically the required server-client workflow that best addresses the resource allocation problem in the effort to execute from the weak workstation. As a result, the weak workstation can perform well at the task, despite lacking the sufficient hardware to do the required computations locally. This is achieved by offloading computations which rely on GPGPU, to the powerful workstation, across the network that connects them. We show the edge-based computation challenges associated with the information flow of the ported algorithm, demonstrate how we cope with them, and identify what needs to be improved for achieving even better performance.

BibTeX:

@arxivarticle{1804.11256,
  author = {Ammar Qammaz and Sokol Kosta and Nikolaos Kyriazis and Antonis Argyros},
  title = {On the Feasibility of Real-Time 3D Hand Tracking using Edge GPGPU Acceleration},
  journal = {CoRR, Arxiv},
  year = {2018},
  month = {April},
  url = {http://arxiv.org/abs/1804.11256},
  projects =  {RAPID, Co4Robots},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_04_arxiv_1804.11256.pdf}
}

P. Panteleris, I. Oikonomidis and A.A. Argyros, "Using a single RGB frame for real time 3D hand pose estimation in the wild", In IEEE Winter Conference on Applications of Computer Vision (WACV 2018), also available at CoRR, arXiv, IEEE, pp. 436-445, lake Tahoe, NV, USA, March 2018.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present a method for the real-time estimation of the full 3D pose of one or more human hands using a single commodity RGB camera. Recent work in the area has displayed impressive progress using RGBD input. However, since the introduction of RGBD sensors, there has been little progress for the case of monocular color input. We capitalize on the latest advancements of deep learning, combining them with the power of generative hand pose estimation techniques to achieve real-time monocular 3D hand pose estimation in unrestricted scenarios. More specifically, given an RGB image and the relevant camera calibration information, we employ a state-of-the-art detector to localize hands. Given a crop of a hand in the image, we run the pretrained network of OpenPose for hands to estimate the 2D location of hand joints. Finally, non-linear least-squares minimization fits a 3D model of the hand to the estimated 2D joint positions, recovering the 3D hand pose. Extensive experimental results provide comparison to the state of the art as well as qualitative assessment of the method in the wild.

BibTeX:

@inproceedings{Panteleris2018,
  author = {Panteleris, Paschalis and Oikonomidis, Iason and Argyros, Antonis A},
  title = {Using a single RGB frame for real time 3D hand pose estimation in the wild},
  booktitle = {IEEE Winter Conference on Applications of Computer Vision (WACV 2018), also available at CoRR, arXiv},
  publisher = {IEEE},
  year = {2018},
  month = {March},
  pages = {436--445},
  address = {lake Tahoe, NV, USA},
  url = {http://users.ics.forth.gr/ argyros/res_rgbmonohand.html},
  projects =  {Co4Robots, WEARHAP},
  doi = {10.1109/WACV.2018.00054},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_03_WACV_rgbmonohand.pdf},
  videolink = {https://youtu.be/VoWAmtga9fg}
}

A. Qammaz, D. Michel and A.A. Argyros, "A Hybrid Method for 3D Pose Estimation of Personalized Human Body Models", In IEEE Winter Conference on Applications of Computer Vision (WACV 2018), IEEE, pp. 456-465, Lake Tahoe, NV, USA, March 2018.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We propose a new hybrid method for 3D human body pose estimation based on RGBD data. We treat this as an optimization problem that is solved using a stochastic optimization technique. The solution to the optimization problem is the pose parameters of a human model that register it to the available observations. Our method can make use of any skinned, articulated human body model. However, we focus on personalized models that can be acquired easily and automatically based on existing human scanning and mesh rigging techniques. Observations consist of the 3D structure of the human (measured by the RGBD camera) and the body joints locations (computed based on a discriminative, CNN-based component). A series of quantitative and qualitative experiments demonstrate the accuracy and the benefits of the proposed approach. In particular, we show that the proposed approach achieves state of the art results compared to competitive methods and that the use of personalized body models improve significantly the accuracy in 3D human pose estimation.

BibTeX:

@inproceedings{Qammaz2018,
  author = {Qammaz, Ammar and Michel, Damien and Argyros, Antonis A},
  title = {A Hybrid Method for 3D Pose Estimation of Personalized Human Body Models},
  booktitle = {IEEE Winter Conference on Applications of Computer Vision (WACV 2018)},
  publisher = {IEEE},
  year = {2018},
  month = {March},
  pages = {456--465},
  address = {Lake Tahoe, NV, USA},
  url = {http://users.ics.forth.gr/ argyros/res_personalizedHumanPose.html},
  projects =  {Co4Robots, ACANTO},
  doi = {10.1109/WACV.2018.00056},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_03_WACV_personalizedHumanPose.pdf},
  videolink = {https://youtu.be/SCgpIIaRIuI}
}

V. Manousaki, K. Papoutsakis and A.A. Argyros, "Evaluating Method Design Options for Action Classification based on Bags of Visual Words", In International Conference on Computer Vision Theory and Applications (VISAPP 2018), Scitepress, pp. 185-192, Madeira, Portugal, January 2018.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: The Bags of Visual Words (BoVWs) framework has been applied successfully to several computer vision tasks. In this work we are particularly interested on its application to the problem of action recognition/classification. The key design decisions for a method that follows the BoVWs framework are (a) the visual features to be employed, (b) the size of the codebook to be used for representing a certain action and (c) the classifier applied to the developed representation to solve the classification task. We perform several experiments to investigate a variety of options regarding all the aforementioned design parameters. We also propose a new feature type and we suggest a method that determines automatically the size of the codebook. The experimental results show that our proposals produce results that are competitive to the outcomes of state of the art methods.

BibTeX:

@inproceedings{Manousaki2018,
  author = {Manousaki, Victoria and Papoutsakis, Konstantinos and Argyros, Antonis A},
  title = {Evaluating Method Design Options for Action Classification based on Bags of Visual Words},
  booktitle = {International Conference on Computer Vision Theory and Applications (VISAPP 2018)},
  publisher = {Scitepress},
  year = {2018},
  month = {January},
  pages = {185--192},
  address = {Madeira, Portugal},
  url = {http://www.scitepress.org/PublicationsDetail.aspx?ID=x1JKg2Ydp4w=&t=1},
  projects =  {Co4Robots,ACANTO},
  doi = {10.5220/0006544201850192},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_01_VISAPP_manousaki.pdf}
}

I. Oikonomidis, G. Garcia-Hernando, A. Yao, A. Argyros, V. Lepetit and T.-K. Kim, "HANDS18: Methods, Techniques and Applications for Hand Observation", CoRR, arXiv, 2018.
[Abstract] [BibTeX] [PDF] [URL]

BibTeX:

@arxivarticle{1810.10818,
  author = {Iason Oikonomidis and Guillermo Garcia-Hernando and Angela Yao and Antonis Argyros and Vincent Lepetit and Tae-Kyun Kim},
  title = {HANDS18: Methods, Techniques and Applications for Hand Observation},
  journal = {CoRR, arXiv},
  year = {2018},
  url = {http://arxiv.org/abs/1810.10818},
  projects =  {CO4ROBOTS},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_10_arxiv_hands18.pdf}
}

V.C. Nicodemou, I. Oikonomidis, G. Tzimiropoulos and A. Argyros, "Learning to Infer the Depth Map of a Hand from its Color Image", CoRR, arXiv, 2018.
[Abstract] [BibTeX] [PDF] [URL]

BibTeX:

@arxivarticle{1812.02486,
  author = {Vassilis C. Nicodemou and Iason Oikonomidis and Georgios Tzimiropoulos and Antonis Argyros},
  title = {Learning to Infer the Depth Map of a Hand from its Color Image},
  journal = {CoRR, arXiv},
  year = {2018},
  url = {http://arxiv.org/abs/1812.02486},
  projects =  {Co4Robots,HealthSign},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_12_arxiv_nicodemou.pdf}
}

F. Gouidis, P. Panteleris, I. Oikonomidis and A. Argyros, "Accurate Hand Keypoint Localization on Mobile Devices", CoRR, arXiv, 2018.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

BibTeX:

@arxivarticle{1812.08028,
  author = {Filippos Gouidis and Paschalis Panteleris and Iason Oikonomidis and Antonis Argyros},
  title = {Accurate Hand Keypoint Localization on Mobile Devices},
  journal = {CoRR, arXiv},
  year = {2018},
  url = {http://arxiv.org/abs/1812.08028},
  projects =  {Co4Robots},
  doi = {10.48550/arXiv.1812.08028},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_12_arxiv_gouidis.pdf},
  videolink = {https://youtu.be/pPqg1lMkuaQ}
}

D. Kosmopoulos, K. Papoutsakis and A.A. Argyros, "A framework for online segmentation and classification of modeled actions performed in the context of unmodeled ones", IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), vol. 27, pp. 2578-2590, December 2017.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: In this work, we propose a discriminative framework for online simultaneous segmentation and classification of modeled visual actions that can be performed in the context of other, unknown actions. To this end, we employ Hough transform to vote in a 3D space for the begin point, the end point and the label of the segmented part of the input stream. An SVM is used to model each class and to suggest putative labeled segments on the timeline. To identify the most plausible segments among the putative ones we apply a dynamic programming algorithm, which maximises the likelihood for label assignment in linear time. The performance of our method is evaluated on synthetic, as well as on real data (Weizmann, TUM Kitchen, UTKAD and Berkeley multimodal human action databases). Extensive quantitative results obtained on a number of standard datasets demonstrate that the proposed approach is of comparable accuracy to the state of the art for online stream segmentation and classification when all performed actions are known and performs considerably better in the presence of unmodeled actions.

BibTeX:

@article{Kosmopoulos2017,
  author = {Kosmopoulos, Dimitrios and Papoutsakis, Konstantinos and Argyros, Antonis A},
  title = {A framework for online segmentation and classification of modeled actions performed in the context of unmodeled ones},
  journal = {IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)},
  year = {2017},
  month = {December},
  volume = {27},
  pages = {2578--2590},
  url = {http://users.ics.forth.gr/ argyros/res_actions.html},
  projects =  {HOBBIT,ROBOHOW,ACANTO},
  doi = {10.1109/TCSVT.2016.2589678},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2017_12_IEEETransCSVT_action_recognition.pdf}
}

P. Panteleris and A.A. Argyros, "Back to RGB: 3D tracking of hands and hand-object interactions based on short-baseline stereo", In IEEE International Conference on Computer Vision Workshops (HANDS 2017 - ICCVW 2017), also available at CoRR, arXiv, IEEE, pp. 575-584, Venice, Italy, October 2017.
[Abstract] [BibTeX] [PDF] [URL] [VIDEO]

Abstract: We present a novel solution to the problem of 3D tracking of the articulated motion of human hand(s), possibly in interaction with other objects. The vast majority of contemporary relevant work capitalizes on depth information provided by RGBD cameras. In this work, we show that accurate and efficient 3D hand tracking is possible, even for the case of RGB stereo. A straightforward approach for solving the problem based on such input would be to first recover depth and then apply a state of the art depth-based 3D hand tracking method. Unfortunately, this does not work well in practice because the stereo-based, dense 3D reconstruction of hands is far less accurate than the one obtained by RGBD cameras. Our approach bypasses 3D reconstruction and follows a completely different route: 3D hand tracking is formulated as an optimization problem whose solution is the hand configuration that maximizes the color consistency between the two views of the hand. We demonstrate the applicability of our method for real time tracking of a single hand, of a hand manipulating an object and of two interacting hands. The method has been evaluated quantitatively on standard datasets and in comparison to relevant, state of the art RGBD-based approaches. The obtained results demonstrate that the proposed stereo-based method performs equally well to its RGBD-based competitors, and in some cases, it even outperforms them.

BibTeX:

@inproceedings{Panteleris2017,
  author = {Panteleris, Paschalis and Argyros, Antonis A},
  title = {Back to RGB: 3D tracking of hands and hand-object interactions based on short-baseline stereo},
  booktitle = {IEEE International Conference on Computer Vision Workshops (HANDS 2017 - ICCVW 2017), also available at CoRR, arXiv},
  publisher = {IEEE},
  year = {2017},
  month = {October},
  pages = {575--584},
  address = {Venice, Italy},
  url = {http://openaccess.thecvf.com/content_ICCV_2017_workshops/w11/html/Panteleris_Back_to_RGB_ICCV_2017_paper.html},
  projects =  {Co4Robots,WEARHAP},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2017_10_HANDS_StereoHandTracking.pdf},
  videolink = {https://youtu.be/yf3TKCjLD00}
}

K. Roditakis, A. Makris and A.A. Argyros, "Generative 3D Hand Tracking with Spatially Constrained Pose Sampling", In British Machine Vision Conference (BMVC 2017), BMVA, London, UK, September 2017.
[Abstract] [BibTeX] [PDF] [URL] [VIDEO]

Abstract: We present a method for 3D hand tracking that exploits spatial constraints in the form of end effector (fingertip) locations. The method follows a generative, hypothesize-and-test approach and uses a hierarchical particle filter to track the hand. In contrast to state of the art methods that consider spatial constraints in a soft manner, the proposed approach enforces constraints during the hand pose hypothesis generation phase by sampling in the sl Reachable Distance Space (RDS). This sampling produces hypotheses that respect both the hands' dynamics and the end effector locations. The data likelihood term is calculated by measuring the discrepancy between the rendered 3D model and the available observations. Experimental results on challenging, ground truth-annotated sequences containing severe hand occlusions demonstrate that the proposed approach outperforms the state of the art in hand tracking accuracy.

BibTeX:

@inproceedings{Roditakis2017,
  author = {Roditakis, Konstantinos and Makris, Alexandros and Argyros, Antonis A},
  title = {Generative 3D Hand Tracking with Spatially Constrained Pose Sampling},
  booktitle = {British Machine Vision Conference (BMVC 2017)},
  publisher = {BMVA},
  year = {2017},
  month = {September},
  address = {London, UK},
  url = {http://users.ics.forth.gr/ argyros/res_handRDS.html},
  projects =  {WEARHAP, CO4ROBOTS},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2017_09_BMVC_RDSRoditak.pdf},
  videolink = {https://youtu.be/DdXA-fslgpI}
}

C. Hernandez-Matas, X. Zabulis and A.A. Argyros, "An Experimental Evaluation of the Accuracy of Keypoints-based Retinal Image Registration", In IEEE Engineering in Medicine and Biology Conference (EMBC 2017), IEEE, pp. 377-381, JeJu island, S. Korea, July 2017.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: This work regards an investigation of the accuracy of a state-of-the-art, keypoint-based retinal image registration approach, as to the type of keypoint features used to guide the registration process. The employed registration approach is a local method that incorporates the notion of a 3D retinal surface imaged from different viewpoints and has been shown, experimentally, to be more accurate than competing approaches. The correspondences obtained between SIFT, SURF, Harris-PIIFD and vessel bifurcations are studied, either individually or in combinations. The combination of SIFT features with vessel bifurcations was found to perform better than other combinations or any individual feature type, alone. The registration approach is also comparatively evaluated against representative methods of the state-of-the-art in retinal image registration, using a benchmark dataset that covers a broad range of cases regarding the overlap of the acquired images and the anatomical characteristics of the imaged retinas.

BibTeX:

@inproceedings{Hernandez-Matas2017,
  author = {Hernandez-Matas, Carlos and Zabulis, Xenophon and Argyros, Antonis A},
  title = {An Experimental Evaluation of the Accuracy of Keypoints-based Retinal Image Registration},
  booktitle = {IEEE Engineering in Medicine and Biology Conference (EMBC 2017)},
  publisher = {IEEE},
  year = {2017},
  month = {July},
  pages = {377--381},
  address = {JeJu island, S. Korea},
  url = {https://doi.org/10.1109/EMBC.2017.8036841},
  projects =  {REVAMMAD},
  doi = {10.1109/EMBC.2017.8036841},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2017_07_EMBC_cues.pdf}
}

K. Papoutsakis, C. Panagiotakis and A.A. Argyros, "Temporal Action Co-Segmentation in 3D Motion Capture Data and Videos", In IEEE Computer Vision and Pattern Recognition (CVPR 2017), IEEE, pp. 2146-2155, Honolulu, Hawaii, USA, July 2017.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: Given two action sequences, we are interested in spotting/co-segmenting all pairs of sub-sequences that represent the same action. We propose a totally unsupervised solution to this problem. No a-priori model of the actions is assumed to be available. The number of common subsequences may be unknown. The sub-sequences can be located anywhere in the original sequences, may differ in duration and the corresponding actions may be performed by a different person, in different style. We treat this type of temporal action co-segmentation as a stochastic optimization problem that is solved by employing Particle Swarm Optimization (PSO). The objective function that is minimized by PSO capitalizes on Dynamic TimeWarping (DTW) to compare two action sequences. Due to the generic problem formulation and solution, the proposed method can be applied to motion capture (i.e., 3D skeletal) data or to conventional RGB video data acquired in the wild. We present extensive quantitative experiments on several standard, ground truthed datasets. The obtained results demonstrate that the proposed method achieves a remarkable increase in co-segmentation quality compared to all tested existing state of the art methods.

BibTeX:

@inproceedings{Papoutsakis2017,
  author = {Papoutsakis, Konstantinos and Panagiotakis, Costas and Argyros, Antonis A},
  title = {Temporal Action Co-Segmentation in 3D Motion Capture Data and Videos},
  booktitle = {IEEE Computer Vision and Pattern Recognition (CVPR 2017)},
  publisher = {IEEE},
  year = {2017},
  month = {July},
  pages = {2146--2155},
  address = {Honolulu, Hawaii, USA},
  url = {http://www.ics.forth.gr/cvrl/evaco/},
  projects =  {ACANTO, Co4Robots},
  doi = {10.1109/CVPR.2017.231},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2017_07_CVPR_cosegmentation.pdf},
  videolink = {https://youtu.be/YH3QdthVDvQ}
}

D. Michel, A. Qammaz and A.A. Argyros, "Markerless 3D Human Pose Estimation and Tracking based on RGBD Cameras: an Experimental Evaluation", In International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2017), ACM, pp. 115-122, Rhodes, Greece, June 2017.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present a comparative experimental evaluation of three methods that estimate the 3D position, orientation and articulation of the human body from markerless visual observations obtained by RGBD cameras. The evaluated methods are representatives of three broad 3D human pose estimation/tracking methods. Specifically, the first is the discriminative approach adopted by OpenNI. The second is a hybrid approach that depends on the input of two synchronized and extrinsically calibrated RGBD cameras. Finally, the third one is a recently developed generative method that depends on input provided by a single RGBD camera. The experimental evaluation of these methods has been based on a publicly available data set that is annotated with ground truth. The obtained results expose the characteristics of the three methods and provide evidence that can guide the selection of the most appropriate one depending on the requirements of a certain application domain.

BibTeX:

@inproceedings{Michel2017,
  author = {Michel, Damien and Qammaz, Ammar and Argyros, Antonis A},
  title = {Markerless 3D Human Pose Estimation and Tracking based on RGBD Cameras: an Experimental Evaluation},
  booktitle = {International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2017)},
  publisher = {ACM},
  year = {2017},
  month = {June},
  pages = {115--122},
  address = {Rhodes, Greece},
  url = {http://users.ics.forth.gr/ argyros/res_fhbt.html},
  projects =  {WEARHAP, ACANTO},
  doi = {10.1145/3056540.3056543},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2017_06_PETRA_skeletons.pdf},
  videolink = {http://cvrlcode.ics.forth.gr/projects/fhbt/}
}

C. Hernandez-Matas, X. Zabulis, A. Triantafyllou, P. Anyfanti and A.A. Argyros, "Retinal Image Registration under the Assumption of a Spherical Eye", Computerized Medical Imaging and Graphics, Special Issue on Ophthalmic Medical Image Analysis, Elsevier, vol. 55, pp. 95-105, January 2017.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: We propose a method for registering a pair of retinal images. The proposed approach employs point correspondences and assumes that the human eye has a spherical shape. The image registration problem is formulated as a 3D pose estimation problem, solved by estimating the rigid transformation that relates the views from which the two images were acquired. Given this estimate, each image can be warped upon the other so that pixels with the same coordinates image the same retinal point. Extensive experimental evaluation shows improved accuracy over state of the art methods, as well as robustness to noise and spurious keypoint matches. Experiments also indicate the method's applicability to the comparative analysis of images from different examinations that may exhibit changes and its applicability to diagnostic support.

BibTeX:

@article{Hernandez-Matas2017b,
  author = {Hernandez-Matas, Carlos and Zabulis, Xenophon and Triantafyllou, Areti and Anyfanti, Panagiota and Argyros, Antonis A},
  title = {Retinal Image Registration under the Assumption of a Spherical Eye},
  journal = {Computerized Medical Imaging and Graphics, Special Issue on Ophthalmic Medical Image Analysis},
  publisher = {Elsevier},
  year = {2017},
  month = {January},
  volume = {55},
  pages = {95--105},
  projects =  {REVAMMAD},
  doi = {10.1016/j.compmedimag.2016.06.006},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2017_journal_SMIG_CarlosSpherical.pdf}
}

C. Hernandez-Matas, X. Zabulis, A. Triantafyllou, P. Anyfanti, S. Douma and A.A. Argyros, "FIRE: Fundus Image Registration Dataset", Journal for Modeling in Opthalmology, Kugler Publications, vol. 1, no. 4, pp. 16-28, 2017.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: Purpose: Retinal image registration is a useful tool for medical professionals. However, accuracy evaluation of registration methods has not been consistently assessed in the literature. Toaddress that, a dataset comprised of retinal imagepairs annotated with ground truth and an evaluation protocol for registration methods is proposed. Methods: The dataset is comprised by 134 retinal fundus image pairs. These pairs are classified into three categories, according to characteristics that are relevant to indicative registration applications. Such characteristics are the degree of overlap between images and the presence/absence of anatomical dierences. Ground truth in the form of corresponding image points and a protocol to evaluate registration accuracy are provided. Results: Using the aforementioned protocol it is shownthat FIRE enables quantitative and comparative evaluation of retinal registration methods under a variety of conditions. Conclusion: This work enables the fair comparison of retinal registration methods. It also helps researchers to select the registration method that is most appropriate given a specific target use.

BibTeX:

@article{Hernandez-Matas2017a,
  author = {Hernandez-Matas, Carlos and Zabulis, Xenophon and Triantafyllou, Areti and Anyfanti, Panagiota and Douma, S and Argyros, Antonis A},
  title = {FIRE: Fundus Image Registration Dataset},
  journal = {Journal for Modeling in Opthalmology},
  publisher = {Kugler Publications},
  year = {2017},
  volume = {1},
  number = {4},
  pages = {16--28},
  url = {http://www.ics.forth.gr/cvrl/fire/},
  projects =  {REVAMMAD},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2017_journal_JMO_firedataset.pdf}
}

P. Panteleris and A. Argyros, "Back to RGB: 3D tracking of hands and hand-object interactions based on short-baseline stereo", CoRR, arXiv, 2017.
[Abstract] [BibTeX] [PDF] [URL] [VIDEO]

BibTeX:

@arxivarticle{1705.05301,
  author = {Paschalis Panteleris and Antonis Argyros},
  title = {Back to RGB: 3D tracking of hands and hand-object interactions based on short-baseline stereo},
  journal = {CoRR, arXiv},
  year = {2017},
  url = {http://arxiv.org/abs/1705.05301},
  projects =  {Co4Robots,WEARHAP},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2017_10_HANDS_StereoHandTracking.pdf},
  videolink = {https://youtu.be/yf3TKCjLD00}
}

P. Panteleris, I. Oikonomidis and A. Argyros, "Using a single RGB frame for real time 3D hand pose estimation in the wild", CoRR, arXiv, 2017.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

BibTeX:

@arxivarticle{1712.03866,
  author = {Paschalis Panteleris and Iason Oikonomidis and Antonis Argyros},
  title = {Using a single RGB frame for real time 3D hand pose estimation in the wild},
  journal = {CoRR, arXiv},
  year = {2017},
  url = {http://arxiv.org/abs/1712.03866},
  projects =  {Co4Robots, WEARHAP},
  doi = {10.1109/WACV.2018.00054},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_03_WACV_rgbmonohand.pdf},
  videolink = {https://youtu.be/VoWAmtga9fg}
}

S. Yuan, G. Garcia-Hernando, B. Stenger, G. Moon, J.Y. Chang, K.M. Lee, P. Molchanov, J. Kautz, S. Honari, L. Ge, J. Yuan, X. Chen, G. Wang, F. Yang, K. Akiyama, Y. Wu, Q. Wan, M. Madadi, S. Escalera, S. Li, D. Lee, I. Oikonomidis, A. Argyros and T.-K. Kim, "Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals", CoRR, Arxiv, 2017.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

BibTeX:

@arxivarticle{1712.03917,
  author = {Shanxin Yuan and Guillermo Garcia-Hernando and Bjorn Stenger and Gyeongsik Moon and Ju Yong Chang and Kyoung Mu Lee and Pavlo Molchanov and Jan Kautz and Sina Honari and Liuhao Ge and Junsong Yuan and Xinghao Chen and Guijin Wang and Fan Yang and Kai Akiyama and Yang Wu and Qingfu Wan and Meysam Madadi and Sergio Escalera and Shile Li and Dongheui Lee and Iason Oikonomidis and Antonis Argyros and Tae-Kyun Kim},
  title = {Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals},
  journal = {CoRR, Arxiv},
  year = {2017},
  url = {http://arxiv.org/abs/1712.03917},
  projects =  {Co4Robots},
  doi = {10.1109/CVPR.2018.00279},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2018_06_CVPR_HandPose.pdf}
}

S. Ciotti, E. Battaglia, I. Oikonomidis, A. Makris, A. Tsoli, A. Bicchi, A.A. Argyros and M. Bianchi, "Synergy-driven Performance Enhancement of Vision-based 3D Hand Pose Reconstruction", In International Conference on Wireless Mobile Communication and Healthcare (MobiHealth 2016), special session on advances in soft wearable technology for mobile-health, EAI, pp. 328-336, Milan, Italy, November 2016.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: In this work we propose, for the first time, to improve the performance of a hand pose reconstruction (HPR) technique from RGBD camera data, which is affected by self-occlusions, leveraging upon postural synergy information, i.e., a priori information on how human most commonly use and shape their hands in everyday life tasks. More specifically, in our approach, we ignore joint angle values estimated with low confidence through a vision-based HPR technique and fuse synergistic information with such incomplete measures. Preliminary experiments are reported showing the effectiveness of the proposed integration.

BibTeX:

@inproceedings{Ciotti2016,
  author = {Ciotti, Simone and Battaglia, Edoardo and Oikonomidis, Iason and Makris, Alexandros and Tsoli, Aggeliki and Bicchi, Antonio and Argyros, Antonis A and Bianchi, Matteo},
  title = {Synergy-driven Performance Enhancement of Vision-based 3D Hand Pose Reconstruction},
  booktitle = {International Conference on Wireless Mobile Communication and Healthcare (MobiHealth 2016), special session on advances in soft wearable technology for mobile-health},
  publisher = {EAI},
  year = {2016},
  month = {November},
  pages = {328--336},
  address = {Milan, Italy},
  projects =  {WEARHAP},
  doi = {10.1007/978-3-319-58877-3},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2016_11_MobiHealth_synergies.pdf}
}

A. Tsoli and A.A. Argyros, "Tracking deformable surfaces that undergo topological changes using an RGB-D camera", In International Conference on 3D Vision (3DV 2016), pp. 333-341, Stanford University, California, USA, October 2016.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present a method for 3D tracking of deformable surfaces with dynamic topology, for instance a paper that undergoes cutting or tearing. Existing template-based methods assume a template of fixed topology. Thus, they fail in tracking deformable objects that undergo topological changes. In our work, we employ a dynamic template (3D mesh) whose topology evolves based on the topological changes of the observed geometry. Our tracking framework deforms the defined template based on three types of constraints (a) the surface of the template has to be registered to the 3D shape of the tracked surface, (b) the template deformation should respect feature (SIFT) correspondences between selected pairs of frames and, (c) the lengths of the template edges should be preserved. The latter constraint is relaxed when an edge is found to lie on a “geometric gap”, that is, when a significant depth discontinuity is detected along this edge. The topology of the template is updated on the fly by removing overstretched edges that lie on a geometric gap. The proposed method has been evaluated quantitatively and qualitatively in both synthetic and real sequences of monocular RGB-D views of surfaces that undergo various types of topological changes. The obtained results show that our approach tracks effectively objects with evolving topology and outperforms state of the art methods in tracking accuracy.

BibTeX:

@inproceedings{Tsoli2016,
  author = {Tsoli, Aggeliki and Argyros, Antonis A},
  title = {Tracking deformable surfaces that undergo topological changes using an RGB-D camera},
  booktitle = {International Conference on 3D Vision (3DV 2016)},
  year = {2016},
  month = {October},
  pages = {333--341},
  address = {Stanford University, California, USA},
  url = {http://www.ics.forth.gr/cvrl/tearing},
  projects =  {WEARHAP,ROBOHOW},
  doi = {10.1109/3DV.2016.42},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2016_10_3dv_tearing.pdf},
  videolink = {https://youtu.be/Hxa7nKUvsso}
}

G. Karvounas, I. Oikonomidis and A.A. Argyros, "Localizing Periodicity in Time Series and Videos", In British Machine Vision Conference (BMVC 2016), BMVA, York, UK, September 2016.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: Periodicity detection is a problem that has received a lot of attention, thus several important tools exist to analyse purely periodic signals. However, in many real world scenarios (time series, videos of human activities, etc) periodic signals appear in the context of non-periodic ones. In this work we propose a method that, given a time series representing a periodic signal that has a non-periodic prefix and tail, estimates the start, the end and the period of the periodic part of the signal. We formulate this as an optimization problem that is solved based on evolutionary optimization techniques. Quantitative experiments on synthetic data demonstrate that the proposed method is successful in localizing the periodic part of a signal and exhibits robustness in the presence of noisy measurements. Also, it does so even when the periodic part of the signal is too short compared to its non-periodic prefix and tail. We also provide quantitative and qualitative results obtained from the application of the proposed method to the problem of unsupervised localization and segmentation of periodic activities in real world videos.

BibTeX:

@inproceedings{Karvounas2016,
  author = {Karvounas, Giorgos and Oikonomidis, Iason and Argyros, Antonis A},
  title = {Localizing Periodicity in Time Series and Videos},
  booktitle = {British Machine Vision Conference (BMVC 2016)},
  publisher = {BMVA},
  year = {2016},
  month = {September},
  address = {York, UK},
  url = {http://users.ics.forth.gr/ argyros/res_periodicity.html},
  projects =  {ACANTO},
  doi = {10.5220/0005800300450052},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2016_09_bmvc_periodicity.pdf},
  videolink = {https://youtu.be/2oa0Y9znH6g}
}

C. Hernandez-Matas, X. Zabulis and A.A. Argyros, "Retinal Image Registration Through Simultaneous Camera Pose and Eye Shape Estimation", In IEEE Engineering in Medicine and Biology Conference (EMBC 2016), IEEE, pp. 3247-3251, Orlando, Florida, USA, August 2016.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: In this paper, a retinal image registration method is proposed. The approach utilizes keypoint correspondences and assumes that the human eye has a spherical or ellipsoidal shape. The image registration problem amounts to solving a camera 3D pose estimation problem and, simultaneously, an eye 3D shape estimation problem. The camera pose estimation problem is solved by estimating the relative pose between the views from which the images were acquired. The eye shape estimation problem parameterizes the shape and orientation of an ellipsoidal model for the eye. Experimental evaluation shows 17.91% reduction of registration error and 47.52% reduction of the error standard deviation over state of the art methods.

BibTeX:

@inproceedings{Hernandez-Matas2016,
  author = {Hernandez-Matas, Carlos and Zabulis, Xenophon and Argyros, Antonis A},
  title = {Retinal Image Registration Through Simultaneous Camera Pose and Eye Shape Estimation},
  booktitle = {IEEE Engineering in Medicine and Biology Conference (EMBC 2016)},
  publisher = {IEEE},
  year = {2016},
  month = {August},
  pages = {3247--3251},
  address = {Orlando, Florida, USA},
  projects =  {REVAMMAD},
  doi = {10.1109/EMBC.2016.7591421},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2016_08_embc_ellipsoidal.pdf}
}

G. Park, A.A. Argyros and W. Woo, "Efficient 3D Hand Tracking in Articulation Subspaces for the Manipulation of Virtual Objects", In Computer Graphics International (CGI 2016), ACM, pp. 33-36, Hersonissos, Crete, Greece, June 2016.
[Abstract] [BibTeX] [DOI] [PDF] [VIDEO]

Abstract: We propose an efficient method for model-based 3D tracking of hand articulations observed from an egocentric viewpoint that aims at supporting the manipulation of virtual objects. Previous model-based approaches optimize non-convex objective functions defined in the 26 Degrees of Freedom (DoFs) space of possible hand articulations. In our work, we decompose this space into six lower dimensional spaces (6 DoFs for the palm and 4 DoFs for each finger). We also label each finger with a Gaussian model that is propagated between successive image frames. As confirmed by a number of experiments, this divide-and-conquer approach tracks hand articulations more accurately than existing model-based approaches. At the same time, real time performance is achieved without the need of GPGPU processing. Additional experiments show that the proposed approach is preferable for supporting the accurate manipulation of virtual objects in VR/AR scenarios.

BibTeX:

@inproceedings{Park2016,
  author = {Park, Gabyong and Argyros, Antonis A and Woo, Woontack},
  title = {Efficient 3D Hand Tracking in Articulation Subspaces for the Manipulation of Virtual Objects},
  booktitle = {Computer Graphics International (CGI 2016)},
  publisher = {ACM},
  year = {2016},
  month = {June},
  pages = {33--36},
  address = {Hersonissos, Crete, Greece},
  projects =  {WEARHAP},
  doi = {10.1145/2949035.2949044},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2016_06_CGI_park.pdf},
  videolink = {https://youtu.be/9sU2HAvnCKQ}
}

C. Panagiotakis and A.A. Argyros, "Parameter-free modelling of 2D shapes with ellipses", Pattern Recognition, Elsevier, vol. 53, pp. 259-275, May 2016.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Our goal is to represent a given 2D shape with an automatically determined number of ellipses, so that the total area covered by the ellipses is equal to the area of the original shape without any assumption or prior knowledge about the object structure. To solve this interesting theoretical problem, first we employ the skeleton of the 2D shape which provides important information on the parameters of the ellipses that could approximate the original shape. For a given number of such ellipses, the hard Expectation-Maximization (EM) algorithm is employed to maximise the shape coverage under the equal area constraint. Different models (i.e., solutions involving different numbers of ellipses) are evaluated based on the Akaike Information Criterion (AIC). This considers a novel, entropy-based shape complexity measure that balances the model complexity and the model approximation error. In order to minimise the AIC criterion, two variants are proposed and evaluated: (a) the augmentative method that gradually increases the number of considered ellipses starting from a single one and, (b) the decremental method that decreases the number of ellipses starting from a large, automatically defined set. The obtained quantitative results on more than 4,000 2D shapes included in standard as well as in custom datasets, quantify the performance of the proposed methods and illustrate that their solutions agree with human intuition.

BibTeX:

@article{Panagiotakis2016,
  author = {Panagiotakis, Costas and Argyros, Antonis A},
  title = {Parameter-free modelling of 2D shapes with ellipses},
  journal = {Pattern Recognition},
  publisher = {Elsevier},
  year = {2016},
  month = {May},
  volume = {53},
  pages = {259--275},
  url = {https://sites.google.com/site/costaspanagiotakis/research/EFA},
  projects =  {WEARHAP},
  doi = {10.1016/j.patcog.2015.11.004},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_journal_pr_ellipses.pdf}
}

P. Panteleris and A.A. Argyros, "Monitoring and Interpreting Human Motion to Support Clinical Applications of a Smart Walker", In Workshop on Human Motion Analysis for Healthcare Applications (HMAHA 2016), IET, London, UK, May 2016.
[Abstract] [BibTeX] [PDF] [URL] [VIDEO]

Abstract: We present two proof-of-concept applications of human motion perception technologies for automating clinical tests performed by the FriWalk smart walker. FriWalk is a robotic walker currently being developed in the context of the EU H2020 project ACANTO that is designed to operate in a personal or a clinical mode. The goal of the clinical FriWalk is to support/automate clinical tests and rehabilitation of patients with mobility problems.

BibTeX:

@inproceedings{Panteleris2016,
  author = {Panteleris, Paschalis and Argyros, Antonis A},
  title = {Monitoring and Interpreting Human Motion to Support Clinical Applications of a Smart Walker},
  booktitle = {Workshop on Human Motion Analysis for Healthcare Applications (HMAHA 2016)},
  publisher = {IET},
  year = {2016},
  month = {May},
  address = {London, UK},
  url = {http://users.ics.forth.gr/ argyros/res_fhbt.html},
  projects =  {ACANTO},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2016_05_IETWorkshop_acanto.pdf},
  videolink = {https://youtu.be/YkyJsIW7FVM}
}

M. Foukarakis, I. Adami, D. Ioannidi, A. Leonidis, D. Michel, A. Qammaz, K. Papoutsakis, M. Antona and A.A. Argyros, "A Robot-based Application for Physical Exercise Training", In International Conference on Information and Communication Technologies for Ageing Well and e-Health (ICT4AWE 2016), Scitepress, pp. 45-52, Rome, Italy, April 2016.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: According to studies, performing physical exercise is beneficial for reducing the risk of falling in the elderly and prolonging their stay at home. In addition, regular exercising helps cognitive function and increases positive behaviour for seniors with cognitive impairment and dementia. In this paper, a fitness application integrated into a service robot is presented. Its aim is to motivate the users to perform physical training by providing relevant exercises and useful feedback on their progress. The application utilizes the robot vision system to track and recognize user movements and activities and supports multimodal interaction with the user. The pa-per describes the design challenges, the system architecture, the user interface and the human motion capturing module. Additionally, it discusses some results from user testing in laboratory and home-based trials.

BibTeX:

@inproceedings{Foukarakis2016,
  author = {Foukarakis, Michael and Adami, Ilia and Ioannidi, Danae and Leonidis, Asterios and Michel, Damien and Qammaz, Ammar and Papoutsakis, Konstantinos and Antona, Margherita and Argyros, Antonis A},
  title = {A Robot-based Application for Physical Exercise Training},
  booktitle = {International Conference on Information and Communication Technologies for Ageing Well and e-Health (ICT4AWE 2016)},
  publisher = {Scitepress},
  year = {2016},
  month = {April},
  pages = {45--52},
  address = {Rome, Italy},
  url = {http://users.ics.forth.gr/ argyros/res_fhbt.html},
  projects =  {HOBBIT},
  doi = {10.5220/0005800300450052},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2016_04_ICT4AWE_fitness.pdf}
}

R. Jimenez, M. Patino, V. Vianello, I. Brondino, R. Vilaca, J. Teixeira, M. Biscaia, G. Drossis, D. Michel, C. Birliraki, G. Margetis, A. Argyros, C. Stephanidis, L. Sgaglione, G. Papale, G. Mazzeo, F. Campanile, M. Sole, V. Muntés-Mulero, D. Solans, A. Huelamo, P. Kranas, D. Varvarigou, V. Moulos and F. Aisopos, "Scalable and Efficient Big Data Analytics - The LeanBigData Approach", In European Space project on Smart Systems, Big Data, Future Internet - Towards Serving the Grand Societal Challenges, SciTePress, pp. 92-111, Rome, Italy, April 2016.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: One of the major problems in enterprise data management lies in the separation of databases between operational databases and data warehouses. This separation is motivated by the different capabilities of OLTP and OLAP data management systems. Due to this separation copies from the operational databases to the data warehouses should be performed periodically. These copies are performed by a process call Extract-Transform-Load (ETL) that turns out to amount to 80% of the budget of performing business analytics. LeanBigData main goal has been to address this major pain by providing a real-time big data platform providing both functions, OLTP and OLAP, in a single data management solution. The way to achieve this goal has been to leverage an ultra-scalable OLTP database, LeanXcale, and develop a new OLAP engine that works directly over the operational data. The platform is based on a novel storage engine that provides extreme levels of efficiency. The platform has also an integrated para llel-distributed CEP that scales the processing of streaming data and that can be combined with the processing of data at rest at the new OLTP+OLAP database to address a wide variety of data management problems. LeanBigData has a bigger vision and aims at providing and end-to-end analytics platform. This platform provides a visual workbench that enables data scientist to perform discovery of new insights. The platform is also enriched with a subsystem that performs anomaly detection and root cause analysis that works with the new developed system and enables to perform this analysis over streaming data. The LeanBigData platform has been validated by four real-world use case scenarios cloud data centre monitoring, fraud detection in direct debit operations, sentiment analysis in social networks and targeted advertisement.

BibTeX:

@inproceedings{lbd2016,
  author = {Ricardo Jimenez and Marta Patino and Valerio Vianello and Ivan Brondino and Ricardo Vilaca and Jorge Teixeira and Miguel Biscaia and Giannis Drossis and Damien Michel and Chryssi Birliraki and George Margetis and Antonis Argyros and Constantine Stephanidis and Luigi Sgaglione and Gaetano Papale and Giavanni Mazzeo and Ferdinando Campanile and Marc Sole and Victor Muntés-Mulero and David Solans and Alberto Huelamo and Pavlos Kranas and Dora Varvarigou and Vrettos Moulos and Fotis Aisopos},
  title = {Scalable and Efficient Big Data Analytics - The LeanBigData Approach},
  booktitle = {European Space project on Smart Systems, Big Data, Future Internet - Towards Serving the Grand Societal Challenges},
  publisher = {SciTePress},
  year = {2016},
  month = {April},
  pages = {92--111},
  address = {Rome, Italy},
  projects =  {LeanBigData},
  doi = {10.5220/0007903100920111},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2016_04_lbd.pdf}
}

D. Michel and A.A. Argyros, "Apparatuses, methods and systems for recovering a 3-dimensional skeletal model of the human body", United States Patent No 20160086350, Filed: 22 September, 2015, Published: 24 March, 2016.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: The ARS offers tracking, estimation of position, orientation and full articulation of the human body from marker-less visual observations obtained by a camera, for example an RGBD camera. An ARS may provide hypotheses of the 3D configuration of body parts or the entire body from a single depth frame. The ARS may also propagates estimations of the 3D configuration of body parts and the body by mapping or comparing data from the previous frame and the current frame. The ARS may further compare the estimations and the hypotheses to provide a solution for the current frame. An ARS may select, merge, refine, and/or otherwise combine data from the estimations and the hypotheses to provide a final estimation corresponding to the 3D skeletal data and may apply the final estimation data to capture parameters associated with a moving or still body.

BibTeX:

@patent{Michel2016,
  author = {Michel, Damien and Argyros, Antonis A}, 
  title = {Apparatuses, methods and systems for recovering a 3-dimensional skeletal model of the human body}, 
  nationality = {United States}, 
  number = {20160086350}, 
  year = {2016}, 
  month = {March}, 
  day = {24}, 
  yearfiled = {2015}, 
  monthfiled = {September}, 
  dayfiled = {22}, 
  url = {http://users.ics.forth.gr/ argyros/res_fhbt.html}, 
  projects =  {WEARHAP,ROBOHOW,ACANTO},
  doi = {http://www.freepatentsonline.com/y2016/0086350.html},
  pdflink = {http://www.freepatentsonline.com/20160086350.pdf},
  videolink = {https://youtu.be/ZKlC9PA1IDg}
}

D. Michel, K. Papoutsakis and A.A. Argyros, "Gesture Recognition Apparatuses, Methods and Systems for Human-Machine Interaction", United States Patent No 20160078289, Filed: 16 September, 2015, Published: 17 March, 2016.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: The GESTURE RECOGNITION APPARATUSES, METHODS AND SYSTEMS FOR HUMAN-MACHINE INTERACTION (GRA) discloses vision-based gesture recognition. GRA can be implemented in any application involving tracking, detection and/or recognition of gestures or motion in general. Disclosed methods and systems consider a gestural vocabulary of a predefined number of user specified static and/or dynamic hand gestures that are mapped with a database to convey messages. In one implementation, the disclosed systems and methods support gesture recognition by detecting and tracking body parts, such as arms, hands and fingers, and by performing spatio-temporal segmentation and recognition of the set of predefined gestures, based on data acquired by an RGBD sensor. In one implementation, a model of the hand is employed to detect hand and finger candidates. At a higher level, hand posture models are defined and serve as building blocks to recognize gestures.

BibTeX:

@patent{Michel2016a,
  author = {Michel, Damien and Papoutsakis, Konstantinos and Argyros, Antonis A}, 
  title = {Gesture Recognition Apparatuses, Methods and Systems for Human-Machine Interaction}, 
  nationality = {United States}, 
  number = {20160078289}, 
  year = {2016}, 
  month = {March}, 
  day = {17}, 
  yearfiled = {2015}, 
  monthfiled = {September}, 
  dayfiled = {16}, 
  url = {http://users.ics.forth.gr/ argyros/res_gesturesforHRI.html}, 
  projects =  {ROBOHOW,HOBBIT,ACANTO},
  doi = {http://www.freepatentsonline.com/y2016/0078289.html},
  pdflink = {http://www.freepatentsonline.com/20160078289.pdf},
  videolink = {https://youtu.be/eIIzgjG2V7A}
}

D. Fischinger, P. Einramhof, K. Papoutsakis, W. Wohlkinger, P. Mayer, P. Panek, S. Hofmann, T. Koertner, A. Weiss, A.A. Argyros and others, "Hobbit, a care robot supporting independent living at home: First prototype and lessons learned", Robotics and Autonomous Systems, Elsevier, vol. 75, no. A, pp. 60-78, January 2016.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: One option to address the challenge of demographic transition is to build robots that enable aging in place. Falling has been identified as the most relevant factor to cause a move to a care facility. The Hobbit project combines research from robotics, gerontology, and human–robot interaction to develop a care robot which is capable of fall prevention and detection as well as emergency detection and handling. Moreover, to enable daily interaction with the robot, other functions are added, such as bringing objects, offering reminders, and entertainment. The interaction with the user is based on a multimodal user interface including automatic speech recognition, text-to-speech, gesture recognition, and a graphical touch-based user interface. We performed controlled laboratory user studies with a total of 49 participants (aged 70 plus) in three EU countries (Austria, Greece, and Sweden). The collected user responses on perceived usability, acceptance, and affordability of the robot demonstrate a positive reception of the robot from its target user group. This article describes the principles and system components for navigation and manipulation in domestic environments, the interaction paradigm and its implementation in a multimodal user interface, the core robot tasks, as well as the results from the user studies, which are also reflected in terms of lessons we learned and we believe are useful to fellow researchers.

BibTeX:

@article{Fischinger2016,
  author = {Fischinger, David and Einramhof, Peter and Papoutsakis, Konstantinos and Wohlkinger, Walter and Mayer, Peter and Panek, Paul and Hofmann, Stefan and Koertner, Tobias and Weiss, Astrid and Argyros, Antonis A and others},
  title = {Hobbit, a care robot supporting independent living at home: First prototype and lessons learned},
  journal = {Robotics and Autonomous Systems},
  publisher = {Elsevier},
  year = {2016},
  month = {January},
  volume = {75},
  number = {A},
  pages = {60--78},
  url = {http://users.ics.forth.gr/ argyros/res_gesturesforHRI.html},
  projects =  {HOBBIT},
  doi = {http://www.sciencedirect.com/science/article/pii/S0921889014002140},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2016_journal_RAS_HOBBIT.pdf}
}

N. Kyriazis and A.A. Argyros, "3D Tracking of Hands Interacting with Several Objects", In IEEE International Conference on Computer Vision Workshops (OUI 2015 - ICCVW 2015), IEEE, Santiago, Chile, November 2015.
[BibTeX] [PDF] [URL] [VIDEO]

BibTeX:

@inproceedings{Kyriazis2015,
  author = {Kyriazis, Nikolaos and Argyros, Antonis A},
  title = {3D Tracking of Hands Interacting with Several Objects},
  booktitle = {IEEE International Conference on Computer Vision Workshops (OUI 2015 - ICCVW 2015)},
  publisher = {IEEE},
  year = {2015},
  month = {November},
  address = {Santiago, Chile},
  url = {http://cvrlcode.ics.forth.gr/handtracking/},
  projects =  {ROBOHOW,WEARHAP},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_12_OUI_mbv.pdf},
  videolink = {https://youtu.be/SCOtBdhDMKg}
}

P. Panteleris, N. Kyriazis and A.A. Argyros, "Recovering 3D models of manipulated objects through 3D tracking of hand-object interaction", In IEEE International Conference on Computer Vision Workshops (OUI 2015 - ICCVW 2015), IEEE, Santiago, Chile, November 2015.
[BibTeX] [PDF] [URL] [VIDEO]

BibTeX:

@inproceedings{Panteleris2015,
  author = {Panteleris, Paschalis and Kyriazis, Nikolaos and Argyros, Antonis A},
  title = {Recovering 3D models of manipulated objects through 3D tracking of hand-object interaction},
  booktitle = {IEEE International Conference on Computer Vision Workshops (OUI 2015 - ICCVW 2015)},
  publisher = {IEEE},
  year = {2015},
  month = {November},
  address = {Santiago, Chile},
  url = {http://users.ics.forth.gr/ argyros/res_handunknownobject.html},
  projects =  {ROBOHOW},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_12_OUI_padeler.pdf},
  videolink = {https://youtu.be/9r43PtJ0Fwg}
}

T.-H. Pham, A. Kheddar, A. Qammaz and A.A. Argyros, "Capturing and Reproducing Hand-Object Interactions Through Vision-Based Force Sensing", In IEEE International Conference on Computer Vision Workshops (OUI 2015 - ICCVW 2015), IEEE, Santiago, Chile, November 2015.
[BibTeX] [PDF] [URL] [VIDEO]

BibTeX:

@inproceedings{Pham2015,
  author = {Pham, Tu-Hoa and Kheddar, Abderahmanne and Qammaz, Ammar and Argyros, Antonis A},
  title = {Capturing and Reproducing Hand-Object Interactions Through Vision-Based Force Sensing},
  booktitle = {IEEE International Conference on Computer Vision Workshops (OUI 2015 - ICCVW 2015)},
  publisher = {IEEE},
  year = {2015},
  month = {November},
  address = {Santiago, Chile},
  url = {http://users.ics.forth.gr/ argyros/res_fsv.html},
  projects =  {ROBOHOW},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_12_OUI_FSV.pdf},
  videolink = {https://youtu.be/C4k-FPWM1t0}
}

N. Kyriazis, I. Oikonomidis, P. Panteleris, D. Michel, A. Qammaz, A. Makris, K. Tzevanidis, P. Douvantzis, K. Roditakis and A.A. Argyros, "A Generative Approach to Tracking Hands and Their Interaction with Objects", In Man-Machine Interactions 4 - International Conference on Man-Machine Interactions (ICMMI 2015), Springer, pp. 19-28, Kocierz, Poland, October 2015.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Markerless 3D tracking of hands in action or in interaction with objects provides rich information that can be used to interpret a number of human activities. In this paper, we review a number of relevant methods we have proposed. All of them focus on hands, objects and their interaction and follow a generative approach. The major strength of such an approach is the straightforward fashion in which arbitrarily complex priors can be easily incorporated towards solving the tracking problem and their capability to generalize to greater and/or different domains. The proposed generative approach is implemented in a single, unified computational framework.

BibTeX:

@inproceedings{Kyriazis2015a,
  author = {Kyriazis, Nikolaos and Oikonomidis, Iason and Panteleris, Paschalis and Michel, Damien and Qammaz, Ammar and Makris, Alexandros and Tzevanidis, Konstantinos and Douvantzis, Petros and Roditakis, Konstantinos and Argyros, Antonis A},
  title = {A Generative Approach to Tracking Hands and Their Interaction with Objects},
  booktitle = {Man-Machine Interactions 4 - International Conference on Man-Machine Interactions (ICMMI 2015)},
  publisher = {Springer},
  year = {2015},
  month = {October},
  pages = {19--28},
  address = {Kocierz, Poland},
  url = {http://cvrlcode.ics.forth.gr/handtracking/},
  projects =  {ROBOHOW,WEARHAP},
  doi = {10.1007/978-3-319-23437-3_2},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_10_ICMMI_keynote.pdf}
}

K. Paliouras and A.A. Argyros, "Towards the Automatic Definition of the Objective Function for Model-Based 3D Hand Tracking", In Man-Machine Interactions 4 - International Conference on Man-Machine Interactions (ICMMI 2015), Springer, pp. 353-363, Kocierz, Poland, October 2015.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Recently, model-based approaches have produced very promising results to the problems of 3D hand tracking. The current state of the art method recovers the 3D position, orientation and 20 DOF articulation of a human hand from markerless visual observations obtained by an RGB-D sensor. Hand pose estimation is formulated as an optimization problem, seeking for the hand model parameters that minimize an objective function that quantifies the discrepancy between the appearance of hand hypotheses and the actual hand observation. The design of such a function is a complicated process that requires a lot of prior experience with the problem. In this paper we automate the definition of the objective function in such optimization problems. First, a set of relevant, candidate image features is computed. Then, given synthetic data sets with ground truth information, regression analysis is used to combine these features in an objective function that seeks to maximize optimization performance. Extensive experiments study the performance of the proposed approach based on various dataset generation strategies and feature selection techniques.

BibTeX:

@inproceedings{Paliouras2015,
  author = {Konstantinos Paliouras and Argyros, Antonis A},
  title = {Towards the Automatic Definition of the Objective Function for Model-Based 3D Hand Tracking},
  booktitle = {Man-Machine Interactions 4 - International Conference on Man-Machine Interactions (ICMMI 2015)},
  publisher = {Springer},
  year = {2015},
  month = {October},
  pages = {353--363},
  address = {Kocierz, Poland},
  projects =  {ROBOHOW},
  doi = {10.1007/978-3-319-23437-3_30},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_10_ICMMI_auto_objfun.pdf}
}

A. Makris and A.A. Argyros, "Model-based 3D Hand Tracking with on-line Shape Adaptation", In British Machine Vision Conference (BMVC 2015), BMVA, pp. 77-1, Swansea, UK, September 2015.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: One of the shortcomings of the existing model-based 3D hand tracking methods is the fact that they consider a fixed hand model, i.e. one with fixed shape parameters. In this work we propose an online model-based method that tackles jointly the hand pose tracking and the hand shape estimation problems. The hand pose is estimated using a hierarchical particle filter. The hand shape is estimated by fitting the shape model parameters over the observations in a frame history. The candidate shapes required by the fitting framework are obtained by optimizing the shape parameters independently in each frame. Extensive experiments demonstrate that the proposed method tracks the pose of the hand and estimates its shape parameters accurately, even under heavy noise and inaccurate shape initialization.

BibTeX:

@inproceedings{Makris2015,
  author = {Makris, Alexandros and Argyros, Antonis A},
  title = {Model-based 3D Hand Tracking with on-line Shape Adaptation},
  booktitle = {British Machine Vision Conference (BMVC 2015)},
  publisher = {BMVA},
  year = {2015},
  month = {September},
  pages = {77--1},
  address = {Swansea, UK},
  url = {http://users.ics.forth.gr/ argyros/res_handadaptation.html},
  projects =  {ROBOHOW},
  doi = {10.5244/C.29.77},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_09_BMVC_adaptivehandtracking.pdf},
  videolink = {https://youtu.be/4dgwoKkDSn8}
}

P. Panteleris, N. Kyriazis and A.A. Argyros, "3D Tracking of Human Hands in Interaction with Unknown Objects", In British Machine Vision Conference (BMVC 2015), BMVA, pp. 123-1, Swansea, UK, September 2015.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: The analysis and the understanding of object manipulation scenarios based on computer vision techniques can be greatly facilitated if we can gain access to the full articulation of the manipulating hands and the 3D pose of the manipulated objects. Currently, there exist methods for tracking hands in interaction with objects whose 3D models are known. There are also methods that can reconstruct 3D models of objects that are partially observable in each frame of a sequence. However, to the best of our knowledge, no method can track hands in interaction with unknown objects. In this paper we propose such a method. Experimental results show that hand tracking can be achieved with an accuracy that is comparable to the one obtained by methods that assume knowledge of the object models. Additionally, as a by-product, the proposed method delivers accurate 3D models of the manipulated objects.

BibTeX:

@inproceedings{Panteleris2015a,
  author = {Panteleris, Paschalis and Kyriazis, Nikolaos and Argyros, Antonis A},
  title = {3D Tracking of Human Hands in Interaction with Unknown Objects},
  booktitle = {British Machine Vision Conference (BMVC 2015)},
  publisher = {BMVA},
  year = {2015},
  month = {September},
  pages = {123--1},
  address = {Swansea, UK},
  url = {http://users.ics.forth.gr/ argyros/res_handunknownobject.html},
  projects =  {ROBOHOW},
  doi = {10.5244/C.29.123},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_09_BMVC_unknownobjects.pdf},
  videolink = {https://youtu.be/9r43PtJ0Fwg}
}

G. Poier, K. Roditakis, S. Schulter, D. Michel, H. Bischof and A.A. Argyros, "Hybrid One-Shot 3D Hand Pose Estimation by Exploiting ,Uncertainties", In British Machine Vision Conference (BMVC 2015), also available at CoRR, arXiv, BMVA, pp. 182-1, Swansea, UK, September 2015.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Model-based approaches to 3D hand tracking have been shown to perform well in a wide range of scenarios. However, they require initialisation and cannot recover easily from tracking failures that occur due to fast hand motions. Data-driven approaches, on the other hand, can quickly deliver a solution, but the results often suffer from lower accuracy or missing anatomical validity compared to those obtained from model-based approaches. In this work we propose a hybrid approach for hand pose estimation from a single depth image. First, a learned regressor is employed to deliver multiple initial hypotheses for the 3D position of each hand joint. Subsequently, the kinematic parameters of a 3D hand model are found by deliberately exploiting the inherent uncertainty of the inferred joint proposals. This way, the method provides anatomically valid and accurate solutions without requiring manual initialisation or suffering from track losses. Quantitative results on several standard datasets demonstrate that the proposed method
outperforms state-of-the-art representatives of the model-based, data-driven and hybrid paradigms.

BibTeX:

@inproceedings{Poier2015,
  author = {Poier, Georg and Roditakis, Konstantinos and Schulter, Samuel and Michel, Damien and Bischof, Horst and Argyros, Antonis A},
  title = {Hybrid One-Shot 3D Hand Pose Estimation by Exploiting ,Uncertainties},
  booktitle = {British Machine Vision Conference (BMVC 2015), also available at CoRR, arXiv},
  publisher = {BMVA},
  year = {2015},
  month = {September},
  pages = {182--1},
  address = {Swansea, UK},
  url = {https://lrs.icg.tugraz.at/research/hybridhape/},
  projects =  {WEARHAP},
  doi = {10.5244/C.29.182},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_09_BMVC_hybrid.pdf}
}

A. Qammaz, N. Kyriazis and A.A. Argyros, "Boosting the Performance of Model-based 3D Tracking by Employing Low Level Motion Cues", In British Machine Vision Conference (BMVC 2015), BMVA, pp. 144-1, Swansea, UK, September 2015.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: 3D tracking of objects and hands in an object manipulation scenario is a very interesting computer vision problem with a wide variety of applications ranging from consumer electronics to robotics and medicine. Recent advances in this research topic allow for 3D tracking of complex scenarios involving bimanual manipulation of several rigid objects using commodity hardware and with high accuracy. The problem with these approaches is that they treat tracking as a search problem whose dimensionality increases with the number of objects in the scene. This fact typically limits the number of the tracked objects and/or the processing framerate. In this paper we present a method that utilizes simple low level motion cues for dynamically assigning computational resources to parts of the scene where they are actually required. In a series of experiments, we show that this simple idea improves tracking performance dramatically at a cost of only a minor degradation of tracking accuracy.

BibTeX:

@inproceedings{Qammaz2015,
  author = {Qammaz, Ammar and Kyriazis, Nikolaos and Argyros, Antonis A},
  title = {Boosting the Performance of Model-based 3D Tracking by Employing Low Level Motion Cues},
  booktitle = {British Machine Vision Conference (BMVC 2015)},
  publisher = {BMVA},
  year = {2015},
  month = {September},
  pages = {144--1},
  address = {Swansea, UK},
  url = {http://users.ics.forth.gr/ argyros/res_ect.html},
  projects =  {ROBOHOW,WEARHAP},
  doi = {10.5244/C.29.144},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_09_BMVC_fect.pdf},
  videolink = {https://youtu.be/nPru6PpWrK4}
}

C. Hernandez-Matas, X. Zabulis and A.A. Argyros, "Retinal Image Registration Based on Keypoint Correspondences, Spherical Eye Modeling and Camera Pose Estimation", In IEEE Engineering in Medicine and Biology Conference (EMBC 2015), IEEE, pp. 5650-5654, Milan, Italy, August 2015.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: In this work, an image registration method for two retinal images is proposed. The proposed method utilizes keypoint correspondences and assumes a spherical model of the eye. Image registration is treated as a pose estimation problem, which requires estimation of the rigid transformation that relates the two images. Using this estimate, one image can be warped so that it is registered to the coordinate frame of the other. Experimental evaluation shows improved accuracy over state-of-the-art approaches as well as robustness to noise and spurious keypoint correspondences. Experiments also indicate the method's applicability to diagnostic image enhancement and comparative analysis of images from different examinations.

BibTeX:

@inproceedings{Hernandez-Matas2015,
  author = {Hernandez-Matas, Carlos and Zabulis, Xenophon and Argyros, Antonis A},
  title = {Retinal Image Registration Based on Keypoint Correspondences, Spherical Eye Modeling and Camera Pose Estimation},
  booktitle = {IEEE Engineering in Medicine and Biology Conference (EMBC 2015)},
  publisher = {IEEE},
  year = {2015},
  month = {August},
  pages = {5650--5654},
  address = {Milan, Italy},
  projects =  {REVAMMAD},
  doi = {10.1109/EMBC.2015.7319674},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_09_EMBC_RetinalImageRegistration.pdf}
}

K. Roditakis and A.A. Argyros, "Quantifying the Effect of a Colored Glove in the 3D Tracking of a Human Hand", In International Conference on Computer Vision Systems (ICVS 2015), Springer, pp. 404-414, Copenhagen, Denmark, July 2015.
[Abstract] [BibTeX] [DOI] [PDF] [VIDEO]

BibTeX:

@inproceedings{Roditakis2015,
  author = {Konstantinos Roditakis and Argyros, Antonis A},
  title = {Quantifying the Effect of a Colored Glove in the 3D Tracking of a Human Hand},
  booktitle = {International Conference on Computer Vision Systems (ICVS 2015)},
  publisher = {Springer},
  year = {2015},
  month = {July},
  pages = {404--414},
  address = {Copenhagen, Denmark},
  projects =  {WEARHAP},
  doi = {10.1007/978-3-319-20904-3_36},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_07_ICVS_colorglove.pdf},
  videolink = {https://youtu.be/9nkHIgFYtzE}
}

A. Makris, N. Kyriazis and A.A. Argyros, "Hierarchical particle filtering for 3d hand tracking", In IEEE Computer Vision and Pattern Recognition Workshops (HANDS 2015 - CVPRW 2015), IEEE, pp. 8-17, Boston, USA, June 2015.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present a fast and accurate 3D hand tracking method which relies on RGB-D data. The method follows a model based approach using a hierarchical particle filter variant to track the model's state. The filter estimates the probability density function of the state's posterior. As such, it has increased robustness to observation noise and compares favourably to existing methods that can be trapped in local minima resulting in track loses. The data likelihood term is calculated by measuring the discrepancy between the rendered 3D model and the observations. Extensive experiments with real and simulated data show that hand tracking is achieved at a frame rate of 90fps with less that 10mm average error using a GPU implementation, thus comparing favourably to the state of the art in terms of both speed and tracking accuracy.

BibTeX:

@inproceedings{Makris2015a,
  author = {Makris, Alexandros and Kyriazis, Nikolaos and Argyros, Antonis A},
  title = {Hierarchical particle filtering for 3d hand tracking},
  booktitle = {IEEE Computer Vision and Pattern Recognition Workshops (HANDS 2015 - CVPRW 2015)},
  publisher = {IEEE},
  year = {2015},
  month = {June},
  pages = {8--17},
  address = {Boston, USA},
  url = {http://users.ics.forth.gr/ argyros/res_hmf.html},
  projects =  {WEARHAP, ROBOHOW},
  doi = {10.1109/CVPRW.2015.7301343},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_06_HANDS_hmf.pdf},
  videolink = {https://youtu.be/DR8YUOAM3QM}
}

T.-H. Pham, A. Kheddar, A. Qammaz and A.A. Argyros, "Towards force sensing from vision: Observing hand-object interactions to infer manipulation forces", In IEEE Computer Vision and Pattern Recognition (CVPR 2015), IEEE, pp. 2810-2819, Boston, USA, June 2015.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present a novel, non-intrusive approach for estimating contact forces during hand-object interactions relying solely on visual input provided by a single RGB-D camera. We consider a manipulated object with known geometrical and physical properties. First, we rely on model-based visual tracking to estimate the object's pose together with that of the hand manipulating it throughout the motion. Following this, we compute the object's first and second order kinematics using a new class of numerical differentiation operators. The estimated kinematics is then instantly fed into a second-order cone program that returns a minimal force distribution explaining the observed motion. However, humans typically apply more forces than mechanically required when manipulating objects. Thus, we complete our estimation method by learning these excessive forces and their distribution among the fingers in contact. We provide a full validity analysis of the proposed method by evaluating it based on ground truth data from additional sensors such as accelerometers, gyroscopes and pressure sensors. Experimental results show that force sensing from vision (FSV) is indeed feasible.

BibTeX:

@inproceedings{Pham2015a,
  author = {Pham, Tu-Hoa and Kheddar, Abderrahmane and Qammaz, Ammar and Argyros, Antonis A},
  title = {Towards force sensing from vision: Observing hand-object interactions to infer manipulation forces},
  booktitle = {IEEE Computer Vision and Pattern Recognition (CVPR 2015)},
  publisher = {IEEE},
  year = {2015},
  month = {June},
  pages = {2810--2819},
  address = {Boston, USA},
  url = {http://users.ics.forth.gr/ argyros/res_fsv.html},
  projects =  {ROBOHOW},
  doi = {10.1109/CVPR.2015.7298898},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_06_CVPR_fsv.pdf},
  videolink = {https://youtu.be/C4k-FPWM1t0}
}

L. Palopoli, A.A. Argyros, J. Birchbauer, A. Colombo, D. Fontanelli, A. Legay, A. Garulli, A. Giannitrapani, D. Macii, F. Moro and others, "Navigation assistance and guidance of older adults across complex public spaces: the DALi approach", Intelligent Service Robotics, Springer, vol. 8, no. 2, pp. 77-92, April 2015.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: The Devices for Assisted Living (DALi) project is a research initiative sponsored by the European Commission under the FP7 programme aiming for the development of a robotic device to assist people with cognitive impairments in navigating complex environments. The project revisits the popular paradigm of the walker enriching it with sensing abilities (to perceive the environment), with cognitive abilities (to decide the best path across the space) and with mechanical, visual, acoustic and haptic guidance devices (to guide the person along the path). In this paper, we offer an overview of the developed system and describe in detail some of its most important technological aspects.

BibTeX:

@article{Palopoli2015,
  author = {Palopoli, Luigi and Argyros, Antonis A and Birchbauer, Josef and Colombo, Alessio and Fontanelli, Daniele and Legay, Axel and Garulli, Andrea and Giannitrapani, Antonello and Macii, David and Moro, Federico and others},
  title = {Navigation assistance and guidance of older adults across complex public spaces: the DALi approach},
  journal = {Intelligent Service Robotics},
  publisher = {Springer},
  year = {2015},
  month = {April},
  volume = {8},
  number = {2},
  pages = {77--92},
  url = {http://users.ics.forth.gr/ argyros/res_slammot.html},
  projects =  {DALI},
  doi = {10.1007/s11370-015-0169-y},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_journal_IJSR_DALi.pdf}
}

D. Michel, C. Panagiotakis and A.A. Argyros, "Tracking the articulated motion of the human body with two RGBD cameras", Machine Vision Applications, Springer, vol. 26, no. 1, pp. 41-54, 2015.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present a model-based, top-down solution to the problem of tracking the 3D position, orientation and full articulation of the human body from markerless visual observations obtained by two synchronized RGBD cameras. Inspired by recent advances to the problem of model-based hand tracking Oikonomidis et al. (Efficient Model-based 3D Tracking of Hand Articulations using Kinect, 2011), we treat human body tracking as an optimization problem that is solved using stochastic optimization techniques. We show that the proposed approach outperforms in accuracy state of the art methods that rely on a single RGBD camera. Thus, for applications that require increased accuracy and can afford the extra-complexity introduced by the second sensor, the proposed approach constitutes a viable solution to the problem of markerless human motion tracking. Our findings are supported by an extensive quantitative evaluation of the method that has been performed on a publicly available data set that is annotated with ground truth.

BibTeX:

@article{Michel2015,
  author = {Michel, Damien and Panagiotakis, Costas and Argyros, Antonis A},
  title = {Tracking the articulated motion of the human body with two RGBD cameras},
  journal = {Machine Vision Applications},
  publisher = {Springer},
  year = {2015},
  volume = {26},
  number = {1},
  pages = {41--54},
  url = {http://users.ics.forth.gr/ argyros/res_humanbody_two_kinects.html},
  projects =  {HOBBIT,ERASITECHNIS},
  doi = {10.1007/s00138-014-0651-0},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_journal_MVA_humanbody.pdf},
  videolink = {https://youtu.be/n5irgHVuFwc}
}

G. Poier, K. Roditakis, S. Schulter, D. Michel, H. Bischof and A.A. Argyros, "Hybrid One-Shot 3D Hand Pose Estimation by Exploiting Uncertainties", CoRR, Arxiv, 2015.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Model-based approaches to 3D hand tracking have been shown to perform well in a wide range of scenarios. However, they require initialisation and cannot recover easily from tracking failures that occur due to fast hand motions. Data-driven approaches, on the other hand, can quickly deliver a solution, but the results often suffer from lower accuracy or missing anatomical validity compared to those obtained from model-based approaches. In this work we propose a hybrid approach for hand pose estimation from a single depth image. First, a learned regressor is employed to deliver multiple initial hypotheses for the 3D position of each hand joint. Subsequently, the kinematic parameters of a 3D hand model are found by deliberately exploiting the inherent uncertainty of the inferred joint proposals. This way, the method provides anatomically valid and accurate solutions without requiring manual initialisation or suffering from track losses. Quantitative results on several standard datasets demonstrate that the proposed method outperforms state-of-the-art representatives of the model-based, data-driven and hybrid paradigms.

BibTeX:

@arxivarticle{1510.08039,
  author = {Georg Poier and Konstantinos Roditakis and Samuel Schulter and Damien Michel and Horst Bischof and Antonis A. Argyros},
  title = {Hybrid One-Shot 3D Hand Pose Estimation by Exploiting Uncertainties},
  journal = {CoRR, Arxiv},
  year = {2015},
  url = {http://arxiv.org/abs/1510.08039},
  projects =  {WEARHAP},
  doi = {10.5244/C.29.182},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_09_BMVC_hybrid.pdf}
}

D. Michel, K.E. Papoutsakis and A.A. Argyros, "Gesture Recognition Supporting the Interaction of Humans with Socially Assistive Robots", In Advances in Visual Computing (ISVC 2014), Springer, pp. 793-804, Las Vegas, Nevada, USA, December 2014.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We propose a new approach for vision-based gesture recognition to support robust and efficient human robot interaction towards developing socially assistive robots. The considered gestural vocabulary consists of five, user specified hand gestures that convey messages of fundamental importance in the context of human-robot dialogue. Despite their small number, the recognition of these gestures exhibits considerable challenges. Aiming at natural, easy-to-memorize means of interaction, users have identified gestures consisting of both static and dynamic hand configurations that involve different scales of observation (from arms to fingers) and exhibit intrinsic ambiguities. Moreover, the gestures need to be recognized regardless of the multifaceted variability of the human subjects performing them. Recognition needs to be performed online, in continuous video streams containing other irrelevant/unmodeled motions. All the above need to be achieved by analyzing information acquired by a possibly moving RGBD camera, in cluttered environments with considerable light variations. We present a gesture recognition method that addresses the above challenges, as well as promising experimental results obtained from relevant user trials.

BibTeX:

@inproceedings{Michel2014,
  author = {Michel, Damien and Konstantinos E. Papoutsakis and Argyros, Antonis A},
  title = {Gesture Recognition Supporting the Interaction of Humans with Socially Assistive Robots},
  booktitle = {Advances in Visual Computing (ISVC 2014)},
  publisher = {Springer},
  year = {2014},
  month = {December},
  pages = {793--804},
  address = {Las Vegas, Nevada, USA},
  url = {http://users.ics.forth.gr/ argyros/res_gesturesforHRI.html},
  projects =  {HOBBIT},
  doi = {10.1007/978-3-319-14249-4_76},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2014_12_ISVC_gestures.pdf},
  videolink = {https://youtu.be/eIIzgjG2V7A}
}

D.I. Kosmopoulos, K. Papoutsakis and A.A. Argyros, "Online segmentation and classification of modeled actions performed in the context of unmodeled ones", In British Machine Vision Conference (BMVC 2014), BMVA, Nottingham, UK, September 2014.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: In this work, we provide a discriminative framework for online simultaneous segmentation and classification of visual actions, which deals effectively with unknown sequences that may interrupt the known sequential patterns. To this end we employ Hough transform to vote in a 3D space for the begin point, the end point and the label of the segmented part of the input stream. An SVM is used to model each class and to suggest putative labeled segments on the timeline. To identify the most plausible segments among the putative ones we apply a dynamic programming algorithm, which maximises an objective function for label assignment in linear time. The performance of our method is evaluated on synthetic as well as on real data (Weizmann and Berkeley multimodal human action database). The proposed approach is of comparable accuracy to the state of the art for online stream segmentation and classification and performs considerably better in the presence of previously unseen actions.

BibTeX:

@inproceedings{Kosmopoulos2014,
  author = {Kosmopoulos, Dimitrios I and Papoutsakis, Konstantinos and Argyros, Antonis A},
  title = {Online segmentation and classification of modeled actions performed in the context of unmodeled ones},
  booktitle = {British Machine Vision Conference (BMVC 2014)},
  publisher = {BMVA},
  year = {2014},
  month = {September},
  address = {Nottingham, UK},
  url = {http://users.ics.forth.gr/ argyros/res_actions.html},
  projects =  {HOBBIT},
  doi = {10.5244/C.28.95},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2014_09_BMVC_actions.pdf},
  videolink = {https://youtu.be/LxIiTFDavpg}
}

C. Panagiotakis, A.A. Argyros and D. Michel, "Temporal segmentation and seamless stitching of motion patterns for synthesizing novel animations of periodic dances", In IEEE International Conference on Pattern Recognition (ICPR 2014), IEEE, pp. 1892-1897, Stockholm, Sweden, August 2014.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: In this paper, we present an efficient algorithm for synthesizing novel, arbitrarily long animations of periodic dances. The input to the proposed method is motion capture data acquired from markeless visual observations of a human performing a periodic dance. The provided human motion capture data are temporally segmented into the constituent periodic motion patterns. These are further organized in a motion graph that also represents possible transitions among them. Finally, an efficient algorithm exploits this representation to come up with a previously unseen sequence of motion patterns that are stitched seamlessly into a novel, realistic dance animation. Several experiments have been conducted with real recordings of Greek folk dances. The obtained results are very promising and indicate the efficacy of the proposed approach, as well as its tolerance to dynamic and noisy human motion capture input.

BibTeX:

@inproceedings{Panagiotakis2014,
  author = {Panagiotakis, Costas and Argyros, Antonis A and Michel, Damien},
  title = {Temporal segmentation and seamless stitching of motion patterns for synthesizing novel animations of periodic dances},
  booktitle = {IEEE International Conference on Pattern Recognition (ICPR 2014)},
  publisher = {IEEE},
  year = {2014},
  month = {August},
  pages = {1892--1897},
  address = {Stockholm, Sweden},
  url = {https://sites.google.com/site/costaspanagiotakis/research/dancer},
  projects =  {ERASITECHNIS,HOBBIT},
  doi = {10.1109/ICPR.2014.331},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2014_08_ICPR_danceanimation.pdf},
  videolink = {https://youtu.be/SYKjGpV_dN0}
}

N. Kyriazis and A.A. Argyros, "Scalable 3D Tracking of Multiple Interacting Objects", In IEEE Computer Vision and Pattern Recognition (CVPR 2014), IEEE, pp. 3430-3437, Columbus, Ohio, USA, June 2014.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We consider the problem of tracking multiple interacting objects in 3D, using RGBD input and by considering a hypothesize-and-test approach. Due to their interaction, objects to be tracked are expected to occlude each other in the field of view of the camera observing them. A naive approach would be to employ a Set of Independent Trackers (SIT) and to assign one tracker to each object. This approach scales well with the number of objects but fails as occlusions become stronger due to their disjoint consideration. The solution representing the current state of the art employs a single Joint Tracker (JT) that accounts for all objects simultaneously. This directly resolves ambiguities due to occlusions but has a computational complexity that grows geometrically with the number of tracked objects. We propose a middle ground, namely an Ensemble of Collaborative Trackers (ECT), that combines best traits from both worlds to deliver a practical and accurate solution to the multi-object 3D tracking problem. We present quantitative and qualitative experiments with several synthetic and real world sequences of diverse complexity. Experiments demonstrate that ECT manages to track far more complex scenes than JT at a computational time that is only slightly larger than that of SIT.

BibTeX:

@inproceedings{Kyriazis2014,
  author = {Kyriazis, Nikolaos and Argyros, Antonis A},
  title = {Scalable 3D Tracking of Multiple Interacting Objects},
  booktitle = {IEEE Computer Vision and Pattern Recognition (CVPR 2014)},
  publisher = {IEEE},
  year = {2014},
  month = {June},
  pages = {3430--3437},
  address = {Columbus, Ohio, USA},
  url = {http://users.ics.forth.gr/ argyros/res_ect.html},
  projects =  {ROBOHOW, WEARHAP},
  doi = {10.1109/CVPR.2014.438},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2014_06_cvpr_ect.pdf},
  videolink = {https://youtu.be/SCOtBdhDMKg}
}

I. Oikonomidis, M.I.A. Lourakis and A.A. Argyros, "Evolutionary Quasi-random Search for Hand Articulations Tracking", In IEEE Computer Vision and Pattern Recognition (CVPR 2014), IEEE, pp. 3422-3429, Columbus, Ohio, USA, June 2014.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present a new method for tracking the 3D position, global orientation and full articulation of human hands. Following recent advances in model-based, hypothesize-and-test methods, the high-dimensional parameter space of hand configurations is explored with a novel evolutionary optimization technique specifically tailored to the problem. The proposed method capitalizes on the fact that samples from quasi-random sequences such as the Sobol have low discrepancy and exhibit a more uniform coverage of the sampled space compared to random samples obtained from the uniform distribution. The method has been tested for the problems of tracking the articulation of a single hand (27D parameter space) and two hands (54D space). Extensive experiments have been carried out with synthetic and real data, in comparison with state of the art methods. The quantitative evaluation shows that for cases of limited computational resources, the new approach achieves a speed-up of four (single hand tracking) and eight (two hands tracking) without compromising tracking accuracy. Interestingly, the proposed method is preferable compared to the state of the art either in the case of limited computational resources or in the case of more complex (i.e., higher dimensional) problems, thus improving the applicability of the method in a number of application domains.

BibTeX:

@inproceedings{Oikonomidis2014,
  author = {Oikonomidis, Iason and Lourakis, Manolis I A and Argyros, Antonis A},
  title = {Evolutionary Quasi-random Search for Hand Articulations Tracking},
  booktitle = {IEEE Computer Vision and Pattern Recognition (CVPR 2014)},
  publisher = {IEEE},
  year = {2014},
  month = {June},
  pages = {3422--3429},
  address = {Columbus, Ohio, USA},
  url = {http://users.ics.forth.gr/ argyros/res_sobolhandtracking.html},
  projects =  {WEARHAP, ROBOHOW},
  doi = {10.1109/CVPR.2014.437},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2014_06_cvpr_sobol.pdf},
  videolink = {https://youtu.be/3yvaFuX09xY}
}

D. Michel, X. Zabulis and A.A. Argyros, "Shape from interaction", Machine Vision Applications, Springer, vol. 25, no. 4, pp. 1077-1087, May 2014.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present “shape from interaction” (SfI), an approach to the problem of acquiring 3D representations of rigid objects through observing the activity of a human who handles a tool. SfI relies on the fact that two rigid objects cannot share the same physical space. The 3D reconstruction of the unknown object is achieved by tracking the known 3D tool and by carving out the space it occupies as a function of time. Due to this indirection, SfI reconstructs rigid objects regardless of their material and appearance properties and proves particularly useful for the cases of textureless, transparent, translucent, refractive and specular objects for which there exists no practical vision-based 3D reconstruction method. Additionally, object concavities that are not directly observable can also be reconstructed. The 3D tracking of the tool is formulated as an optimization problem that is solved based on visual input acquired by a multicamera system. Experimental results from a prototype implementation of SfI support qualitatively and quantitatively the effectiveness of the proposed approach.

BibTeX:

@article{Michel2014a,
  author = {Michel, Damien and Zabulis, Xenophon and Argyros, Antonis A},
  title = {Shape from interaction},
  journal = {Machine Vision Applications},
  publisher = {Springer},
  year = {2014},
  month = {May},
  volume = {25},
  number = {4},
  pages = {1077--1087},
  url = {http://users.ics.forth.gr/ argyros/res_sfi.html},
  projects =  {ROBOHOW},
  doi = {10.1007/s00138-014-0602-9},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2014_journal_MVA_sfi.pdf},
  videolink = {https://youtu.be/uZQQkGTk6-k}
}

P. Panteleris and A.A. Argyros, "Vision-Based SLAM and Moving Objects Tracking for the Perceptual Support of a Smart Walker Platform", In European Conference on Computer Vision Workshops (HAU3D 2014 - ECCVW 2014), Springer, pp. 407-423, Zurich, Switzerland, January 2014.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: The problems of vision-based detection and tracking of independently moving objects, localization and map construction are highly interrelated, in the sense that the solution of any of them provides valuable information to the solution of the others. In this paper, rather than trying to solve each of them in isolation, we propose a method that treats all of them simultaneously. More specifically, given visual input acquired by a moving RGBD camera, the method detects independently moving objects and tracks them in time. Additionally, the method estimates the camera (ego)motion and the motion of the tracked objects in a coordinate system that is attached to the static environment, a map of which is progressively built from scratch. The loose assumptions that the method adopts with respect to the problem parameters make it a valuable component for any robotic platform that moves in a dynamic environment and requires simultaneous tracking of moving objects, egomotion estimation and map construction. The usability of the method is further enhanced by its robustness and its low computational requirements that permit real time execution even on low-end CPUs.

BibTeX:

@inproceedings{Panteleris2014,
  author = {Paschalis Panteleris and Argyros, Antonis A},
  title = {Vision-Based SLAM and Moving Objects Tracking for the Perceptual Support of a Smart Walker Platform},
  booktitle = {European Conference on Computer Vision Workshops (HAU3D 2014 - ECCVW 2014)},
  publisher = {Springer},
  year = {2014},
  month = {January},
  pages = {407--423},
  address = {Zurich, Switzerland},
  url = {http://users.ics.forth.gr/ argyros/res_slammot.html},
  projects =  {ACANTO},
  doi = {10.1007/978-3-319-16199-0_29},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2014_09_ACVR_slammot.pdf},
  videolink = {https://youtu.be/RnKFCypUk9U}
}

S. Escalera, J. Gonzàlez, X. Baró, M. Reyes, I. Guyon, V. Athitsos, H. Escalante, L. Sigal, A.A. Argyros, C. Sminchisescu and others, "ChaLearn multi-modal gesture recognition 2013: grand challenge and workshop summary", In ACM International Conference on Multimodal Interaction (ICMI 2013), ACM, pp. 365-368, Sydney, Australia, December 2013.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: We organized a Grand Challenge and Workshop on Multi-Modal Gesture Recognition. The MMGR Grand Challenge focused on the recognition of continuous natural gestures from multi-modal data (including RGB, Depth, user mask, Skeletal model, and audio). We made available a large labeled video database of 13,858 gestures from a lexicon of 20 Italian gesture categories recorded with a KinectTM camera. More than 54 teams participated in the challenge and a final error rate of 12% was achieved by the winner of the competition. Winners of the competition published their work in the workshop of the Challenge. The MMGR Workshop was held at ICMI conference 2013, Sidney. A total of 9 relevant papers with basis on multi-modal gesture recognition were accepted for presentation. This includes multi-modal descriptors, multi-class learning strategies for segmentation and classification in temporal data, as well as relevant applications in the field, including multi-modal Social Signal Processing and multi-modal Human Computer Interfaces. Five relevant invited speakers participated in the workshop: Profs. Leonid Signal from Disney Research, Argyros, Antonis A from FORTH, Institute of Computer Science, Cristian Sminchisescu from Lund University, Richard Bowden from University of Surrey, and Stan Sclaroff from Boston University. They summarized their research in the field and discussed past, current, and future challenges in Multi-Modal Gesture Recognition.

BibTeX:

@inproceedings{Escalera2013,
  author = {Escalera, Sergio and Gonzàlez, Jordi and Baró, Xavier and Reyes, Miguel and Guyon, Isabelle and Athitsos, Vassilis and Escalante, Hugo and Sigal, Leonid and Argyros, Antonis A and Sminchisescu, Cristian and others},
  title = {ChaLearn multi-modal gesture recognition 2013: grand challenge and workshop summary},
  booktitle = {ACM International Conference on Multimodal Interaction (ICMI 2013)},
  publisher = {ACM},
  year = {2013},
  month = {December},
  pages = {365--368},
  address = {Sydney, Australia},
  projects =  {none},
  doi = {10.1145/2522848.2532597},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2013_12_icmi_mmgr.pdf}
}

D. Fischinger, P. Einramhof, W. Wohlkinger, K. Papoutsakis, P. Mayer, P. Panek, T. Koertner, S. Hofmann, A.A. Argyros and M. Vincze, "Hobbit-The Mutual Care Robot", In IEEE/RSJ International Conference on Intelligent Robots and Systems Workshops (ASROB 2013 - IROSW 2013), IEEE, Tokyo, Japan, November 2013.
[Abstract] [BibTeX] [PDF]

Abstract: One option to face the aging society is to build robots that help old persons to stay longer at home. We present Hobbit, a robot that attempts to let users feel safe at home by preventing and detecting falls. Falling has been identified as the highest risk for older adults of getting injured so badly that they can no longer live independently at home and have to move to a care facility. Hobbit is intended to provide high usability and acceptability for the target user group while, at the same time, needs to be affordable for private customers. The development process so far (1.5 years) included a thorough user requirement analysis, conceptual interaction design, prototyping and implementation of key behaviors, as well as extensive empirical testing with target users in the laboratory. We shortly describe the overall interdisciplinary decision-making and conceptualization of the robot and will then focus on the system itself describing the hardware, basic components, and the robot tasks. Finally, we will summarize the findings of the first empirical test with 49 users in three countries and give an outlook of how the platform will be extended in future.

BibTeX:

@inproceedings{Fischinger2013,
  author = {Fischinger, Davis and Einramhof, Peter and Wohlkinger, Walter and Papoutsakis, Konstantinos and Mayer, Peter and Panek, Paul and Koertner, Tobias and Hofmann, Stefan and Argyros, Antonis A and Vincze, Markus},
  title = {Hobbit-The Mutual Care Robot},
  booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems Workshops (ASROB 2013 - IROSW 2013)},
  publisher = {IEEE},
  year = {2013},
  month = {November},
  address = {Tokyo, Japan},
  projects =  {HOBBIT},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2013_11_asrob_hobbit.pdf}
}

P. Padeleris, X. Zabulis and A.A. Argyros, "Multicamera tracking of multiple humans based on colored visual hulls", In IEEE Conference on Emerging Technologies & Factory Automation, (ETFA 2013), IEEE, pp. 1-8, Cagliari, Italy, September 2013.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Detecting, localizing and tracking humans within an industrial environment are three tasks which are of central importance towards achieving automation in workplaces and intelligent environments. This is because unobtrusive, real-time and reliable person tracking provides valuable input to solving problems such as workplace surveillance and event/activity recognition and, also, contributes to safety and optimized use of resources. This paper presents a passive approach to the problem of person tracking that is based on a network of conventional color cameras. The proposed approach exhibits robustness to challenging conditions that are encountered in industrial environments due to illumination artifacts, occlusions and the highly dynamic nature of the observed scenes. The multiple views of the environment that the system employs are used to obtain a volumetric representation of the humans within it, in real-time. Although human tracking can be achieved based solely on such a volumetric representation, in demanding scenes, this information is not enough to recover from tracking failures. Thus, in this work, we collect and update a representation of the color appearance of the persons in the environment. The combination of volumetric and color information reinforces tracking robustness, even when a person is not visible by any of the cameras for extended time intervals. The proposed approach has been extensively evaluated in comparison with an existing state of the art method and pertinent results are reported.

BibTeX:

@inproceedings{Padeleris2013,
  author = {Padeleris, Pashalis and Zabulis, Xenophon and Argyros, Antonis A},
  title = {Multicamera tracking of multiple humans based on colored visual hulls},
  booktitle = {IEEE Conference on Emerging Technologies &amp; Factory Automation, (ETFA 2013)},
  publisher = {IEEE},
  year = {2013},
  month = {September},
  pages = {1--8},
  address = {Cagliari, Italy},
  url = {http://users.ics.forth.gr/ argyros/res_humanpresence.html},
  projects =  {DALI},
  doi = {10.1109/ETFA.2013.6647982},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2013_09_etfa_coloredhulls.pdf}
}

P. Douvantzis, I. Oikonomidis, N. Kyriazis and A.A. Argyros, "Dimensionality Reduction for Efficient Single Frame Hand Pose Estimation", In International Conference on Computer Vision Systems (ICVS 2013), Springer, pp. 143-152, St. Petersburg, Russia, July 2013.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Model based approaches for the recovery of the 3D position, orientation and full articulation of the human hand have a number of attractive properties. One bottleneck towards their practical exploitation is their computational cost. To a large extent, this is determined by the large dimensionality of the problem to be solved. In this work we exploit the fact that the parametric joints space representing hand configurations is highly redundant. Thus, we employ Principal Component Analysis (PCA) to learn a lower dimensional space that describes compactly and effectively the human hand articulation. The reduced dimensionality of the resulting space leads to a simpler optimization problem, so model-based approaches require less computational effort to solve it. Experiments demonstrate that the proposed approach achieves better accuracy in hand pose recovery compared to a state of the art baseline method using only 1/4 of the latter’s computational budget.

BibTeX:

@inproceedings{Douvantzis2013,
  author = {Douvantzis, Petros and Oikonomidis, Iason and Kyriazis, Nikolaos and Argyros, Antonis A},
  title = {Dimensionality Reduction for Efficient Single Frame Hand Pose Estimation},
  booktitle = {International Conference on Computer Vision Systems (ICVS 2013)},
  publisher = {Springer},
  year = {2013},
  month = {July},
  pages = {143--152},
  address = {St. Petersburg, Russia},
  projects =  {WEARHAP},
  doi = {10.1007/978-3-642-39402-7_15},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2013_07_icvs_dimreduction.pdf}
}

P. Koutlemanis, X. Zabulis, A. Ntelidakis and A.A. Argyros, "Foreground detection with a moving rgbd camera", In Advances in Visual Computing (ISVC 2013), Springer, pp. 216-227, Rethymnon, Crete, Greece, July 2013.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: A method for foreground detection in data acquired by a moving RGBD camera is proposed. The background scene is initially in a reference model. An initial estimation of camera motion is provided by a conventional point cloud registration approach of matched keypoints between the captured scene and the reference model. This initial solution is then refined based on a top-down, model based approach that evaluates candidate camera poses in a Particle Swarm Optimization framework. To evaluate a candidate pose, the method renders color and depth images of the model according to this pose and computes a dissimilarity score of the rendered images to the currently captured ones. This score is based on the direct comparison of color, depth, and surface geometry between the acquired and rendered images, while allowing for outliers due to the potential occurrence of foreground objects, or newly imaged surfaces. Extended quantitative and qualitative experimental results confirm that the proposed method produces significantly more accurate foreground segmentation maps compared to the conventional, baseline feature-based approach.

BibTeX:

@inproceedings{Koutlemanis2013,
  author = {Koutlemanis, Panayotis and Zabulis, Xenophon and Ntelidakis, Antonios and Argyros, Antonis A},
  title = {Foreground detection with a moving rgbd camera},
  booktitle = {Advances in Visual Computing (ISVC 2013)},
  publisher = {Springer},
  year = {2013},
  month = {July},
  pages = {216--227},
  address = {Rethymnon, Crete, Greece},
  projects =  {WEARHAP},
  doi = {10.1007/978-3-642-41914-0_22},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2013_07_isvc_fgbg.pdf}
}

I. Oikonomidis, N. Kyriazis, K. Tzevanidis and A.A. Argyros, "Tracking hand articulations: Relying on 3D visual hulls versus relying on multiple 2D cues", In IEEE International Symposium on Ubiquitous Virtual Reality (ISUVR 2013), IEEE, pp. 7-10, Daejeon, South Korea, July 2013.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: We present a method for articulated hand tracking that relies on visual input acquired by a calibrated multicamera system. A state-of-the-art result on this problem has been presented in [12]. In that work, hand tracking is formulated as the minimization of an objective function that quantifies the discrepancy between a hand pose hypothesis and the observations. The objective function treats the observations from each camera view in an independent way. We follow the same general optimization framework but we choose to employ the visual hull [10] as the main observation cue, which results from the integration of information from all available views prior to optimization. We investigate the behavior of the resulting method in extensive experiments and in comparison with that of [12]. The obtained results demonstrate that for low levels of noise contamination, regardless of the number of cameras, the two methods perform comparably. The situation changes when noisy observations or as few as two cameras with short baselines are employed. In these cases, the proposed method is more accurate than that of [12]. Thus, the proposed method is preferable in real-world scenarios with noisy observations obtained from easy-to-deploy, stereo camera setups.

BibTeX:

@inproceedings{Oikonomidis2013a,
  author = {Oikonomidis, Iason and Kyriazis, Nikolaos and Tzevanidis, Konstantinos and Argyros, Antonis A},
  title = {Tracking hand articulations: Relying on 3D visual hulls versus relying on multiple 2D cues},
  booktitle = {IEEE International Symposium on Ubiquitous Virtual Reality (ISUVR 2013)},
  publisher = {IEEE},
  year = {2013},
  month = {July},
  pages = {7--10},
  address = {Daejeon, South Korea},
  projects =  {ROBOHOW,WEARHAP},
  doi = {10.1109/ISUVR.2013.13},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2013_07_isuvr_handvisualhull.pdf}
}

C. Panagiotakis, A. Holzapfel, D. Michel and A.A. Argyros, "Beat Synchronous Dance Animation Based on Visual Analysis of Human Motion and Audio Analysis of Music Tempo", In Advances in Visual Computing (ISVC 2013), Springer, pp. 118-127, Rethymnon, Crete, Greece, July 2013.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present a framework that generates beat synchronous dance animation based on the analysis of both visual and audio data. First, the articulated motion of a dancer is captured based on markerless visual observations obtained by a multicamera system. We propose and employ a new method for the temporal segmentation of such motion data into the periods of dance. Next, we use a beat tracking algorithm to estimate the pulse related to the tempo of a piece of music. Given an input music that is of the same genre as the one corresponding to the visually observed dance, we automatically produce a beat synchronous dance animation of a virtual character. The proposed approach has been validated with extensive experiments performed on a data set containing a variety on traditional Greek/Cretan dances and the corresponding music.

BibTeX:

@inproceedings{Panagiotakis2013,
  author = {Panagiotakis, Costas and Holzapfel, Andre and Michel, Damien and Argyros, Antonis A},
  title = {Beat Synchronous Dance Animation Based on Visual Analysis of Human Motion and Audio Analysis of Music Tempo},
  booktitle = {Advances in Visual Computing (ISVC 2013)},
  publisher = {Springer},
  year = {2013},
  month = {July},
  pages = {118--127},
  address = {Rethymnon, Crete, Greece},
  url = {https://sites.google.com/site/costaspanagiotakis/research/dancer},
  projects =  {ERASITECHNIS},
  doi = {10.1007/978-3-642-41939-3_12},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2013_07_isvc_dancer.pdf},
  videolink = {https://youtu.be/SYKjGpV_dN0}
}

N. Kyriazis and A.A. Argyros, "Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis", In IEEE Computer Vision and Pattern Recognition (CVPR 2013), IEEE, pp. 9-16, Portland, Oregon, USA, June 2013.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: In several hand-object(s) interaction scenarios, the change in the objects' state is a direct consequence of the hand's motion. This has a straightforward representation in Newtonian dynamics. We present the first approach that exploits this observation to perform model-based 3D tracking of a table-top scene comprising passive objects and an active hand. Our forward modelling of 3D hand-object(s) interaction regards both the appearance and the physical state of the scene and is parameterized over the hand motion (26 DoFs) between two successive instants in time. We demonstrate that our approach manages to track the 3D pose of all objects and the 3D pose and articulation of the hand by only searching for the parameters of the hand motion. In the proposed framework, covert scene state is inferred by connecting it to the overt state, through the incorporation of physics. Thus, our tracking approach treats a variety of challenging observability issues in a principled manner, without the need to resort to heuristics.

BibTeX:

@inproceedings{Kyriazis2013,
  author = {Kyriazis, Nikolaos and Argyros, Antonis A},
  title = {Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis},
  booktitle = {IEEE Computer Vision and Pattern Recognition (CVPR 2013)},
  publisher = {IEEE},
  year = {2013},
  month = {June},
  pages = {9--16},
  address = {Portland, Oregon, USA},
  url = {http://users.ics.forth.gr/ argyros/res_singleactorhypothesis.html},
  projects =  {ROBOHOW},
  doi = {10.1109/CVPR.2013.9},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2013_06_cvpr_singleactor.pdf},
  videolink = {https://youtu.be/0RCsQPXeHRQ}
}

K. Papoutsakis, P. Padeleris, A. Ntelidakis, S. Stefanou, X. Zabulis, D. Kosmopoulos and A.A. Argyros, "Developing visual competencies for socially assistive robots: the HOBBIT approach", In International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2013), ACM, pp. 1-7, Rhodes, Greece, May 2013.
[Abstract] [BibTeX] [DOI] [PDF] [VIDEO]

Abstract: In this paper, we present our approach towards developing visual competencies for socially assistive robots within the framework of the HOBBIT project. We show how we integrated several vision modules using a layered architectural scheme. Our goal is to endow the mobile robot with visual perception capabilities so that it can interact with the users. We present the key modules of independent motion detection, object detection, body localization, person tracking, head pose estimation and action recognition and we explain how they serve the goal of natural integration of robots in social environments.

BibTeX:

@inproceedings{Papoutsakis2013a,
  author = {Papoutsakis, Konstantinos and Padeleris, Pashalis and Ntelidakis, Antonis and Stefanou, Stefanos and Zabulis, Xenophon and Kosmopoulos, Dimitrios and Argyros, Antonis A},
  title = {Developing visual competencies for socially assistive robots: the HOBBIT approach},
  booktitle = {International Conference on Pervasive Technologies Related to Assistive Environments (PETRA 2013)},
  publisher = {ACM},
  year = {2013},
  month = {May},
  pages = {1--7},
  address = {Rhodes, Greece},
  projects =  {HOBBIT, DALI},
  doi = {10.1145/2504335.2504395},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2013_05_petra_hobbit.pdf},
  videolink = {https://youtu.be/LxIiTFDavpg}
}

M. Patel, C.H. Ek, N. Kyriazis, A.A. Argyros, J.V. Miró and D. Kragic, "Language for learning complex human-object interactions", In IEEE International Conference on Robotics and Automation (ICRA 2013), IEEE, pp. 4997-5002, Karlsruhe, Germany, May 2013.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: In this paper we use a Hierarchical Hidden Markov Model (HHMM) to represent and learn complex activities/task performed by humans/robots in everyday life. Action primitives are used as a grammar to represent complex human behaviour and learn the interactions and behaviour of human/robots with different objects. The main contribution is the use of a probabilistic model capable of representing behaviours at multiple levels of abstraction to support the proposed hypothesis. The hierarchical nature of the model allows decomposition of the complex task into simple action primitives. The framework is evaluated with data collected for tasks of everyday importance performed by a human user.

BibTeX:

@inproceedings{Patel2013,
  author = {Patel, Mitesh and Ek, Carl Henrik and Kyriazis, Nikolaos and Argyros, Antonis A and Miró, Jaime Valls and Kragic, Danica},
  title = {Language for learning complex human-object interactions},
  booktitle = {IEEE International Conference on Robotics and Automation (ICRA 2013)},
  publisher = {IEEE},
  year = {2013},
  month = {May},
  pages = {4997--5002},
  address = {Karlsruhe, Germany},
  projects =  {ROBOHOW, GRASP},
  doi = {10.1109/ICRA.2013.6631291},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2013_05_icra_mitesh.pdf}
}

D. Song, N. Kyriazis, I. Oikonomidis, C. Papazov, A.A. Argyros, D. Burschka and D. Kragic, "Predicting human intention in visual observations of hand/object interactions", In IEEE International Conference on Robotics and Automation (ICRA 2013), IEEE, pp. 1608-1615, Karlsruhe, Germany, May 2013.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: The main contribution of this paper is a probabilistic method for predicting human manipulation intention from image sequences of human-object interaction. Predicting intention amounts to inferring the imminent manipulation task when human hand is observed to have stably grasped the object. Inference is performed by means of a probabilistic graphical model that encodes object grasping tasks over the 3D state of the observed scene. The 3D state is extracted from RGB-D image sequences by a novel vision-based, markerless hand-object 3D tracking framework. To deal with the high-dimensional state-space and mixed data types (discrete and continuous) involved in grasping tasks, we introduce a generative vector quantization method using mixture models and self-organizing maps. This yields a compact model for encoding of grasping actions, able of handling uncertain and partial sensory data. Experimentation showed that the model trained on simulated data can provide a potent basis for accurate goal-inference with partial and noisy observations of actual real-world demonstrations. We also show a grasp selection process, guided by the inferred human intention, to illustrate the use of the system for goal-directed grasp imitation.

BibTeX:

@inproceedings{Song2013,
  author = {Song, Dan and Kyriazis, Nikolaos and Oikonomidis, Iason and Chavdar Papazov and Argyros, Antonis A and Burschka, Darius and Kragic, Danica},
  title = {Predicting human intention in visual observations of hand/object interactions},
  booktitle = {IEEE International Conference on Robotics and Automation (ICRA 2013)},
  publisher = {IEEE},
  year = {2013},
  month = {May},
  pages = {1608--1615},
  address = {Karlsruhe, Germany},
  projects =  {GRASP},
  doi = {10.1109/ICRA.2013.6630785},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2013_05_icra_song.pdf}
}

X. Zabulis, D. Grammenos, T. Sarmis, K. Tzevanidis, P. Padeleris, P. Koutlemanis and A.A. Argyros, "Multicamera human detection and tracking supporting natural interaction with large-scale displays", Machine Vision Applications, Springer, vol. 24, no. 2, pp. 319-336, February 2013.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: This paper presents a computer vision system that supports non-instrumented, location-based interaction of multiple users with digital representations of large-scale artifacts. The proposed system is based on a camera network that observes multiple humans in front of a very large display. The acquired views are used to volumetrically reconstruct and track the humans robustly and in real time, even in crowded scenes and challenging human configurations. Given the frequent and accurate monitoring of humans in space and time, a dynamic and personalized textual/graphical annotation of the display can be achieved based on the location and the walk-through trajectory of each visitor. The proposed system has been successfully deployed in an archaeological museum, offering its visitors the capability to interact with and explore a digital representation of an ancient wall painting. This installation permits an extensive evaluation of the proposed system in terms of tracking robustness, computational performance and usability. Furthermore, it proves that computer vision technology can be effectively used to support non-instrumented interaction of humans with their environments in realistic settings.

BibTeX:

@article{Zabulis2013,
  author = {Zabulis, Xenophon and Grammenos, Dimitris and Sarmis, Thomas and Tzevanidis, Konstantinos and Padeleris, Pashalis and Koutlemanis, Panagiotis and Argyros, Antonis A},
  title = {Multicamera human detection and tracking supporting natural interaction with large-scale displays},
  journal = {Machine Vision Applications},
  publisher = {Springer},
  year = {2013},
  month = {February},
  volume = {24},
  number = {2},
  pages = {319--336},
  url = {http://users.ics.forth.gr/ argyros/res_humanpresence.html},
  projects =  {AMIPROJ,GRASP},
  doi = {10.1007/s00138-012-0408-6},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2012_XX_journal_mva_macrografia.pdf},
  videolink = {https://youtu.be/x9KTfZafZBA}
}

D. Grammenos, X. Zabulis, D. Michel, P. Padeleris, T. Sarmis, G. Georgalis, P. Koutlemanis, K. Tzevanidis, A.A. Argyros, M. Sifakis and others, "A prototypical interactive exhibition for the archaeological museum of Thessaloniki", International Journal of Heritage in the Digital Era, SAGE Publications, vol. 2, no. 1, pp. 75-99, 2013.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: In 2010, the Institute of Computer Science of the Foundation for
Research and Technology-Hellas (ICS FORTH) and the Archaeological Museum of Thessaloniki (AMTh) collaborated towards the creation of a special exhibition of prototypical interactive systems with subjects drawn from ancient Macedonia, named “Macedonia from fragments to pixels”. The exhibition comprises seven interactive systems based on the research outcomes of ICS FORTH’s Ambient Intelligence Programme. Up to the summer of 2012, more than 165.000 people have visited it. The paper initially provides some background information, including related previous research work, and then illustrates and discusses the development process that was followed for creating the exhibition. Subsequently, the technological and interactive characteristics of the project outcomes (i.e., the interactive systems) are analysed and the complementary evaluation approaches followed are briefly described. Finally, some conclusions stemming from the project are highlighted.

BibTeX:

@article{Grammenos2013,
  author = {Grammenos, Dimitris and Zabulis, Xenophon and Michel, Damine and Padeleris, Pashalis and Sarmis, Thomas and Georgalis, Giannis and Koutlemanis, Panagiotis and Tzevanidis, Konstantinos and Argyros, Antonis A and Sifakis, M and others},
  title = {A prototypical interactive exhibition for the archaeological museum of Thessaloniki},
  journal = {International Journal of Heritage in the Digital Era},
  publisher = {SAGE Publications},
  year = {2013},
  volume = {2},
  number = {1},
  pages = {75--99},
  url = {http://users.ics.forth.gr/ argyros/res_humanpresence.html},
  projects =  {AMIPROJ},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2013_journal_JHDE_macrografia.pdf}
}

K.E. Papoutsakis and A.A. Argyros, "Integrating tracking with fine object segmentation", Image and Vision Computing, Elsevier, vol. 31, no. 10, pp. 771-785, 2013.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present a novel method for on-line, joint object tracking and segmentation in a monocular video captured by a possibly moving camera. Our goal is to integrate tracking and fine segmentation of a single, previously unseen, potentially non-rigid object of unconstrained appearance, given its segmentation in the first frame of an image sequence as the only prior information. To this end, we tightly couple an existing kernel-based object tracking method with Random Walker-based image segmentation. Bayesian inference mediates between tracking and segmentation, enabling effective data fusion of pixel-wise spatial and color visual cues. The fine segmentation of an object at a certain frame provides tracking with reliable initialization for the next frame, closing the loop between the two building blocks of the proposed framework. The effectiveness of the proposed methodology is evaluated experimentally by comparing it to a large collection of state of the art tracking and video-based object segmentation methods on the basis of a data set consisting of several challenging image sequences for which ground truth data is available.

BibTeX:

@article{Papoutsakis2013,
  author = {Papoutsakis, Konstantinos E and Argyros, Antonis A},
  title = {Integrating tracking with fine object segmentation},
  journal = {Image and Vision Computing},
  publisher = {Elsevier},
  year = {2013},
  volume = {31},
  number = {10},
  pages = {771--785},
  url = {http://users.ics.forth.gr/ argyros/res_trackingsegmentation.html},
  projects =  {ROBOHOW},
  doi = {10.1016/j.imavis.2013.07.008},
  pdflink = {http://dx.doi.org/10.1016/j.imavis.2013.07.008},
  videolink = {https://youtu.be/n_z6SY3UYB0}
}

I. Oikonomidis, N. Kyriazis and A.A. Argyros, "Tracking the Articulated Motion of Human Hands in 3D", ERCIM News, no. 95, 2013.
[BibTeX] [PDF] [URL]

BibTeX:

@periodical{Oikonomidis2013,
  author = {Oikonomidis, Iason and Kyriazis, Nikolaos and Argyros, Antonis A},
  title = {Tracking the Articulated Motion of Human Hands in 3D},
  journal = {ERCIM News},
  year = {2013},
  number = {95},
  url = {http://cvrlcode.ics.forth.gr/handtracking/},
  projects =  {ROBOHOW,WEARHAP},
  pdflink = {http://ercim-news.ercim.eu/en95/special/tracking-the-articulated-motion-of-human-hands-in-3d}
}

D. Grammenos, X. Zabulis, D. Michel, P. Padeleris, T. Sarmis, G. Georgalis, P. Koutlemanis, K. Tzevanidis, A.A. Argyros, M. Sifakis, P. Adam-Veleni and C. Stephanidis, "Macedonia from Fragments to Pixels: A Permanent Exhibition of Interactive Systems at the Archaeological Museum of Thessaloniki", In International Conference on Progress in Cultural Heritage Preservation (EuroMed 2012), Springer, pp. 602-609, Lemesos, Cyprus, October 2012.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: The theme of this paper is an exhibition of prototypical interactive systems with subjects drawn from ancient Macedonia, named "Macedonia from fragments to pixels". Since 2010, the exhibition is hosted by the Archaeological Museum of Thessaloniki and is open daily to the general public. Up to now, more than 165.000 people have visited it. The exhibition comprises 7 interactive systems which are based on some research outcomes of the Ambient Intelligence Programme of the Institute of Computer Science, Foundation for Research and Technology - Hellas. The digital content of these systems includes objects from the Museum’s permanent collection and from Macedonia.

BibTeX:

@inproceedings{Grammenos2012,
  author = {Grammenos, Dimitris and Zabulis, Xenophon and Michel, Damien and Padeleris, Pashalis and Sarmis, Thomas and Georgalis, Giannis and Koutlemanis, Panayotis and Tzevanidis, Konstantinos and Argyros, Antonis A and Michalis Sifakis and Polyxeni Adam-Veleni and Stephanidis, Constantine},
  title = {Macedonia from Fragments to Pixels: A Permanent Exhibition of Interactive Systems at the Archaeological Museum of Thessaloniki},
  booktitle = {International Conference on Progress in Cultural Heritage Preservation (EuroMed 2012)},
  publisher = {Springer},
  year = {2012},
  month = {October},
  pages = {602--609},
  address = {Lemesos, Cyprus},
  url = {http://users.ics.forth.gr/ argyros/res_humanpresence.html},
  projects =  {AMIPROJ},
  doi = {10.1007/978-3-642-34234-9_62},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2012_10_euromed_ami.pdf}
}

S. Stefanou and A.A. Argyros, "Efficient Scale and Rotation Invariant Object Detection Based on HOGs and Evolutionary Optimization Techniques", In Advances in Visual Computing (ISVC 2012), Springer, pp. 220-229, Rethymnon, Crete, Greece, July 2012.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Object detection and localization in an image can be achieved by representing an object as a Histogram of Oriented Gradients (HOG). HOGs have proven to be robust object descriptors. However, to achieve accurate object localization, one must take a sliding window approach and evaluate the similarity of the descriptor over all possible windows in an image. In case that search should also be scale and rotation invariant, the exhaustive consideration of all possible HOG transformations makes the method impractical due to its computational complexity. In this work, we first propose a variant of an existing rotation invariant HOG-like descriptor. We then formulate object detection and localization as an optimization problem that is solved using the Particle Swarm Optimization (PSO) method. A series of experiments demonstrates that the proposed approach results in very large performance gains without sacrificing object detection and localization accuracy.

BibTeX:

@inproceedings{Stefanou2012,
  author = {Stefanou, Stefanos and Argyros, Antonis A},
  title = {Efficient Scale and Rotation Invariant Object Detection Based on HOGs and Evolutionary Optimization Techniques},
  booktitle = {Advances in Visual Computing (ISVC 2012)},
  publisher = {Springer},
  year = {2012},
  month = {July},
  pages = {220--229},
  address = {Rethymnon, Crete, Greece},
  projects =  {HOBBIT},
  doi = {10.1007/978-3-642-33179-4_22},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2012_07_isvc_hogpso.pdf}
}

I. Oikonomidis, N. Kyriazis and A.A. Argyros, "Tracking the articulated motion of two strongly interacting hands", In IEEE Computer Vision and Pattern Recognition (CVPR 2012), IEEE, pp. 1862-1869, Providence, Rhode Island, USA, June 2012.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We propose a method that relies on markerless visual observations to track the full articulation of two hands that interact with each-other in a complex, unconstrained manner. We formulate this as an optimization problem whose 54-dimensional parameter space represents all possible configurations of two hands, each represented as a kinematic structure with 26 Degrees of Freedom (DoFs). To solve this problem, we employ Particle Swarm Optimization (PSO), an evolutionary, stochastic optimization method with the objective of finding the two-hands configuration that best explains observations provided by an RGB-D sensor. To the best of our knowledge, the proposed method is the first to attempt and achieve the articulated motion tracking of two strongly interacting hands. Extensive quantitative and qualitative experiments with simulated and real world image sequences demonstrate that an accurate and efficient solution of this problem is indeed feasible.

BibTeX:

@inproceedings{Oikonomidis2012,
  author = {Oikonomidis, Iason and Kyriazis, Nikolaos and Argyros, Antonis A},
  title = {Tracking the articulated motion of two strongly interacting hands},
  booktitle = {IEEE Computer Vision and Pattern Recognition (CVPR 2012)},
  publisher = {IEEE},
  year = {2012},
  month = {June},
  pages = {1862--1869},
  address = {Providence, Rhode Island, USA},
  url = {http://users.ics.forth.gr/ argyros/res_twohands.html},
  projects =  {GRASP,ROBOHOW},
  doi = {10.1109/CVPR.2012.6247885},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2012_06_cvpr_twohands.pdf},
  videolink = {https://youtu.be/e3G9soCdIbc}
}

P. Padeleris, X. Zabulis and A.A. Argyros, "Head pose estimation on depth data based on Particle Swarm Optimization", In IEEE Computer Vision and Pattern Recognition Workshops (CVPRW 2012), IEEE, pp. 42-49, Providence, Rhode Island, USA, June 2012.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We propose a method for human head pose estimation based on images acquired by a depth camera. During an initialization phase, a reference depth image of a human subject is obtained. At run time, the method searches the 6-dimensional pose space to find a pose from which the head appears identical to the reference view. This search is formulated as an optimization problem whose objective function quantifies the discrepancy of the depth measurements between the hypothesized views to the reference view. The method is demonstrated in several data sets including ones with known ground truth and comparatively evaluated with respect to state of the art methods. The obtained experimental results show that the proposed method outperforms existing methods in accuracy and tolerance to occlusions. Additionally, compared to the state of the art, it handles head pose estimation in a wider range of head poses.

BibTeX:

@inproceedings{Padeleris2012,
  author = {Padeleris, Pashalis and Zabulis, Xenophon and Argyros, Antonis A},
  title = {Head pose estimation on depth data based on Particle Swarm Optimization},
  booktitle = {IEEE Computer Vision and Pattern Recognition Workshops (CVPRW 2012)},
  publisher = {IEEE},
  year = {2012},
  month = {June},
  pages = {42--49},
  address = {Providence, Rhode Island, USA},
  url = {http://users.ics.forth.gr/ argyros/res_rgbdheadpose.html},
  projects =  {DALI,HOBBIT,AMIPROJ},
  doi = {10.1109/CVPRW.2012.6239236},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2012_06_hau3d_headpose.pdf},
  videolink = {https://youtu.be/BFxzyagDF9A}
}

N. Kyriazis, I. Oikonomidis and A.A. Argyros, "A GPU-powered Computational Framework for Efficient 3D Model-based Vision", In International Conference on Cognitive Systems (COGSYS 2012), Vienna, Austria, February 2012.
[BibTeX] [PDF] [URL]

BibTeX:

@inproceedings{Kyriazis2012,
  author = {Kyriazis, Nikolaos and Oikonomidis, Iason and Argyros, Antonis A},
  title = {A GPU-powered Computational Framework for Efficient 3D Model-based Vision},
  booktitle = {International Conference on Cognitive Systems (COGSYS 2012)},
  year = {2012},
  month = {February},
  address = {Vienna, Austria},
  url = {http://cvrlcode.ics.forth.gr/handtracking/},
  projects =  {GRASP},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2012_02_cogsys_kyriazis.pdf}
}

I. Oikonomidis, N. Kyriazis and A.A. Argyros, "Efficient Model-based Tracking of the Articulated Motion of Hands", In International Conference on Cognitive Systems (COGSYS 2012), Vienna, Austria, February 2012.
[BibTeX] [PDF] [URL]

BibTeX:

@inproceedings{Oikonomidis2012a,
  author = {Oikonomidis, Iason and Kyriazis, Nikolaos and Argyros, Antonis A},
  title = {Efficient Model-based Tracking of the Articulated Motion of Hands},
  booktitle = {International Conference on Cognitive Systems (COGSYS 2012)},
  year = {2012},
  month = {February},
  address = {Vienna, Austria},
  url = {http://cvrlcode.ics.forth.gr/handtracking/},
  projects =  {GRASP},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2012_02_cogsys_oikonom.pdf}
}

I. Oikonomidis, N. Kyriazis and A.A. Argyros, "Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints", In IEEE International Conference on Computer Vision (ICCV 2011), IEEE, pp. 2088-2095, Barcelona, Spain, November 2011.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: Due to occlusions, the estimation of the full pose of a human hand interacting with an object is much more challenging than pose recovery of a hand observed in isolation. In this work we formulate an optimization problem whose solution is the 26-DOF hand pose together with the pose and model parameters of the manipulated object. Optimization seeks for the joint hand-object model that (a) best explains the incompleteness of observations resulting from occlusions due to hand-object interaction and (b) is physically plausible in the sense that the hand does not share the same physical space with the object. The proposed method is the first that solves efficiently the continuous, full-DOF, joint hand-object tracking problem based solely on markerless multicamera input. Additionally, it is the first to demonstrate how hand-object interaction can be exploited as a context that facilitates hand pose estimation, instead of being considered as a complicating factor. Extensive quantitative and qualitative experiments with simulated and real world image sequences as well as a comparative evaluation with a state-of-the-art method for pose estimation of isolated hands, support the above findings.

BibTeX:

@inproceedings{Oikonomidis2011a,
  author = {Oikonomidis, Iason and Kyriazis, Nikolaos and Argyros, Antonis A},
  title = {Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints},
  booktitle = {IEEE International Conference on Computer Vision (ICCV 2011)},
  publisher = {IEEE},
  year = {2011},
  month = {November},
  pages = {2088--2095},
  address = {Barcelona, Spain},
  url = {http://users.ics.forth.gr/ argyros/res_hope.html},
  projects =  {GRASP},
  doi = {10.1109/ICCV.2011.6126483},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2011_11_iccv_hope.pdf},
  videolink = {https://youtu.be/N3ffgj1bBGw}
}

D. Grammenos, X. Zabulis, D. Michel, T. Sarmis, G. Georgalis, K. Tzevanidis, A.A. Argyros and C. Stephanidis, "Design and Development of Four Prototype Interactive Edutainment Exhibits for Museums", In Universal Access in Human-Computer Interaction Context Diversity UAHCI 2011, Held as Part of HCI International 2011, pp. 173-182, Orlando, Florida, USA, July 2011.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: This paper describes the outcomes stemming from the work of a multidisciplinary R&D project of ICS FORTH, aiming to explore and experiment with novel interactive museum exhibits, and to assess their utility, usability and potential impact. More specifically, four interactive systems are presented in this paper which have been integrated, tested and evaluated in a dedicated, appropriately designed, laboratory space. The paper also discusses key issues stemming from experience and observations in the course of qualitative evaluation sessions with a large number of participants.

BibTeX:

@inproceedings{Grammenos2011,
  author = {Grammenos, Dimitris and Zabulis, Xenophon and Michel, Damien and Sarmis, Thomas and Georgalis, Giannis and Tzevanidis, Konstantinos and Argyros, Antonis A and Stephanidis, Constantine},
  title = {Design and Development of Four Prototype Interactive Edutainment Exhibits for Museums},
  booktitle = {Universal Access in Human-Computer Interaction Context Diversity UAHCI 2011, Held as Part of HCI International 2011},
  year = {2011},
  month = {July},
  pages = {173--182},
  address = {Orlando, Florida, USA},
  url = {http://users.ics.forth.gr/ argyros/res_humanpresence.html},
  projects =  {AMIPROJ},
  doi = {10.1007/978-3-642-21666-4_20},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2011_07_hcii_ami.pdf}
}

D. Grammenos, X. Zabulis, D. Michel and A.A. Argyros, "Augmented Reality Interactive Exhibits in Cartographic Heritage: An implemented case-study open to the general public", International web journal on sciences and technologies affined to history of cartography and maps, ePerimetron, vol. 6, no. 2, pp. 57-67, April 2011.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: This paper presents the application of the PaperView system in the domain of cartographic heritage. PaperView is a multi-user augmented-reality system for supplementing physical surfaces with digital information, through the use of pieces of plain paper that act as personal, location-aware, interactive screens. By applying the proposed method of reality augmentation in the cartographic heritage domain, the system provides the capability of retrieving multimedia information about areas of interest, overlaying information on a 2D or 3D (i.e., scale model) map, as well as comparing different versions of a single map. The technologies employed are presented, along with the interactive behavior of the system, which was instantiated and tested in three setups: (i) a map of Macedonia, Greece, including ancient Greek cities with archeological interest; (ii) a glass case containing a scale model and (iii) a part of Rigas Velestinlis’ Charta. The first two systems are currently installed and available to the general public at the Archaeological Museum of Thessaloniki, Greece, as part of a permanent exhibition of interactive systems.

BibTeX:

@article{Grammenos2011a,
  author = {Grammenos, Dimitris and Zabulis, Xenophon and Michel, Damien and Argyros, Antonis A},
  title = {Augmented Reality Interactive Exhibits in Cartographic Heritage: An implemented case-study open to the general public},
  journal = {International web journal on sciences and technologies affined to history of cartography and maps},
  publisher = {ePerimetron},
  year = {2011},
  month = {April},
  volume = {6},
  number = {2},
  pages = {57--67},
  address = {Hague, Netherlands},
  url = {http://users.ics.forth.gr/ argyros/res_humanpresence.html},
  projects =  {AMIPROJ},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2011_04_ica_cartographic.pdf}
}

D. Grammenos, D. Michel, X. Zabulis and A.A. Argyros, "PaperView: augmenting physical surfaces with location-aware digital information", In ACM Tangible, Embedded, and Embodied Interaction (TEI 2011), ACM, pp. 57-60, Funchal, Portugal, January 2011.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: A frequent need of museums is to provide visitors with context-sensitive information about exhibits in the form of maps, or scale models. This paper suggests an augmented-reality approach for supplementing physical surfaces with digital information, through the use of pieces of plain paper that act as personal, location-aware, interactive screens. The technologies employed are presented, along with the interactive behavior of the system, which was instantiated and tested in the form of two prototype setups: a wooden table covered with a printed map and a glass case containing a scale model. The paper also discusses key issues stemming from experience and observations in the course of qualitative evaluation sessions.

BibTeX:

@inproceedings{Grammenos2011b,
  author = {Grammenos, Dimitris and Michel, Damien and Zabulis, Xenophon and Argyros, Antonis A},
  title = {PaperView: augmenting physical surfaces with location-aware digital information},
  booktitle = {ACM Tangible, Embedded, and Embodied Interaction (TEI 2011)},
  publisher = {ACM},
  year = {2011},
  month = {January},
  pages = {57--60},
  address = {Funchal, Portugal},
  url = {http://users.ics.forth.gr/ argyros/res_paperview.html},
  projects =  {AMIPROJ},
  doi = {10.1145/1935701.1935713},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2011_01_tei_paperview.pdf},
  videolink = {https://youtu.be/ZxqUEYxc5FA?list=PL51573060F0131D04}
}

D. Michel, I. Oikonomidis and A.A. Argyros, "Scale invariant and deformation tolerant partial shape matching", Image and Vision Computing, Elsevier, vol. 29, no. 7, pp. 459-469, 2011.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present a novel approach to the problem of establishing the best match between an open contour and a part of a closed contour. At the heart of the proposed scheme lies a novel shape descriptor that also permits the quantification of local scale. Shape descriptors are computed along open or closed contours in a spatially non-uniform manner. The resulting ordered collections of shape descriptors constitute the global shape representation. A variant of an existing Dynamic Time Warping (DTW) matching technique is proposed to handle the matching of shape representations. Due to the properties of the employed shape descriptor, sampling scheme and matching procedure, the proposed approach performs partial shape matching that is invariant to Euclidean transformations, starting point as well as to considerable shape deformations. Additionally, the problem of matching closed-to-closed contours is naturally treated as a special case. Extensive experiments on benchmark datasets but also in the context of specific applications, demonstrate that the proposed scheme outperforms existing methods for the problem of partial shape matching and performs comparably to methods for full shape matching.

BibTeX:

@article{Michel2011,
  author = {Michel, Damien and Oikonomidis, Iason and Argyros, Antonis A},
  title = {Scale invariant and deformation tolerant partial shape matching},
  journal = {Image and Vision Computing},
  publisher = {Elsevier},
  year = {2011},
  volume = {29},
  number = {7},
  pages = {459--469},
  url = {http://users.ics.forth.gr/ argyros/res_partialshapematching.html},
  projects =  {GRASP},
  doi = {10.1016/j.imavis.2011.01.008},
  pdflink = {http://dx.doi.org/10.1016/j.imavis.2011.01.008},
  videolink = {https://youtu.be/_N41CP0f0Mg}
}

K. Tzevanidis and A.A. Argyros, "Unsupervised learning of background modeling parameters in multicamera systems", Computer Vision and Image Understanding, Academic Press, vol. 115, no. 1, pp. 105-116, 2011.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Background modeling algorithms are commonly used in camera setups for foreground object detection. Typically, these algorithms need adjustment of their parameters towards achieving optimal performance in different scenarios and/or lighting conditions. This is a tedious process requiring considerable effort by expert users. In this work we propose a novel, fully automatic method for the tuning of foreground detection parameters in calibrated multicamera systems. The proposed method requires neither user intervention nor ground truth data. Given a set of such parameters, we define a fitness function based on the consensus built from the multicamera setup regarding whether points belong to the scene foreground or background. The maximization of this fitness function through Particle Swarm Optimization leads to the adjustment of the foreground detection parameters. Extensive experimental results confirm the effectiveness of the adopted approach.

BibTeX:

@article{Tzevanidis2011,
  author = {Tzevanidis, Konstantinos and Argyros, Antonis A},
  title = {Unsupervised learning of background modeling parameters in multicamera systems},
  journal = {Computer Vision and Image Understanding},
  publisher = {Academic Press},
  year = {2011},
  volume = {115},
  number = {1},
  pages = {105--116},
  projects =  {GRASP},
  doi = {10.1016/j.cviu.2010.09.003},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2011_01_journal_cviu_background.pdf}
}

N. Kyriazis, I. Oikonomidis and A.A. Argyros, "Binding Computer Vision to Physics Based Simulation: The Case Study of a Bouncing Ball", In British Machine Vision Conference (BMVC 2011), BMVA, pp. 1-11, Dundee, UK, 2011.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: A dynamic scene and, therefore, its visual observations are invariably determined by the laws of physics. We demonstrate an illustrative case where physical explanation, as a vision prior, is not a commodity but a necessity. By considering the problem of ball motion estimation we show how physics-based simulation in conjunction with visual processes can lead to the reduction of the visual input required to infer physical attributes of the observed world. Even further, we show that the proposed methodology manages to reveal certain physical attributes of the observed scene that are difficult or even impossible to extract by other means. A series of experiments on synthetic data as well as experiments with image sequences of an actual ball, support the validity of the proposed approach. The use of generic tools and the top-down nature of the proposed approach make it general enough to be a likely candidate for handling even more complex problems in larger contexts.

BibTeX:

@inproceedings{Kyriazis2011,
  author = {Kyriazis, Nikolaos and Oikonomidis, Iason and Argyros, Antonis A},
  title = {Binding Computer Vision to Physics Based Simulation: The Case Study of a Bouncing Ball},
  booktitle = {British Machine Vision Conference (BMVC 2011)},
  publisher = {BMVA},
  year = {2011},
  pages = {1--11},
  address = {Dundee, UK},
  url = {http://users.ics.forth.gr/ argyros/res_bouncingball.html},
  projects =  {GRASP},
  doi = {10.5244/C.25.43},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2011_09_bmvc_bouncing_ball.pdf},
  videolink = {https://youtu.be/Lr5wq5It4io}
}

I. Oikonomidis, N. Kyriazis and A.A. Argyros, "Efficient model-based 3D tracking of hand articulations using Kinect", In British Machine Vision Conference (BMVC 2011), BMVA, pp. 1-11, Dundee, UK, 2011.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present a novel solution to the problem of recovering and tracking the 3D position, orientation and full articulation of a human hand from markerless visual observations obtained by a Kinect sensor. We treat this as an optimization problem, seeking for the hand model parameters that minimize the discrepancy between the appearance and 3D structure of hypothesized instances of a hand model and actual hand observations. This optimization problem is effectively solved using a variant of Particle Swarm Optimization (PSO). The proposed method does not require special markers and/or a complex image acquisition setup. Being model based, it provides continuous solutions to the problem of tracking hand articulations. Extensive experiments with a prototype GPU-based implementation of the proposed method demonstrate that accurate and robust 3D tracking of hand articulations can be achieved in near real-time (15Hz).

BibTeX:

@inproceedings{Oikonomidis2011,
  author = {Oikonomidis, Iason and Kyriazis, Nikolaos and Argyros, Antonis A},
  title = {Efficient model-based 3D tracking of hand articulations using Kinect},
  booktitle = {British Machine Vision Conference (BMVC 2011)},
  publisher = {BMVA},
  year = {2011},
  volume = {1},
  number = {2},
  pages = {1--11},
  address = {Dundee, UK},
  url = {http://cvrlcode.ics.forth.gr/handtracking/},
  projects =  {GRASP},
  doi = {10.5244/C.25.101},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2011_09_bmvc_kinect_hand_tracking.pdf},
  videolink = {https://youtu.be/Fxa43qcm1C4}
}

R. Dillmann, T. Asfour and A.A. Argyros, "Intelligent and Cognitive Systems-Introduction to the Special Theme", ERCIM News, no. 84, pp. 12-13, 2011.
[BibTeX] [PDF] [URL]

BibTeX:

@periodical{Dillmann2011,
  author = {Dillmann, Rudiger and Asfour, Tamim and Argyros, Antonis A},
  title = {Intelligent and Cognitive Systems-Introduction to the Special Theme},
  journal = {ERCIM News},
  year = {2011},
  number = {84},
  pages = {12--13},
  url = {http://users.ics.forth.gr/ argyros},
  projects =  {none},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2011_01_ercimnews_cognitive_systems.pdf}
}

X. Zabulis, D. Grammenos, A.A. Argyros, M. Sifakis and C. Stephanidis, "Macedonia: From Fragments to Pixels", ERCIM News, no. 86, 2011.
[BibTeX] [PDF] [URL]

BibTeX:

@periodical{Zabulis2011,
  author = {Zabulis, Xenophon and Grammenos, Dimitris and Argyros, Antonis A and Sifakis, Michalis and Stephanidis, Constantine},
  title = {Macedonia: From Fragments to Pixels},
  journal = {ERCIM News},
  year = {2011},
  number = {86},
  url = {http://users.ics.forth.gr/ argyros/res_humanpresence.html},
  projects =  {AMIPROJ},
  pdflink = {http://ercim-news.ercim.eu/en86/special/macedonia-from-fragments-to-pixels}
}

N. Kyriazis, I. Oikonomidis and A.A. Argyros, "A GPU-powered computational framework for efficient 3D model-based vision", FORTH-ICS, TR-420, 2011.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: We present a generic computational framework that exploits GPU processing to cope with the significant computational requirements of a class of model-based vision problems. We study the structure of this class of problems and map the involved processes to contemporary GPU architectures. The proposed framework has been validated through its application to various instances of the problem of model-based 3D hand tracking. We show that through the exploitation of this framework near real-time performance is achieved in problems that are prohibitively expensive to solve on CPU-only architectures. Additional experiments performed in various GPU architectures demonstrate the scalability of the approach and the distribution of the execution time among the involved processes.

BibTeX:

@techreport{Kyriazis2011a,
  author = {Kyriazis, Nikolaos and Oikonomidis, Iason and Argyros, Antonis A},
  title = {A GPU-powered computational framework for efficient 3D model-based vision},
  school = {FORTH-ICS},
  year = {2011},
  number = {TR-420},
  url = {http://cvrlcode.ics.forth.gr/handtracking/},
  projects =  {GRASP},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2011_07_tr420_gpuarch.pdf}
}

K. Papoutsakis and A.A. Argyros, "Object Tracking and Segmentation in a Closed Loop", In Advances in Visual Computing (ISVC 2010), Springer, pp. 405-416, Las Vegas, Nevada, USA, November 2010.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We introduce a new method for integrated tracking and segmentation of a single non-rigid object in an monocular video, captured by a possibly moving camera. A closed-loop interaction between EM-like color-histogram-based tracking and Random Walker-based image segmentation is proposed, which results in reduced tracking drifts and in fine object segmentation. More specifically, pixel-wise spatial and color image cues are fused using Bayesian inference to guide object segmentation. The spatial properties and the appearance of the segmented objects are exploited to initialize the tracking algorithm in the next step, closing the loop between tracking and segmentation. As confirmed by experimental results on a variety of image sequences, the proposed approach efficiently tracks and segments previously unseen objects of varying appearance and shape, under challenging environmental conditions.

BibTeX:

@inproceedings{Papoutsakis2010,
  author = {Papoutsakis, Konstantinos and Argyros, Antonis A},
  title = {Object Tracking and Segmentation in a Closed Loop},
  booktitle = {Advances in Visual Computing (ISVC 2010)},
  publisher = {Springer},
  year = {2010},
  month = {November},
  pages = {405--416},
  address = {Las Vegas, Nevada, USA},
  url = {http://users.ics.forth.gr/ argyros/res_trackingsegmentation.html},
  projects =  {GRASP},
  doi = {10.1007/978-3-642-17289-2_39},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2010_12_isvc_trackingsegmentation.pdf}
}

X. Zabulis, T. Sarmis, K. Tzevanidis, P. Koutlemanis, D. Grammenos and A.A. Argyros, "A Platform for Monitoring Aspects of Human Presence in Real-Time", In Advances in Visual Computing (ISVC 2010), Springer, pp. 584-595, Las Vegas, Nevada, USA, November 2010.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: In this paper, the design and implementation of a hardware/software platform for parallel and distributed multiview vision processing is presented. The platform is focused at supporting the monitoring of human presence in indoor environments. Its architecture is focused at increased throughput through process pipelining as well as at reducing communication costs and hardware requirements. Using this platform, we present efficient implementations of basic visual processes such as person tracking, textured visual hull computation and head pose estimation. Using the proposed platform multiview visual operations can be combined and third-party ones integrated, to ultimately facilitate the development of interactive applications that employ visual input. Computational performance is benchmarked comparatively to state of the art and the efficacy of the approach is qualitatively assessed in the context of already developed applications related to interactive environment.

BibTeX:

@inproceedings{Zabulis2010,
  author = {Zabulis, Xenophon and Sarmis, Thomas and Tzevanidis, Konstantinos and Koutlemanis, Panayotis and Grammenos, Dimitris and Argyros, Antonis A},
  title = {A Platform for Monitoring Aspects of Human Presence in Real-Time},
  booktitle = {Advances in Visual Computing (ISVC 2010)},
  publisher = {Springer},
  year = {2010},
  month = {November},
  pages = {584--595},
  address = {Las Vegas, Nevada, USA},
  url = {http://users.ics.forth.gr/ argyros/res_humanpresence.html},
  projects =  {AMIPROJ},
  doi = {10.1007/978-3-642-17274-8_57},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2010_12_isvc_humanpresence.pdf}
}

K. Tzevanidis, X. Zabulis, T. Sarmis, P. Koutlemanis, N. Kyriazis and A.A. Argyros, "From Multiple Views to Textured 3D Meshes: A GPU-Powered Approach", In European Conference on Computer Vision Workshops (CVGPU 2010 - ECCVW 2010), Springer, pp. 384-397, Heraklion, Crete, Greece, September 2010.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present work on exploiting modern graphics hardware towards the real-time production of a textured 3D mesh representation of a scene observed by a multicamera system. The employed computational infrastructure consists of a network of four PC workstations each of which is connected to a pair of cameras. One of the PCs is equipped with a GPU that is used for parallel computations. The result of the processing is a list of texture mapped triangles representing the reconstructed surfaces. In contrast to previous works, the entire processing pipeline (foreground segmentation, 3D reconstruction, 3D mesh computation, 3D mesh smoothing and texture mapping) has been implemented on the GPU. Experimental results demonstrate that an accurate, high resolution, texture-mapped 3D reconstruction of a scene observed by eight cameras is achievable in real time.

BibTeX:

@inproceedings{Tzevanidis2010,
  author = {Tzevanidis, Konstantinos and Zabulis, Xenophon and Sarmis, Thomas and Koutlemanis, Panayotis and Kyriazis, Nikolaos and Argyros, Antonis A},
  title = {From Multiple Views to Textured 3D Meshes: A GPU-Powered Approach},
  booktitle = {European Conference on Computer Vision Workshops (CVGPU 2010 - ECCVW 2010)},
  publisher = {Springer},
  year = {2010},
  month = {September},
  pages = {384--397},
  address = {Heraklion, Crete, Greece},
  url = {http://users.ics.forth.gr/ argyros/res_gpu3Drec.html},
  projects =  {GRASP,AMIPROJ},
  doi = {10.1007/978-3-642-35740-4_30},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2010_09_cvgpu_3Dreconstruction.pdf},
  videolink = {https://youtu.be/n0KC7wL_D_Q}
}

X. Zabulis, D. Grammenos, T. Sarmis, K. Tzevanidis and A.A. Argyros, "Exploration of Large-scale Museum Artifacts through Non-instrumented, Location-based, Multi-user Interaction", In International Symposium on Virtual Reality, Archaeology and Intelligent Cultural Heritage (VAST 2010), pp. 155-162, Paris, France, September 2010.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: This paper presents a system that supports the exploration of digital representations of large-scale museum artifacts in through non-instrumented, location-based interaction. The system employs a state-of-the-art computer vision system, which localizes and tracks multiple visitors. The artifact is presented in a wall-sized projection screen and it is visually annotated with text and images according to the location as well as walkthrough trajectories of the tracked visitors. The system is evaluated in terms of computational performance, localization accuracy, tracking robustness and usability.

BibTeX:

@inproceedings{Zabulis2010a,
  author = {Zabulis, Xenophon and Grammenos, Dimitris and Sarmis, Thomas and Tzevanidis, Konstantinos and Argyros, Antonis A},
  title = {Exploration of Large-scale Museum Artifacts through Non-instrumented, Location-based, Multi-user Interaction},
  booktitle = {International Symposium on Virtual Reality, Archaeology and Intelligent Cultural Heritage (VAST 2010)},
  year = {2010},
  month = {September},
  pages = {155--162},
  address = {Paris, France},
  url = {http://users.ics.forth.gr/ argyros/res_humanpresence.html},
  projects =  {AMIPROJ},
  doi = {10.2312/VAST/VAST10/155-162},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2010_09_vast_localization.pdf}
}

I. Oikonomidis, N. Kyriazis and A.A. Argyros, "Markerless and Efficient 26-DOF Hand Pose Recovery", In Asian Conference on Computer Vision (ACCV 2010), Springer, pp. 744-757, Queenstown, New Zealand, April 2010.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We present a novel method that, given a sequence of synchronized views of a human hand, recovers its 3D position, orientation and full articulation parameters. The adopted hand model is based on properly selected and assembled 3D geometric primitives. Hypothesized configurations/poses of the hand model are projected to different camera views and image features such as edge maps and hand silhouettes are computed. An objective function is then used to quantify the discrepancy between the predicted and the actual, observed features. The recovery of the 3D hand pose amounts to estimating the parameters that minimize this objective function which is performed using Particle Swarm Optimization. All the basic components of the method (feature extraction, objective function evaluation, optimization process) are inherently parallel. Thus, a GPU-based implementation achieves a speedup of two orders of magnitude over the case of CPU processing. Extensive experimental results demonstrate qualitatively and quantitatively that accurate 3D pose recovery of a hand can be achieved robustly at a rate that greatly outperforms the current state of the art.

BibTeX:

@inproceedings{Oikonomidis2010,
  author = {Oikonomidis, Iason and Kyriazis, Nikolaos and Argyros, Antonis A},
  title = {Markerless and Efficient 26-DOF Hand Pose Recovery},
  booktitle = {Asian Conference on Computer Vision (ACCV 2010)},
  publisher = {Springer},
  year = {2010},
  month = {April},
  pages = {744--757},
  address = {Queenstown, New Zealand},
  url = {http://users.ics.forth.gr/ argyros/res_3Dhandpose.html},
  projects =  {GRASP},
  doi = {10.1007/978-3-642-19318-7_58},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2010_11_ACCV_3Dhandpose.pdf}
}

D. Michel, A.A. Argyros and M.I.A. Lourakis, "Horizon matching for localizing unordered panoramic images", Computer Vision and Image Understanding, Elsevier, vol. 114, no. 2, pp. 274-285, 2010.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: There is currently an abundance of vision algorithms which, provided with a sequence of images that have been acquired from sufficiently close successive 3D locations, are capable of determining the relative positions of the viewpoints from which the images have been captured. However, very few of these algorithms can cope with unordered image sets. This paper presents an efficient method for recovering the position and orientation parameters corresponding to the viewpoints of a set of panoramic images for which no a priori order information is available, along with certain structure information regarding the imaged environment. The proposed approach assumes that all images have been acquired from a constant height above a planar ground and operates sequentially, employing the Levenshtein distance to deduce the spatial proximity of image viewpoints and thus determine the order in which images should be processed. The Levenshtein distance also provides matches between imaged points, from which their corresponding environment points can be reconstructed. Image matching with the aid of the Levenshtein distance forms the crux of an iterative process that alternates between image localization from multiple reconstructed points and point reconstruction from multiple image projections, until all views have been localized. Periodic refinement of the reconstruction with the aid of bundle adjustment, distributes the reconstruction error among images. The approach is demonstrated on several unordered sets of panoramic images obtained in indoor environments.

BibTeX:

@article{Michel2010,
  author = {Michel, Damien and Argyros, Antonis A and Lourakis, Manolis I A},
  title = {Horizon matching for localizing unordered panoramic images},
  journal = {Computer Vision and Image Understanding},
  publisher = {Elsevier},
  year = {2010},
  volume = {114},
  number = {2},
  pages = {274--285},
  url = {http://users.ics.forth.gr/ argyros/res_pan_slam.html},
  projects =  {none},
  doi = {10.1016/j.cviu.2009.03.006},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2010_02_journal_cviu_omni_localization.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/reconsProgress2frames.avi}
}

V. Papadourakis and A.A. Argyros, "Multiple objects tracking in the presence of long-term occlusions", Computer Vision and Image Understanding, Elsevier, vol. 114, no. 7, pp. 835-846, 2010.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: We present a robust object tracking algorithm that handles spatially extended and temporally long object occlusions. The proposed approach is based on the concept of “object permanence” which suggests that a totally occluded object will re-emerge near its occluder. The proposed method does not require prior training to account for differences in the shape, size, color or motion of the objects to be tracked. Instead, the method automatically and dynamically builds appropriate object representations that enable robust and effective tracking and occlusion reasoning. The proposed approach has been evaluated on several image sequences showing either complex object manipulation tasks or human activity in the context of surveillance applications. Experimental results demonstrate that the developed tracker is capable of handling several challenging situations, where the labels of objects are correctly identified and maintained over time, despite the complex interactions among the tracked objects that lead to several layers of occlusions.

BibTeX:

@article{Papadourakis2010,
  author = {Papadourakis, Vasilis and Argyros, Antonis A},
  title = {Multiple objects tracking in the presence of long-term occlusions},
  journal = {Computer Vision and Image Understanding},
  publisher = {Elsevier},
  year = {2010},
  volume = {114},
  number = {7},
  pages = {835--846},
  url = {http://users.ics.forth.gr/ argyros/res_occlusions.html},
  projects =  {GRASP},
  doi = {10.1016/j.cviu.2010.02.003},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2010_07_journal_cviu_occlusions.pdf},
  videolink = {https://youtu.be/MiLah8Q0TWk}
}

D. Grammenos, X. Zabulis, A.A. Argyros and C. Stephanidis, "FORTH-ICS Internal RTD Programme Ambient Intelligence and Smart Environments", In European Conference on Ambient Intelligence (AMI 2009), Saltsburg, Austria, November 2009.
[Abstract] [BibTeX] [PDF]

Abstract: This paper introduces the horizontal, interdisciplinary, cross-thematic RTD Programme in the field of Ambient Intelligence which has recently been initiated by the Institute of Computer Science of the Foundation for Research and Technology Hellas, aiming to contribute towards the creation and provision of pioneering human-centric AmI technologies and smart environments.

BibTeX:

@inproceedings{Grammenos2009,
  author = {Grammenos, Dimitris and Zabulis, Xenophon and Argyros, Antonis A and Stephanidis, Constantine},
  title = {FORTH-ICS Internal RTD Programme Ambient Intelligence and Smart Environments},
  booktitle = {European Conference on Ambient Intelligence (AMI 2009)},
  year = {2009},
  month = {November},
  address = {Saltsburg, Austria},
  projects =  {AMIPROJ},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2009_11_AmI09_AmI.pdf}
}

I. Oikonomidis and A.A. Argyros, "Deformable 2D Shape Matching Based on Shape Contexts and Dynamic Programming", In Advances in Visual Computing (ISVC 2009), Springer, pp. 460-469, Las Vegas, Nevada, USA, November 2009.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: This paper presents a method for matching closed, 2D shapes (2D object silhouettes) that are represented as an ordered collection of shape contexts [1]. Matching is performed using a recent method that computes the optimal alignment of two cyclic strings in sub-cubic runtime. Thus, the proposed method is suitable for efficient, near real-time matching of closed shapes. The method is qualitatively and quantitatively evaluated using several datasets. An application of the method for joint detection in human figures is also presented.

BibTeX:

@inproceedings{Oikonomidis2009,
  author = {Oikonomidis, Iason and Argyros, Antonis A},
  title = {Deformable 2D Shape Matching Based on Shape Contexts and Dynamic Programming},
  booktitle = {Advances in Visual Computing (ISVC 2009)},
  publisher = {Springer},
  year = {2009},
  month = {November},
  pages = {460--469},
  address = {Las Vegas, Nevada, USA},
  url = {http://users.ics.forth.gr/ argyros/res_shapematching.html},
  projects =  {GRASP},
  doi = {10.1007/978-3-642-10520-3_43},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2009_11_isvc_shape_matching.pdf}
}

I. Cheng, D. Michel, A.A. Argyros and A. Basu, "A HIMI model for collaborative multi-touch multimedia education", In Proceedings of the 2009 workshop on Ambient media computing, ACM, pp. 3-12, Beijing, China, October 2009.
[Abstract] [BibTeX] [PDF]

Abstract: Educational testing and learning have evolved from using standard True/False, fill-in-the-blank and multiple choice on paper to more visually enriched formats using interactive multimedia content on digital displays. However, traditional educational application interfaces are primarily mouse-driven, which prevents multiple users working simultaneously. Although touch-based displays have emerged and inspired new developments, they are mainly used in simple tasks. In this paper we show how the multi-touch technology can be extended to collaborative learning and testing at a larger scale, using an existing education implementation for illustration. We propose a Human-Intention-Machine-Interpretation (HIMI) model, which applies a graph-based approach to recognize hand gestures and interpret user intentions. Our focus is not to build a new multitouch system but to make use of the existing multi-touch technology to enhance learning performance. The HIMI model not only facilitates natural interactions using hand movements on simple tasks, but also supports complex collaborative operations. Our contribution lies in embedding the multi-touch technology in multimedia education, providing a multi-user learning and testing environment which would not have been possible using traditional input devices. We formalize a conceptual model to uniquely interpret user intentions via touch states, state transitions and transition associations. We also propose a set of hand gestures for working with multimedia educational items. User evaluations are conducted to show the feasibility of the proposed hand gestures.

BibTeX:

@inproceedings{Cheng2009,
  author = {Cheng, Irene and Michel, Damien and Argyros, Antonis A and Basu, Anup},
  title = {A HIMI model for collaborative multi-touch multimedia education},
  booktitle = {Proceedings of the 2009 workshop on Ambient media computing},
  publisher = {ACM},
  year = {2009},
  month = {October},
  pages = {3--12},
  address = {Beijing, China},
  projects =  {AMIPROJ},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2009_10_amc09_eblackboard_education.pdf}
}

X. Zabulis, T. Sarmis and A.A. Argyros, "3D Head Pose Estimation from Multiple Distant Views", In British Machine Vision Conference (BMVC 2009), BMVA, pp. 1-12, London, UK, September 2009.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: A method for human head pose estimation in multicamera environments is proposed. The method computes the textured visual hull of the subject and unfolds the texture of the head on a hypothetical sphere around it, whose parameterization is iteratively rotated so that the face eventually occurs on its equator. This gives rise to a spherical image, in which face detection is simplified, because exactly one frontal face is guaranteed to appear in it. In this image, the face center yields two components of pose (yaw, pitch), while the third (roll) is retrieved from the orientation of the major symmetry axis of the face. Face detection applied on the original images reduces the required iterations and anchors tracking drift. The method is demonstrated and evaluated in several data sets, including ones with known ground truth. Experimental results show that the proposed method is accurate and robust to distant imaging, despite the low-resolution appearance of subjects.

BibTeX:

@inproceedings{Zabulis2009a,
  author = {Zabulis, Xenophon and Sarmis, Thomas and Argyros, Antonis A},
  title = {3D Head Pose Estimation from Multiple Distant Views},
  booktitle = {British Machine Vision Conference (BMVC 2009)},
  publisher = {BMVA},
  year = {2009},
  month = {September},
  volume = {1},
  number = {2},
  pages = {1--12},
  address = {London, UK},
  url = {http://www.ics.forth.gr/cvrl/headpose/},
  projects =  {none},
  doi = {10.5244/C.23.118},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2009_09_bmvc_headpose.pdf},
  videolink = {https://youtu.be/x3GUYs9snBs}
}

D. Grammenos, Y. Georgalis, N. Partarakis, X. Zabulis, T. Sarmis, S. Kartakis, P. Tourlakis, A.A. Argyros and C. Stephanidis, "Rapid Prototyping of an AmI-Augmented Office Environment Demonstrator", In Human-Computer Interaction. Ambient, Ubiquitous and Intelligent Interaction, Springer, pp. 397-406, San Diego, CA, USA, July 2009.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: This paper presents the process and tangible outcomes of a rapid prototyping activity towards the creation of a demonstrator, showcasing the potential use and effect of Ambient Intelligence technologies in a typical office environment. In this context, the hardware and software components used are described, as well as the interactive behavior of the demonstrator. Additionally, some conclusions stemming from the experience gained are presented, along with pointers for future research and development work.

BibTeX:

@inproceedings{Grammenos2009a,
  author = {Grammenos, Dimitris and Georgalis, Yannis and Partarakis, Nikolaos and Zabulis, Xenophon and Sarmis, Thomas and Kartakis, Sokratis and Tourlakis, Panagiotis and Argyros, Antonis A and Stephanidis, Constantine},
  title = {Rapid Prototyping of an AmI-Augmented Office Environment Demonstrator},
  booktitle = {Human-Computer Interaction. Ambient, Ubiquitous and Intelligent Interaction},
  publisher = {Springer},
  year = {2009},
  month = {July},
  pages = {397--406},
  address = {San Diego, CA, USA},
  projects =  {AMIPROJ},
  doi = {10.1007/978-3-642-02580-8_43},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2009_07_hci_smartoffice.pdf}
}

G. López-Nicolás, M. Sfakiotakis, D.P. Tsakiris, A.A. Argyros, C. Sagüés and J.J. Guerrero, "Visual homing for undulatory robotic locomotion", In IEEE International Conference on Robotics and Automation (ICRA 2009), IEEE, pp. 2629-2636, Kobe, Japan, May 2009.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: This paper addresses the problem of vision-based closed-loop control for undulatory robots. We present an image-based visual servoing scheme, which drives the robot to a desired location specified by a target image, without explicitly estimating its pose. Instead, the control relies on the computation of the epipolar geometry between the current and target images. We analyze controllability and stability of the proposed control scheme, which is validated by simulation studies using the SIMUUN computational tools. Preliminary experiments, involving the Nereisbot undulatory robotic prototype, are also presented.

BibTeX:

@inproceedings{Lopez-Nicolas2009,
  author = {Gonzalo López-Nicolás and Sfakiotakis, Michael and Tsakiris, Dimitris P and Argyros, Antonis A and Sagüés, Carlos and Guerrero, José Jesús},
  title = {Visual homing for undulatory robotic locomotion},
  booktitle = {IEEE International Conference on Robotics and Automation (ICRA 2009)},
  publisher = {IEEE},
  year = {2009},
  month = {May},
  pages = {2629--2636},
  address = {Kobe, Japan},
  url = {http://users.ics.forth.gr/ argyros/res_undulatory.html},
  doi = {10.1109/ROBOT.2009.5152432},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2009_05_icra_homing.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/videoUndulatoryDivx6.avi}
}

D. Michel, A.A. Argyros, D. Grammenos, X. Zabulis and T. Sarmis, "Building a Multi-Touch Display Based on Computer Vision Techniques", In Machine Vision Applications (MVA 2009), pp. 74-77, Hiyoshi Campus, Keio University, Japan, May 2009.
[Abstract] [BibTeX] [PDF] [URL] [VIDEO]

Abstract: We present the development of a multi-touch display based on computer vision techniques. The developed system is built upon low cost, off-the-shelf hardware components and a careful selection of computer vision techniques. The resulting system is capable of detecting and tracking several objects that may move freely on the surface of a wide projection screen. It also provides additional information regarding the detected and tracked objects, such as their orientation, their full contour, etc. All of the above are achieved robustly, in real time and regardless of the visual appearance of what may be independently projected on the projection screen. We also present indicative results from
the exploitation of the developed system in three application scenarios and discuss directions for further research.

BibTeX:

@inproceedings{Michel2009,
  author = {Michel, Damien and Argyros, Antonis A and Grammenos, Dimitris and Zabulis, Xenophon and Sarmis, Thomas},
  title = {Building a Multi-Touch Display Based on Computer Vision Techniques},
  booktitle = {Machine Vision Applications (MVA 2009)},
  year = {2009},
  month = {May},
  pages = {74--77},
  address = {Hiyoshi Campus, Keio University, Japan},
  url = {http://users.ics.forth.gr/ argyros/res_multitouch.html},
  projects =  {AMIPROJ},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2009_05_mva_smartboard.pdf},
  videolink = {https://youtu.be/l6y2clf73-8?list=PL51573060F0131D04}
}

X. Zabulis, T. Sarmis, D. Grammenos and A.A. Argyros, "A Multicamera Vision System Supporting the Development of Wide-Area Exertainment Applications", In Machine Vision Applications (MVA 2009), pp. 269-272, Keio University, Japan, May 2009.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: In this paper, the application of computer vision techniques to the localization of multiple persons in a relatively wide gaming terrain is presented. Multiple views are employed both for terrain coverage, but most importantly, for treatment of occlusions. Through the appropriate selection of lightweight operations and acceleration strategies, an adequate frame rate is achieved despite the large volume of input data. The resulting system is employed in the development of multiplayer entertainment applications, which are demonstrated and evaluated.

BibTeX:

@inproceedings{Zabulis2009b,
  author = {Zabulis, Xenophon and Sarmis, Thomas and Grammenos, Dimitris and Argyros, Antonis A},
  title = {A Multicamera Vision System Supporting the Development of Wide-Area Exertainment Applications},
  booktitle = {Machine Vision Applications (MVA 2009)},
  year = {2009},
  month = {May},
  pages = {269--272},
  address = {Keio University, Japan},
  url = {http://users.ics.forth.gr/ argyros/res_humanpresence.html},
  projects =  {AMIPROJ},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2009_05_mva_exertainment.pdf}
}

H. Baltzakis and A.A. Argyros, "Propagation of Pixel Hypotheses for Multiple Objects Tracking", In Advances in Visual Computing (ISVC 2009), Springer, pp. 140-149, Las Vegas, Nevada, USA, January 2009.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: In this paper we propose a new approach for tracking multiple objects in image sequences. The proposed approach differs from existing ones in important aspects of the representation of the location and the shape of tracked objects and of the uncertainty associated with them. The location and the speed of each object is modeled as a discrete time, linear dynamical system which is tracked using Kalman filtering. Information about the spatial distribution of the pixels of each tracked object is passed on from frame to frame by propagating a set of pixel hypotheses, uniformly sampled from the original object’s projection to the target frame using the object’s current dynamics, as estimated by the Kalman filter. The density of the propagated pixel hypotheses provides a novel metric that is used to associate image pixels with existing object tracks by taking into account both the shape of each object and the uncertainty associated with its track. The proposed tracking approach has been developed to support face and hand tracking for human-robot interaction. Nevertheless, it is readily applicable to a much broader class of multiple objects tracking problems.

BibTeX:

@inproceedings{Baltzakis2009,
  author = {Baltzakis, Haris and Argyros, Antonis A},
  title = {Propagation of Pixel Hypotheses for Multiple Objects Tracking},
  booktitle = {Advances in Visual Computing (ISVC 2009)},
  publisher = {Springer},
  year = {2009},
  month = {January},
  pages = {140--149},
  address = {Las Vegas, Nevada, USA},
  projects =  {GRASP},
  doi = {10.1007/978-3-642-10520-3_13},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2009_11_isvc_tracking.pdf}
}

X. Zabulis, H. Baltzakis and A.A. Argyros, "Vision-Based Hand Gesture Recognition for Human-Computer Interaction", The Universal Access Handbook, LEA, pp. 34.1-34.30, 2009.
[BibTeX] [DOI] [PDF]

BibTeX:

@incollection{Zabulis2009,
  author = {Zabulis, Xenophon and Baltzakis, Haris and Argyros, Antonis A},
  title = {Vision-Based Hand Gesture Recognition for Human-Computer Interaction},
  booktitle = {The Universal Access Handbook},
  publisher = {LEA},
  year = {2009},
  pages = {34.1--34.30},
  file = {M:\antonis\professional\_html\mypapers\2009_06_book_hci_gestures.pdf:PDF},
  projects =  {MUSCLE,XENIOS,INDIGO},
  doi = {10.1201/9781420064995-c34},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2009_06_book_hci_gestures.pdf}
}

M.I.A. Lourakis and A.A. Argyros, "SBA: A software package for generic sparse bundle adjustment", ACM Transactions on Mathematical Software (TOMS), ACM, vol. 36, no. 1, 2009.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Bundle adjustment constitutes a large, nonlinear least-squares problem that is often solved as the last step of feature-based structure and motion estimation computer vision algorithms to obtain optimal estimates. Due to the very large number of parameters involved, a general purpose least-squares algorithm incurs high computational and memory storage costs when applied to bundle adjustment. Fortunately, the lack of interaction among certain subgroups of parameters results in the corresponding Jacobian being sparse, a fact that can be exploited to achieve considerable computational savings. This article presents sba, a publicly available C/C++ software package for realizing generic bundle adjustment with high efficiency and flexibility regarding parameterization.

BibTeX:

@article{Lourakis2009,
  author = {Lourakis, Manolis I A and Argyros, Antonis A},
  title = {SBA: A software package for generic sparse bundle adjustment},
  journal = {ACM Transactions on Mathematical Software (TOMS)},
  publisher = {ACM},
  year = {2009},
  volume = {36},
  number = {1},
  url = {http://users.ics.forth.gr/ lourakis/sba/},
  doi = {10.1145/1486525.1486527},
  pdflink = {http://doi.acm.org/10.1145/1486525.1486527}
}

M. Vincze, M. Zillich, W. Ponweiser, V. Hlavác, J. Matas, S. Obdrzálek, H. Buxton, J. Howell, K. Sage, A.A. Argyros, C. Eberst and G. Umgeher, "Integrated vision system for the semantic interpretation of activities where a person handles objects", Computer Vision and Image Understanding, Elsevier, vol. 113, no. 6, pp. 682-692, 2009.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: Interpretation of human activity is primarily known from surveillance and video analysis tasks and concerned with the persons alone. In this paper we present an integrated system that gives a natural language interpretation of activities where a person handles objects. The system integrates low-level image components such as hand and object tracking, detection and recognition, with high-level processes such as spatio-temporal object relationship generation, posture and gesture recognition, and activity reasoning. A task-oriented approach focuses processing to achieve near real-time and to react depending on the situation context.

BibTeX:

@article{Vincze2009,
  author = {Vincze, Markus and Zillich, Michael and Ponweiser, Wolfgang and Hlavác, Václav and Matas, Jiri and Obdrzálek, Stepán and Buxton, Hilary and Howell, Jonathan and Sage, Kingsley and Argyros, Antonis A and Eberst, Christof and Umgeher, Gerald},
  title = {Integrated vision system for the semantic interpretation of activities where a person handles objects},
  journal = {Computer Vision and Image Understanding},
  publisher = {Elsevier},
  year = {2009},
  volume = {113},
  number = {6},
  pages = {682--692},
  url = {http://users.ics.forth.gr/ argyros},
  projects =  {ACTIPRET},
  doi = {10.1016/j.cviu.2008.10.008},
  pdflink = {http://dx.doi.org/10.1016/j.cviu.2008.10.008},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/ActipretVideo.mpg}
}

T. Sarmis, X. Zabulis and A.A. Argyros, "A Software Platform for the Acquisition and Online Processing of Images in a Camera Network", ERCIM News, no. 76, 2009.
[BibTeX] [PDF] [URL]

BibTeX:

@periodical{Sarmis2009,
  author = {Sarmis, Thomas and Zabulis, Xenophon and Argyros, Antonis A},
  title = {A Software Platform for the Acquisition and Online Processing of Images in a Camera Network},
  journal = {ERCIM News},
  year = {2009},
  number = {76},
  url = {http://users.ics.forth.gr/ argyros/res_humanpresence.html},
  projects =  {AMIPROJ},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2009_01_ercimnews_sensorweb.pdf}
}

T. Sarmis, X. Zabulis and A.A. Argyros, "A checkerboard detection utility for intrinsic and extrinsic camera cluster calibration", FORTH-ICS, TR-397, 2009.
[BibTeX] [PDF] [URL]

BibTeX:

@techreport{Sarmis2009a,
  author = {Sarmis, Thomas and Zabulis, Xenophon and Argyros, Antonis A},
  title = {A checkerboard detection utility for intrinsic and extrinsic camera cluster calibration},
  school = {FORTH-ICS},
  year = {2009},
  number = {TR-397},
  url = {http://users.ics.forth.gr/ argyros},
  projects =  {AMIPROJ},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2009_08_tr397_calibration.pdf}
}

X. Zabulis, A.A. Argyros and D.P. Tsakiris, "Lumen detection for capsule endoscopy", In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2008), IEEE, pp. 3921-3926, Nice, France, September 2008.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: In this paper, two visual cues are proposed, to be exploited for the navigation of active endoscopic capsules within the gastrointestinal (GI) tract. These cues consist of the detection and tracking of the lumen and of an illumination highlight in capsule endoscopy (CE) images. The proposed approach aims at developing vision algorithms which are robust with respect to the challenging imaging conditions encountered in the GI tract and the great variability of the acquired images. Cases where no or more than one lumens exists, are also detected. The proposed approach extends the state-of-the-art in lumen detection, and is demonstrated for in-vivo video sequences acquired from endoscopic capsules.

BibTeX:

@inproceedings{Zabulis2008,
  author = {Zabulis, Xenophon and Argyros, Antonis A and Tsakiris, Dimitris P},
  title = {Lumen detection for capsule endoscopy},
  booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2008)},
  publisher = {IEEE},
  year = {2008},
  month = {September},
  pages = {3921--3926},
  address = {Nice, France},
  url = {http://users.ics.forth.gr/ argyros/res_lumen.html},
  projects =  {VECTOR},
  doi = {10.1109/IROS.2008.4650969},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2008_09_iros_lumen_detection.pdf},
  videolink = {https://youtu.be/q6OW2QGGRCQ}
}

P. Nillius, J. Sullivan and A.A. Argyros, "Shading models for illumination and reflectance invariant shape detectors", In IEEE Computer Vision and Pattern Recognition (CVPR 2008), IEEE, pp. 1-8, Anchorage, Alaska, USA, June 2008.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Many objects have smooth surfaces of a fairly uniform color, thereby exhibiting shading patterns that reveal information about its shape, an important clue to the nature of the object. This papers explores extracting this information from images, by creating shape detectors based on shading.

BibTeX:

@inproceedings{Nillius2008,
  author = {Nillius, Peter and Sullivan, Josephine and Argyros, Antonis A},
  title = {Shading models for illumination and reflectance invariant shape detectors},
  booktitle = {IEEE Computer Vision and Pattern Recognition (CVPR 2008)},
  publisher = {IEEE},
  year = {2008},
  month = {June},
  pages = {1--8},
  address = {Anchorage, Alaska, USA},
  doi = {10.1109/CVPR.2008.4587773},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2008_06_cvpr_shape_detectors.pdf}
}

H. Baltzakis, A.A. Argyros, M.I.A. Lourakis and P.E. Trahanias, "Tracking of Human Hands and Faces through Probabilistic Fusion of Multiple Visual Cues", In International Conference on Computer Vision Systems (ICVS 2008), Springer, pp. 33-42, Santorini, Greece, May 2008.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: This paper presents a new approach for real time detection and tracking of human hands and faces in image sequences. The proposed method builds upon our previous research on color-based tracking and extends it towards building a system capable of distinguishing between human hands, faces and other skin-colored regions in the image background. To achieve these goals, the proposed approach allows the utilization of additional information cues including motion information given by means of a background subtraction algorithm, and top-down information regarding the formed image segments such as their spatial location, velocity and shape. All information cues are combined under a probabilistic framework which furnishes the proposed approach with the ability to cope with uncertainty due to noise. The proposed approach runs in real time on a standard, personal computer. The presented experimental results, confirm the effectiveness of the proposed methodology and its advantages over previous approaches.

BibTeX:

@inproceedings{Baltzakis2008,
  author = {Baltzakis, Haris and Argyros, Antonis A and Lourakis, Manolis I A and Trahanias, Panos E},
  title = {Tracking of Human Hands and Faces through Probabilistic Fusion of Multiple Visual Cues},
  booktitle = {International Conference on Computer Vision Systems (ICVS 2008)},
  publisher = {Springer},
  year = {2008},
  month = {May},
  pages = {33--42},
  address = {Santorini, Greece},
  projects =  {MUSCLE,XENIOS,INDIGO},
  doi = {10.1007/978-3-540-79547-6_4},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2008_05_icvs_tracking_hands_faces.pdf}
}

J. Romero, D. Kragic, V. Kyrki and A.A. Argyros, "Dynamic time warping for binocular hand tracking and reconstruction", In IEEE International Conference on Robotics and Automation (ICRA 2008), IEEE, pp. 2289-2294, Pasadena, California, USA, May 2008.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: We show how matching and reconstruction of contour points can be performed using dynamic time warping (DTW) for the purpose of 3D hand contour tracking. We evaluate the performance of the proposed algorithm in object manipulation activities and perform comparison with the iterative closest point (ICP) method.

BibTeX:

@inproceedings{Romero2008,
  author = {Romero, Javier and Kragic, Danica and Kyrki, Ville and Argyros, Antonis A},
  title = {Dynamic time warping for binocular hand tracking and reconstruction},
  booktitle = {IEEE International Conference on Robotics and Automation (ICRA 2008)},
  publisher = {IEEE},
  year = {2008},
  month = {May},
  pages = {2289--2294},
  address = {Pasadena, California, USA},
  doi = {10.1109/ROBOT.2008.4543555},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2008_05_icra_dynamic_time_warping.pdf}
}

C. Stephanidis, A.A. Argyros, D. Grammenos and X. Zabulis, "Pervasive Computing@ICS-FORTH", In Workshop Pervasive Computing@Home, International Conference on Pervasive Computing, pp. 119-124, Sydney, Australia, May 2008.
[BibTeX] [PDF]

BibTeX:

@inproceedings{Stephanidis2008,
  author = {Stephanidis, Constantine and Argyros, Antonis A and Grammenos, Dimitris and Zabulis, Xenophon},
  title = {Pervasive Computing@ICS-FORTH},
  booktitle = {Workshop Pervasive Computing@Home, International Conference on Pervasive Computing},
  year = {2008},
  month = {May},
  pages = {119--124},
  address = {Sydney, Australia},
  projects =  {ACTIPRET},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2008_05_pervasive08_icsami.pdf}
}

K. Sage, J. Howell, H. Buxton and A.A. Argyros, "Learning temporal structure for task based control", Image and Vision Computing, Elsevier, vol. 26, no. 1, pp. 39-52, 2008.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We present an extension for variable length Markov models (VLMMs) to allow for modelling of continuous input data and show that the generative properties of these VLMMs are a powerful tool for dealing with real world tracking issues. We explore methods for addressing the temporal correspondence problem in the context of a practical hand tracker, which is essential to support expectation in task-based control using these behavioural models. The hand tracker forms a part of a larger multi-component distributed system, providing 3-D hand position data to a gesture recogniser client. We show how the performance of such a hand tracker can be improved by using feedback from the gesture recogniser client. In particular, feedback based on the generative extrapolation of the recogniser’s internal models is shown to help the tracker deal with mid-term occlusion. We also show that VLMMs can be used as a means to inform
the prior in an expectation maximisation (EM) process used for joint spatial and temporal learning of image features.

BibTeX:

@article{Sage2008,
  author = {Sage, Kingsley and Howell, Jonathan and Buxton, Hilary and Argyros, Antonis A},
  title = {Learning temporal structure for task based control},
  journal = {Image and Vision Computing},
  publisher = {Elsevier},
  year = {2008},
  volume = {26},
  number = {1},
  pages = {39--52},
  url = {http://users.ics.forth.gr/ argyros/res_colortracking.html},
  projects =  {ACTIPRET},
  doi = {10.1016/j.imavis.2005.08.010},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2008_01_journal_ivc_vlmm.pdf}
}

D. Michel, A.A. Argyros and M.I.A. Lourakis, "Localizing Unordered Panoramic Images Using the Levenshtein Distance", In IEEE International Conference on Computer Vision Workshops (OMNIVIS 2007 - ICCVW 2007), IEEE, pp. 1-7, Rio de Janeiro, Brazil, October 2007.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: This paper proposes a feature-based method for recovering the relative positions of the viewpoints of a set of panoramic images for which no a priori order information is available, along with certain structure information regarding the imaged environment. The proposed approach operates incrementally, employing the Levenshtein distance to deduce the spatial proximity of image viewpoints and thus determine the order in which images should be processed. The Levenshtein distance also provides matches between images, from which their underlying environment points can be recovered. Recovered points that are visible in multiple views permit the localization of more views which in turn allow the recovery of more points. The process repeats until all views have been localized. Periodic refinement of the reconstruction with the aid of bundle adjustment, distributes the reconstruction errors among images. The method is demonstrated on several unordered sets of panoramic images obtained in an indoor environment.

BibTeX:

@inproceedings{Michel2007,
  author = {Michel, Damien and Argyros, Antonis A and Lourakis, Manolis I A},
  title = {Localizing Unordered Panoramic Images Using the Levenshtein Distance},
  booktitle = {IEEE International Conference on Computer Vision Workshops (OMNIVIS 2007 - ICCVW 2007)},
  publisher = {IEEE},
  year = {2007},
  month = {October},
  pages = {1--7},
  address = {Rio de Janeiro, Brazil},
  url = {http://users.ics.forth.gr/ argyros/res_pan_slam.html},
  doi = {10.1109/ICCV.2007.4409200},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2007_10_omnivis_panoramic_slam_levenshtein.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/reconsProgress2frames.avi}
}

M.I.A. Lourakis and A.A. Argyros, "Enforcing Scene Constraints in Single View Reconstruction", In Eurographics 2007, pp. 45-48, Prague, Czech Republic, September 2007.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Three-dimensional reconstruction from a single view is an under-constrained process that relies critically upon the availability of prior knowledge about the imaged scene. This knowledge is assumed to be supplied by a user in the form of geometric constraints such as coplanarity, parallelism, perpendicularity, etc, based on his/her interpretation of the scene. In the presence of noise, however, most of the existing methods yield reconstructions that only approximately satisfy the supplied geometric constraints. This paper proposes a novel single view reconstruction method that provides reconstructions which exactly satisfy all user-supplied constraints. This is achieved by first obtaining a preliminary reconstruction and then refining it in an extendable, constrained optimization framework.

BibTeX:

@inproceedings{Lourakis2007a,
  author = {Lourakis, Manolis I A and Argyros, Antonis A},
  title = {Enforcing Scene Constraints in Single View Reconstruction},
  booktitle = {Eurographics 2007},
  year = {2007},
  month = {September},
  pages = {45--48},
  address = {Prague, Czech Republic},
  projects =  {MUSCLE,XENIOS,INDIGO},
  doi = {10.2312/egs.20071030},
  pdflink = {http://dx.doi.org/10.2312/egs.20071030}
}

M.I.A. Lourakis and A.A. Argyros, "Accurate constraint-based modeling from a single perspective view", In Computer Graphics International (CGI 2007), Petropolis, RJ, Brazil, May 2007.
[Abstract] [BibTeX] [PDF]

Abstract: Three-dimensional reconstruction from a single view is an under-constrained process that relies critically upon the availability of prior knowledge about the imaged scene. This knowledge is assumed to be supplied by a user in the form of geometric constraints such as coplanarity, parallelism, perpendicularity, etc, based on his/her interpretation of the scene. In the presence of noise, however, most of the existing methods yield reconstructions that only approximately satisfy
the supplied geometric constraints. This paper proposes a novel single view reconstruction method that provides reconstructions which exactly satisfy all user-supplied constraints. This is achieved by first obtaining a preliminary reconstruction and then refining it in an extendable, constrained optimization framework.

BibTeX:

@inproceedings{Lourakis2007b,
  author = {Lourakis, Manolis I A and Argyros, Antonis A},
  title = {Accurate constraint-based modeling from a single perspective view},
  booktitle = {Computer Graphics International (CGI 2007)},
  year = {2007},
  month = {May},
  address = {Petropolis, RJ, Brazil},
  projects =  {REPAINTER},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2007_05_cgi_SVR.pdf}
}

A.A. Argyros, "Building computers that can see", 2007.
[BibTeX] [PDF]

BibTeX:

@periodical{Argyros2007a,
  author = {Argyros, Antonis A},
  title = {Building computers that can see},
  booktitle = {Kathimerini, The Economist (Greek newspaper)},
  year = {2007},
  month = {February},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2007_02_kathimerini_compvis.pdf}
}

A.A. Argyros, G. Bártfai, C. Eitzinger, Z. Kemény, B.C. Csáji, L. Kék, M.I.A. Lourakis, W. Reisner, W. Sandrisser, T. Sarmis and others, "Smart sensor based vision system for automated processes", Emerging Technologies, Robotics and Control Systems, International Society for Advanced Research, ISBN: 978-88-901928-9-5, 2007.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: A new approach is proposed for vision-based sensing and processing for process control and monitoring of automated processes. The proposed approach relies on a number of binary logical sensors defined over specific regions of interest in the viewed scene. On top of these elementary sensors, temporal and logical aggregation mechanisms realize hierarchies of compound logical functions, able to detect complex events. Finally, scenario verification mechanisms are employed to monitor the occurrence order and timing of expected and actual events. The proposed framework has been tested and validated in an application involving monitoring of automated processes, demonstrating that the proposed approach provides a promising concept of vision-based event detection. The described framework is being implemented on the Bi-i standalone cellular vision system which has the potential of replacing several conventional sensors used for process control and fault detection in automation.

BibTeX:

@incollection{Argyros2007,
  author = {Argyros, Antonis A and Bártfai, Gusztáv and Eitzinger, Christian and Kemény, Zsolt and Csáji, Balázs Csanád and Kék, László and Lourakis, Manonis I A and Reisner, W and Sandrisser, W and Sarmis, Thomas and others},
  title = {Smart sensor based vision system for automated processes},
  booktitle = {Emerging Technologies, Robotics and Control Systems},
  publisher = {International Society for Advanced Research, ISBN: 978-88-901928-9-5},
  year = {2007},
  volume = {2},
  file = {M:\antonis\professional\_html\mypapers\2007_07_journal_internationalsar_multisens.pdf:PDF},
  projects =  {MULTISENS},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2007_07_journal_internationalsar_multisens.pdf}
}

A.A. Argyros, G. Bártfai, C. Eitzinger, Z. Kemény, B.C. Csáji, L. Kék, M.I.A. Lourakis, W. Reisner, W. Sandrisser, T. Sarmis and others, "Smart sensor based vision system for automated processes", International Journal of Factory Automation, Robotics and Soft Computing, Thomson Scientific Journal, vol. 3, pp. 118-123, 2007.
[Abstract] [BibTeX] [PDF] [URL]

BibTeX:

@article{Argyros2007b,
  author = {Argyros, Antonis A and Bártfai, Gusztáv and Eitzinger, Christian and Kemény, Zsolt and Csáji, Balázs Csanád and Kék, László and Lourakis, Manonis I A and Reisner, W and Sandrisser, W and Sarmis, Thomas and others},
  title = {Smart sensor based vision system for automated processes},
  journal = {International Journal of Factory Automation, Robotics and Soft Computing},
  publisher = {Thomson Scientific Journal},
  year = {2007},
  volume = {3},
  pages = {118--123},
  url = {http://users.ics.forth.gr/ argyros/res_multisens.html},
  projects =  {MULTISENS},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2007_07_journal_internationalsar_multisens.pdf}
}

M.I.A. Lourakis and A.A. Argyros, "Refining Single View Calibration With the Aid of Metric Scene Properties", Journal of WSCG, Václav Skala-UNION Agency, vol. 15, no. 1-3, pp. 129-134, 2007.
[Abstract] [BibTeX] [PDF]

Abstract: Intrinsic camera calibration using a single image is possible provided that certain geometric objects such as orthogonal vanishing points and metric homographies can be estimated from the image and give rise to adequate constraints on the sought calibration parameters. In doing so, however, any additional metric information that might be available for the imaged scene is not always straightforward to accommodate. This paper puts forward a method for incorporating into the calibration procedure metric scene.

BibTeX:

@article{Lourakis2007,
  author = {Lourakis, Manolis I A and Argyros, Antonis A},
  title = {Refining Single View Calibration With the Aid of Metric Scene Properties},
  journal = {Journal of WSCG},
  publisher = {Václav Skala-UNION Agency},
  year = {2007},
  volume = {15},
  number = {1-3},
  pages = {129--134},
  projects =  {REPAINTER},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2007_xx_journal_wscg_svc.pdf}
}

T. Sarmis, A.A. Argyros, M.I.A. Lourakis and K. Hatzopoulos, "Robust and efficient event detection for the monitoring of automated processes", In International conference on visual information engineering (VIE 2006), pp. 454-459, Bangalore, India, September 2006.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We present a new approach for the detection of events in image sequences. Our method relies on a number of logical sensors that can be defined over specific regions of interest in the viewed scene. These sensors measure time varying image properties that can be attributed to primitive events of interest. Thus, the logical sensors can be viewed as a means to transform image data to a set of symbols that can assist event detection and activities interpretation. On top of these elementary sensors, temporal and logical aggregation mechanisms are used to define hierarchies of progressively more complex sensors, able to detect events having more complex semantics. Finally, scenario verification mechanisms are employed to achieve process monitoring, by checking whether events occur according to a predetermined order. The proposed framework has been tested and validated in an application involving monitoring of automated processes. The obtained results demonstrate that the proposed approach, despite its simplicity, provides a promising framework for vision based event detection in the context of such applications.

BibTeX:

@inproceedings{Sarmis2006,
  author = {Sarmis, Thomas and Argyros, Antonis A and Lourakis, Manolis I A and Hatzopoulos, Kostas},
  title = {Robust and efficient event detection for the monitoring of automated processes},
  booktitle = {International conference on visual information engineering (VIE 2006)},
  year = {2006},
  month = {September},
  pages = {454--459},
  address = {Bangalore, India},
  url = {http://users.ics.forth.gr/ argyros/res_multisens.html},
  projects =  {MULTISENS},
  doi = {10.1049/cp:20060573},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2006_09_vie06_vbls.pdf}
}

A.A. Argyros and M.I.A. Lourakis, "Binocular Hand Tracking and Reconstruction Based on 2D Shape Matching", In IEEE International Conference on Pattern Recognition (ICPR 2006), IEEE, pp. 207-210, Hong Kong, August 2006.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: This paper presents a method for real-time 3D hand tracking in images acquired by a calibrated, possibly moving stereoscopic rig. The proposed method consists of a collection of techniques that enable the modeling and detection of hands, their temporal association in image sequences, the establishment of hand correspondences between stereo images and the 3D reconstruction of their contours. Building upon our previous research on color-based, 2D skin-color tracking, the 3D hand tracker is developed through the coupling of the results of two 2D skin-color trackers that run independently on the two video streams acquired by a stereoscopic system. The proposed method runs in real time on a conventional Pentium 4 processor when operating on 320times240 images. Representative experimental results are also presented

BibTeX:

@inproceedings{Argyros2006,
  author = {Argyros, Antonis A and Lourakis, Manolis I A},
  title = {Binocular Hand Tracking and Reconstruction Based on 2D Shape Matching},
  booktitle = {IEEE International Conference on Pattern Recognition (ICPR 2006)},
  publisher = {IEEE},
  year = {2006},
  month = {August},
  volume = {1},
  pages = {207--210},
  address = {Hong Kong},
  url = {http://users.ics.forth.gr/ argyros/res_colortracking.html},
  projects =  {ACTIPRET,MUSCLE},
  doi = {10.1109/ICPR.2006.327},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2006_08_icpr_3dfht.pdf}
}

M.I.A. Lourakis and A.A. Argyros, "Chaining Planar Homographies for Fast and Reliable 3D Plane Tracking", In IEEE International Conference on Pattern Recognition (ICPR 2006), IEEE, pp. 582-586, Hong Kong, August 2006.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: This paper addresses the problem of tracking a 3D plane over a sequence of images acquired by a free moving camera, a task that is of central importance to a wide variety of vision tasks. A feature-based method is proposed which given a triplet of consecutive images and a plane homography between the first two of them, estimates the homography induced by the same plane between the second and third images, without requiring the plane to be segmented from the rest of the scene. Thus, the proposed method operates by "chaining" (i.e. propagating) across frames the image-to-image homographies due to some 3D plane. The chaining operation represents projective space using a "plane + parallax" decomposition, which permits the combination of constraints arising from all available point matches, regardless of whether they actually lie on the tracked 3D plane or not. Experimental results are also provided

BibTeX:

@inproceedings{Lourakis2006a,
  author = {Lourakis, Manolis I A and Argyros, Antonis A},
  title = {Chaining Planar Homographies for Fast and Reliable 3D Plane Tracking},
  booktitle = {IEEE International Conference on Pattern Recognition (ICPR 2006)},
  publisher = {IEEE},
  year = {2006},
  month = {August},
  volume = {1},
  pages = {582--586},
  address = {Hong Kong},
  doi = {10.1109/ICPR.2006.358},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2006_08_icpr_planetrack.pdf}
}

A.A. Argyros and M.I.A. Lourakis, "Vision-Based Interpretation of Hand Gestures for Remote Control of a Computer Mouse", In European Conference on Computer Vision Workshops (HCI 2006 - ECCVW 2006), Springer, pp. 40-51, Graz, Austria, January 2006.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: This paper presents a vision-based interface for controlling a computer mouse via 2D and 3D hand gestures. The proposed interface builds upon our previous work that permits the detection and tracking of multiple hands that can move freely in the field of view of a potentially moving camera system. Dependable hand tracking, combined with fingertip detection, facilitates the definition of simple and, therefore, robustly interpretable vocabularies of hand gestures that are subsequently used to enable a human operator convey control information to a computer system. Two such vocabularies are defined, implemented and validated. The first one depends only on 2D hand tracking results while the second also makes use of 3D information. As confirmed by several experiments, the proposed interface achieves accurate mouse positioning, smooth cursor movement and reliable recognition of gestures activating button events. Owing to these properties, our interface can be used as a virtual mouse for controlling any Windows application.

BibTeX:

@inproceedings{Argyros2006a,
  author = {Argyros, Antonis A and Lourakis, Manolis I A},
  title = {Vision-Based Interpretation of Hand Gestures for Remote Control of a Computer Mouse},
  booktitle = {European Conference on Computer Vision Workshops (HCI 2006 - ECCVW 2006)},
  publisher = {Springer},
  year = {2006},
  month = {January},
  pages = {40--51},
  address = {Graz, Austria},
  url = {http://users.ics.forth.gr/ argyros/res_virtualmouse.html},
  projects =  {ACTIPRET,MUSCLE},
  doi = {10.1007/11754336_5},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2006_05_hci_virtualmouse.pdf},
  videolink = {https://youtu.be/O9DUsRtgigk}
}

K.E. Bekris, A.A. Argyros and L.E. Kavraki, "Exploiting panoramic vision for bearing-only robot homing", Imaging beyond the pinhole camera, Springer, pp. 229-251, 2006.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: Omni-directional vision allows for the development of techniques for mobile robot navigation that have minimum perceptual requirements. In this work, we focus on robot navigation algorithms that do not require range information or metric maps of the environment. More specifically, we present a homing strategy that enables a robot to return to its home position after executing a long path. The proposed strategy relies on measuring the angle between pairs of features extracted from panoramic images, which can be achieved accurately and robustly. In the heart of the proposed homing strategy lies a novel, local control law that enables a robot to reach any position on the plane by exploiting the bearings of at least three landmarks of unknown position, without making assumptions regarding the robot’s orientation and without making use of a compass. This control law is the result of the unification of two other local control laws which guide the robot by monitoring the bearing of landmarks and which are able to reach complementary sets of goal positions on the plane. Long-range homing is then realized through the systematic application of the unified control law between automatically extracted milestone positions connecting the robot’s current position to the home position. Experimental results, conducted both in a simulated environment and on a robotic platform equipped with a panoramic camera validate the employed local control laws as well as the overall homing strategy. Moreover, they show that panoramic vision can assist in simplifying the perceptual processes required to support robust and accurate homing behaviors.

BibTeX:

@incollection{Bekris2006,
  author = {Bekris, Kostas E and Argyros, Antonis A and Kavraki, Lydia E},
  title = {Exploiting panoramic vision for bearing-only robot homing},
  booktitle = {Imaging beyond the pinhole camera},
  publisher = {Springer},
  year = {2006},
  pages = {229--251},
  file = {M:\antonis\professional\_html\mypapers\2006_xx_book_dagstuhl_robot_homing.pdf:PDF},
  doi = {10.1007/978-1-4020-4894-4_12},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2006_xx_book_dagstuhl_robot_homing.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/snapshot.mpg}
}

E. Tzamali, G. Akoumianakis, A.A. Argyros and Y.J. Stephanedes, "Improved design for vision-based incident detection in transportation systems using real-time view transformations", Journal of transportation engineering, American Society of Civil Engineers, vol. 132, no. 11, pp. 837-844, 2006.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Advances in machine vision techniques have led to algorithms and integrated systems that can be applied in transportation engineering to improve surveillance and control. Despite these advances, certain problems in the effective integration of machine-vision based systems at complex intersections and complex freeway sections still remain. These are related to increasing system performance in the identification, analysis, and detection of the traffic state in real time. This work examines the feasibility of providing transformed visual input to existing machine-vision based systems, in order to gain increased efficiency and cost effectiveness of integrated transportation systems. Two transformations are developed, homography-based transformation and panoramic image reprojection. Homography-based transformation operates on video of the road scene, provided by classical cameras, and seeks to transform any view to a top-down view. This transforms the three-dimensional problem of image analysis for, e.g., road event detection to a two-dimensional one. Panoramic image reprojection employs panoramic cameras to reduce required hardware, and the complexity and cost incurred in obtaining the desired road view. The image reprojection technique allows the reconstruction of undistorted, perspectively correct views from panoramic images in real time. Tests at sites in Spain, the United Kingdom, and Greece are performed on-line and off-line in combination with operating machine-vision based incident detection systems. Test results indicate that the two methods simplify the input provided to machine vision, and reduce the workload and amount of hardware in implementing complex machine-vision based systems for incident detection. Both modules can be integrated into incident detection systems to improve their overall efficiency and ease of application.

BibTeX:

@article{Tzamali2006,
  author = {Tzamali, Eleftheria and Akoumianakis, George and Argyros, Antonis A and Stephanedes, Yorgos J},
  title = {Improved design for vision-based incident detection in transportation systems using real-time view transformations},
  journal = {Journal of transportation engineering},
  publisher = {American Society of Civil Engineers},
  year = {2006},
  volume = {132},
  number = {11},
  pages = {837--844},
  url = {http://users.ics.forth.gr/ argyros/res_pan_geom.html},
  projects =  {PRIME},
  doi = {10.1061/(ASCE)0733-947X(2006)132:11(837)},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2015_journal_TE_Tzamali_PRIME.pdf}
}

M.I.A. Lourakis and A.A. Argyros, "Exploiting the Sparseness of Bundle Adjustment for Efficient 3D Reconstruction", In European Conference on Computer Vision Workshops (CIMCV 2006 - ECCVW 2006), Graz, Austria, 2006.
[Abstract] [BibTeX] [PDF]

Abstract: Bundle adjustment amounts to a large, nonlinear least squares optimization problem that is often used as the last step of feature-based structure and motion estimation vision algorithms to obtain optimal estimates. Due to the very large number of parameters involved, a general purpose implementation of a nonlinear least squares algorithm incurs high computational and memory storage costs when applied to bundle adjustment. Fortunately, the lack of interaction among certain sub-groups of parameters results in the corresponding Jacobian exhibiting a sparse block structure. In this paper we outline the mathematics of sba, our publicly available software package for generic bundle adjustment that exploits sparseness to achieve considerable computational savings. In addition, we provide experimental results demonstrating that sba can efficiently handle large bundle adjustment problems which would be intractable with general purpose nonlinear least squares implementations.

BibTeX:

@inproceedings{Lourakis2006,
  author = {Lourakis, Manolis I A and Argyros, Antonis A},
  title = {Exploiting the Sparseness of Bundle Adjustment for Efficient 3D Reconstruction},
  booktitle = {European Conference on Computer Vision Workshops (CIMCV 2006 - ECCVW 2006)},
  year = {2006},
  address = {Graz, Austria},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2006_05_cimcv_sba.pdf}
}

M.I.A. Lourakis and A.A. Argyros, "Is Levenberg-Marquardt the Most Efficient Optimization Algorithm for Implementing Bundle Adjustment?", In IEEE International Conference on Computer Vision (ICCV 2005), IEEE, pp. 1526-1531, Beijing, China, October 2005.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: In order to obtain optimal 3D structure and viewing parameter estimates, bundle adjustment is often used as the last step of feature-based structure and motion estimation algorithms. Bundle adjustment involves the formulation of a large scale, yet sparse minimization problem, which is traditionally solved using a sparse variant of the Levenberg-Marquardt optimization algorithm that avoids storing and operating on zero entries. This paper argues that considerable computational benefits can be gained by substituting the sparse Levenberg-Marquardt algorithm in the implementation of bundle adjustment with a sparse variant of Powell's dog leg non-linear least squares technique. Detailed comparative experimental results provide strong evidence supporting this claim

BibTeX:

@inproceedings{Lourakis2005a,
  author = {Lourakis, Manolis I A and Argyros, Antonis A},
  title = {Is Levenberg-Marquardt the Most Efficient Optimization Algorithm for Implementing Bundle Adjustment?},
  booktitle = {IEEE International Conference on Computer Vision (ICCV 2005)},
  publisher = {IEEE},
  year = {2005},
  month = {October},
  volume = {2},
  pages = {1526--1531},
  address = {Beijing, China},
  doi = {10.1109/ICCV.2005.128},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2005_10_iccv_levenberg.pdf}
}

M.I.A. Lourakis and A.A. Argyros, "Fast trifocal tensor estimation using virtual parallax", In IEEE International Conference on Image Processing (ICIP 2005), IEEE, pp. 169-172, Genoa, Italy, September 2005.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: We present a computationally efficient method for estimating the trifocal tensor corresponding to three images acquired by a freely moving camera. The proposed method represents projective space through a "plane + parallax" decomposition and employs a novel technique for estimating the homographies induced by a virtual 3D plane between successive image pairs. Knowledge of these homographies allows the corresponding camera projection matrices to be expressed in a common projective frame and, therefore, to be recovered directly. The trifocal tensor can then be recovered in a straightforward manner from the estimated projection matrices. Sample experimental results demonstrate that the method performs considerably faster compared to a state of the art method, without a serious loss in accuracy.

BibTeX:

@inproceedings{Lourakis2005b,
  author = {Lourakis, Manolis I A and Argyros, Antonis A},
  title = {Fast trifocal tensor estimation using virtual parallax},
  booktitle = {IEEE International Conference on Image Processing (ICIP 2005)},
  publisher = {IEEE},
  year = {2005},
  month = {September},
  volume = {2},
  pages = {169--172},
  address = {Genoa, Italy},
  doi = {10.1109/ICIP.2005.1529999},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2005_09_icip_tensor.pdf}
}

A.A. Argyros, K.E. Bekris, S.C. Orphanoudakis and L.E. Kavraki, "Robot Homing by Exploiting Panoramic Vision", Autonomous Robots, Springer, vol. 19, no. 1, pp. 7-25, July 2005.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We propose a novel, vision-based method for robot homing, the problem of computing a route so that a robot can return to its initial “home” position after the execution of an arbitrary “prior” path. The method assumes that the robot tracks visual features in panoramic views of the environment that it acquires as it moves. By exploiting only angular information regarding the tracked features, a local control strategy moves the robot between two positions, provided that there are at least three features that can be matched in the panoramas acquired at these positions. The strategy is successful when certain geometric constraints on the configuration of the two positions relative to the features are fulfilled. In order to achieve long-range homing, the features’ trajectories are organized in a visual memory during the execution of the “prior” path. When homing is initiated, the robot selects Milestone Positions (MPs) on the “prior” path by exploiting information in its visual memory. The MP selection process aims at picking positions that guarantee the success of the local control strategy between two consecutive MPs. The sequential visit of successive MPs successfully guides the robot even if the visual context in the “home” position is radically different from the visual context at the position where homing was initiated. Experimental results from a prototype implementation of the method demonstrate that homing can be achieved with high accuracy, independent of the distance traveled by the robot. The contribution of this work is that it shows how a complex navigational task such as homing can be accomplished efficiently, robustly and in real-time by exploiting primitive visual cues. Such cues carry implicit information regarding the 3D structure of the environment. Thus, the computation of explicit range information and the existence of a geometric map are not required.

BibTeX:

@article{Argyros2005a,
  author = {Argyros, Antonis A and Bekris, Kostas E. and Orphanoudakis, Stelios C and Kavraki, Lydia E},
  title = {Robot Homing by Exploiting Panoramic Vision},
  journal = {Autonomous Robots},
  publisher = {Springer},
  year = {2005},
  month = {July},
  volume = {19},
  number = {1},
  pages = {7--25},
  url = {http://users.ics.forth.gr/ argyros/res_pan_homing.html},
  doi = {10.1007/s10514-005-0603-7},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2005_xx_journal_ar_homing.pdf}
}

P.E. Trahanias, W. Burgard, A.A. Argyros, D. Hähnel, H. Baltzakis, P. Pfaff and C. Stachniss, "TOURBOT and WebFAIR: Web-operated mobile robots for tele-presence in populated exhibitions", IEEE Robotics and Automation Magazine, IEEE, vol. 12, no. 2, pp. 77-89, June 2005.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: This paper presents a number of techniques that are needed for realizing Web-operated mobile robots. These techniques include effective map-building capabilities, a method for obstacle avoidance based on a combination of range and visual information, and advanced Web and onboard robot interfaces. In addition to video streams, the system provides high-resolution virtual reality visualizations that also include the people in the vicinity of the robot. This increases the flexibility of the interface and simultaneously allows a user to understand the navigation actions of the robot. The techniques described in this article have been successfully deployed within the EU-funded projects TOURBOT and WebFAIR, which aimed to develop interactive tour-guided robots able to serve Web as well as on-site visitors. Technical developments in the framework of these projects have resulted in robust and reliable systems that have been demonstrated and validated in real-world conditions. Equally important, the system setup time has been drastically reduced, facilitating its porting to new environments.

BibTeX:

@article{Trahanias2005,
  author = {Trahanias, Panos E and Burgard, Wolfram and Argyros, Antonis A and Hähnel, Dirk and Baltzakis, Haris and Pfaff, Patrick and Stachniss, Cyrill},
  title = {TOURBOT and WebFAIR: Web-operated mobile robots for tele-presence in populated exhibitions},
  journal = {IEEE Robotics and Automation Magazine},
  publisher = {IEEE},
  year = {2005},
  month = {June},
  volume = {12},
  number = {2},
  pages = {77--89},
  url = {http://users.ics.forth.gr/ argyros/res_robotsinexhibitions.html},
  projects =  {TOURBOT,WEBFAIR},
  doi = {10.1109/MRA.2005.1458329},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2005_xx_journal_ram_tourbot_webfair.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/TourbotInBrief.mpg}
}

A.A. Argyros and M.I.A. Lourakis, "Tracking Multiple Colored Blobs with a Moving Camera", In IEEE Computer Vision and Pattern Recognition (CVPR 2005), IEEE, pp. 1178, San Diego, CA, USA, June 2005.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: This paper concerns a method for tracking multiple blobs exhibiting certain color distributions in images acquired by a possibly moving camera. The method encompasses a collection of techniques that enable modeling and detecting the blobs possessing the desired color distribution(s), as well as inferring their temporal association across image sequences. Appropriately colored blobs are detected with a Bayesian classifier, which is bootstrapped with a small set of training data. Then, an online iterative training procedure is employed to refine the classifier using additional training images. Online adaptation of color probabilities is used to enable the classifier to cope with illumination changes. Tracking over time is realized through a novel technique, which can handle multiple colored blobs. Such blobs may move in complex trajectories and occlude each other in the field of view of a possibly moving camera, while their number may vary over time. A prototype implementation of the developed system running on a conventional Pentium IV processor at 2.5 GHz operates on 320×240 live video in real time (30Hz). It is worth pointing out that currently, the cycle time of the tracker is determined by the maximum acquisition frame rate that is supported by our IEEE 1394 camera, rather than the latency introduced by the computational overhead for tracking blobs.

BibTeX:

@inproceedings{Argyros2005b,
  author = {Argyros, Antonis A and Lourakis, Manolis I A},
  title = {Tracking Multiple Colored Blobs with a Moving Camera},
  booktitle = {IEEE Computer Vision and Pattern Recognition (CVPR 2005)},
  publisher = {IEEE},
  year = {2005},
  month = {June},
  volume = {2},
  pages = {1178},
  address = {San Diego, CA, USA},
  url = {http://users.ics.forth.gr/ argyros/res_colortracking.html},
  projects =  {ACTIPRET},
  doi = {10.1109/CVPR.2005.348},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2005_06_vpcvpr_blobtracking.pdf}
}

M.I.A. Lourakis and A.A. Argyros, "Camera Matchmoving in Unprepared, Unknown Environments", In IEEE Computer Vision and Pattern Recognition (CVPR 2005), IEEE, pp. 1190, San Diego, CA, USA, June 2005.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Camera matchmoving is an application involving synthesis of real scenes and artificial objects, in which the goal is to insert computer-generated graphical 3D objects into live-action footage depicting unmodeled, arbitrary scenes. This work addresses the problem of tracking the 3D motion of a camera in space, using only the images it acquires while moving freely in unmodeled, arbitrary environments. A novel feature-based method for camera tracking has been developed, intended to facilitate tracking in online, time-critical applications such as video see-through augmented reality and vision-based control. In contrast to several existing techniques, which are designed to operate in a batch, offline mode, assuming that the whole video sequence to be tracked is available before tracking commences, our method operates on images incrementally, as they are being acquired.

BibTeX:

@inproceedings{Lourakis2005c,
  author = {Lourakis, Manolis I A and Argyros, Antonis A},
  title = {Camera Matchmoving in Unprepared, Unknown Environments},
  booktitle = {IEEE Computer Vision and Pattern Recognition (CVPR 2005)},
  publisher = {IEEE},
  year = {2005},
  month = {June},
  volume = {2},
  pages = {1190},
  address = {San Diego, CA, USA},
  url = {http://users.ics.forth.gr/ lourakis/camtrack/},
  projects =  {LIFEPLUS},
  doi = {10.1109/CVPR.2005.96},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2005_06_vpcvpr_matchmoving.pdf}
}

A.A. Argyros and M.I.A. Lourakis, "Tracking skin-colored objects in real-time", Cutting Edge Robotics, Advanced Robotic Systems International, pp. 77-90, 2005.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: We present a methodology for tracking multiple skin-colored objects in a monocular image sequence. The proposed approach encompasses a collection of techniques that allow the modeling, detection and temporal association of skin-colored objects across image sequences. A non-parametric model of skin color is employed. Skin-colored objects are detected with a Bayesian classifier that is bootstrapped with a small set of training data and refined through an off-line iterative training procedure. By using on-line adaptation of skin-color probabilities the classifier is able to cope with considerable illumination changes. Tracking over time is achieved by a novel technique that can handle multiple objects simultaneously. Tracked objects may move in complex trajectories, occlude each other in the field of view of a possibly moving camera and vary in number over time. A prototype implementation of the developed system operates on 320x240 live video in real time (28Hz), running on a conventional Pentium IV processor. Representative experimental results from the application of this prototype to image sequences are also presented.

BibTeX:

@incollection{Argyros2005,
  author = {Argyros, Antonis A and Lourakis, Manolis I A},
  title = {Tracking skin-colored objects in real-time},
  booktitle = {Cutting Edge Robotics},
  publisher = {Advanced Robotic Systems International},
  year = {2005},
  pages = {77--90},
  file = {M:\antonis\professional\_html\mypapers\2005_xx_book_Cutting_Edge_Robotics_Book_2D_tracking.pdf:PDF},
  projects =  {ACTIPRET},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2005_xx_book_Cutting_Edge_Robotics_Book_2D_tracking.pdf}
}

M.I.A. Lourakis and A.A. Argyros, "Efficient, causal camera tracking in unprepared environments", Computer Vision and Image Understanding, Elsevier, vol. 99, no. 2, pp. 259-290, 2005.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: This paper addresses the problem of tracking the 3D pose of a camera in space, using the images it acquires while moving freely in unmodeled, arbitrary environments. A novel feature-based approach for camera tracking is proposed, intended to facilitate tracking in on-line, time-critical applications such as video see-through augmented reality. In contrast to several existing methods which are designed to operate in a batch, off-line mode, assuming that the whole video sequence to be tracked is available before tracking commences, the proposed method operates on images incrementally. At its core lies a feature-based 3D plane tracking technique, which permits the estimation of the homographies induced by a virtual 3D plane between successive image pairs. Knowledge of these homographies allows the corresponding projection matrices encoding camera motion to be expressed in a common projective frame and, therefore, to be recovered directly, without estimating 3D structure. Projective camera matrices are then upgraded to Euclidean and used for recovering structure, which is in turn employed for refining the projection matrices through local resectioning. The proposed approach is causal, is tolerant to erroneous and missing feature matches, does not require modifications of the environment and has computational requirements that permit a near real-time implementation. Extensive experimental results demonstrating the performance of the approach on several image sequences are included.

BibTeX:

@article{Lourakis2005,
  author = {Lourakis, Manolis I A and Argyros, Antonis A},
  title = {Efficient, causal camera tracking in unprepared environments},
  journal = {Computer Vision and Image Understanding},
  publisher = {Elsevier},
  year = {2005},
  volume = {99},
  number = {2},
  pages = {259--290},
  url = {http://users.ics.forth.gr/ lourakis/camtrack/},
  projects =  {LIFEPLUS},
  doi = {10.1016/j.cviu.2005.02.001},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2005_xx_journal_cviu_camera_tracking.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/CameraTracker.avi}
}

A.A. Argyros, D.P. Tsakiris and C. Groyer, "Biomimetic centering behavior [mobile robots with panoramic sensors]", IEEE Robotics and Automation Magazine, IEEE, vol. 11, no. 4, pp. 21-30, December 2004.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: A reactive robotic centering behavior based on panoramic vision is presented. It is inspired by the way insects exploit visual information in analogous navigation tasks. By employing a panoramic camera, the development of the centering behavior is simplified both from a theoretical and from an implementation point of view. The proposed method relies on the extraction of primitive visual information from appropriately selected areas of a panoramic visual field and its direct use in the control law. Experimental results from an implementation of this method on a robotic platform demonstrate a centering behavior which can be achieved in real-time and with high accuracy. The proposed technique circumvents the need to address complex problems of 3D structure estimation and the resulting control laws were shown to possess the required stability properties.

BibTeX:

@article{Argyros2004a,
  author = {Argyros, Antonis A and Tsakiris, Dimitris P and Groyer, Cedric},
  title = {Biomimetic centering behavior [mobile robots with panoramic sensors]},
  journal = {IEEE Robotics and Automation Magazine},
  publisher = {IEEE},
  year = {2004},
  month = {December},
  volume = {11},
  number = {4},
  pages = {21--30},
  url = {http://users.ics.forth.gr/ argyros/res_biomimeticcentering.html},
  doi = {10.1109/MRA.2004.1371612},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2004_12_journal_ram_biomimetic_centering.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/CorridorFollowing.avi}
}

M.I.A. Lourakis, A.A. Argyros and K. Marias, "A graph-based approach to corner matching using mutual information as a local similarity measure", In IEEE International Conference on Pattern Recognition (ICPR 2004), IEEE, pp. 827-830, Cambridge, UK, August 2004.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Corner matching constitutes a fundamental vision problem that serves as a building block of several important applications. The common approach to dealing with this problem starts by ranking potential matches according to their affinity, which is assessed with the aid of window-based intensity similarity measures. Then, actual matches are established by optimizing global criteria involving all potential matches. This paper puts forward a novel approach for solving the corner matching problem that uses mutual information as a window similarity measure, combined with graph matching techniques for determining a matching of corners that is globally optimal. Experimental results illustrate the effectiveness of the approach.

BibTeX:

@inproceedings{Lourakis2004a,
  author = {Lourakis, Manolis I A and Argyros, Antonis A and Marias, Kostas},
  title = {A graph-based approach to corner matching using mutual information as a local similarity measure},
  booktitle = {IEEE International Conference on Pattern Recognition (ICPR 2004)},
  publisher = {IEEE},
  year = {2004},
  month = {August},
  volume = {2},
  pages = {827--830},
  address = {Cambridge, UK},
  doi = {10.1109/ICPR.2004.1334386},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2004_08_icpr_corner_matching.pdf}
}

M.I.A. Lourakis and A.A. Argyros, "Vision-based camera motion recovery for augmented reality", In Computer Graphics International (CGI 2004), pp. 569-576, Hersonissos, Crete, Greece, June 2004.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: We address the problem of tracking the 3D position and orientation of a camera, using the images it acquires while moving freely in unmodeled, arbitrary environments. This task has a broad spectrum of useful applications in domains such as augmented reality and video post production. Most of the existing methods for vision-based camera tracking are designed to operate in a batch, off-line mode, assuming that the whole video sequence to be tracked is available before tracking commences. Typically, such methods operate noncausally, processing video frames backwards and forwards in time as they see fit. Furthermore, they resort to optimization in very high dimensional spaces, a process that is computationally intensive. For these reasons, batch methods are inapplicable to tracking in online, time-critical applications such as video see-through augmented reality. This paper puts forward a novel feature-based approach for camera tracking. The proposed approach operates on images continuously as they are acquired, has realistic computational requirements and does not require modifications of the environment. Sample experimental results demonstrating the feasibility of the approach on video images are also provided

BibTeX:

@inproceedings{Lourakis2004b,
  author = {Lourakis, Manolis I A and Argyros, Antonis A},
  title = {Vision-based camera motion recovery for augmented reality},
  booktitle = {Computer Graphics International (CGI 2004)},
  year = {2004},
  month = {June},
  pages = {569--576},
  address = {Hersonissos, Crete, Greece},
  url = {http://users.ics.forth.gr/ lourakis/camtrack/},
  projects =  {LIFEPLUS},
  doi = {10.1109/CGI.2004.1309266},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2004_06_cgi_camera_tracking.pdf}
}

K.E. Bekris, A.A. Argyros and L.E. Kavraki, "Angle-based methods for mobile robot navigation: reaching the entire plane", In IEEE International Conference on Robotics and Automation (ICRA 2004), IEEE, pp. 2373-2378, Barcelona, Spain, April 2004.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Popular approaches for mobile robot navigation involve range information and metric maps of the workspace. For many sensors, however, such as cameras and wireless hardware, the angle between two features or beacons is easier to measure. With these sensors' features in mind, we initially present a control law, which allows a robot with an omni-directional sensor to reach a subset of the plane by monitoring the angles of only three landmarks. By analyzing the law's properties, a second law has been developed that reaches the complementary set of points. The two methods are then combined in a path planning framework that reaches any possible goal configuration in a planar obstacle-free workspace with three landmarks. The proposed framework could be used together with other techniques, such as obstacle avoidance and topological maps to improve the efficiency of autonomous navigation. Experiments have been conducted on a robotic platform using a panoramic camera that exhibits the effectiveness and accuracy of the proposed techniques. This work provides evidence that navigational tasks can be performed using only a small number of primitive sensor cues and without the explicit computation of range information.

BibTeX:

@inproceedings{Bekris2004,
  author = {Bekris, Kostas E and Argyros, Antonis A and Kavraki, Lydia E.},
  title = {Angle-based methods for mobile robot navigation: reaching the entire plane},
  booktitle = {IEEE International Conference on Robotics and Automation (ICRA 2004)},
  publisher = {IEEE},
  year = {2004},
  month = {April},
  volume = {3},
  pages = {2373--2378},
  address = {Barcelona, Spain},
  url = {http://users.ics.forth.gr/ argyros/res_anglebasednavigation.html},
  doi = {10.1109/ROBOT.2004.1307416},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2004_04_icra_angle_based_navigation.pdf}
}

A.A. Argyros and M.I.A. Lourakis, "Real-time tracking of multiple skin-colored objects with a possibly moving camera", In European Conference on Computer Vision (ECCV 2004), Springer, pp. 368-379, Prague, Czech Republic, January 2004.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: This paper presents a method for tracking multiple skin-colored objects in images acquired by a possibly moving camera. The proposed method encompasses a collection of techniques that enable the modeling and detection of skin-colored objects as well as their temporal association in image sequences. Skin-colored objects are detected with a Bayesian classifier which is bootstrapped with a small set of training data. Then, an off-line iterative training procedure is employed to refine the classifier using additional training images. On-line adaptation of skin-color probabilities is used to enable the classifier to cope with illumination changes. Tracking over time is realized through a novel technique which can handle multiple skin-colored objects. Such objects may move in complex trajectories and occlude each other in the field of view of a possibly moving camera. Moreover, the number of tracked objects may vary in time. A prototype implementation of the developed system operates on 320x240 live video in real time (28Hz) on a conventional Pentium 4 processor. Representative experimental results from the application of this prototype to image sequences are also provided.

BibTeX:

@inproceedings{Argyros2004b,
  author = {Argyros, Antonis A and Lourakis, Manolis I A},
  title = {Real-time tracking of multiple skin-colored objects with a possibly moving camera},
  booktitle = {European Conference on Computer Vision (ECCV 2004)},
  publisher = {Springer},
  year = {2004},
  month = {January},
  pages = {368--379},
  address = {Prague, Czech Republic},
  url = {http://users.ics.forth.gr/ argyros/res_colortracking.html},
  projects =  {ACTIPRET},
  doi = {10.1007/978-3-540-24672-5_29},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2004_05_eccv_hand_tracking_2d.pdf},
  videolink = {https://youtu.be/m4qv9rK8k9s}
}

A.A. Argyros and M.I.A. Lourakis, "Three-dimensional tracking of multiple skin-colored regions by a moving stereoscopic system", Applied Optics, [New York: Optical Society of America], 1962-, vol. 43, no. 2, pp. 366-378, 2004.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: A system that performs three-dimensional (3D) tracking of multiple skin-colored regions (SCRs) in images acquired by a calibrated, possibly moving stereoscopic rig is described. The system consists of a collection of techniques that permit the modeling and detection of SCRs, the determination of their temporal association in monocular image sequences, the establishment of their correspondence between stereo images, and the extraction of their 3D positions in a world-centered coordinate system. The development of these techniques has been motivated by the need for robust, near-real-time tracking performance. SCRs are detected by use of a Bayesian classifier that is trained with the aid of a novel technique. More specifically, the classifier is bootstrapped with a small set of training data. Then, as new images are being processed, an iterative training procedure is employed to refine the classifier. Furthermore, a technique is proposed to enable the classifier to cope with changes in illumination. Tracking of SCRs in time as well as matching of SCRs in the images of the employed stereo rig is performed through computationally inexpensive and robust techniques. One of the main characteristics of the skin-colored region tracker (SCRT) instrument is its ability to report the 3D positions of SCRs in a world-centered coordinate system by employing a possibly moving stereo rig with independently verging CCD cameras. The system operates on images of dimensions 640x480 pixels at a rate of 13 Hz on a conventional Pentium 4 processor at 1.8 GHz. Representative experimental results from the application of the SCRT to image sequences are also provided.

BibTeX:

@article{Argyros2004,
  author = {Argyros, Antonis A and Lourakis, Manolis I A},
  title = {Three-dimensional tracking of multiple skin-colored regions by a moving stereoscopic system},
  journal = {Applied Optics},
  publisher = {[New York: Optical Society of America], 1962-},
  year = {2004},
  volume = {43},
  number = {2},
  pages = {366--378},
  url = {http://users.ics.forth.gr/ argyros/res_colortracking.html},
  projects =  {ACTIPRET},
  doi = {10.1364/AO.43.000366},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2004_04_journal_aoip_vol43_no2_pp366-378_3d_skin_tracking.pdf}
}

M.I.A. Lourakis and A.A. Argyros, "The Design and Implementation of a Generic Sparse Bundle Adjustment Software Package Based on the Levenberg-Marquardt Algorithm", FORTH-ICS, TR-340, 2004.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: Bundle adjustment using the Levenberg-Marquardt minimization algorithm is almost invariably used as the last step of every feature-based structure and motion estimation vision algorithm to obtain optimal 3D structure and viewing parameter estimates. However, due to the large number of unknowns contributing to the minimized reprojection error, a general purpose implementation of the Levenberg-Marquardt algorithm incurs high computational costs when applied to the problem of bundle adjustment. Fortunately, the lack of interaction among parameters for different 3D points and cameras in multiple view reconstruction results in the under-lying normal equations exhibiting a sparse block structure, which can be exploited to gain considerable computational benefits. This paper presents the design and explains the use of sba, a publicly available C/C++ software package for generic bundle adjustment based on the sparse Levenberg-Marquardt algorithm.

BibTeX:

@techreport{Lourakis2004,
  author = {Lourakis, Manolis I A and Argyros, Antonis A},
  title = {The Design and Implementation of a Generic Sparse Bundle Adjustment Software Package Based on the Levenberg-Marquardt Algorithm},
  school = {FORTH-ICS},
  year = {2004},
  number = {TR-340},
  url = {http://users.ics.forth.gr/ lourakis/sba/},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2004_08_tr340_forth_sba.pdf}
}

H. Baltzakis, A.A. Argyros and P.E. Trahanias, "Fusion of laser and visual data for robot motion planning and collision avoidance", Machine Vision Applications, Springer, vol. 15, no. 2, pp. 92-100, December 2003.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: In this paper, a method for inferring scene structure information based on both laser and visual data is proposed. Common laser scanners employed in contemporary robotic systems provide accurate range measurements, but only in 2D slices of the environment. On the other hand, vision is capable of providing dense 3D information of the environment. The proposed fusion scheme combines the accuracy of laser sensors with the broad visual fields of cameras toward extracting accurate scene structure information. Data fusion is achieved by validating 3D structure assumptions formed according to 2D range scans of the environment, through the exploitation of visual information. The proposed methodology is applied to robot motion planning and collision avoidance tasks by using a suitably modified version of the vector field histogram algorithm. Experimental results confirm the effectiveness of the proposed methodology.

BibTeX:

@article{Baltzakis2003,
  author = {Baltzakis, Haris and Argyros, Antonis A and Trahanias, Panos E},
  title = {Fusion of laser and visual data for robot motion planning and collision avoidance},
  journal = {Machine Vision Applications},
  publisher = {Springer},
  year = {2003},
  month = {December},
  volume = {15},
  number = {2},
  pages = {92--100},
  url = {http://users.ics.forth.gr/ argyros},
  doi = {10.1007/s00138-003-0133-2},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2003_10_journal_mva_vol15_pp92-100_fusion_laser_vision.pdf}
}

W. Burgard, P.E. Trahanias, D. Hähnel, M. Moors, D. Schulz, H. Baltzakis and A.A. Argyros, "Tele-Presence in Populated Exhibitions Through Web-Operated Mobile Robots", Autonomous Robots, Springer, vol. 15, no. 3, pp. 299-316, November 2003.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: This paper presents techniques that facilitate mobile robots to be deployed as interactive agents in populated environments such as museum exhibitions or trade shows. The mobile robots can be tele-operated over the Internet and, this way, provide remote access to distant users. Throughout this paper we describe several key techniques that have been developed in this context. To support safe and reliable robot navigation, techniques for environment mapping, robot localization, obstacle detection and people-tracking have been developed. To support the interaction of both web and on-site visitors with the robot and its environment, appropriate software and hardware interfaces have been employed. By using advanced navigation capabilities and appropriate authoring tools, the time required for installing a robotic tour-guide in a museum or a trade fair has been drastically reduced. The developed robotic systems have been thoroughly tested and validated in the real-world conditions offered in the premises of various sites. Such demonstrations ascertain the functionality of the employed techniques, establish the reliability of the complete systems, and provide useful evidence regarding the acceptance of tele-operated robotic tour-guides by the broader public.

BibTeX:

@article{Burgard2003,
  author = {Burgard, Wolfram and Trahanias, Panos E and Hähnel, Dirk and Moors, Mark and Schulz, Dirk and Baltzakis, Haris and Argyros, Antonis A},
  title = {Tele-Presence in Populated Exhibitions Through Web-Operated Mobile Robots},
  journal = {Autonomous Robots},
  publisher = {Springer},
  year = {2003},
  month = {November},
  volume = {15},
  number = {3},
  pages = {299--316},
  url = {http://users.ics.forth.gr/ argyros/res_robotsinexhibitions.html},
  projects =  {WEBFAIR,TOURBOT},
  doi = {10.1023/A:1026272605502},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2003_xx_journal_ar_vol15_no2_pp299-316_robots_in_exhibitions.pdf}
}

P.E. Trahanias, W. Burgard, D. Haehnel, M. Moors, D. Schulz, H. Baltzakis and A.A. Argyros, "Interactive tele-presence in exhibitions through web-operated robots", In International Conference on Advanced Robotics (ICAR 2003), invited session on Robotics and Art, pp. 1253-1258, Coimbra, Portugal, June 2003.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: The current paper presents techniques that facilitate mobile robots to be deployed as interactive agents in populated environments, such as museum exhibitions or trade shows. The mobile robots can be tele-operated over the Internet and this way provide remote access to distant users. Throughout this paper we describe several key techniques that have been developed in the relevant projects. They include robust mapping and localization, people-tracking and advanced visualizations for Web users. The developed robotic systems have been installed and operated in the premises of various sites. Use of the above techniques, as well as appropriate authoring tools, has resulted in drastic reduction in the installation times. Additionally, the systems were thoroughly tested and validated in real-world conditions. Such demonstrations ascertain the functionality and reliability of our methods and provide evidence as of the operation of the complete systems.

BibTeX:

@inproceedings{Trahanias2003,
  author = {Trahanias, Panos E and Burgard, Wolfram and Haehnel, Dirk and Moors, Mark and Schulz, Dirk and Baltzakis, Haris and Argyros, Antonis A},
  title = {Interactive tele-presence in exhibitions through web-operated robots},
  booktitle = {International Conference on Advanced Robotics (ICAR 2003), invited session on Robotics and Art},
  year = {2003},
  month = {June},
  pages = {1253--1258},
  address = {Coimbra, Portugal},
  url = {http://users.ics.forth.gr/ argyros/res_robotsinexhibitions.html},
  projects =  {WEBFAIR,TOURBOT},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2003_06_icar_exhibitions.pdf}
}

M.I.A. Lourakis, S.V. Tzurbakis, A.A. Argyros and S.C. Orphanoudakis, "Feature transfer and matching in disparate stereo views through the use of plane homographies", IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE, vol. 25, no. 2, pp. 271-276, February 2003.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Many vision tasks rely upon the identification of sets of corresponding features among different images. This paper presents a method that, given some corresponding features in two stereo images, matches them with features extracted from a second stereo pair captured from a distant viewpoint. The proposed method is based on the assumption that the viewed scene contains two planar surfaces and exploits geometric constraints that are imposed by the existence of these planes to first transfer and then match image features between the two stereo pairs. The resulting scheme handles point and line features in a unified manner and is capable of successfully matching features extracted from stereo pairs that are acquired from considerably different viewpoints. Experimental results are presented, which demonstrate that the performance of the proposed method compares favorably to that of epipolar and tensor-based approaches.

BibTeX:

@article{Lourakis2003a,
  author = {Lourakis, Manolis I A and Tzurbakis, Stavros V and Argyros, Antonis A and Orphanoudakis, Stelios C},
  title = {Feature transfer and matching in disparate stereo views through the use of plane homographies},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
  publisher = {IEEE},
  year = {2003},
  month = {February},
  volume = {25},
  number = {2},
  pages = {271--276},
  doi = {10.1109/TPAMI.2003.1177157},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2003_02_journal_pami_feature_matching.pdf}
}

S.C. Orphanoudakis, A.A. Argyros and M. Vincze, "Towards a cognitive vision methodology: understanding and interpreting activities of experts", ERCIM News, no. 53, 2003.
[BibTeX] [PDF] [URL] [VIDEO]

BibTeX:

@periodical{Orphanoudakis2003,
  author = {Orphanoudakis, Stelios C and Argyros, Antonis A and Vincze, Markus},
  title = {Towards a cognitive vision methodology: understanding and interpreting activities of experts},
  journal = {ERCIM News},
  year = {2003},
  number = {53},
  url = {http://users.ics.forth.gr/ argyros/res_colortracking.html},
  projects =  {ACTIPRET},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2003_04_report_ercim_actipret.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/ActipretVideo.mpg}
}

K. Bekris, A.A. Argyros and L. Kavraki, "Angle-Based Methods for Mobile Robot Navigation: Reaching the Entire Plane", Rice University, TR03-426, 2003.
[BibTeX] [URL] [VIDEO]

BibTeX:

@techreport{Bekris2003,
  author = {Bekris, Kostas and Argyros, Antonis A and Kavraki, Lydia},
  title = {Angle-Based Methods for Mobile Robot Navigation: Reaching the Entire Plane},
  school = {Rice University},
  year = {2003},
  number = {TR03-426},
  address = {Houston, Texas, USA},
  url = {http://users.ics.forth.gr/ argyros/res_anglebasednavigation.html},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/snapshot.mpg}
}

M.I.A. Lourakis and A.A. Argyros, "Efficient 3D Camera Matchmoving Using Markerless, Segmentation-Free Plane Tracking", FORTH-ICS, TR-324, 2003.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: We address the problem of tracking the position and orientation of a camera in 3D space, using the images it acquires while moving freely in unmodeled, arbitrary environments. This task has a broad spectrum of useful applications in domains such as augmented reality and video post production. Most of the existing methods for camera tracking are designed to operate in a batch, off-line mode, assuming that the whole video sequence to be tracked is available before tracking commences. Typically, such methods operate non-causally, processing video frames backwards and forwards in time as they see fit. Furthermore, they resort to optimization in very high dimensional spaces, a process that is computationally intensive. For these reasons, batch methods are inapplicable to tracking in on-line, time-critical applications such as video see-through augmented reality. This paper puts forward a novel feature-based approach for camera tracking. The proposed approach operates continuously as images are acquired, has realistic computational requirements and does not require modifications of the environment. At its core lies a novel, feature-based 3D plane tracking technique, which permits the estimation of the homographies induced by a virtual 3D plane between successive image pairs. Knowledge of these homographies allows the corresponding projection matrices encoding camera motion to be expressed in a common projective frame and, therefore, to be recovered directly. Projective camera matrices are then upgraded to Euclidean and used for recovering 3D structure, which is in turn employed for refining the projection matrices through local bundle adjustment. Sample experimental results demonstrating the feasibility of the approach on several image sequences are also provided.

BibTeX:

@techreport{Lourakis2003,
  author = {Lourakis, Manolis I A and Argyros, Antonis A},
  title = {Efficient 3D Camera Matchmoving Using Markerless, Segmentation-Free Plane Tracking},
  school = {FORTH-ICS},
  year = {2003},
  number = {TR-324},
  url = {http://users.ics.forth.gr/ lourakis/camtrack/},
  projects =  {LIFEPLUS},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2003_09_tr324_forth_camera_tracking.pdf}
}

D.P. Tsakiris, A.A. Argyros and C. Groyer, "Experiments in Corridor Following with Nonholonomic Mobile Robots Equipped with Panoramic Cameras", Institute of Computer Science - FORTH, TR-318, 2003.
[BibTeX] [URL] [VIDEO]

BibTeX:

@techreport{Tsakiris2003,
  author = {Tsakiris, Dimitris P and Argyros, Antonis A and Groyer, Cedric},
  title = {Experiments in Corridor Following with Nonholonomic Mobile Robots Equipped with Panoramic Cameras},
  school = {Institute of Computer Science - FORTH},
  year = {2003},
  number = {TR-318},
  url = {http://users.ics.forth.gr/ argyros/res_biomimeticcentering.html},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/CorridorFollowing.avi}
}

W. Burgard, P. Trahanias, D. Hähnel, M. Moors, D. Schulz, H. Baltzakis and A.A. Argyros, "TOURBOT and WebFAIR: Web-operated mobile robots for tele-presence in populated exhibitions", In IEEE/RSJ International Conference on Intelligent Robots and Systems Workshops (IROSW 2002), IEEE, pp. 1-10, Lausanne, Switzerland, October 2002.
[Abstract] [BibTeX] [PDF] [URL] [VIDEO]

BibTeX:

@inproceedings{Burgard2002,
  author = {Burgard, Wolfram and Trahanias, Panos and Hähnel, Dirk and Moors, Mark and Schulz, Dirk and Baltzakis, Haris and Argyros, Antonis A},
  title = {TOURBOT and WebFAIR: Web-operated mobile robots for tele-presence in populated exhibitions},
  booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems Workshops (IROSW 2002)},
  publisher = {IEEE},
  year = {2002},
  month = {October},
  pages = {1--10},
  address = {Lausanne, Switzerland},
  url = {http://users.ics.forth.gr/ argyros/res_robotsinexhibitions.html},
  projects =  {WEBFAIR,TOURBOT},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2002_10_iros_exhibitions.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/TourbotInBrief.mpg}
}

G.D. Kazazakis and A.A. Argyros, "Fast positioning of limited-visibility guards for the inspection of 2D workspaces", In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2002), IEEE, pp. 2843-2848, Lausanne, Switzerland, October 2002.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: This paper presents a novel method for deciding the locations of "guards" required to visually inspect a given 2D workspace. The decided guard positions can then be used as control points in the path of a mobile robot that autonomously inspects a workspace. It is assumed that each of the guards (or the mobile robot that visits the guard positions in some order) is equipped with a panoramic camera of 360 degrees field of view. However, the camera has limited visibility, in the sense that it can observe with sufficient detail objects that are not further than a predefined visibility range. The method seeks to efficiently produce solutions that contain the smaller possible number of guards. Experimental results demonstrate that the proposed method is computationally efficient and that, although suboptimal, decides a small number of guards.

BibTeX:

@inproceedings{Kazazakis2002,
  author = {Kazazakis, Giorgos D. and Argyros, Antonis A},
  title = {Fast positioning of limited-visibility guards for the inspection of 2D workspaces},
  booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2002)},
  publisher = {IEEE},
  year = {2002},
  month = {October},
  volume = {3},
  pages = {2843--2848},
  address = {Lausanne, Switzerland},
  doi = {10.1109/IRDS.2002.1041701},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2002_10_inspection_panoramic.pdf}
}

H. Baltzakis, A.A. Argyros and P.E. Trahanias, "Fusion of range and visual data for the extraction of scene structure information", In IEEE International Conference on Pattern Recognition (ICPR 2002), IEEE, pp. 7-11, Quebec City, Canada, August 2002.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: In this paper a method for inferring 3D structure information based on both range and visual data is proposed. Data fusion is achieved by validating assumptions formed according to 2D range scans of the environment, through the exploitation of visual information. The proposed method is readily applicable to robot navigation tasks providing significant advantages over existing methods.

BibTeX:

@inproceedings{Baltzakis2002,
  author = {Baltzakis, Haris and Argyros, Antonis A and Trahanias, Panos E},
  title = {Fusion of range and visual data for the extraction of scene structure information},
  booktitle = {IEEE International Conference on Pattern Recognition (ICPR 2002)},
  publisher = {IEEE},
  year = {2002},
  month = {August},
  volume = {4},
  pages = {7--11},
  address = {Quebec City, Canada},
  projects =  {WEBFAIR,ACTIPRET},
  doi = {10.1109/ICPR.2002.1047388},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2002_08_icpr_fusion_range_vision.pdf}
}

A.A. Argyros, P. Georgiadis, P.E. Trahanias and D.P. Tsakiris, "Semi-autonomous navigation of a robotic wheelchair", Journal of Intelligent and Robotic Systems, Kluwer Academic Publishers, vol. 34, no. 3, pp. 315-329, 2002.
[Abstract] [BibTeX] [DOI] [URL] [VIDEO]

Abstract: Abstract. The present work considers the development of a wheelchair for people with special needs, which is capable of navigating semi-autonomously within its workspace. This system is expected to prove useful to people with impaired mobility and limited fine motor control of the upper extremities. Among the implemented behaviors of this robotic system are the avoidance of obstacles, the motion in the middle of the free space and the following of a moving target specified by the user (e.g., a person walking in front of the wheelchair). The wheelchair is equipped with sonars, which are used for distance measurement in preselected critical directions, and with a panoramic camera with a 360 degree field of view, which is used for following a moving target. After suitably processing the color sequence of the panoramic images using the color histogram of the desired target, the orientation of the target with respect to the wheelchair is determined, while its distance is determined by the sonars. The motion control laws developed for the system use the sensory data and take into account the non-holonomic kinematic constraints of the wheelchair, in order to guarantee certain desired features of the closed-loop system, such as stability. Moreover, they are as simplified as possible to minimize implementation requirements. An experimental prototype has been developed at ICS—FORTH, based on a commercially-available wheelchair. The sensors, the computing power and the electronics needed for the implementation of the navigation behaviors and of the user interfaces (touch screen, voice commands) were developed as add-on modules and integrated with the wheelchair.

BibTeX:

@article{Argyros2002,
  author = {Argyros, Antonis A and Georgiadis, Pantelis and Trahanias, Panos E and Tsakiris, Dimitris P},
  title = {Semi-autonomous navigation of a robotic wheelchair},
  journal = {Journal of Intelligent and Robotic Systems},
  publisher = {Kluwer Academic Publishers},
  year = {2002},
  volume = {34},
  number = {3},
  pages = {315--329},
  url = {http://users.ics.forth.gr/ argyros/res_pan_track.html},
  doi = {10.1023/A:1016371922451},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/ChairFollowsPerson.mpg}
}

K. Bekris, K. Hatzopoulos, G. Kazazakis, G. Kontolemakis, M. Masvoula, N. Tsivourakis, A.A. Argyros and P. Trahanias, "PYTHEAS: an integrated robotic system with autonomous navigation capabilities", Image Processing and Applications, Image Processing & Communications Journal, Special issue on Intelligent Sensing, De Gruyter publications, vol. 8, no. 2, pp. 81-92, 2002.
[Abstract] [BibTeX] [PDF]

Abstract: In this paper we present PYTHEAS, an integrated robotic software system that supports autonomous navigation capabilities. These include localization, workspace mapping, path planning and tracking, and obstacle avoidance. PYTHEAS enables mapping of an unknown indoor environment by exploiting sensory information extracted from a laser scanner. Based on this acquired environment representation, the system is able to navigate autonomously in the mapped workspace, avoiding at the same time dynamic obstacles, such as moving persons or other objects. The developed competences are coupled in an integrated system, which can be controlled through a user-friendly interface over the web. Experimental results demonstrate the ability of the developed system to map complicated environments and support navigation in dynamic worlds.

BibTeX:

@article{Bekris2002,
  author = {Bekris, Kostas and Hatzopoulos, Kostas and Kazazakis, Giorgos and Kontolemakis, Giorgos and Masvoula, M and Tsivourakis, Nikos and Argyros, Antonis A and Trahanias, Panos},
  title = {PYTHEAS: an integrated robotic system with autonomous navigation capabilities},
  journal = {Image Processing and Applications, Image Processing &amp; Communications Journal, Special issue on Intelligent Sensing},
  publisher = {De Gruyter publications},
  year = {2002},
  volume = {8},
  number = {2},
  pages = {81--92},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2002_xx_journal_pcj_vol8_no2_pp81-92_pytheas.pdf}
}

M.I.A. Lourakis, A.A. Argyros and S.C. Orphanoudakis, "Detecting Planes In An Uncalibrated Image Pair.", In British Machine Vision Conference (BMVC 2002), BMVA, pp. 587-596, Cardiff, UK, 2002.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: Plane detection is a prerequisite to a wide variety of vision tasks. This paper proposes a novel method that exploits results from projective geometry to automatically detect planes using two images. Using a set of point and line features that have been matched between images, the method exploits the fact that every pair of a 3D line and a 3D point defines a plane and utilizes an iterative voting scheme for identifying coplanar subsets of the employed feature set. The method does not require camera calibration, circumvents the 3D reconstruction problem is robust to the existence of mismatched features and is applicable either to stereo or motion sequence images. Sample results from the application of the proposed method to real imagery are also provided.

BibTeX:

@inproceedings{Lourakis2002,
  author = {Lourakis, Manolis I A and Argyros, Antonis A and Orphanoudakis, Stelios C},
  title = {Detecting Planes In An Uncalibrated Image Pair.},
  booktitle = {British Machine Vision Conference (BMVC 2002)},
  publisher = {BMVA},
  year = {2002},
  pages = {587--596},
  address = {Cardiff, UK},
  doi = {10.5244/C.16.57},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2002_09_bmvc_plane_detection.pdf}
}

M. Roussou, P.E. Trahanias, G. Giannoulis, G. Kamarinos, A.A. Argyros, D.P. Tsakiris, P. Georgiadis, W. Burgard, D. Haehnel, A. Cremers and others, "Experiences from the Use of a Robotic Avatar in a Museum Setting", In Proceedings of the 2001 conference on Virtual reality, archaeology, and cultural heritage, ACM, pp. 153-160, New York, USA, December 2001.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

BibTeX:

@inproceedings{Roussou2001,
  author = {Roussou, Maria and Trahanias, Panos E and Giannoulis, George and Kamarinos, George and Argyros, Antonis A and Tsakiris, Dimitris P and Georgiadis, Pantelis and Burgard, Wolfram and Haehnel, Dirk and Cremers, Armin and others},
  title = {Experiences from the Use of a Robotic Avatar in a Museum Setting},
  booktitle = {Proceedings of the 2001 conference on Virtual reality, archaeology, and cultural heritage},
  publisher = {ACM},
  year = {2001},
  month = {December},
  pages = {153--160},
  address = {New York, USA},
  url = {http://users.ics.forth.gr/ argyros/res_robotsinexhibitions.html},
  projects =  {TOURBOT},
  doi = {10.1145/584993.585017},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2001_12_vast_tourbot.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/TourbotInBrief.mpg}
}

A.A. Argyros, P. Georgiadis, P.E. Trahanias and D.P. Tsakiris, "Semi-Autonomous Navigation of a Robotic Wheelchair", In KTISIVIOS Panhellenic Conference in Robotics and Automation, Santorini, Greece, July 2001.
[BibTeX] [PDF] [URL]

BibTeX:

@inproceedings{Argyros2001a,
  author = {Argyros, Antonis A and Georgiadis, Pantelis and Trahanias, Panos E and Tsakiris, Dimitris P},
  title = {Semi-Autonomous Navigation of a Robotic Wheelchair},
  booktitle = {KTISIVIOS Panhellenic Conference in Robotics and Automation},
  year = {2001},
  month = {July},
  address = {Santorini, Greece},
  url = {http://users.ics.forth.gr/ argyros/res_pan_track.html},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2001_07_ktisivios_wheelchair.pdf}
}

K. Bekris, K. Hatzopoulos, G. Kazazakis, G. Kontolemakis, M. Masvoula, N. Tsivourakis, A.A. Argyros and P.E. Trahanias, "PYTHEAS: an integrated robotic system with autonomous navigation capabilities", In KTISIVIOS Panhellenic Conference in Robotics and Automation, Santorini, Greece, July 2001.
[BibTeX] [PDF]

BibTeX:

@inproceedings{Bekris2001,
  author = {Bekris, Kostas and Hatzopoulos, Kostas and Kazazakis, Giorgos and Kontolemakis, Giorgos and Masvoula, M and Tsivourakis, Nikos and Argyros, Antonis A and Trahanias, Panos E},
  title = {PYTHEAS: an integrated robotic system with autonomous navigation capabilities},
  booktitle = {KTISIVIOS Panhellenic Conference in Robotics and Automation},
  year = {2001},
  month = {July},
  address = {Santorini, Greece},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2001_07_ktisivios_pytheas.pdf}
}

G. Giannoulis, M. Coliou, G.S. Kamarinos, M. Roussou, P.E. Trahanias, A.A. Argyros, D.P. Tsakiris, A. Cremers, D. Schulz, W. Burgard and others, "Enhancing museum visitor access through robotic avatars connected to the web", In Museums and the Web, pp. 14-17, Seattle, USA, March 2001.
[Abstract] [BibTeX] [PDF] [URL] [VIDEO]

Abstract: Access to cultural exhibits is a central issue in museums and exhibition galleries that is recently approached under a new, technological perspective. Although the cultural industries' practices in the cases of museums and cultural exhibits have remained practically unchanged for long, in recent years we are witnessing a gradual adoption of media-technologies in various aspects, such as collections archiving and digital document preservation, media- and Web-presentation, graphical animations, etc. The advent of such technologies contributes towards providing media-rich presentations of cultural exhibits and consequently offering better services to museum visitors. Lately, Internet and Web-based technologies have been employed for providing access, mostly to images of exhibited objects. With current technology, such access is limited due to the non-interactive nature of pre-recorded images or videos and the difficulty in constant updating of the sites when there is change in the content. In few cases, the incorporation of higher-end technology, such as virtual reality, artificial intelligence, or robotics, is explored. Some science museums, large "edutainment" venues, and recreation parks have traditionally been the ones to embrace new media first, by employing fascinating and sophisticated interactive installations and presenting up-to-date results on the creative use of technology. ln this paper we present such an effort, the TOURBOT project, which emphasizes the development of alternative ways for interactive museum telepresence, essentially through the use of robotic "avatars". TOURBOT, an acronym for TOUr-guide RoBOT, represents a collaboration between museums, technology an interactive tour-guide robot able to provide individual access to museums' exhibits and cultural heritage over the Internet. TOURBOT operates as the user's surrogate personna (avatar) in the museum by accepting commands over the web that direct it to move in its physical workspace and visit specific exhibits. In other words, the imaged scene of the museum and the exhibits is communicated over the Internet to the remote visitor As a result the user enjoys a personalized tele-presence to the museum, being able to choose the exhibits to visit, as well as the preferred viewing conditions (point of view, distance to the exhibit, resolution, etc). ln addition to remote interaction with the robot, TOURBOT can also act as a flexible, on-site museum guide to visitors that are physically present. By interacting with the tour-guide robot, museum visitors have the ability to individually exploit the expertise stored in the robot, which can react flexibly to their requirements. It can, for example, offer dedicated tours of specific focus to exhibitions or alternatively give overview tours. As a side effect of this concept, museum visitors get acquainted with new, cutting-edge technology by easily interacting with a complex robotic system. Therefore, technological advances are seamlessly assimilated in everyday activities. This approach to cultural heritage access presents a high degree of novelty as well as a number of technical and conceptual issues and challenges. This paper discusses these issues while analyzing the expected benefits and expectations from visitors, the community, and the museums.

BibTeX:

@inproceedings{Giannoulis2001,
  author = {Giannoulis, George and Coliou, Mandy and Kamarinos, George S and Roussou, Maria and Trahanias, Panos E and Argyros, Antonis A and Tsakiris, Dimitris P and Cremers, Armin and Schulz, Dirk and Burgard, Wolfram and others},
  title = {Enhancing museum visitor access through robotic avatars connected to the web},
  booktitle = {Museums and the Web},
  year = {2001},
  month = {March},
  pages = {14--17},
  address = {Seattle, USA},
  url = {http://users.ics.forth.gr/ argyros/res_robotsinexhibitions.html},
  projects =  {TOURBOT},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2001_12_mw_tourbot.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/TourbotInBrief.mpg}
}

A.A. Argyros, C. Bekris and S.C. Orphanoudakis, "Robot homing based on panoramic vision", FORTH-ICS, TR-287, March 2001.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: In robotics, homing can be defined as that behavior, which enables a robot to return to its initial (home) position, after traveling a certain distance along an arbitrary path. Odometry has traditionally been used for the implementation of such a behavior, but it has been shown to be an unreliable source of information. In this work, a novel method for visual homing is proposed, based on a panoramic camera. As the robot departs from its initial position, it tracks characteristic features of the environment (corners). As soon as homing is activated, the robot selects intermediate target positions on the original path. These intermediate positions are then visited sequentially, until the home position is reached. For the robot to move between two consecutive intermediate positions, it is only required to establish correspondence among at least three corners. This correspondence is obtained through a feature tracking mechanism. The proposed homing scheme is based on the extraction of very low-level sensory information, namely the bearing angles of corners, and has been implemented on a robotic platform. Experimental results show that the proposed scheme achieves homing with a remarkable accuracy, which is not affected by the distance traveled by the robot.

BibTeX:

@techreport{Argyros2001b,
  author = {Argyros, Antonis A and Bekris, Costas and Orphanoudakis, Stelios C},
  title = {Robot homing based on panoramic vision},
  school = {FORTH-ICS},
  year = {2001},
  month = {March},
  number = {TR-287},
  url = {http://users.ics.forth.gr/ argyros/res_pan_homing.html},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2001_03_tr287_forth_robot_homing_panoramic_vision.pdf}
}

A.A. Argyros, K.E. Bekris and S.C. Orphanoudakis, "Robot homing based on corner tracking in a sequence of panoramic images", In IEEE Computer Vision and Pattern Recognition (CVPR 2001), pp. 3-10, Kauai, Hawaii, USA, 2001.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: In robotics, homing can be defined as that behavior which enables a robot to return to its initial (home) position, after traveling a certain distance along an arbitrary path. Odometry has traditionally been used for the implementation of such a behavior, but it has been shown to be an unreliable source of information. In this work, a novel method for visual homing is proposed, based on a panoramic camera. As the robot departs from its initial position, it tracks characteristic features of the environment (corners). As soon as homing is activated, the robot selects intermediate target positions on the original path. These intermediate positions (IPs) are then visited sequentially, until the home position is reached. For the robot to move between two consecutive IPs, it is only required to establish correspondence among at least three corners. This correspondence is obtained through a feature tracking mechanism. The proposed homing scheme is based on the extraction of very low-level sensory information, namely the bearing angles of corners, and has been implemented on a robotic platform. Experimental results show that the proposed scheme achieves homing with a remarkable accuracy, which is not affected by the distance traveled by the robot.

BibTeX:

@inproceedings{Argyros2001,
  author = {Argyros, Antonis A and Bekris, Kostas E and Orphanoudakis, Stelios C},
  title = {Robot homing based on corner tracking in a sequence of panoramic images},
  booktitle = {IEEE Computer Vision and Pattern Recognition (CVPR 2001)},
  year = {2001},
  volume = {2},
  pages = {3--10},
  address = {Kauai, Hawaii, USA},
  url = {http://users.ics.forth.gr/ argyros/res_pan_homing.html},
  doi = {10.1109/CVPR.2001.990917},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2001_12_cvpr_homing_panoramic.pdf}
}

C. Balas, G. Themelis, A. Papadakis, E. Vasgiouraki, A.A. Argyros, E. Koumantakis, A. Tosca and E. Helidonis, "A novel hyper-spectral imaging system: application on in-vivo detection and grading of cervical precancers and of pigmented skin lesions", In IEEE Computer Vision and Pattern Recognition Workshops (CVPRW 2001), IEEE, Kauai, Hawai, USA, 2001.
[Abstract] [BibTeX] [PDF]

Abstract: We present a novel Hyper-Spectral Imaging (HySI) System capable of acquiring and real time displaying of 5nm specrtal images, with 2nm tuning step, in the range of 400nm-1000nm. Synchronized spectral scanning and image storing enables the collection of a stack of calibrated spectral images, from which a fully resolved spectrum per image pixel can be calculated and displayed. We also present results from a pilot use of the developed HySI system as a research tool, in an attempt to develop novel, non-invasive diagnostic methods for the detection and grading of cervical precancers and of pigmented skin lesions. In the case of cervical diagnosis we have succeeded to detect in vivo, quantitatively assess and map alterations in tissue structure and functionality, associated with progress of the disease, with high sensitivity and specificity. In the case of pigmented skin areas, near infrared spectral analysis and imaging of melanin-rich spots show that they become transparent at certain imaging wavelengths and that the transparency wavelength increases with the melanin content and the lesion’s depth. This information can be used for the in vivo quantitative assessment of the later, which is considered to be of great diagnostic and predictive value.

BibTeX:

@inproceedings{Balas2001,
  author = {Balas, Costas and Themelis, George and Papadakis, A and Vasgiouraki, E and Argyros, Antonis A and Koumantakis, E and Tosca, A and Helidonis, E},
  title = {A novel hyper-spectral imaging system: application on in-vivo detection and grading of cervical precancers and of pigmented skin lesions},
  booktitle = {IEEE Computer Vision and Pattern Recognition Workshops (CVPRW 2001)},
  publisher = {IEEE},
  year = {2001},
  address = {Kauai, Hawai, USA},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2001_12_cvbvs_hyperspectral.pdf}
}

M.I.A. Lourakis, S.V. Tzurbakis, A.A. Argyros and S.C. Orphanoudakis, "Using geometric constraints for matching disparate stereo views of 3D scenes containing planes", In IEEE International Conference on Pattern Recognition (ICPR 2000), IAPR, pp. 419-422, Barcelona, Spain, September 2000.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: Several vision tasks rely upon the availability of sets of corresponding features among images. This paper presents a method which, given some corresponding features in two stereo images, addresses the problem of matching them with features extracted from a second stereo pair captured from a distant viewpoint. The proposed method is based on the assumption that the viewed scene contains two planar surfaces and exploits geometric constraints that are imposed by the existence of these planes to predict the location of image features in the second stereo pair. The resulting scheme handles point and line features in a unified manner and is capable of successfully matching features extracted from stereo pairs acquired from considerably different viewpoints. Experimental results from a prototype implementation demonstrate the effectiveness of the approach.

BibTeX:

@inproceedings{Lourakis2000,
  author = {Lourakis, Manolis I A and Tzurbakis, Stavros V and Argyros, Antonis A and Orphanoudakis, Stelios C},
  title = {Using geometric constraints for matching disparate stereo views of 3D scenes containing planes},
  booktitle = {IEEE International Conference on Pattern Recognition (ICPR 2000)},
  publisher = {IAPR},
  year = {2000},
  month = {September},
  volume = {1},
  pages = {419--422},
  address = {Barcelona, Spain},
  url = {http://users.ics.forth.gr/ argyros/res_matching.html},
  doi = {10.1109/ICPR.2000.905366},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2000_09_icpr_feature_transfer_and_matching.pdf}
}

D.P. Tsakiris and A.A. Argyros, "Corridor following by mobile robots equipped with panoramic cameras", In IEEE Mediterranean Conference on Control and Automation (MED 2000), IEEE, Patras, Greece, July 2000.
[Abstract] [BibTeX] [PDF] [URL] [VIDEO]

Abstract: The present work considers corridor–following maneuvers for non-holonomic mobile robots, guided by sensory data acquired by panoramic cameras. The panoramic vision system provides information from an environment with textured walls to the motion control system, which drives the robot along a corridor. Panoramic cameras have a 360 degrees visual field, a capability that the proposed control methods exploit. In our sensor–based control scheme, optical flow information from several distinct viewing directions in the entire field of view of the panoramic camera is used directly in the control loop, without the need for state reconstruction. The interest of this lies in the fact that the optical flow information is not sufficient to reconstruct the state of the system, it is however sufficient for the proposed control law to accomplish the desired task. Driving the robot along a corridor amounts to the asymptotic stabilization of a subsystem of the robot’s kinematics and the proposed control schemes are shown to achieve this goal.

BibTeX:

@inproceedings{Tsakiris2000,
  author = {Tsakiris, Dimitris P and Argyros, Antonis A},
  title = {Corridor following by mobile robots equipped with panoramic cameras},
  booktitle = {IEEE Mediterranean Conference on Control and Automation (MED 2000)},
  publisher = {IEEE},
  year = {2000},
  month = {July},
  address = {Patras, Greece},
  url = {http://users.ics.forth.gr/ argyros/res_biomimeticcentering.html},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2000_07_med_corridor_following_panoramic.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/CorridorFollowing.avi}
}

D.P. Tsakiris and A.A. Argyros, "Nonholonomic mobile robots equipped with panoramic cameras: Corridor following", FORTH-ICS, vol. 26, TR-272, June 2000.
[Abstract] [BibTeX] [PDF] [URL] [VIDEO]

Abstract: The present work considers corridor-following maneuvers for mobile robots with nonholonomic constraints, guided by sensory data acquired by panoramic cameras. The panoramic vision system provides information from an environment with textured walls to the motion control system, which drives the robot along a corridor. Panoramic cameras have a 360 degrees visual field, a capability that the proposed control methods attempt to exploit. We consider two types of sensor-based controllers: one is a path-following state feedback control law where the state of the robot inside the corridor is reconstructed from the visual data; in the other, optical flow information from several distinct "looking" directions in the field of view of the panoramic camera is used directly in the control loop, without the need for state reconstruction. The interest of the second type of controllers lies in the fact that this optical flow information is not sufficient to reconstruct the state of the system, it is however sufficient for the proposed control law to accomplish the desired task. Driving the robot along a corridor amounts to the asymptotic stabilization of a subsystem of the robot's kinematics and the proposed control schemes are shown to achieve this goal.

BibTeX:

@techreport{Tsakiris2000a,
  author = {Tsakiris, Dimitris P and Argyros, Antonis A},
  title = {Nonholonomic mobile robots equipped with panoramic cameras: Corridor following},
  school = {FORTH-ICS},
  year = {2000},
  month = {June},
  volume = {26},
  number = {TR-272},
  url = {http://users.ics.forth.gr/ argyros/res_biomimeticcentering.html},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2000_06_tr272_forth_corridor_following_panoramic.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/CorridorFollowing.avi}
}

P.E. Trahanias, A.A. Argyros, D.P. Tsakiris, A. Cremers, D. Schulz, W. Burgard, D. Haehnel, V. Savvaides, G. Giannoulis, M. Coliou and others, "Tourbot-interactive museum tele-presence through robotic avatars", In International WWW Conference, Culture track, Amsterdam, Netherlands, May 2000.
[Abstract] [BibTeX] [PDF] [URL] [VIDEO]

Abstract: TOURBOT, the acronym of a project entitled “Interactive Museum Tele-presence Through Robotic Avatars”, represents an EU-IST funded activity aiming at developing alternative ways for interactive museum tele-presence [1]. In this paper we present the project framework, with emphasis on the project goals, approach and innovations, as well as the expected benefits and results. The overal goal of TOURBOT is the development of an interactive tour-guide robot able to provide individual access to museums’ exhibits and cultural heritage over the Internet. TOURBOT operates as the user’s avatar in the museum (i.e. as a remote “representative” of the user, able to carry out actions and transmit information), by accepting commands over the Web that direct it to move in its workspace and visit specific exhibits; besides, TOURBOT can also act as a flexible, on-site museum guide. More specifically, the TOURBOT objectives are: (1) to develop a robotic avatar with advanced navigation capabilities that will be able to move autonomously in the museum’s premises, (2) to implement appropriate Web interfaces to the robotic avatar that will realize distant-user’s telepresence, i.e. facilitate scene observation through the avatars eyes, (3) to facilitate personalized and realistic observation of the museum exhibits, and (4) to enable on-site, interactive museum tour-guides.

BibTeX:

@inproceedings{Trahanias2000,
  author = {Trahanias, Panos E and Argyros, Antonis A and Tsakiris, Dimitris P and Cremers, Armin and Schulz, Dirk and Burgard, Wolfram and Haehnel, Dirk and Savvaides, Vassilis and Giannoulis, George and Coliou, Mandy and others},
  title = {Tourbot-interactive museum tele-presence through robotic avatars},
  booktitle = {International WWW Conference, Culture track},
  year = {2000},
  month = {May},
  volume = {7},
  address = {Amsterdam, Netherlands},
  url = {http://users.ics.forth.gr/ argyros/res_robotsinexhibitions.html},
  projects =  {TOURBOT},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2000_05_www9_tourbot.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/TourbotInBrief.mpg}
}

M.I.A. Lourakis, S.V. Tzurbakis, A.A. Argyros and S.C. Orphanoudakis, "Using Geometric Constraints for Matching Disparate Stereo Views of 3D Scenes Containing Planes", FORTH-ICS, TR-268, February 2000.
[Abstract] [BibTeX] [PDF] [URL]

BibTeX:

@techreport{Lourakis2000a,
  author = {Lourakis, Manonis I A and Tzurbakis, Stavros V and Argyros, Antonis A and Orphanoudakis, Stelios C},
  title = {Using Geometric Constraints for Matching Disparate Stereo Views of 3D Scenes Containing Planes},
  school = {FORTH-ICS},
  year = {2000},
  month = {February},
  number = {TR-268},
  url = {http://users.ics.forth.gr/ argyros/res_matching.html},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/2000_02_tr268_forth_feature_matching_transfer.pdf}
}

A.A. Argyros and F. Bergholm, "Combining central and peripheral vision for reactive robot navigation", In IEEE Computer Vision and Pattern Recognition (CVPR 1999), IEEE, pp. 646-651, Fort Collins, Colorado, USA, 1999.
[Abstract] [BibTeX] [DOI] [PDF] [URL] [VIDEO]

Abstract: In this paper we present a new method for vision-based, reactive robot navigation that enables a robot to move in the middle of the free space by exploiting both central and peripheral vision. The robot employs a forward-looking camera for central vision and two side-looking cameras for sensing the periphery of its visual field. The developed method combines the information acquired by this trinocular vision system and produces low-level motor commands that keep the robot in the middle of the free space. The approach follows the purposive vision paradigm in the sense that vision is not studied in isolation but in the context of the behaviors that the system is engaged as well as the environment and the robot's motor capabilities. It is demonstrated that by taking into account these issues, vision processing can be drastically simplified still giving rise to quite complex behaviors. The proposed method does not make strict assumptions about the environment, requires very low level information to be extracted from the images, produces a robust robot behavior and is computationally efficient. Results obtained by bath simulations and from a prototype on-line implementation demonstrate the effectiveness of the method

BibTeX:

@inproceedings{Argyros1999,
  author = {Argyros, Antonis A and Bergholm, Fredrik},
  title = {Combining central and peripheral vision for reactive robot navigation},
  booktitle = {IEEE Computer Vision and Pattern Recognition (CVPR 1999)},
  publisher = {IEEE},
  year = {1999},
  volume = {2},
  pages = {646--651},
  address = {Fort Collins, Colorado, USA},
  url = {http://users.ics.forth.gr/ argyros/res_bees.html},
  doi = {10.1109/CVPR.1999.784994},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1999_06_cvpr_flow_balancing_trinocular.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/Centering3cameras.mpg}
}

A.A. Argyros, "Reactive Robot Navigation: A Purposive Approach", In Proceedings of the EU TMR Networks Conference, Graz, Austria, May 1998.
[BibTeX] [PDF] [URL] [VIDEO]

BibTeX:

@inproceedings{Argyros1998a,
  author = {Argyros, Antonis A},
  title = {Reactive Robot Navigation: A Purposive Approach},
  booktitle = {Proceedings of the EU TMR Networks Conference},
  year = {1998},
  month = {May},
  address = {Graz, Austria},
  url = {http://users.ics.forth.gr/ argyros/res_bees.html},
  projects =  {VIRGO},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1998_05_eu-tmrconf_flow_balancing_trinocular.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/Centering3cameras.mpg}
}

M.I.A. Lourakis, A.A. Argyros and S.C. Orphanoudakis, "Independent 3D motion detection using residual parallax normal flow fields", In IEEE International Conference on Computer Vision (ICCV 1998), IEEE, pp. 1012-1017, Santa Barbara, CA, USA, January 1998.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: This paper considers a specific problem of visual perception of motion, tamely the problem of visual detection of independent 3D motion. Most of the existing techniques for solving this problem rely on restrictive assumptions about the environment, the observer's motion, or both. Moreover, they are based on the computation of a dense optical flow field, which amounts to solving the ill-posed correspondence problem. In this work independent motion detection is formulated as a problem of robust parameter estimation applied to the visual input acquired by a rigidly moving observer. The proposed method automatically selects a planar surface in the scene and the residual planar parallax normal flow field with respect to the motion of this surface is computed at two successive time! instants. The two resulting normal flow fields are then combined in a linear model. The parameters of this model are related to the parameters of self-motion (ego-motion) and their robust estimation leads to a segmentation of the scene based on 3D motion. The method avoids a complete solution to the correspondence problem by selectively matching subsets of image points and by employing normal flow fields. Experimental results demonstrate the effectiveness of the proposed method in detecting independent motion in scenes with large depth variations and unrestricted observer motion

BibTeX:

@inproceedings{Lourakis1998,
  author = {Lourakis, Manolis I A and Argyros, Antonis A and Orphanoudakis, Stelios C},
  title = {Independent 3D motion detection using residual parallax normal flow fields},
  booktitle = {IEEE International Conference on Computer Vision (ICCV 1998)},
  publisher = {IEEE},
  year = {1998},
  month = {January},
  pages = {1012--1017},
  address = {Santa Barbara, CA, USA},
  url = {http://users.ics.forth.gr/ argyros/res_imd.html},
  doi = {10.1109/ICCV.1998.710840},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1998_01_iccv_imd_residual_parallax.pdf}
}

A.A. Argyros, P.E. Trahanias and S.C. Orphanoudakis, "Robust regression for the detection of independent 3D motion by a binocular observer", Real-Time Imaging, Academic Press, vol. 4, no. 2, pp. 125-141, 1998.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: A method is proposed for the visual detection of objects that move independently of the observer in a 3D dynamic environment. Many of the existing techniques for solving this problem are based on 2D motion models, which is equivalent to assuming that all the objects in a scene are at a constant depth from the observer. Although such methods perform well if this assumption holds, they may give erroneous results when applied to scenes with large depth variations. Additionally, many of the existing techniques rely on the computation of optical flow, which amounts to solving the ill-posed correspondence problem. In this paper, independent 3D motion detection is formulated using 3D models and is approached as a problem of robust regression applied to visual input acquired by a binocular, rigidly moving observer. Similar analysis is applied both to the stereoscopic data taken by a non-calibrated stereoscopic system and to the motion data obtained from successive frames m time. Least Median of Squares (LMedS) estimation is applied to stereoscopic data to produce maps of image regions characterized by a dominant depth. LMedS is also applied to the motion data that are related to the points at the dominant depth, to segment the latter with respect to 3D motion. In contrast to the methods that rely on 2D models, the proposed method performs accurately, even in the case of scenes with large depth variations. Both stereo and motion processing is based on the normal flow field, which (in contrast to the optical flow field) can be robustly computed from the spatiotemporal derivatives of the image intensity function. Although parts of the proposed scheme have non-trivial computational requirements, computations can be expedited by various ways which are discussed in detail. This is also demonstrated by an on-board implementation of the method on a mobile robotic platform. The method has been evaluated using synthetic as well as real data. Sample results show the effectiveness and robustness of the proposed scheme.

BibTeX:

@article{Argyros1998,
  author = {Argyros, Antonis A and Trahanias, Panos E and Orphanoudakis, Stelios C},
  title = {Robust regression for the detection of independent 3D motion by a binocular observer},
  journal = {Real-Time Imaging},
  publisher = {Academic Press},
  year = {1998},
  volume = {4},
  number = {2},
  pages = {125--141},
  url = {http://users.ics.forth.gr/ argyros/res_imd.html},
  doi = {10.1006/rtim.1997.0055},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1998_04_journal_rti_imd.pdf}
}

A.A. Argyros and F. Bergholm, "Reactive Robot Navigation Based on a Combination of Central and Peripheral Vision", In Computer Vision and Mobile Robotics Workshop (CVMR 1998), Santorini, Greece, 1998.
[Abstract] [BibTeX] [PDF] [URL] [VIDEO]

Abstract: In this paper, we present a new method for vision-based, reactive robot navigation that enables a robot to move in the middle of the free space by exploiting both central and peripheral vision. The system employs a forward-looking camera for central vision and two side-looking cameras for sensing the periphery of the robot’s visual field. The developed method combines the information acquired by this trinocular vision system and produces low-level motor commands that keep the robot in the middle of the free space. The approach follows the purposive vision paradigm in the sense that vision is not studied in isolation but in the context of the behaviors that the system is engaged as well as the environment and the motor capabilities of the robot. It is demonstrated that by taking into account these issues, vision processing can be drastically simplified, still giving rise to quite rich behaviors. The advantages of the method is that it does not make strict assumptions about the environment, it requires very low level information to be extracted from the images, it produces a robust robot behavior and it is computationally very efficient. Results obtained by both simulations and from a prototype on-line implementation demonstrate the effectiveness of the method.

BibTeX:

@inproceedings{Argyros1998b,
  author = {Argyros, Antonis A and Bergholm, Fredrik},
  title = {Reactive Robot Navigation Based on a Combination of Central and Peripheral Vision},
  booktitle = {Computer Vision and Mobile Robotics Workshop (CVMR 1998)},
  year = {1998},
  address = {Santorini, Greece},
  url = {http://users.ics.forth.gr/ argyros/res_bees.html},
  projects =  {VIRGO},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1998_09_cvmr_flow_balancing_trinocular.pdf},
  videolink = {http://users.ics.forth.gr/ argyros/support/imgvideo/Centering3cameras.mpg}
}

M.I.A. Lourakis, A.A. Argyros and S.C. Orphanoudakis, "Independent 3D Motion Detection Using Residual Parallax Normal Flow Fields", FORTH-ICS, TR-206, August 1997.
[BibTeX] [PDF] [URL]

BibTeX:

@techreport{Lourakis1997,
  author = {Lourakis, Manolis I A and Argyros, Antonis A and Orphanoudakis, Stelios C},
  title = {Independent 3D Motion Detection Using Residual Parallax Normal Flow Fields},
  school = {FORTH-ICS},
  year = {1997},
  month = {August},
  number = {TR-206},
  url = {http://users.ics.forth.gr/ argyros/res_imd.html},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1997_08_tr206_forth_imd_using_parallax_normal_flow.pdf}
}

A.A. Argyros and S.C. Orphanoudakis, "Independent 3d motion detection based on depth elimination in normal flow fields", In IEEE Computer Vision and Pattern Recognition (CVPR 1997), IEEE, pp. 672-677, San Juan, Puerto Rico, USA, June 1997.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: This paper considers a specific problem of visual perception of motion, namely the problem of visual detection of independent 3D motion. Most of the existing techniques for solving this problem rely on restrictive assumptions about the environment, the observer's motion, or both. Moreover they are based on the computation of optical flow, which amounts to solving the ill-posed correspondence problem. In this work, independent motion detection is formulated as robust parameter estimation applied to the visual input acquired by a binocular rigidly moving observer. Depth and motion measurements are combined in a linear model. The parameters of this model are related to the parameters of self-motion (egomotion) and the parameters of the stereoscopic configuration of the observer. The robust estimation of this model leads to a segmentation of the scene based on 3D motion. The method avoids the correspondence problem by employing only normal flow fields. Experimental results demonstrate the effectiveness of this method in detecting independent motion in scenes with large depth variations, without any constraints imposed on observer motion.

BibTeX:

@inproceedings{Argyros1997a,
  author = {Argyros, Antonis A and Orphanoudakis, Stelios C},
  title = {Independent 3d motion detection based on depth elimination in normal flow fields},
  booktitle = {IEEE Computer Vision and Pattern Recognition (CVPR 1997)},
  publisher = {IEEE},
  year = {1997},
  month = {June},
  pages = {672--677},
  address = {San Juan, Puerto Rico, USA},
  url = {http://users.ics.forth.gr/ argyros/res_imd.html},
  doi = {10.1109/CVPR.1997.609398},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1997_06_cvpr_imd_depth_elimination_normal_flow.pdf}
}

P.E. Trahanias, M.I.A. Lourakis, A.A. Argyros and S.C. Orphanoudakis, "Navigational support for robotic wheelchair platforms: an approach that combines vision and range sensors", In IEEE International Conference on Robotics and Automation (ICRA 1997), IEEE, pp. 1265-1270, Albuquerque, New Mexico, USA, April 1997.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: An approach towards providing advanced navigational support to robotic wheelchair platforms is presented. In order to avoid any modifications to the environment, we propose an approach that employs computer vision techniques which facilitate space perception and navigation. Computer vision has not been introduced to date in rehabilitation robotics, since the former is not mature enough to meet the needs of this sensitive application. However, in the proposed approach, stable techniques are exploited that facilitate reliable, automatic navigation to any point in the visible environment. Preliminary results obtained from its implementation on a laboratory robotic platform indicate its usefulness and flexibility.

BibTeX:

@inproceedings{Trahanias1997a,
  author = {Trahanias, Panos E and Lourakis, Manonlis I A and Argyros, Antonis A and Orphanoudakis, Stelios C},
  title = {Navigational support for robotic wheelchair platforms: an approach that combines vision and range sensors},
  booktitle = {IEEE International Conference on Robotics and Automation (ICRA 1997)},
  publisher = {IEEE},
  year = {1997},
  month = {April},
  volume = {2},
  pages = {1265--1270},
  address = {Albuquerque, New Mexico, USA},
  doi = {10.1109/ROBOT.1997.614311},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1997_04_icra_wheelchair_navigation.pdf}
}

A.A. Argyros and S.C. Orphanoudakis, "Detecting Independently Moving Objects by Eliminating Depth in Normal Flow Fields", FORTH-ICS, TR-189, 1997.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: This paper considers a specific problem of visual perception of motion, namely the problem of visual detection of independent 3D motion. Most of the existing techniques for solving this problem rely on restrictive assumptions about the environment, the observer’s motion, or both. Moreover, they are based on the computation of optical flow, which amounts to solving the ill-posed correspondence problem. In this work, independent motion detection is formulated as robust parameter estimation applied to the visual input acquired by a binocular, rigidly moving observer. Depth and motion measurements are combined in a linear model. The parameters of this model are related to the parameters of self-motion (egomotion) and the parameters of the stereoscopic configuration of the observer. The robust estimation of this model leads to a segmentation of the scene based on 3D motion. The method avoids the correspondence problem by employing only normal flow fields. Experimental results demonstrate the effectiveness of this method in detecting independent motion in scenes with large depth variations, without any constraints imposed on observer motion.

BibTeX:

@techreport{Argyros1997,
  author = {Argyros, Antonis A and Orphanoudakis, Stelios C},
  title = {Detecting Independently Moving Objects by Eliminating Depth in Normal Flow Fields},
  school = {FORTH-ICS},
  year = {1997},
  number = {TR-189},
  url = {http://users.ics.forth.gr/ argyros/res_imd.html},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1997_03_tr189_forth_imd_depth_elimination_normal_flows.pdf}
}

P.E. Trahanias, S.C. Orphanoudakis, A.A. Argyros, J. Arnspang, F. Bergholm, K. Chandrinos, K. Henriksen, J. Hertzberg, L. Gaga, C.B. Madsen and J. Santos-Victor, "Analysis of Current Approaches in Automated Vision-based Navigation", FORTH-ICS, 1997.
[BibTeX] [PDF] [URL]

BibTeX:

@techreport{Trahanias1997,
  author = {Trahanias, Panos E and Orphanoudakis, Stelios C and Argyros, Antonis A and Arnspang, Jens and Bergholm, Fredrik and Chandrinos, Kostas and Henriksen, Knud and Hertzberg, Joachim and Gaga, Lena and Madsen, Claus B and Santos-Victor, Jose},
  title = {Analysis of Current Approaches in Automated Vision-based Navigation},
  school = {FORTH-ICS},
  year = {1997},
  url = {http://users.ics.forth.gr/ argyros},
  projects =  {VIRGO},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1997_07_virgo_report_vision_based_navigation.pdf}
}

A.A. Argyros, M.I.A. Lourakis, P.E. Trahanias and S.C. Orphanoudakis, "Fast Visual Detection of Changes in 3D Motion", In Machine Vision Applications (MVA 1996), pp. 216-219, Tokyo, Japan, November 1996.
[Abstract] [BibTeX] [PDF]

Abstract: A method is proposed for the fast detection of objects that maneuver in the visual field of a monocular observer. Such cases are common in natural environments where the 3D motion parameters of certain objects (e.g. animals) change considerably over time. The approach taken conforms with the theory of purposive vision, according to which vision algorithms should solve many, specific problems under loose assumptions. The method can effectively answer two important questions: (a) whether the observer has changed his 3D motion parameters, and (b) in case that the observer has constant 3D motion, whether there are any maneuvering objects (objects with non-constant 3D motion parameters) in his visual field. Essentially, the method relies on a pointwise comparison of two normal flow fields which can be robustly computed from three successive frames. Thus, it by-passes the ill-posed problem of optical flow computation. Experimental results demonstrate the effectiveness and robustness of the proposed scheme. Moreover, the computational requirements of the method are extremely low, making it a likely candidate for real-time implementation.

BibTeX:

@inproceedings{Argyros1996b,
  author = {Argyros, Antonis A and Lourakis, Manolis I A and Trahanias, Panos E and Orphanoudakis, Stelios C},
  title = {Fast Visual Detection of Changes in 3D Motion},
  booktitle = {Machine Vision Applications (MVA 1996)},
  year = {1996},
  month = {November},
  pages = {216--219},
  address = {Tokyo, Japan},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1996_11_mva_imd_changes.pdf}
}

A.A. Argyros, M.I.A. Lourakis, P.E. Trahanias and S.C. Orphanoudakis, "Qualitative detection of 3D motion discontinuities", In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 1996), IEEE, pp. 1630-1637, Osaka, Japan, November 1996.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: This paper presents a method for the detection of objects that move independently of the observer in a 3D dynamic scene, Independent motion detection is achieved through processing of stereoscopic image sequences acquired by a binocular, rigidly moving observer. A weak assumption is made about the observer's motion (egomotion), namely that the direction of the translational and rotational components of egomotion are constant in small image patches. This assumption facilitates the extraction of qualitative information about depth from motion, while additional qualitative depth information is independently computed from image stereo pairs acquired by the binocular vision system. Robust regression in the form of least median of squares estimation is applied within each image patch to test for consistency between the depth functions computed from motion and stereo. Possible inconsistencies signal the presence of independently moving objects. In contrast to other existing approaches for independent motion detection, which are based on the ill-posed problem of optical flow computation, the proposed method relies on normal flow fields for both stereo and motion processing. By exploiting local constraints of qualitative nature, the problem of independent motion detection is approached directly, without relying on a solution to the general structure from motion problem. Experimental results indicate that the proposed method is both effective and robust.

BibTeX:

@inproceedings{Argyros1996c,
  author = {Argyros, Antonis A and Lourakis, Manolis I A and Trahanias, Panos E and Orphanoudakis, Stelios C},
  title = {Qualitative detection of 3D motion discontinuities},
  booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 1996)},
  publisher = {IEEE},
  year = {1996},
  month = {November},
  volume = {3},
  pages = {1630--1637},
  address = {Osaka, Japan},
  url = {http://users.ics.forth.gr/ argyros/res_imd.html},
  doi = {10.1109/IROS.1996.569030},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1996_11_iros_imd_motion_discontinuities.pdf}
}

G. Sandini, A.A. Argyros, E. Auffret, P. Dario, B. Dierickx, F. Ferrari, H. Frowein, C. Guerin, L. Hermans, A. Manganas and others, "Image-based personal communication using an innovative space-variant CMOS sensor", In IEEE International Workshop on Robot and Human Communication, 1996, IEEE, pp. 158-163, Tsukuba, Japan, November 1996.
[Abstract] [BibTeX] [DOI] [PDF]

Abstract: This paper reports the result of IBIDEM, a collaborative project supported by the European Union under the Technology Initiative for Elderly and Disabled People-TIDE initiative. The goal of the project has been to build a prototype of a videophone, connected to standard PSTN lines, that can be used by hearing impaired persons in face-to-face communication. The most innovative content of the project has been the design, fabrication and use of a new generation of space-variant visual sensor characterized by a spatial resolution decreasing linearly with distance from the geometric center of the sensor's chip. This sampling strategy allows, with a limited number of pixels and consequently a high frame rate, transmission of high resolution information for “speech-reading” and a wide field of view for facial expressions and gestures.

BibTeX:

@inproceedings{Sandini1996,
  author = {Sandini, Giulio and Argyros, Antonis A and Auffret, E and Dario, Paolo and Dierickx, B and Ferrari, F and Frowein, H and Guerin, C and Hermans, L and Manganas, Andreas and others},
  title = {Image-based personal communication using an innovative space-variant CMOS sensor},
  booktitle = {IEEE International Workshop on Robot and Human Communication, 1996},
  publisher = {IEEE},
  year = {1996},
  month = {November},
  pages = {158--163},
  address = {Tsukuba, Japan},
  projects =  {IBIDEM},
  doi = {10.1109/ROMAN.1996.568790},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1992_07_tr048_forth_load_redistribution_in_vision.pdf}
}

A.A. Argyros, M.I.A. Lourakis, P.E. Trahanias and S.C. Orphanoudakis, "Qualitative Detection of 3D Motion Discontinuities", FORTH-ICS, TR-177, October 1996.
[BibTeX] [PDF] [URL]

BibTeX:

@techreport{Argyros1996d,
  author = {Argyros, Antonis A and Lourakis, Manolis I A and Trahanias, Panos E and Orphanoudakis, Stelios C},
  title = {Qualitative Detection of 3D Motion Discontinuities},
  school = {FORTH-ICS},
  year = {1996},
  month = {October},
  number = {TR-177},
  url = {http://users.ics.forth.gr/ argyros},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1996_11_iros_imd_motion_discontinuities.pdf}
}

P.E. Trahanias, M.I.A. Lourakis, A.A. Argyros and S.C. Orphanoudakis, "Vision-Based Assistive Navigation for Robotic Wheelchair Platforms", FORTH-ICS, TR-178, October 1996.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: In this paper we present an approach towards providing advanced navigational capabilities to robotic wheelchair platforms. Contemporary methods that are employed in robotic wheelchairs are based on the information provided by range sensors and its appropriate exploitation by means of obstacle avoidance techniques. However, since range sensors cannot support a detailed environment representation, these methods fail to provide advanced navigational assistance, unless the environment is appropriately regulated (e.g. with the introduction of beacons). In order to avoid any modifications to the environment, we propose an alternative approach that employs computer vision techniques which facilitate space perception and navigation. Computer vision has not been introduced todate in rehabilitation robotics, since the former is not mature and reliable enough to meet the needs of this sensitive application. However, in the proposed approach, stable techniques are exploited that facilitate reliable, automatic navigation to any point in the visible environment. This greatly enhances the mobility of the elderly and disabled, without requiring them to exercise fine motor control. Preliminary results obtained from the implementation of this approach on a laboratory robotic platform indicate its usefulness and flexibility.

BibTeX:

@techreport{Trahanias1996,
  author = {Trahanias, Panos E and Lourakis, Manolis I A and Argyros, Antonis A and Orphanoudakis, Stelios C},
  title = {Vision-Based Assistive Navigation for Robotic Wheelchair Platforms},
  school = {FORTH-ICS},
  year = {1996},
  month = {October},
  number = {TR-178},
  url = {http://users.ics.forth.gr/ argyros},
  projects =  {VIRGO},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1996_10_tr178_forth_robotic_wheechair.pdf}
}

A.A. Argyros, "Visual Detection of Independent 3D Motion by a Moving Observer", Ph.D. Thesis, Computer Science Department, University of Crete, October 1996.
[BibTeX] [URL]

BibTeX:

@phdthesis{Argyros1996a,
  author = {Argyros, Antonis A},
  title = {Visual Detection of Independent 3D Motion by a Moving Observer},
  school = {Computer Science Department, University of Crete},
  year = {1996},
  month = {October},
  url = {http://users.ics.forth.gr/ argyros/res_imd.html}
}

A.A. Argyros, M.I.A. Lourakis, P.E. Trahanias and S.C. Orphanoudakis, "Independent 3D Motion Detection through Robust Regression in Depth Layers", In British Machine Vision Conference (BMVC 1996), BMVA, pp. 535-544, Edinburgh, UK, September 1996.
[Abstract] [BibTeX] [DOI] [PDF] [URL]

Abstract: This paper presents a novel method for the detection of objects that move independently of the observer in a 3D dynamic environment. Independent 3D motion detection is formulated as a problem of robust regression applied to visual input acquired by a binocular, rigidly moving observer. The qualitative analysis of images acquired by a parallel stereo configuration yields a segmentation of a scene into depth layers. A depth layer consists of points of the 3D space for which depth variations are small compared to the distance from the observer. Robust regression is applied to each depth layer in order to segment the latter into coherently moving regions. Finally, a combination stage is applied across all layers in order to come up with an integrated view of independent motion in the whole 3D scene. In contrast to other existing approaches for independent motion detection which are based on the ill-posed problem of optical flow computation, the proposed method relies on normal flow fields for both stereo and motion processing. Experimental results show the effectiveness and robustness of the proposed scheme, which is capable of discriminating independent 3D motion in scenes with large depth variations.

BibTeX:

@inproceedings{Argyros1996e,
  author = {Argyros, Antonis A and Lourakis, Manolis I A and Trahanias, Panos E and Orphanoudakis, Stelios C},
  title = {Independent 3D Motion Detection through Robust Regression in Depth Layers},
  booktitle = {British Machine Vision Conference (BMVC 1996)},
  publisher = {BMVA},
  year = {1996},
  month = {September},
  pages = {535--544},
  address = {Edinburgh, UK},
  url = {http://users.ics.forth.gr/ argyros/res_imd.html},
  doi = {10.5244/C.10.1},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1996_09_bmvc_imd_in_depth_layers.pdf}
}

P.E. Trahanias, M.I.A. Lourakis, A.A. Argyros and S.C. Orphanoudakis, "Vision-Based Assistive Navigation for Robotic Wheelchair Platforms", In Machine Perception Applications, pp. 43-57, Graz, Austria, September 1996.
[Abstract] [BibTeX] [PDF]

Abstract: Abstract: In this paper we present an approach towards providing advanced navigational capabilities to robotic wheelchair platforms. Contemporary methods that are employed in robotic wheelchairs are based on the information provided by range sensors and its appropriate exploitation by means of obstacle avoidance techniques. However, since range sensors cannot support a detailed environment representation, these methods fail to provide advanced navigational assistance, unless the environment is appropriately regulated (e.g. with the introduction of beacons). In order to avoid any modifications to the environment, we propose an alternative approach that employs computer vision techniques which facilitate space perception and navigation. Computer vision has not been introduced todate in rehabilitation robotics, since the former is not mature and reliable enough to meet the needs of this sensitive application. However, in the proposed approach, stable techniques are exploited that facilitate reliable, automatic navigation to any point in the visible environment. This greatly enhances the mobility of the elderly and disabled, without requiring them to exercise fine motor control. Preliminary results obtained from the implementation of this approach on a laboratory robotic platform indicate its usefulness and flexibility.

BibTeX:

@inproceedings{Trahanias1996a,
  author = {Trahanias, Panos E and Lourakis, Manolis I A and Argyros, Antonis A and Orphanoudakis, Stelios C},
  title = {Vision-Based Assistive Navigation for Robotic Wheelchair Platforms},
  booktitle = {Machine Perception Applications},
  year = {1996},
  month = {September},
  pages = {43--57},
  address = {Graz, Austria},
  projects =  {VIRGO},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1996_09_IAPR_TC-8_assistive_navigation.pdf}
}

A.A. Argyros, M.I.A. Lourakis, P.E. Trahanias and S.C. Orphanoudakis, "Independent 3D Motion Detection Through Robust Regression in Depth Layers", FORTH-ICS, TR-159, February 1996.
[Abstract] [BibTeX] [URL]

BibTeX:

@techreport{Argyros1996f,
  author = {Argyros, Antonis A and Lourakis, Manolis I A and Trahanias, Panos E and Orphanoudakis, Stelios C},
  title = {Independent 3D Motion Detection Through Robust Regression in Depth Layers},
  school = {FORTH-ICS},
  year = {1996},
  month = {February},
  number = {TR-159},
  url = {http://users.ics.forth.gr/ argyros/res_imd.html}
}

A.A. Argyros, M.I.A. Lourakis, P.E. Trahanias and S.C. Orphanoudakis, "Real-time Detection of Maneuvering Objects by a Monocular Observer", FORTH-ICS, TR-160, 1996.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: A method is proposed for real-time detection of objects that maneuver in the visual field of a monocular observer. Such cases are common in natural environments where the 3D motion parameters of certain objects (e.g. animals) change considerably over time. The approach taken conforms with the theory of purposive vision, according to which vision algorithms should solve many, specific problems under loose assumptions. The method can effectively answer two important questions: (a) whether the observer has changed his 3D motion parameters, and (b) in case that the observer has constant 3D motion, whether there are any maneuvering objects (objects with non-constant 3D motion parameters) in his visual field. The approach is direct in the sense that the structure from motion problem - which can only be solved under restrictive assumptions - is avoided. Essentially, the method relies on a pointwise comparison of two normal flow fields which can be robustly computed from three successive frames. Thus, it by-passes the ill-posed problem of optical flow computation. Experimental results demonstrate the effectiveness and robustness of the proposed scheme. Moreover, the computational requirements of the method are extremely low, making it a likely candidate for real-time implementation.

BibTeX:

@techreport{Argyros1996,
  author = {Argyros, Antonis A and Lourakis, Manolis I A and Trahanias, Panos E and Orphanoudakis, Stelios C},
  title = {Real-time Detection of Maneuvering Objects by a Monocular Observer},
  school = {FORTH-ICS},
  year = {1996},
  number = {TR-160},
  url = {http://users.ics.forth.gr/ argyros},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1996_02_tr160_forth_imd_maneuvering_object_detection.pdf}
}

A.A. Argyros, "Load Redistribution Algorithms for Parallel Implementations of Intermediate Level Vision Tasks", M.Sc. Thesis, Computer Science Department, University of Crete, December 1992.
[BibTeX] [DOI] [URL]

BibTeX:

@mastersthesis{Argyros1992a,
  author = {Argyros, Antonis A},
  title = {Load Redistribution Algorithms for Parallel Implementations of Intermediate Level Vision Tasks},
  school = {Computer Science Department, University of Crete},
  year = {1992},
  month = {December},
  address = {Heraklion, Crete, Greece},
  url = {http://users.ics.forth.gr/ argyros},
  doi = {http://elocus.lib.uoc.gr/dlib/8/9/0/metadata-dlib-1992argyros.tkl}
}

A.A. Argyros and S.C. Orphanoudakis, "Load Redistribution Algorithms for Parallel Implementations of Intermediate Level Vision Tasks", In ERCIM Workshop on Parallel Architectures for Computer Vision, pp. 91-101, Hersonissos, Crete, Greece, October 1992.
[BibTeX]

BibTeX:

@inproceedings{Argyros1992b,
  author = {Argyros, Antonis A and Orphanoudakis, Stelios C},
  title = {Load Redistribution Algorithms for Parallel Implementations of Intermediate Level Vision Tasks},
  booktitle = {ERCIM Workshop on Parallel Architectures for Computer Vision},
  year = {1992},
  month = {October},
  pages = {91--101},
  address = {Hersonissos, Crete, Greece}
}

A.A. Argyros and S.C. Orphanoudakis, "Load Redistribution Algorithms for Parallel Implementations of Intermediate Level Vision Tasks", In Dartmouth Advanced Graduate Studies Symposium on Parallel Computing (DAGS/PC 1992), pp. 162-175, Dartmouth College, New Hampshire, USA, June 1992.
[BibTeX] [PDF]

BibTeX:

@inproceedings{Argyros1992c,
  author = {Argyros, Antonis A and Orphanoudakis, Stelios C},
  title = {Load Redistribution Algorithms for Parallel Implementations of Intermediate Level Vision Tasks},
  booktitle = {Dartmouth Advanced Graduate Studies Symposium on Parallel Computing (DAGS/PC 1992)},
  year = {1992},
  month = {June},
  pages = {162--175},
  address = {Dartmouth College, New Hampshire, USA},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1992_07_tr048_forth_load_redistribution_in_vision.pdf}
}

A.A. Argyros and S.C. Orphanoudakis, "Load Redistribution Algorithms for Parallel Implementations of Intermediate Level Vision Tasks", FORTH-ICS, TR-48, 1992.
[Abstract] [BibTeX] [PDF] [URL]

Abstract: Parallelism can be exploited to handle the enormous computational requirements of many vision applications. However, the computational power offered by multiprocessor architectures cannot be fully harnessed to achieve the desired speedup. This is primarily due to the unbalanced distribution of computational load among the processors of a parallel architecture. Furthermore, in parallel implementations of image analysis tasks, what constitutes computational load and the load balancing requirements of specific implementations are often difficult to define in a systematic way. In this paper, we consider the load balancing requirements of parallel implementations of intermediate level vision tasks on distributed memory parallel architectures. The computational characteristics of such tasks are briefly discussed and an appropriate definition of computational load is adopted. The primary implication of this definition for load balancing is that load entities to be redistributed are allowed to have nonuniform computational cost. An existing algorithm, which assumes uniform cost loads, and two modifications of this algorithm, which handle the nonuniform cost loads encountered in parallel implementations of intermediate level vision tasks, are described. These algorithms have been implemented on the iPSC/2 hyper-cube and their performance has been evaluated using simulated load conditions, as well as in the context of a simple object recognition system. Results on load balancing accuracy and total execution time are presented and discussed. Algorithmic performance has also been compared with the cases of optimal load distribution and no load redistribution. This work emphasizes the importance of understanding the requirements and difficulties of load redistribution in parallel image processing applications.

BibTeX:

@techreport{Argyros1992,
  author = {Argyros, Antonis A and Orphanoudakis, Stelios C},
  title = {Load Redistribution Algorithms for Parallel Implementations of Intermediate Level Vision Tasks},
  school = {FORTH-ICS},
  year = {1992},
  number = {TR-48},
  address = {FORTH-ICS},
  url = {http://users.ics.forth.gr/ argyros},
  pdflink = {http://users.ics.forth.gr/ argyros/mypapers/1992_07_tr048_forth_load_redistribution_in_vision.pdf}
}

A. Damianakis, A.A. Argyros and S.C. Orphanoudakis, "Parallel Implementations of Image Analysis Tasks", In Panhellenic Conference on Informatics, Athens, Greece, May 1991.
[BibTeX]

BibTeX:

@inproceedings{Damianakis1991,
  author = {Damianakis, Adam and Argyros, Antonis A and Orphanoudakis, Stelios C},
  title = {Parallel Implementations of Image Analysis Tasks},
  booktitle = {Panhellenic Conference on Informatics},
  year = {1991},
  month = {May},
  address = {Athens, Greece}
}

Professor, Computer Science Department, University of Crete

Researcher, Institute of Computer Science, FORTH

Index

Home | Research | Publications | Group | Awards | Projects | Activities | Teaching | Media | News | Meet me! | |