Professor Aljosa Smolic presents keynote at IEEE GEM 2018!


Professor Aljosa Smolic presented his keynote titled Content Creation for AR, VR, and Free Viewpoint Video at IEEEGEM 2018, held in Galway from the 15-17 August, 2018.

The IEEE GEM 2018 conference is a platform for disseminating innovative research and development work on game, entertainment, and media technologies, applying lessons learned, and developing new ideas through audience interaction. Participation from all sectors including academia, industry, and government is welcome. The IEEE GEM 2018 conference brings together researchers, developers, industry, and government partners for formal and informal engagement and examination of emergent features of computer game development for entertainment, for learning/teaching, for serious purposes, and to impact society.

V-SENSE attends SIGGRAPH 2018 and EXPRESSIVE 2018!

Our colleague Dr. Matis Hudon attended SIGGRAPH 2018 in Vancouver, from the 12-16 August and EXPRESSIVE 2018 in Victoria, from the 17-19 August, on our behalf.

SIGGRAPH 2018 is a five-day immersion into the latest innovations in CG, Animation, VR, Games, Digital Art, Mixed Reality and Emerging Technologies.

EXPRESSIVE 2018 is co-located with SIGGRAPH 2018, aims to fuse three symposia centered on expressive aspects of computer graphics.

Computational Aesthetics (CAe) integrates the bridging aspects of computer science, philosophy, psychology, and the fine, applied & performing arts.

Non-Photorealistic Animation and Rendering (NPAR) investigates computational techniques for visual communication.

Sketch-Based Interfaces and Modeling (SBIM)

Thank you for both SIGGRAPH 2018 and EXPRESSIVE 2018 for showcasing such creative innovation!

Professor Smolic presents keynote speech at the 10th International Conference on Quality of Multimedia Experience (QoMEX 2018)

In May, Professor Smolic was invited to present a keynote speech at the 10th International Conference on Quality of Multimedia Experience (QoMEX 2018) in Sardinia, Italy, titled Content Creation for AR, VR, and Free Viewpoint Video.

Augmented reality (AR) and virtual reality (VR) are among most important technology trends these days. Major industry players make huge investments, vibrant activity can be observed in the start-up scene and academia. The elements of the ecosystem seem mature enough for broad adoption and success. However, availability of compelling content can become a limiting factor. This talk will address this content gap for AR/VR, and present solutions developed by the V-SENSE team, i.e. 3D reconstruction of dynamic real world scenes and their interactive visualization in AR/VR.

Dr. Rafael Pages presents Closing the Content Gap for VR and AR at FMX 2018, Stuttgart

On the 27th April, Dr. Rafael Pages presented Closing the Content Gap for VR and AR at FMX 2018, Stuttgart.

Although Virtual and Augmented Reality are already impacting global business, most of the content that is consumed nowadays is either based on 360 video or synthetic models created by 3D artists. Many experts are calling AR a new mass medium but before this becomes reality, new ways of capturing reality need to become available. Free-viewpoint and volumetric video are emerging technologies that will bring reality capture for VR/AR content creation closer to everyone, and there are a few highly innovative companies working towards this.

Dr. Rafael Pagés is CEO and Co-founder of Volograms, a technology startup on a mission to bring reality capture closer to everyone. Volograms is a Trinity College Dublin spin-off, where Rafael worked as a Postdoctoral Research Fellow within the V-SENSE team, under the supervision of Prof. Aljosa Smolic. His research interests include 3D reconstruction, free-viewpoint video, VR/AR, computer vision, and image processing.

Volograms is a technology startup on a mission to bring reality capture closer to everyone. Our technology uses a set of videos taken from different viewpoints and transforms them into volumetric holograms, volograms, that can be enjoyed in Virtual and Augmented Reality. Our system works with different camera configurations, outdoor or indoor scenarios, and can even generate content with videos captured with handheld consumer devices.

Seminar presentation by new V-SENSE Postdoctoral Research Fellow, Dr. Aakanksha Rana




Dr. Aakanksha Rana, V-SENSE Postdoctoral Research Fellow.

Date & Time:
4pm, Wednesday, 9th May

Large Conference Room, O’Reilly Institute (ORI LCR)

High Dynamic Range Image Analysis.


High Dynamic Range (HDR) imaging enables to capture a wider dynamic range and color gamut, thus enabling us to draw on subtle, yet discriminating details present both in the extremely dark and bright areas of a scene. Such property is of potential interest for computer vision algorithms where performance degrades substantially when the scenes are captured using traditional low dynamic range (LDR) imagery. While such algorithms have been exhaustively designed using traditional LDR images, little work has been done so far in context of HDR content. In this talk, I will present the quantitative and qualitative analysis of HDR imagery for such task-specific algorithms.

The seminar will begin by identifying the most natural and important questions of using HDR content for low-level feature extraction task, which is of fundamental importance for many high-level applications such as stereo vision, localization, matching and retrieval.

By conducting a performance evaluation study, how different HDR-based modalities enhance algorithms performance with respect to LDR on a proposed dataset will be shown.

Then, three learning based methodologies aimed at optimal mapping of HDR content to enhance the efficiency of local features extraction at each stage namely, detection, description and final matching will be introduced. By spatial adaptation of a given filter using a regression-based approach, three models are learnt to adaptively map the HDR content by bringing invariance to luminance transformations at all the aforementioned stages.

Finally, I will present deep learning based generic tone mapping operators (DeepTMOs) designed to cater desired perceptual characteristics over a wide spectrum of linear input HDR images. Based on conditional Generative Adversarial Networks (cGANs), the three models

(end-to-end) are proposed to produce realistic and artifact-free high resolution images. While the hand crafted classical TMOs are designed to work for specific scenarios, our model generalizes well over a larger range of HDR `contents’ by modelling the underlying distribution of all available tone mapping outputs.


Aakanksha Rana received M.Engg. in multimedia technologies and Ph.D. in signal and image processing from Telecom ParisTech, France, in 2014 and 2018 respectively. During her masters, she was an intern in the exploratory research group at Technicolor Rennes Research & Innovation Center (2014) and AYIN group, INRIA, Sophia Antipolis (2013). Her Ph.D. mainly focussed on High Dyanmic Range (HDR) image analysis for low-level computer vision and perceptual applications. Her areas of research broadly include computer vision, deep learning, high dynamic range imaging and analysis of satellite imagery.

You may contact Aakanksha at


Summary of other V-SENSE seminar – dates for your diary:

Professor Giuseppe Valenzise, CentraleSupelec Université Paris-Sud
Date: Monday, 14th May
Time: 2pm
Venue: Large Conference Room, O’ Reilly Institute (LCR ORI)

Dr Iman Zolanvari, new V-SENSE Research Fellow
Date: Wed, 16th May
Time: 12.30pm
Venue: Large Conference Room, O’ Reilly Institute (LCR ORI)


School Seminar Series presented by V-SENSE guest speaker Professor Giuseppe Valenzise!

Date: Monday, 14th May
Time: 2pm
Venue: Large Conference Room, O’ Reilly Institute (LCR ORI)

Title: Blind Quality Estimation by Disentangling Perceptual and Noisy Features in High Dynamic Range Images

Abstract: High Dynamic Range (HDR) image visual quality assessment in the absence of a reference image is challenging. This research topic has not been adequately studied largely due to the high cost of HDR display devices. Nevertheless, HDR imaging technology has attracted increasing attention because it provides more realistic content, consistent to what the Human Visual System perceives. We propose a new No-Reference Image Quality Assessment (NR-IQA) model for HDR data based on convolutional neural networks. The proposed model is able to detect visual artifacts, taking into consideration perceptual masking effects, in a distorted HDR image without any reference. The error and perceptual masking values are measured separately, yet sequentially, and then processed by a Mixing function to predict the perceived quality of the distorted image. Instead of using simple stimuli and psychovisual experiments, perceptual masking effects are computed from a set of annotated HDR images during our training process. Experimental results demonstrate that our proposed NR-IQA model can predict HDR image quality as accurately as state-of-the-art full-reference IQA methods.

Bio:  Giuseppe Valenzise completed a master degree and a Ph.D. in Information Technology at the Politecnico di Milano, Italy, in 2007 and 2011, respectively. In 2012, he joined the French Centre National de la Recherche Scientifique (CNRS) as a permanent researcher, first at the Laboratoire Traitement et Communication de l’Information (LTCI) Telecom Paristech, and from 2016 at the Laboratoire des Signaux et Systmes (L2S), CentraleSupelec Université Paris-Sud. His research interests span different fields of image and video processing, including high dynamic range imaging, video quality assessment, single and multi-view video coding, applications of machine learning to image and video analysis. He is co-author of more than 70 research publications and of several award-winning papers. He is the recipient of the EURASIP Early Career Award 2018. Dr. Valenzise serves as Associate Editor for IEEE Transactions on Circuits and Systems for Video Technology, as well as for Elsevier Signal Processing: Image communication, and he is a member of the MMSP and IVMSP technical committees of the IEEE Signal Processing Society for the term 2018-2020.

Professor Smolic elected Fellow Trinity College Dublin

Congratulations to our team leader and PI, Professor Smolic, on his Professorial  Fellowship at Trinity College Dublin! We are very proud of Professor Smolic’s election.

Students and academics gathered in Trinity College Dublin’s Front Square to applaud the announcement of new Scholars and Fellows read out by the Provost of Trinity, Dr Patrick Prendergast. The College community, friends and families celebrated their wonderful achievement.

This year there are 73 Scholars, 16 Fellows and two Honorary Fellows.

“We are so proud of all our students and academics today who have become new Fellows and Scholars. This is always a special day for all of us when we celebrate their hard work and academic prowess,” said Provost, Dr Patrick Prendergast.

Seminar presentation by new V-SENSE member Dr. Rogerio da Silva!

Dr. Rogerio da Silva, V-SENSE Postdoctoral Research Fellow

Date & Time: 4pm, Thursday, 22nd March.

Venue: Large Conference Room, O’Reilly Institute (ORI LCR)

Title:  Creating Partly Autonomous Expressive Virtual Actors for Computer Animation


Autonomous digital actors represent the next stage in the animation industry in its search for novel processes for authoring character-based animations. In this research, we drawn a list of requirements for a proposed autonomous agent architecture for digital actors. The purpose of this was to suggest an improvement in the current technology on digital actors and the way “believable” characters are used by the game and animation industries. Our solution considers three main layers in terms of what skills autonomous actors should display: ?rst, they should be able to interpret script representations autonomously; second, there is a deliberation phase which aims at implementing an agent architecture to work out suitable ways of enacting the previously interpreted script and third, these enactments are translated into animation commands that are suitable for a given animation engine. Although determining the best process for creating autonomous digital actors remains an open question, we believe that this thesis provides a better understanding of some of its components, and can lead towards the development of the ?rst fully functional autonomous digital actor.

V-SENSE research seminar presented by new V-SENSE team member, Dr. Pan Gao!

Presenter: Dr. Pan Gao, Postdoctoral Research Fellow in V-SENSE.

Date & Time: 4pm, Wednesday, 7th March.

Venue: Large Conference Room, O’Reilly Institute (ORI LCR)


Analysis of Packet-Loss-Induced Distortion in Decoded Depth Map and Its Application to Error Control for 3-D Visual Communications


With the growing demand for a more immersive experience, depth-map-based 3-D video representation has gained increasing popularity from both industry and academia alike, with which an arbitrary number of  views can be synthesized at the receiver using the transmitted textures and depths via depth-image-based rendering technology. As the depth map only provides geometry information for view synthesis instead of being viewed by end users, the packet-loss-errors in depth map introduce a significantly different type of error (i.e., geometry error) from the ordinary intensity errors, which would lead to unexpected holes and overlaps in the rendered virtual views, thus substantially deteriorating the overall quality of the views presented to users. Therefore, it is desirable to provide an in-depth theoretical analysis of how the depth errors affect the rendered views and design corresponding practical error resilience and protection mechanisms for 3-D visual communications.

In this talk, we will firstly present an analytical distortion model to estimate the distortion in decoded depth map caused by packet loss and mathematically analyse its adverse effect on the rendered views. In this model, the depth errors impaired by random errors are estimated through a recursive function, without requiring the prior knowledge of the probability distribution. Further, the proposed model takes into consideration filtering operations with a very low level of complexity, e.g., interpolation invoked for both fractional pixel motion-compensated prediction. Especially, the expected view synthesis distortion is characterized in the frequency domain using a new approach, which combines the power spectral densities of the reconstructed texture image and the channel errors. Simulation results quantitatively and qualitatively demonstrate the proposed analytic model is capable of modelling the depth-error-induced synthesis distortion. Then, this talk will elaborate on  how the developed distortion model is applied to real error control 3-D video coding for enhancing the robustness of 3-D visual communications. In order to keep backward compatibility with the existing standards, we design an approach in the form of rate-distortion optimal joint texture and depth coding mode selection without modifications to the bit stream syntax. To overcome the adjacent block interdependency induced by warping operation in synthesis, we develop a dynamic programming method to optimally locate the computationally-feasible solution. Further, we extend the Lagrange minimization method to the more general variable-block-size prediction case, where the optimal quadtree tree structure and the combined coding modes are jointly determined using a specially-designed multi-level dual trellis. Experimental results also demonstrate the advantage of the proposed error-resilient algorithm combined with the proposed distortion model in improving both the objective and subjective reconstruction quality of  3-D visual communication.

Welcome Pan!

V-SENSE research seminar presented by new V-SENSE Research Fellow, Dr. Emin Zerman!

Please join us to welcome our new V-SENSE Research Fellow, Dr. Emin Zerman!

Dr. Emin Zerman, V-SENSE Research Fellow

Large Conference Room, O’Reilly Institute (ORI LCR)

Date & Time:
4pm, Thursday, 22nd February 2018


Assessment and Analysis of High Dynamic Range Video Quality


In the last decade, high dynamic range (HDR) image and video technology gained a lot of attention, especially within the multimedia community. Recent technological advancements made the acquisition, compression, and reproduction of HDR content easier, and that led to the commercialization of HDR displays and popularization of HDR content. In this context, measuring the quality of HDR content plays a fundamental role in improving the content distribution chain as well as individual parts of it, such as compression and display. However, HDR visual quality assessment presents new challenges with respect to the standard dynamic range (SDR) case. Some of these challenges are the new conditions introduced by the reproduction of HDR content, e.g. the increase in brightness and contrast, estimation of objective HDR content quality, and obtaining subjective HDR content quality.

In this talk, I will mention some of the solutions we suggest to solve these problems. In order to understand the effects of the increased brightness and contrast, we analyze the effects of display rendering using a built-in rendering scheme of SIM2 HDR display and another display rendering which we developed. This developed rendering algorithm reproduces and estimates the emitted luminance of the HDR images accurately, therefore it also allows us to analyze the effect of having accurate luminance estimations on objective HDR quality assessment. In order to evaluate the performances of existing full-reference (FR) objective HDR image quality metrics, we gather five different HDR image quality databases with their MOS values and fuse these databases by aligning their MOS values. We then analyze the HDR image quality metrics using statistical evaluation methods. Additionally, we propose a new method for the evaluation of metric discriminability based on a novel classification approach. Motivated by the need to fuse several different quality databases, we propose to use pairwise comparisons (PC) methodology with scaling since it is much more intuitive and subjects can decide easier as well as faster. In order to increase the scaling performance and reduce cross-content variance as well as confidence intervals, we propose to include cross-content comparisons in the PC experiments.


Emin Zerman received his B.Sc. degree (2011) and M.Sc. degree (2013) in Electrical and Electronics Engineering from the Middle East Technical University, Turkey, and his Ph.D. degree (2018) in Signals and Images from Télécom ParisTech, France. During his masters studies, he also worked as a research and teaching assistant. Although his Ph.D. studies are focused mainly on HDR video quality assessment, his research interests include human visual perception, computer vision, video compression and transmission, and video processing in a more general sense.

Welcome to the team, Emin!