2017 |
Ozcinar, Cagri; Abreu, Ana De; Knorr, Sebastian; Smolic, Aljosa Estimation of optimal encoding ladders for tiled 360° VR video in adaptive streaming systems Conference Forthcoming The 19th IEEE International Symposium on Multimedia (ISM2017), Taichung, Taiwan, Forthcoming. @conference{OzcinarISM2017, title = {Estimation of optimal encoding ladders for tiled 360° VR video in adaptive streaming systems}, author = { Cagri Ozcinar and Ana De Abreu and Sebastian Knorr and Aljosa Smolic}, year = {2017}, date = {2017-12-20}, booktitle = {The 19th IEEE International Symposium on Multimedia (ISM2017)}, address = {Taichung, Taiwan}, keywords = {}, pubstate = {forthcoming}, tppubtype = {conference} } |
Croci, Simone; Knorr, Sebastian; Smolic, Aljosa Saliency-Based Sharpness Mismatch Detection For Stereoscopic Omnidirectional Images Inproceedings Forthcoming 14th European Conference on Visual Media Production, London, UK, Forthcoming. @inproceedings{Croci2017a, title = {Saliency-Based Sharpness Mismatch Detection For Stereoscopic Omnidirectional Images}, author = {Simone Croci and Sebastian Knorr and Aljosa Smolic}, url = {https://v-sense.scss.tcd.ie/wp-content/uploads/2017/10/2017_CVMP_Saliency-Based-Sharpness-Mismatch-Detection-For-Stereoscopic-Omnidirectional-Images.pdf}, year = {2017}, date = {2017-12-11}, booktitle = {14th European Conference on Visual Media Production}, address = {London, UK}, abstract = {In this paper, we present a novel sharpness mismatch detection (SMD) approach for stereoscopic omnidirectional images (ODI) for quality control within the post-production work ow, which is the main contribution. In particular, we applied a state of the art SMD approach, which was originally developed for traditional HD images, and extended it to stereoscopic ODIs. A new e cient method for patch extraction from ODIs was developed based on the spherical Voronoi diagram of equidistant points evenly distributed on the sphere. The subdivision of the ODI into patches allows an accurate detection and localization of regions with sharpness mismatch. A second contribution of the paper is the integration of saliency into our SMD approach. In this context, we introduce a novel method for the estimation of saliency maps from viewport data of head-mounted displays (HMD). Finally, we demonstrate the performance of our SMD approach with data collected from a subjective test with 17 participants.}, keywords = {}, pubstate = {forthcoming}, tppubtype = {inproceedings} } In this paper, we present a novel sharpness mismatch detection (SMD) approach for stereoscopic omnidirectional images (ODI) for quality control within the post-production work ow, which is the main contribution. In particular, we applied a state of the art SMD approach, which was originally developed for traditional HD images, and extended it to stereoscopic ODIs. A new e cient method for patch extraction from ODIs was developed based on the spherical Voronoi diagram of equidistant points evenly distributed on the sphere. The subdivision of the ODI into patches allows an accurate detection and localization of regions with sharpness mismatch. A second contribution of the paper is the integration of saliency into our SMD approach. In this context, we introduce a novel method for the estimation of saliency maps from viewport data of head-mounted displays (HMD). Finally, we demonstrate the performance of our SMD approach with data collected from a subjective test with 17 participants. |
Croci, Simone; Knorr, Sebastian; Goldmann, Lutz; Smolic, Aljosa A Framework for Quality Control in Cinematic VR Based on Voronoi Patches and Saliency Inproceedings Forthcoming International Conference on 3D Immersion, Brussels, Belgium, Forthcoming. @inproceedings{Croci2017b, title = {A Framework for Quality Control in Cinematic VR Based on Voronoi Patches and Saliency}, author = {Simone Croci and Sebastian Knorr and Lutz Goldmann and Aljosa Smolic}, url = {https://v-sense.scss.tcd.ie/wp-content/uploads/2017/10/2017_IC3D_A-FRAMEWORK-FOR-QUALITY-CONTROL-IN-CINEMATIC-VR-BASED-ON-VORONOI-PATCHES-AND-SALIENCY.pdf}, year = {2017}, date = {2017-12-11}, booktitle = {International Conference on 3D Immersion}, address = {Brussels, Belgium}, abstract = {In this paper, we present a novel framework for quality control in cinematic VR (360-video) based on Voronoi patches and saliency which can be used in post-production workflows. Our approach first extracts patches in stereoscopic omnidirectional images (ODI) using the spherical Voronoi diagram. The subdivision of the ODI into patches allows an accurate detection and localization of regions with artifacts. Further, we introduce saliency in order to weight detected artifacts according to the visual attention of end-users. Then, we propose different artifact detection and analysis methods for sharpness mismatch detection (SMD), color mismatch detection (CMD) and disparity distribution analysis. In particular, we took two state of the art approaches for SMD and CMD, which were originally developed for conventional planar images, and extended them to stereoscopic ODIs. Finally, we evaluated the performance of our framework with a dataset of 18 ODIs for which saliency maps were obtained from a subjective test with 17 participants.}, keywords = {}, pubstate = {forthcoming}, tppubtype = {inproceedings} } In this paper, we present a novel framework for quality control in cinematic VR (360-video) based on Voronoi patches and saliency which can be used in post-production workflows. Our approach first extracts patches in stereoscopic omnidirectional images (ODI) using the spherical Voronoi diagram. The subdivision of the ODI into patches allows an accurate detection and localization of regions with artifacts. Further, we introduce saliency in order to weight detected artifacts according to the visual attention of end-users. Then, we propose different artifact detection and analysis methods for sharpness mismatch detection (SMD), color mismatch detection (CMD) and disparity distribution analysis. In particular, we took two state of the art approaches for SMD and CMD, which were originally developed for conventional planar images, and extended them to stereoscopic ODIs. Finally, we evaluated the performance of our framework with a dataset of 18 ODIs for which saliency maps were obtained from a subjective test with 17 participants. |
Alain, Martin; Smolic, Aljosa Light Field Denoising by Sparse 5D Transform Domain Collaborative Filtering Inproceedings IEEE International Workshop on Multimedia Signal Processing (MMSP 2017), 2017. @inproceedings{Alain2017, title = {Light Field Denoising by Sparse 5D Transform Domain Collaborative Filtering}, author = {Martin Alain and Aljosa Smolic}, url = {https://v-sense.scss.tcd.ie/wp-content/uploads/2017/08/LFBM5D_MMSP_camera_ready-1.pdf}, year = {2017}, date = {2017-10-16}, booktitle = {IEEE International Workshop on Multimedia Signal Processing (MMSP 2017)}, abstract = {In this paper, we propose to extend the state-of-the-art BM3D image denoising filter to light fields, and we denote our method LFBM5D. We take full advantage of the 4D nature of light fields by creating disparity compensated 4D patches which are then stacked together with similar 4D patches along a 5th dimension. We then filter these 5D patches in the 5D transform domain, obtained by cascading a 2D spatial transform, a 2D angular transform, and a 1D transform applied along the similarities. Furthermore, we propose to use the shape-adaptive DCT as the 2D angular transform to be robust to occlusions. Results show a significant improvement in synthetic noise removal compared to state-of-the-art methods, for both light fields captured with a lenslet camera or a gantry. Experiments on Lytro Illum camera noise removal also demonstrate a clear improvement of the light field quality.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } In this paper, we propose to extend the state-of-the-art BM3D image denoising filter to light fields, and we denote our method LFBM5D. We take full advantage of the 4D nature of light fields by creating disparity compensated 4D patches which are then stacked together with similar 4D patches along a 5th dimension. We then filter these 5D patches in the 5D transform domain, obtained by cascading a 2D spatial transform, a 2D angular transform, and a 1D transform applied along the similarities. Furthermore, we propose to use the shape-adaptive DCT as the 2D angular transform to be robust to occlusions. Results show a significant improvement in synthetic noise removal compared to state-of-the-art methods, for both light fields captured with a lenslet camera or a gantry. Experiments on Lytro Illum camera noise removal also demonstrate a clear improvement of the light field quality. |
O’Dwyer, Néill; Johnson, Nicholas; Bates, Enda; Pagés, Rafael; Ondrej, Jan; Amplianitis, Konstantinos; Monaghan, David; Smolic, Aljosa Virtual Play in Free-viewpoint Video: Reinterpreting Samuel Beckett for Virtual Reality Inproceedings Forthcoming 16th IEEE International Symposium on Mixed and Augmented Reality (ISMAR), IEEE Xplore digital library, Forthcoming. @inproceedings{ODwyer2017b, title = {Virtual Play in Free-viewpoint Video: Reinterpreting Samuel Beckett for Virtual Reality}, author = {Néill O’Dwyer and Nicholas Johnson and Enda Bates and Rafael Pagés and Jan Ondrej and Konstantinos Amplianitis and David Monaghan and Aljosa Smolic}, url = {https://v-sense.scss.tcd.ie/wp-content/uploads/2017/08/VARCI17_Beckett_V-SENSE_final.pdf}, year = {2017}, date = {2017-10-14}, booktitle = {16th IEEE International Symposium on Mixed and Augmented Reality (ISMAR)}, publisher = {IEEE Xplore digital library}, abstract = {Since the early years of the twenty-first century, the performing arts have been party to an increasing number of digital media projects that bring renewed attention to questions about, on one hand, new working processes involving capture and distribution techniques, and on the other hand, how particular works—with bespoke hard and software—can exert an efficacy over how work is created by the artist/producer or received by the audience. The evolution of author/audience criteria demand that digital arts practice modify aesthetic and storytelling strategies, to types that are more appropriate to communicating ideas over interactive digital networks, wherein AR/VR technologies are rapidly becoming the dominant interface. This project explores these redefined criteria through a reimagining of Samuel Becketts Play (1963) for digital culture. This paper offers an account of the working processes, the aesthetic and technical considerations that guide artistic decisions and how we attempt to place the overall work in the state of the art.}, keywords = {}, pubstate = {forthcoming}, tppubtype = {inproceedings} } Since the early years of the twenty-first century, the performing arts have been party to an increasing number of digital media projects that bring renewed attention to questions about, on one hand, new working processes involving capture and distribution techniques, and on the other hand, how particular works—with bespoke hard and software—can exert an efficacy over how work is created by the artist/producer or received by the audience. The evolution of author/audience criteria demand that digital arts practice modify aesthetic and storytelling strategies, to types that are more appropriate to communicating ideas over interactive digital networks, wherein AR/VR technologies are rapidly becoming the dominant interface. This project explores these redefined criteria through a reimagining of Samuel Becketts Play (1963) for digital culture. This paper offers an account of the working processes, the aesthetic and technical considerations that guide artistic decisions and how we attempt to place the overall work in the state of the art. |
Ozcinar, Cagri; Abreu, Ana De; Smolic, Aljosa Viewport-aware adaptive 360° video streaming using tiles for virtual reality Inproceedings 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 2017. @inproceedings{Ozcinar2017, title = {Viewport-aware adaptive 360° video streaming using tiles for virtual reality}, author = { Cagri Ozcinar and Ana De Abreu and Aljosa Smolic}, url = {https://www.researchgate.net/publication/316990176_VIEWPORT-AWARE_ADAPTIVE_360_VIDEO_STREAMING_USING_TILES_FOR_VIRTUAL_REALITY}, year = {2017}, date = {2017-09-30}, booktitle = {2017 IEEE International Conference on Image Processing (ICIP)}, address = {Beijing, China}, abstract = {360◦ video is attracting an increasing amount of atten- tion in the context of Virtual Reality (VR). Owing to its very high-resolution requirements, existing professional stream- ing services for 360◦ video suffer from severe drawbacks. This paper introduces a novel end-to-end streaming system from encoding to displaying, to transmit 8K resolution 360◦ video and to provide an enhanced VR experience using Head Mounted Displays (HMDs). The main contributions of the proposed system are about tiling, integration of the MPEG- Dynamic Adaptive Streaming over HTTP (DASH) standard, and viewport-aware bitrate level selection. Tiling and adap- tive streaming enable the proposed system to deliver very high-resolution 360◦ video at good visual quality. Further, the proposed viewport-aware bitrate assignment selects an optimum DASH representation for each tile in a viewport-aware manner. The quality performance of the proposed system is verified in simulations with varying network band- width using realistic view trajectories recorded from user experiments. Our results show that the proposed streaming system compares favorably compared to existing methods in terms of PSNR and SSIM inside the viewport. Our streaming system is available as an open source library .}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } 360◦ video is attracting an increasing amount of atten- tion in the context of Virtual Reality (VR). Owing to its very high-resolution requirements, existing professional stream- ing services for 360◦ video suffer from severe drawbacks. This paper introduces a novel end-to-end streaming system from encoding to displaying, to transmit 8K resolution 360◦ video and to provide an enhanced VR experience using Head Mounted Displays (HMDs). The main contributions of the proposed system are about tiling, integration of the MPEG- Dynamic Adaptive Streaming over HTTP (DASH) standard, and viewport-aware bitrate level selection. Tiling and adap- tive streaming enable the proposed system to deliver very high-resolution 360◦ video at good visual quality. Further, the proposed viewport-aware bitrate assignment selects an optimum DASH representation for each tile in a viewport-aware manner. The quality performance of the proposed system is verified in simulations with varying network band- width using realistic view trajectories recorded from user experiments. Our results show that the proposed streaming system compares favorably compared to existing methods in terms of PSNR and SSIM inside the viewport. Our streaming system is available as an open source library . |
Monroy, Rafael; Lutz, Sebastian; Chalasani, Tejo; Smolic, Aljosa SalNet360: Saliency Maps for omni-directional images with CNN Unpublished 2017. @unpublished{Monroy2017, title = {SalNet360: Saliency Maps for omni-directional images with CNN}, author = {Rafael Monroy and Sebastian Lutz and Tejo Chalasani and Aljosa Smolic}, url = {https://arxiv.org/abs/1709.06505}, year = {2017}, date = {2017-09-19}, abstract = {The prediction of Visual Attention data from any kind of media is of valuable use to content creators and used to efficiently drive encoding algorithms. With the current trend in the Virtual Reality (VR) field, adapting known techniques to this new kind of media is starting to gain momentum. In this paper, we present an architectural extension to any Convolutional Neural Network (CNN) to fine-tune traditional 2D saliency prediction to Omnidirectional Images (ODIs) in an end-to-end manner. We show that each step in the proposed pipeline works towards making the generated saliency map more accurate with respect to ground truth data. }, keywords = {}, pubstate = {published}, tppubtype = {unpublished} } The prediction of Visual Attention data from any kind of media is of valuable use to content creators and used to efficiently drive encoding algorithms. With the current trend in the Virtual Reality (VR) field, adapting known techniques to this new kind of media is starting to gain momentum. In this paper, we present an architectural extension to any Convolutional Neural Network (CNN) to fine-tune traditional 2D saliency prediction to Omnidirectional Images (ODIs) in an end-to-end manner. We show that each step in the proposed pipeline works towards making the generated saliency map more accurate with respect to ground truth data. |
Knorr, Sebastian; Croci, Simone; Smolic, Aljosa A Modular Scheme for Artifact Detection in Stereoscopic Omni-Directional Images Inproceedings Irish Machine Vision and Image Processing Conference, Maynooth, Ireland, 2017. @inproceedings{Knorr2017, title = {A Modular Scheme for Artifact Detection in Stereoscopic Omni-Directional Images}, author = { Sebastian Knorr and Simone Croci and Aljosa Smolic}, url = {https://v-sense.scss.tcd.ie/wp-content/uploads/2017/07/imvip2017_knorr_final.pdf }, year = {2017}, date = {2017-08-30}, booktitle = {Irish Machine Vision and Image Processing Conference}, address = {Maynooth, Ireland}, abstract = {With the release of new head-mounted displays (HMDs) and new omni-directional capture systems, 360-degree video is one of the latest and most powerful trends in immersive media, with an increasing potential for the next decades. However, especially creating 360-degree content in 3D is still an error-prone task with many limitations to overcome. This paper describes the critical aspects of 3D content creation for 360-degree video. In particular, conflicts of depth cues and binocular rivalry are reviewed in detail, as these cause eye fatigue, headache, and even nausea. Both the reasons for the appearance of the conflicts and how to detect some of these conflicts by objective image analysis methods are detailed in this paper. The latter is the main contribution of this paper and part of long-term research roadmap of the authors in order to provide a comprehensive framework for artifact detection and correction in 360-degree videos. Then, experimental results are demonstrating the performance of the proposed approaches in terms of objective measures and visual feedback. Finally, the paper concludes with a discussion and future work.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } With the release of new head-mounted displays (HMDs) and new omni-directional capture systems, 360-degree video is one of the latest and most powerful trends in immersive media, with an increasing potential for the next decades. However, especially creating 360-degree content in 3D is still an error-prone task with many limitations to overcome. This paper describes the critical aspects of 3D content creation for 360-degree video. In particular, conflicts of depth cues and binocular rivalry are reviewed in detail, as these cause eye fatigue, headache, and even nausea. Both the reasons for the appearance of the conflicts and how to detect some of these conflicts by objective image analysis methods are detailed in this paper. The latter is the main contribution of this paper and part of long-term research roadmap of the authors in order to provide a comprehensive framework for artifact detection and correction in 360-degree videos. Then, experimental results are demonstrating the performance of the proposed approaches in terms of objective measures and visual feedback. Finally, the paper concludes with a discussion and future work. |
Chen, Yang; Alain, Martin; Smolic, Aljosa Fast and Accurate Optical Flow based Depth Map Estimation from Light Fields Inproceedings Irish Machine Vision and Image Processing Conference (Received the Best Paper Award), 2017. @inproceedings{Yang2017, title = {Fast and Accurate Optical Flow based Depth Map Estimation from Light Fields}, author = {Yang Chen and Martin Alain and Aljosa Smolic}, url = {https://v-sense.scss.tcd.ie/wp-content/uploads/2017/07/Fast-and-Accurate-Optical-Flow-based-Depth-Map-Estimation-from-Light-Fields-5.pdf}, year = {2017}, date = {2017-08-30}, booktitle = {Irish Machine Vision and Image Processing Conference (Received the Best Paper Award)}, abstract = {Depth map estimation is a crucial task in computer vision, and new approaches have recently emerged taking advantage of light fields, as this new imaging modality captures much more information about the angular direction of light rays compared to common approaches based on stereoscopic images or multi-view. In this paper, we propose a novel depth estimation method from light fields based on existing optical flow estimation methods. The optical flow estimator is applied on a sequence of images taken along an angular dimension of the light field, which produces several disparity map estimates. Considering both accuracy and efficiency, we choose the feature flow method as our optical flow estimator. Thanks to its spatio-temporal edge-aware filtering properties, the different disparity map estimates that we obtain are very consistent, which allows a fast and simple aggregation step to create a single disparity map, which can then converted into a depth map. Since the disparity map estimates are consistent, we can also create a depth map from each disparity estimate, and then aggregate the different depth maps in the 3D space to create a single dense depth map.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Depth map estimation is a crucial task in computer vision, and new approaches have recently emerged taking advantage of light fields, as this new imaging modality captures much more information about the angular direction of light rays compared to common approaches based on stereoscopic images or multi-view. In this paper, we propose a novel depth estimation method from light fields based on existing optical flow estimation methods. The optical flow estimator is applied on a sequence of images taken along an angular dimension of the light field, which produces several disparity map estimates. Considering both accuracy and efficiency, we choose the feature flow method as our optical flow estimator. Thanks to its spatio-temporal edge-aware filtering properties, the different disparity map estimates that we obtain are very consistent, which allows a fast and simple aggregation step to create a single disparity map, which can then converted into a depth map. Since the disparity map estimates are consistent, we can also create a depth map from each disparity estimate, and then aggregate the different depth maps in the 3D space to create a single dense depth map. |
O'Dwyer, Néill Reconsidering movement: the performativity of digital drawing techniques in computational performance Journal Article Theatre and Performance Design, 3 (1-2), pp. 68-83, 2017, ISSN: 2332-2551. @article{o2017reconsidering, title = {Reconsidering movement: the performativity of digital drawing techniques in computational performance}, author = { Néill O'Dwyer}, editor = {Jane Collins and Arnold Aronson}, url = {http://www.tandfonline.com/doi/full/10.1080/23322551.2017.1320087}, doi = {10.1080/23322551.2017.1320087}, issn = {2332-2551}, year = {2017}, date = {2017-06-23}, journal = {Theatre and Performance Design}, volume = {3}, number = {1-2}, pages = {68-83}, abstract = {This article is concerned with investigating the aesthetic repercussions of the emergence of computer-vision techniques and their subsequent integration into drawing processes. The inception of this technology has opened possibilities for corporeal drawing techniques that engage the entire body – not just the hand and eye – and this technique is finding its perfect home in digitally engaged dance practice where it is now widely used in scenographic processes, including interactive digital projections and sonification. This article analyses an example of computer-vision-aided drawing, entitled as·phyx·i·a (2015), by a New York-based collective, in order to discuss profound sociocultural reconfigurations occasioned by the invention of a new technology. It will be demonstrated that digital media’s ability to mobilise choreography as a totalised, corporeal mode of drawing – for example in dance and movement practices – represents an avant-garde, experimental and cutting-edge terrain, in which the new technologies of inscription are gathered in innovative ways that fundamentally challenge dominant paradigms of representation. In this regard, I maintain that the innovation at the heart of such projects troubles dominant cultural programmes in a way that directs a questioning at the very heart of how we construct knowledge.}, keywords = {}, pubstate = {published}, tppubtype = {article} } This article is concerned with investigating the aesthetic repercussions of the emergence of computer-vision techniques and their subsequent integration into drawing processes. The inception of this technology has opened possibilities for corporeal drawing techniques that engage the entire body – not just the hand and eye – and this technique is finding its perfect home in digitally engaged dance practice where it is now widely used in scenographic processes, including interactive digital projections and sonification. This article analyses an example of computer-vision-aided drawing, entitled as·phyx·i·a (2015), by a New York-based collective, in order to discuss profound sociocultural reconfigurations occasioned by the invention of a new technology. It will be demonstrated that digital media’s ability to mobilise choreography as a totalised, corporeal mode of drawing – for example in dance and movement practices – represents an avant-garde, experimental and cutting-edge terrain, in which the new technologies of inscription are gathered in innovative ways that fundamentally challenge dominant paradigms of representation. In this regard, I maintain that the innovation at the heart of such projects troubles dominant cultural programmes in a way that directs a questioning at the very heart of how we construct knowledge. |
Abreu, Ana De; Ozcinar, Cagri; Smolic, Aljosa Look around you: saliency maps for omnidirectional images in VR applictions Inproceedings 9th International Conference on Quality of Multimedia Experience (QoMEX), 2017. @inproceedings{AnaDeAbreuCagriOzcinar2017, title = {Look around you: saliency maps for omnidirectional images in VR applictions}, author = { Ana De Abreu and Cagri Ozcinar and Aljosa Smolic}, url = {https://www.researchgate.net/publication/317184829_Look_around_you_Saliency_maps_for_omnidirectional_images_in_VR_applications}, year = {2017}, date = {2017-05-31}, booktitle = {9th International Conference on Quality of Multimedia Experience (QoMEX)}, abstract = {Understanding visual attention has always been a topic of great interest in the graphics, image/video processing, robotics and human computer interaction communities. By understanding salient image regions, the compression, transmission and render- ing algorithms can be optimized. This is particularly important in omnidirectional images (ODIs) viewed with a head-mounted display (HMD), where only a fraction of the captured scene is displayed at a time, namely viewport. In order to predict salient image regions, saliency maps are estimated either by using an eye tracker to collect eye fixations during subjective tests or by using computational models of visual attention. However, eye tracking developments for ODIs are still in the early stages and although a large list of saliency models are available, no particular attention has been dedicated to ODIs. Therefore, in this paper, we consider the problem of estimating saliency maps for ODIs viewed with HMDs, when the use of an eye tracker device is not possible. We collected viewport data of 32 participants for 21 ODIs and propose a method to transform the gathered data into saliency maps. The obtained saliency maps are compared in terms of image exposition time used to display each ODI in the subjective tests. Then, motivated by the equator bias tendency in ODIs, we propose a post-processing method, namely FSM, to adapt current saliency models to ODIs requirements. We show that the use of FSM on current models improves their performance by up to 20%. The developed database and testbed are publicly available with this paper.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } Understanding visual attention has always been a topic of great interest in the graphics, image/video processing, robotics and human computer interaction communities. By understanding salient image regions, the compression, transmission and render- ing algorithms can be optimized. This is particularly important in omnidirectional images (ODIs) viewed with a head-mounted display (HMD), where only a fraction of the captured scene is displayed at a time, namely viewport. In order to predict salient image regions, saliency maps are estimated either by using an eye tracker to collect eye fixations during subjective tests or by using computational models of visual attention. However, eye tracking developments for ODIs are still in the early stages and although a large list of saliency models are available, no particular attention has been dedicated to ODIs. Therefore, in this paper, we consider the problem of estimating saliency maps for ODIs viewed with HMDs, when the use of an eye tracker device is not possible. We collected viewport data of 32 participants for 21 ODIs and propose a method to transform the gathered data into saliency maps. The obtained saliency maps are compared in terms of image exposition time used to display each ODI in the subjective tests. Then, motivated by the equator bias tendency in ODIs, we propose a post-processing method, namely FSM, to adapt current saliency models to ODIs requirements. We show that the use of FSM on current models improves their performance by up to 20%. The developed database and testbed are publicly available with this paper. |
Publications
2017 |
Estimation of optimal encoding ladders for tiled 360° VR video in adaptive streaming systems Conference Forthcoming The 19th IEEE International Symposium on Multimedia (ISM2017), Taichung, Taiwan, Forthcoming. |
Saliency-Based Sharpness Mismatch Detection For Stereoscopic Omnidirectional Images Inproceedings Forthcoming 14th European Conference on Visual Media Production, London, UK, Forthcoming. |
A Framework for Quality Control in Cinematic VR Based on Voronoi Patches and Saliency Inproceedings Forthcoming International Conference on 3D Immersion, Brussels, Belgium, Forthcoming. |
Light Field Denoising by Sparse 5D Transform Domain Collaborative Filtering Inproceedings IEEE International Workshop on Multimedia Signal Processing (MMSP 2017), 2017. |
Virtual Play in Free-viewpoint Video: Reinterpreting Samuel Beckett for Virtual Reality Inproceedings Forthcoming 16th IEEE International Symposium on Mixed and Augmented Reality (ISMAR), IEEE Xplore digital library, Forthcoming. |
Viewport-aware adaptive 360° video streaming using tiles for virtual reality Inproceedings 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 2017. |
SalNet360: Saliency Maps for omni-directional images with CNN Unpublished 2017. |
A Modular Scheme for Artifact Detection in Stereoscopic Omni-Directional Images Inproceedings Irish Machine Vision and Image Processing Conference, Maynooth, Ireland, 2017. |
Fast and Accurate Optical Flow based Depth Map Estimation from Light Fields Inproceedings Irish Machine Vision and Image Processing Conference (Received the Best Paper Award), 2017. |
Reconsidering movement: the performativity of digital drawing techniques in computational performance Journal Article Theatre and Performance Design, 3 (1-2), pp. 68-83, 2017, ISSN: 2332-2551. |
Look around you: saliency maps for omnidirectional images in VR applictions Inproceedings 9th International Conference on Quality of Multimedia Experience (QoMEX), 2017. |