2018
|
Gao, Pan; Ozcinar, Cagri; Smolic, Aljosa Optimization of Occlusion-Inducing Depth Pixels in 3-D Video Coding Conference IEEE International Conference on Image Processing (ICIP 2018), Athens, Greece 2018. @conference{Gao2018,
title = {Optimization of Occlusion-Inducing Depth Pixels in 3-D Video Coding},
author = {Pan Gao and Cagri Ozcinar and Aljosa Smolic},
url = {https://arxiv.org/abs/1805.03105},
year = {2018},
date = {2018-10-07},
booktitle = {IEEE International Conference on Image Processing (ICIP 2018)},
organization = {Athens, Greece},
abstract = {The optimization of occlusion-inducing depth pixels in depth map coding has received little attention in the literature, since their associated texture pixels are occluded in the synthesized view and their effect on the synthesized view is considered negligible. However, the occlusion-inducing depth pixels still need to consume the bits to be transmitted, and will induce geometry distortion that inherently exists in the synthesized view. In this paper, we propose an efficient depth map coding scheme specifically for the occlusion-inducing depth pixels by using allowable depth distortions. Firstly, we formulate a problem of minimizing the overall geometry distortion in the occlusion subject to the bit rate constraint, for which the depth distortion is properly adjusted within the set of allowable depth distortions that introduce the same disparity error as the initial depth distortion. Then, we propose a dynamic programming solution to find the optimal depth distortion vector for the occlusion. The proposed algorithm can improve the coding efficiency without alteration of the occlusion order. Simulation results confirm the performance improvement compared to other existing algorithms.},
keywords = {},
pubstate = {published},
tppubtype = {conference}
}
The optimization of occlusion-inducing depth pixels in depth map coding has received little attention in the literature, since their associated texture pixels are occluded in the synthesized view and their effect on the synthesized view is considered negligible. However, the occlusion-inducing depth pixels still need to consume the bits to be transmitted, and will induce geometry distortion that inherently exists in the synthesized view. In this paper, we propose an efficient depth map coding scheme specifically for the occlusion-inducing depth pixels by using allowable depth distortions. Firstly, we formulate a problem of minimizing the overall geometry distortion in the occlusion subject to the bit rate constraint, for which the depth distortion is properly adjusted within the set of allowable depth distortions that introduce the same disparity error as the initial depth distortion. Then, we propose a dynamic programming solution to find the optimal depth distortion vector for the occlusion. The proposed algorithm can improve the coding efficiency without alteration of the occlusion order. Simulation results confirm the performance improvement compared to other existing algorithms. |
Hudon, Matis; Grogan, Mairéad; Pagés, Rafael; Smolic, Aljosa Deep Normal Estimation for Automatic Shading of Hand-Drawn Characters Workshop The 3rd Geometry Meets Deep Learning Workshop in association with ECCV 2018, 2018. @workshop{HudonGMDL2018,
title = {Deep Normal Estimation for Automatic Shading of Hand-Drawn Characters},
author = {Matis Hudon and Mairéad Grogan and Rafael Pagés and Aljosa Smolic },
editor = {ECCV workshop proceedings},
url = {https://v-sense.scss.tcd.ie:443/research/vfx/deep-normal-estimation-for-automatic-shading-of-hand-drawn-characters/
https://www.scss.tcd.ie/~hudonm/pdf/DeepNormals_eccv_GMDL.pdf},
year = {2018},
date = {2018-09-14},
booktitle = {The 3rd Geometry Meets Deep Learning Workshop in association with ECCV 2018},
abstract = {We present a new fully automatic pipeline for generating shading effects on hand-drawn characters. Our method takes as input a single digitized sketch of any resolution and outputs a dense normal map estimation suitable for rendering without requiring any human input. At the heart of our method lies a deep residual, encoder-decoder convolutional network. The input sketch is first sampled using several equally sized 3-channel windows, with each window capturing a local area of interest at 3 different scales. Each window is then passed through the previously trained network for normal estimation. Finally, network outputs are arranged together to form a full-size normal map of the input sketch. We also present an efficient and effective way to generate a rich set of training data. Resulting renders offer a rich quality without any effort from the 2D artist. We show both quantitative and qualitative results demonstrating the effectiveness and quality of our network and method.},
keywords = {},
pubstate = {published},
tppubtype = {workshop}
}
We present a new fully automatic pipeline for generating shading effects on hand-drawn characters. Our method takes as input a single digitized sketch of any resolution and outputs a dense normal map estimation suitable for rendering without requiring any human input. At the heart of our method lies a deep residual, encoder-decoder convolutional network. The input sketch is first sampled using several equally sized 3-channel windows, with each window capturing a local area of interest at 3 different scales. Each window is then passed through the previously trained network for normal estimation. Finally, network outputs are arranged together to form a full-size normal map of the input sketch. We also present an efficient and effective way to generate a rich set of training data. Resulting renders offer a rich quality without any effort from the 2D artist. We show both quantitative and qualitative results demonstrating the effectiveness and quality of our network and method. |
Lutz, Sebastian; Amplianitis, Konstantinos; Smolic, Aljosa AlphaGAN: Generative adversarial networks for natural image matting Conference British Machine Vision Conference (BMVC 2018), 2018. @conference{Lutz2018,
title = {AlphaGAN: Generative adversarial networks for natural image matting},
author = {Sebastian Lutz and Konstantinos Amplianitis and Aljosa Smolic},
url = {https://v-sense.scss.tcd.ie:443/wp-content/uploads/2018/07/AlphaGAN_arxiv.pdf},
year = {2018},
date = {2018-09-03},
booktitle = {British Machine Vision Conference (BMVC 2018)},
journal = {British Machine Vision Conference (BMVC 2018)},
abstract = {We present the first generative adversarial network (GAN) for natural image matting. Our novel generator network is trained to predict visually appealing alphas with the addition of the adversarial loss from the discriminator that is trained to classify well-composited images. Further, we improve existing encoder-decoder architectures to better deal with the spatial localization issues inherited in convolutional neural networks (CNN) by using dilated convolutions to capture global context information without downscaling feature maps and losing spatial information. We present state-of-the-art results on the alphamatting online benchmark for the gradient error and give comparable results in others. Our method is particularly well suited for fine structures like hair, which is of great importance in practical matting applications, e.g. in film/TV production.},
keywords = {},
pubstate = {published},
tppubtype = {conference}
}
We present the first generative adversarial network (GAN) for natural image matting. Our novel generator network is trained to predict visually appealing alphas with the addition of the adversarial loss from the discriminator that is trained to classify well-composited images. Further, we improve existing encoder-decoder architectures to better deal with the spatial localization issues inherited in convolutional neural networks (CNN) by using dilated convolutions to capture global context information without downscaling feature maps and losing spatial information. We present state-of-the-art results on the alphamatting online benchmark for the gradient error and give comparable results in others. Our method is particularly well suited for fine structures like hair, which is of great importance in practical matting applications, e.g. in film/TV production. |
Ghosal, Koustav; Prasad, Mukta; Smolic, Aljosa A Geometry-Sensitive Approach for Photographic Style Classification Conference Irish Machine Vision and Image Processing Conference 2018 (IMVIP), 2018. @conference{Ghosal2018,
title = {A Geometry-Sensitive Approach for Photographic Style Classification},
author = {Koustav Ghosal and Mukta Prasad and Aljosa Smolic},
url = {https://v-sense.scss.tcd.ie:443/wp-content/uploads/2018/08/IMVIP_2018_paper_2-2.pdf, Paper
https://github.com/V-Sense/A-Geometry-Sensitive-Approach-for-Photographic-Style-Classification.git, Code},
year = {2018},
date = {2018-08-31},
booktitle = {Irish Machine Vision and Image Processing Conference 2018 (IMVIP)},
abstract = {Photographs are characterized by different compositional attributes like the Rule of Thirds, depth of field, vanishing-lines etc. The presence or absence of one or more of these attributes contributes to the overall artistic value of an image. In this work, we analyze the ability of deep learning based methods to learn such photographic style attributes. We observe that although a standard CNN learns the texture and appearance based features reasonably well, its understanding of global and geometric features is limited by two factors. First, the data-augmentation strategies (cropping, warping, etc.) distort the composition of a photograph and affect the performance. Secondly, the CNN features, in principle, are translation-invariant and appearance-dependent. But some geometric properties important for aesthetics, e.g. the Rule of Thirds (RoT), are position-dependent and appearance-invariant. Therefore, we propose a novel input representation which is geometry-sensitive, position-cognizant and appearance-invariant. We further introduce a two-column CNN architecture that performs better than the state-of-the-art (SoA) in photographic style classification. From our results, we observe that the proposed network learns both the geometric and appearance-based attributes better than the SoA.},
keywords = {},
pubstate = {published},
tppubtype = {conference}
}
Photographs are characterized by different compositional attributes like the Rule of Thirds, depth of field, vanishing-lines etc. The presence or absence of one or more of these attributes contributes to the overall artistic value of an image. In this work, we analyze the ability of deep learning based methods to learn such photographic style attributes. We observe that although a standard CNN learns the texture and appearance based features reasonably well, its understanding of global and geometric features is limited by two factors. First, the data-augmentation strategies (cropping, warping, etc.) distort the composition of a photograph and affect the performance. Secondly, the CNN features, in principle, are translation-invariant and appearance-dependent. But some geometric properties important for aesthetics, e.g. the Rule of Thirds (RoT), are position-dependent and appearance-invariant. Therefore, we propose a novel input representation which is geometry-sensitive, position-cognizant and appearance-invariant. We further introduce a two-column CNN architecture that performs better than the state-of-the-art (SoA) in photographic style classification. From our results, we observe that the proposed network learns both the geometric and appearance-based attributes better than the SoA. |
Croci, Simone; Grogan, Mairead; Knorr, Sebastian; Smolic, Aljosa Colour Correction for Stereoscopic Omnidirectional Images Conference Irish Machine Vision and Image Processing Conference (IMVIP 2018), 2018. @conference{croci2018b,
title = {Colour Correction for Stereoscopic Omnidirectional Images},
author = {Simone Croci and Mairead Grogan and Sebastian Knorr and Aljosa Smolic},
url = {https://v-sense.scss.tcd.ie:443/wp-content/uploads/2018/08/IMVIP_2018_paper_9.pdf
https://v-sense.scss.tcd.ie:443/?p=5169&preview=true},
year = {2018},
date = {2018-08-29},
booktitle = {Irish Machine Vision and Image Processing Conference (IMVIP 2018)},
abstract = {Stereoscopic omnidirectional images (ODI) when viewed with a head-mounted display are a way to generate an immersive experience. Unfortunately, their creation is not an easy process, and different problems can be present in the ODI that can reduce the quality of experience. A common problem is colour mismatch, which occurs when the colours of the objects in the scene are different between the two stereoscopic views. In this paper we propose a novel method for the correction of colour mismatch based on the subdivision of ODIs into patches, where local colour correction transformations are fitted and then globally combined. The results presented in the paper show that the proposed method is able to reduce the colour mismatch in stereoscopic ODIs.},
keywords = {},
pubstate = {published},
tppubtype = {conference}
}
Stereoscopic omnidirectional images (ODI) when viewed with a head-mounted display are a way to generate an immersive experience. Unfortunately, their creation is not an easy process, and different problems can be present in the ODI that can reduce the quality of experience. A common problem is colour mismatch, which occurs when the colours of the objects in the scene are different between the two stereoscopic views. In this paper we propose a novel method for the correction of colour mismatch based on the subdivision of ODIs into patches, where local colour correction transformations are fitted and then globally combined. The results presented in the paper show that the proposed method is able to reduce the colour mismatch in stereoscopic ODIs. |
Grogan, Mairéad; Hudon, Matis; McCormack, Daniel; Smolic, Aljosa Automatic Palette Extraction for Image Editing Inproceedings In: Irish Machine Vision and Image Processing Conference, Belfast, 2018. @inproceedings{Grogan2018,
title = {Automatic Palette Extraction for Image Editing},
author = {Mairéad Grogan and Matis Hudon and Daniel McCormack and Aljosa Smolic},
url = {https://v-sense.scss.tcd.ie:443/wp-content/uploads/2018/09/IMVIP18_palette.pdf},
year = {2018},
date = {2018-08-29},
booktitle = {Irish Machine Vision and Image Processing Conference},
address = {Belfast},
abstract = {Interactive palette based colour editing applications have grown in popularity in recent years, but while
many methods propose fast palette extraction techniques, they typically rely on the user to define the number
of colours needed. In this paper, we present an approach that extracts a small set of representative colours
from an image automatically, determining the optimal palette size without user interaction. Our iterative
technique assigns a vote to each pixel in the image based on how close they are in colour space to the colours
already in the palette. We use a histogram to divide the colours into bins and determine which colour occurs
most frequently in the image but is far away from all of the palette colours, and we add this colour to the
palette. This process continues until all pixels in the image are well represented by the palette. Comparisons
with existing methods show that our colour palettes compare well to other state of the art techniques, while
also computing the optimal number of colours automatically at interactive speeds. In addition, we showcase
how our colour palette performs when used in image editing applications such as colour transfer and layer
decomposition.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Interactive palette based colour editing applications have grown in popularity in recent years, but while
many methods propose fast palette extraction techniques, they typically rely on the user to define the number
of colours needed. In this paper, we present an approach that extracts a small set of representative colours
from an image automatically, determining the optimal palette size without user interaction. Our iterative
technique assigns a vote to each pixel in the image based on how close they are in colour space to the colours
already in the palette. We use a histogram to divide the colours into bins and determine which colour occurs
most frequently in the image but is far away from all of the palette colours, and we add this colour to the
palette. This process continues until all pixels in the image are well represented by the palette. Comparisons
with existing methods show that our colour palettes compare well to other state of the art techniques, while
also computing the optimal number of colours automatically at interactive speeds. In addition, we showcase
how our colour palette performs when used in image editing applications such as colour transfer and layer
decomposition. |
Hudon, Matis; Pagés, Rafael; Grogan, Mairéad; Ondrej, Jan; Smolic, Aljosa 2D Shading for Cel Animation Conference Expressive ’18: The Joint Symposium on Computational Aesthetics and Sketch Based Interfaces and Modeling and Non-Photorealistic Animation and Rendering, , ACM, 2018, ISBN: 78-1-4503-5892-7/18/08. @conference{HudonExpressive2018,
title = {2D Shading for Cel Animation},
author = {Matis Hudon and Rafael Pagés and Mairéad Grogan and Jan Ondrej and Aljosa Smolic },
editor = {ACM},
url = {https://v-sense.scss.tcd.ie:443/research/vfx/2d-shading-for-cel-animation/
https://www.scss.tcd.ie/~hudonm/pdf/Expressive18.pdf
https://www.scss.tcd.ie/~hudonm/videos/Expressive2018.mp4
},
doi = {10.1145/3229147.3229148},
isbn = {78-1-4503-5892-7/18/08},
year = {2018},
date = {2018-08-17},
booktitle = {Expressive ’18: The Joint Symposium on Computational Aesthetics and Sketch Based Interfaces and Modeling and Non-Photorealistic Animation and Rendering, },
publisher = {ACM},
abstract = {We present a semi-automatic method for creating shades and self-shadows in cel animation. Besides producing attractive images, shades and shadows provide important visual cues about depth, shapes, movement, and lighting of the scene. In conventional cel animation, shades and shadows are drawn by hand. As opposed to previous approaches, this method does not rely on a complex 3D reconstruction of the scene: its key advantages are simplicity and ease of use. The tool was designed to stay as close as possible to the natural 2D creative environment and therefore provides an intuitive and user-friendly interface. Our system creates shading based on hand-drawn objects or characters, given very limited guidance from the user. The method employs simple yet very efficient algorithms to create shading directly out of drawn strokes. We evaluate our system through a subjective user study and provide qualitative comparison of our method versus existing professional tools and state of the art.},
keywords = {},
pubstate = {published},
tppubtype = {conference}
}
We present a semi-automatic method for creating shades and self-shadows in cel animation. Besides producing attractive images, shades and shadows provide important visual cues about depth, shapes, movement, and lighting of the scene. In conventional cel animation, shades and shadows are drawn by hand. As opposed to previous approaches, this method does not rely on a complex 3D reconstruction of the scene: its key advantages are simplicity and ease of use. The tool was designed to stay as close as possible to the natural 2D creative environment and therefore provides an intuitive and user-friendly interface. Our system creates shading based on hand-drawn objects or characters, given very limited guidance from the user. The method employs simple yet very efficient algorithms to create shading directly out of drawn strokes. We evaluate our system through a subjective user study and provide qualitative comparison of our method versus existing professional tools and state of the art. |
O’Dwyer, Néill; Johnson, Nicholas; Pagés, Rafael; Ondřej, Jan; Amplianitis, Konstantinos; Bates, Enda; Monaghan, David; Smolic, Aljoša Beckett in VR: exploring narrative using free viewpoint video Inproceedings In: Proceeding SIGGRAPH '18, ACM SIGGRAPH ACM SIGGRAPH, New York, NY, USA, 2018, ISBN: 978-1-4503-5817-0 . @inproceedings{O'Dwyer2018,
title = {Beckett in VR: exploring narrative using free viewpoint video},
author = {Néill O’Dwyer and Nicholas Johnson and Rafael Pagés and Jan Ondřej and Konstantinos Amplianitis and Enda Bates and David Monaghan and Aljoša Smolic},
url = {https://dl.acm.org/citation.cfm?doid=3230744.3230774},
doi = {10.1145/3230744.3230774},
isbn = {978-1-4503-5817-0 },
year = {2018},
date = {2018-08-12},
booktitle = {Proceeding SIGGRAPH '18},
number = {2},
publisher = {ACM SIGGRAPH},
address = {New York, NY, USA},
organization = {ACM SIGGRAPH},
abstract = {This poster describes a reinterpretation of Samuel Beckett's theatrical text Play for virtual reality (VR). It is an aesthetic reflection on practice that follows up an a technical project description submitted to ISMAR 2017 [O'Dwyer et al. 2017]. Actors are captured in a green screen environment using free-viewpoint video (FVV) techniques, and the scene is built in a game engine, complete with binaural spatial audio and six degrees of freedom of movement. The project explores how ludic qualities in the original text help elicit the conversational and interactive specificities of the digital medium. The work affirms the potential for interactive narrative in VR, opens new experiences of the text, and highlights the reorganisation of the author-audience dynamic.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
This poster describes a reinterpretation of Samuel Beckett's theatrical text Play for virtual reality (VR). It is an aesthetic reflection on practice that follows up an a technical project description submitted to ISMAR 2017 [O'Dwyer et al. 2017]. Actors are captured in a green screen environment using free-viewpoint video (FVV) techniques, and the scene is built in a game engine, complete with binaural spatial audio and six degrees of freedom of movement. The project explores how ludic qualities in the original text help elicit the conversational and interactive specificities of the digital medium. The work affirms the potential for interactive narrative in VR, opens new experiences of the text, and highlights the reorganisation of the author-audience dynamic. |
Ozcinar, Cagri; Smolic, Aljosa Visual Attention in Omnidirectional Video for Virtual Reality Applications Inproceedings In: 10th International Conference on Quality of Multimedia Experience (QoMEX 2018) , 2018. @inproceedings{Ozcinar2018,
title = {Visual Attention in Omnidirectional Video for Virtual Reality Applications},
author = {Cagri Ozcinar and Aljosa Smolic},
url = {https://v-sense.scss.tcd.ie:443/research/3dof/visual-attention-in-omnidirectional-video-for-virtual-reality-applications/
https://v-sense.scss.tcd.ie:443/wp-content/uploads/2018/05/OmniAttention2018.pdf},
year = {2018},
date = {2018-05-29},
booktitle = {10th International Conference on Quality of Multimedia Experience (QoMEX 2018) },
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
|
Monroy, Rafael; Lutz, Sebastian; Chalasani, Tejo; Smolic, Aljosa SalNet360: Saliency Maps for omni-directional images with CNN Journal Article In: Signal Processing: Image Communication, 2018, ISSN: 0923-5965. @article{Monroy_SalNet2018,
title = {SalNet360: Saliency Maps for omni-directional images with CNN},
author = {Rafael Monroy and Sebastian Lutz and Tejo Chalasani and Aljosa Smolic},
url = {https://arxiv.org/abs/1709.06505
https://github.com/V-Sense/salnet360
https://v-sense.scss.tcd.ie:443/research/3dof/salnet360-saliency-maps-for-omni-directional-images-with-cnn/},
doi = {10.1016/j.image.2018.05.005},
issn = {0923-5965},
year = {2018},
date = {2018-05-12},
journal = {Signal Processing: Image Communication},
abstract = {The prediction of Visual Attention data from any kind of media is of valuable use to content creators and used to efficiently drive encoding algorithms. With the current trend in the Virtual Reality (VR) field, adapting known techniques to this new kind of media is starting to gain momentum. In this paper, we present an architectural extension to any Convolutional Neural Network (CNN) to fine-tune traditional 2D saliency prediction to Omnidirectional Images (ODIs) in an end-to-end manner. We show that each step in the proposed pipeline works towards making the generated saliency map more accurate with respect to ground truth data.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
The prediction of Visual Attention data from any kind of media is of valuable use to content creators and used to efficiently drive encoding algorithms. With the current trend in the Virtual Reality (VR) field, adapting known techniques to this new kind of media is starting to gain momentum. In this paper, we present an architectural extension to any Convolutional Neural Network (CNN) to fine-tune traditional 2D saliency prediction to Omnidirectional Images (ODIs) in an end-to-end manner. We show that each step in the proposed pipeline works towards making the generated saliency map more accurate with respect to ground truth data. |
Pagés, Rafael; Amplianitis, Konstantinos; Monaghan, David; Ondrej, Jan; Smolic, Aljosa Affordable Content Creation for Free-Viewpoint Video and VR/AR Applications Journal Article In: Journal of Visual Communication and Image Representation, vol. Volume 53, pp. 192-201, 2018. @article{pages2018affordable,
title = {Affordable Content Creation for Free-Viewpoint Video and VR/AR Applications},
author = {Rafael Pagés and Konstantinos Amplianitis and David Monaghan and Jan Ondrej and Aljosa Smolic},
url = {https://v-sense.scss.tcd.ie:443/research/6dof/affordable-content-creation-for-free-viewpoint-video-and-vr-ar-applications/},
doi = {10.1016/j.jvcir.2018.03.012},
year = {2018},
date = {2018-05-01},
journal = {Journal of Visual Communication and Image Representation},
volume = {Volume 53},
pages = {192-201},
abstract = {We present a scalable pipeline for Free-Viewpoint Video (FVV) content creation, considering also visualisation in Augmented Reality (AR) and Virtual Reality (VR). We support a range of scenarios where there may be a limited number of handheld consumer cameras, but also demonstrate how our method can be applied in professional multi-camera setups. Our novel pipeline extends many state-of-the-art techniques (such as structure-from-motion, shape-from-silhouette and multi-view stereo) and incorporates bio-mechanical constraints through 3D skeletal information as well as efficient camera pose estimation algorithms. We introduce multi-source shape-from-silhouette (MS-SfS) combined with fusion of different geometry data as crucial components for accurate reconstruction in sparse camera settings. Our approach is highly flexible and our results indicate suitability either for affordable content creation for VR/AR or for interactive FVV visualisation where a user can choose an arbitrary viewpoint or sweep between known views using view synthesis.
},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
We present a scalable pipeline for Free-Viewpoint Video (FVV) content creation, considering also visualisation in Augmented Reality (AR) and Virtual Reality (VR). We support a range of scenarios where there may be a limited number of handheld consumer cameras, but also demonstrate how our method can be applied in professional multi-camera setups. Our novel pipeline extends many state-of-the-art techniques (such as structure-from-motion, shape-from-silhouette and multi-view stereo) and incorporates bio-mechanical constraints through 3D skeletal information as well as efficient camera pose estimation algorithms. We introduce multi-source shape-from-silhouette (MS-SfS) combined with fusion of different geometry data as crucial components for accurate reconstruction in sparse camera settings. Our approach is highly flexible and our results indicate suitability either for affordable content creation for VR/AR or for interactive FVV visualisation where a user can choose an arbitrary viewpoint or sweep between known views using view synthesis.
|
O’Dwyer, Néill; Johnson, Nicholas Virtual Play: Beckettian Experiments in Virtual Reality Journal Article In: Contemporary Theatre review, vol. 28.1, 2018. @article{BeckettianExperimentsinVirtualReality,
title = {Virtual Play: Beckettian Experiments in Virtual Reality},
author = {Néill O’Dwyer and Nicholas Johnson},
url = {https://www.contemporarytheatrereview.org/2018/beckettian-experiments-in-virtual-reality/
},
year = {2018},
date = {2018-02-21},
journal = {Contemporary Theatre review},
volume = {28.1},
abstract = {The past ten years have seen extensive experimentation with Beckett and new technological media at Trinity College Dublin. Research projects have included the stage adaptation and installation of a teleplay (Ghost Trio, 2007), the HD digital video exploration of two teleplays (Abstract Machines, 2010, including new versions of …but the clouds… and Nacht und Träume), and numerous smaller projects involving audio and video within the remit of “fundamental research” at the Samuel Beckett Laboratory (2013–present). The most recent project, Virtual Play, explores Beckett’s Play (1963) within FVV (free-viewpoint video), a form of user-centred VR (virtual reality). This project, reflecting interdisciplinary and cross-faculty collaboration between the V-SENSE project (within the School of Computer Science and Statistics) and the School of Creative Arts, has made high-impact contributions in both FVV research and Beckett Studies, and has now been recognised at European level, receiving first prize at the 2017 New European Media Awards.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
The past ten years have seen extensive experimentation with Beckett and new technological media at Trinity College Dublin. Research projects have included the stage adaptation and installation of a teleplay (Ghost Trio, 2007), the HD digital video exploration of two teleplays (Abstract Machines, 2010, including new versions of …but the clouds… and Nacht und Träume), and numerous smaller projects involving audio and video within the remit of “fundamental research” at the Samuel Beckett Laboratory (2013–present). The most recent project, Virtual Play, explores Beckett’s Play (1963) within FVV (free-viewpoint video), a form of user-centred VR (virtual reality). This project, reflecting interdisciplinary and cross-faculty collaboration between the V-SENSE project (within the School of Computer Science and Statistics) and the School of Creative Arts, has made high-impact contributions in both FVV research and Beckett Studies, and has now been recognised at European level, receiving first prize at the 2017 New European Media Awards. |
2017
|
Ozcinar, Cagri; Abreu, Ana De; Knorr, Sebastian; Smolic, Aljosa Estimation of optimal encoding ladders for tiled 360° VR video in adaptive streaming systems Inproceedings In: The 19th IEEE International Symposium on Multimedia (ISM 2017), Taichung, Taiwan, 2017. @inproceedings{OzcinarISM2017,
title = {Estimation of optimal encoding ladders for tiled 360° VR video in adaptive streaming systems},
author = { Cagri Ozcinar and Ana De Abreu and Sebastian Knorr and Aljosa Smolic},
url = {https://v-sense.scss.tcd.ie:443/wp-content/uploads/2018/02/ISM_2017_pcopy.pdf
http://ieeexplore.ieee.org/document/8241580/
https://arxiv.org/pdf/1711.03362.pdf
https://www.researchgate.net/publication/320274287_Estimation_of_Optimal_Encoding_Ladders_for_Tiled_360_VR_Video_in_Adaptive_Streaming_Systems?tab=overview},
year = {2017},
date = {2017-12-11},
booktitle = {The 19th IEEE International Symposium on Multimedia (ISM 2017)},
address = {Taichung, Taiwan},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
|
Croci, Simone; Knorr, Sebastian; Smolic, Aljosa Saliency-Based Sharpness Mismatch Detection For Stereoscopic Omnidirectional Images Inproceedings In: 14th European Conference on Visual Media Production, London, UK, 2017. @inproceedings{Croci2017a,
title = {Saliency-Based Sharpness Mismatch Detection For Stereoscopic Omnidirectional Images},
author = {Simone Croci and Sebastian Knorr and Aljosa Smolic},
url = {https://v-sense.scss.tcd.ie:443/wp-content/uploads/2017/10/2017_CVMP_Saliency-Based-Sharpness-Mismatch-Detection-For-Stereoscopic-Omnidirectional-Images.pdf},
doi = {https://doi.org/10.1145/3150165.3150168},
year = {2017},
date = {2017-12-11},
booktitle = {14th European Conference on Visual Media Production},
address = {London, UK},
abstract = {In this paper, we present a novel sharpness mismatch detection (SMD) approach for stereoscopic omnidirectional images (ODI) for quality control within the post-production work ow, which is the main contribution. In particular, we applied a state of the art SMD approach, which was originally developed for traditional HD images, and extended it to stereoscopic ODIs. A new e cient method for patch extraction from ODIs was developed based on the spherical Voronoi diagram of equidistant points evenly distributed on the sphere. The subdivision of the ODI into patches allows an accurate detection and localization of regions with sharpness mismatch. A second contribution of the paper is the integration of saliency into our SMD approach. In this context, we introduce a novel method for the estimation of saliency maps from viewport data of head-mounted displays (HMD). Finally, we demonstrate the performance of our SMD approach with data collected from a subjective test with 17 participants.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
In this paper, we present a novel sharpness mismatch detection (SMD) approach for stereoscopic omnidirectional images (ODI) for quality control within the post-production work ow, which is the main contribution. In particular, we applied a state of the art SMD approach, which was originally developed for traditional HD images, and extended it to stereoscopic ODIs. A new e cient method for patch extraction from ODIs was developed based on the spherical Voronoi diagram of equidistant points evenly distributed on the sphere. The subdivision of the ODI into patches allows an accurate detection and localization of regions with sharpness mismatch. A second contribution of the paper is the integration of saliency into our SMD approach. In this context, we introduce a novel method for the estimation of saliency maps from viewport data of head-mounted displays (HMD). Finally, we demonstrate the performance of our SMD approach with data collected from a subjective test with 17 participants. |
Croci, Simone; Knorr, Sebastian; Goldmann, Lutz; Smolic, Aljosa A Framework for Quality Control in Cinematic VR Based on Voronoi Patches and Saliency Inproceedings In: International Conference on 3D Immersion, Brussels, Belgium, 2017. @inproceedings{Croci2017b,
title = {A Framework for Quality Control in Cinematic VR Based on Voronoi Patches and Saliency},
author = {Simone Croci and Sebastian Knorr and Lutz Goldmann and Aljosa Smolic},
url = {https://v-sense.scss.tcd.ie:443/wp-content/uploads/2017/10/2017_IC3D_A-FRAMEWORK-FOR-QUALITY-CONTROL-IN-CINEMATIC-VR-BASED-ON-VORONOI-PATCHES-AND-SALIENCY.pdf},
year = {2017},
date = {2017-12-11},
booktitle = {International Conference on 3D Immersion},
address = {Brussels, Belgium},
abstract = {In this paper, we present a novel framework for quality control in cinematic VR (360-video) based on Voronoi patches and saliency which can be used in post-production workflows. Our approach first extracts patches in stereoscopic omnidirectional images (ODI) using the spherical Voronoi diagram. The subdivision of the ODI into patches allows an accurate detection and localization of regions with artifacts. Further, we introduce saliency in order to weight detected artifacts according to the visual attention of end-users. Then, we propose different artifact detection and analysis methods for sharpness mismatch detection (SMD), color mismatch detection (CMD) and disparity distribution analysis. In particular, we took two state of the art approaches for SMD and CMD, which were originally developed for conventional planar images, and extended them to stereoscopic ODIs. Finally, we evaluated the performance of our framework with a dataset of 18 ODIs for which saliency maps were obtained from a subjective test with 17 participants.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
In this paper, we present a novel framework for quality control in cinematic VR (360-video) based on Voronoi patches and saliency which can be used in post-production workflows. Our approach first extracts patches in stereoscopic omnidirectional images (ODI) using the spherical Voronoi diagram. The subdivision of the ODI into patches allows an accurate detection and localization of regions with artifacts. Further, we introduce saliency in order to weight detected artifacts according to the visual attention of end-users. Then, we propose different artifact detection and analysis methods for sharpness mismatch detection (SMD), color mismatch detection (CMD) and disparity distribution analysis. In particular, we took two state of the art approaches for SMD and CMD, which were originally developed for conventional planar images, and extended them to stereoscopic ODIs. Finally, we evaluated the performance of our framework with a dataset of 18 ODIs for which saliency maps were obtained from a subjective test with 17 participants. |
Grogan, Mairead; Dahyot, Rozenn; Smolic, Aljosa User Interaction for Image Recolouring using L2 Inproceedings In: 14th European Conference on Visual Media Production, London, UK, 2017. @inproceedings{Grogan2017a,
title = {User Interaction for Image Recolouring using L2},
author = {Mairead Grogan and Rozenn Dahyot and Aljosa Smolic},
url = {https://v-sense.scss.tcd.ie:443/wp-content/uploads/2017/11/CVMP17PaletteRecolouring.pdf},
year = {2017},
date = {2017-12-11},
booktitle = {14th European Conference on Visual Media Production},
address = {London, UK},
abstract = {Recently, an example based colour transfer approach proposed modelling the colour distributions of a palette and target image using Gaussian Mixture Models, and registers them by minimising
the robust L2 distance between the mixtures. In this paper we propose to extend this approach to allow for user interaction. We present two interactive recolouring applications, the first allowing
the user to select colour correspondences between a target and palette image, while the second palette based application allows the user to edit a palette of colours to determine the image recolouring. We modify the L2 based cost function to improve results when an interactive interface is used, and take measures to ensure that even when minimal input is given by the user, good colour transfer results are created. Both applications are available through a web interface and qualitatively assessed against recent recolouring techniques},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Recently, an example based colour transfer approach proposed modelling the colour distributions of a palette and target image using Gaussian Mixture Models, and registers them by minimising
the robust L2 distance between the mixtures. In this paper we propose to extend this approach to allow for user interaction. We present two interactive recolouring applications, the first allowing
the user to select colour correspondences between a target and palette image, while the second palette based application allows the user to edit a palette of colours to determine the image recolouring. We modify the L2 based cost function to improve results when an interactive interface is used, and take measures to ensure that even when minimal input is given by the user, good colour transfer results are created. Both applications are available through a web interface and qualitatively assessed against recent recolouring techniques |
Alain, Martin; Smolic, Aljosa Light Field Denoising by Sparse 5D Transform Domain Collaborative Filtering Inproceedings In: IEEE International Workshop on Multimedia Signal Processing (MMSP 2017) - Top 10% Paper Award, 2017. @inproceedings{Alain2017,
title = {Light Field Denoising by Sparse 5D Transform Domain Collaborative Filtering},
author = {Martin Alain and Aljosa Smolic},
url = {https://v-sense.scss.tcd.ie:443/wp-content/uploads/2017/08/LFBM5D_MMSP_camera_ready-1.pdf},
year = {2017},
date = {2017-10-16},
booktitle = {IEEE International Workshop on Multimedia Signal Processing (MMSP 2017) - Top 10% Paper Award},
abstract = {In this paper, we propose to extend the state-of-the-art BM3D image denoising filter to light fields, and we denote our method LFBM5D.
We take full advantage of the 4D nature of light fields by creating disparity compensated 4D patches which are then stacked together with similar 4D patches along a 5th dimension.
We then filter these 5D patches in the 5D transform domain, obtained by cascading a 2D spatial transform, a 2D angular transform, and a 1D transform applied along the similarities.
Furthermore, we propose to use the shape-adaptive DCT as the 2D angular transform to be robust to occlusions.
Results show a significant improvement in synthetic noise removal compared to state-of-the-art methods, for both light fields captured with a lenslet camera or a gantry.
Experiments on Lytro Illum camera noise removal also demonstrate a clear improvement of the light field quality.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
In this paper, we propose to extend the state-of-the-art BM3D image denoising filter to light fields, and we denote our method LFBM5D.
We take full advantage of the 4D nature of light fields by creating disparity compensated 4D patches which are then stacked together with similar 4D patches along a 5th dimension.
We then filter these 5D patches in the 5D transform domain, obtained by cascading a 2D spatial transform, a 2D angular transform, and a 1D transform applied along the similarities.
Furthermore, we propose to use the shape-adaptive DCT as the 2D angular transform to be robust to occlusions.
Results show a significant improvement in synthetic noise removal compared to state-of-the-art methods, for both light fields captured with a lenslet camera or a gantry.
Experiments on Lytro Illum camera noise removal also demonstrate a clear improvement of the light field quality. |
O’Dwyer, Néill; Johnson, Nicholas; Bates, Enda; Pagés, Rafael; Ondrej, Jan; Amplianitis, Konstantinos; Monaghan, David; Smolic, Aljosa Virtual Play in Free-viewpoint Video: Reinterpreting Samuel Beckett for Virtual Reality Inproceedings In: 16th IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 262-267, IEEE Xplore digital library, 2017. @inproceedings{ODwyer2017b,
title = {Virtual Play in Free-viewpoint Video: Reinterpreting Samuel Beckett for Virtual Reality},
author = {Néill O’Dwyer and Nicholas Johnson and Enda Bates and Rafael Pagés and Jan Ondrej and Konstantinos Amplianitis and David Monaghan and Aljosa Smolic},
url = {https://v-sense.scss.tcd.ie:443/wp-content/uploads/2017/08/VARCI17_Beckett_V-SENSE_final.pdf},
doi = {10.1109/ISMAR-Adjunct.2017.87},
year = {2017},
date = {2017-10-14},
booktitle = {16th IEEE International Symposium on Mixed and Augmented Reality (ISMAR)},
pages = {262-267},
publisher = {IEEE Xplore digital library},
abstract = {Since the early years of the twenty-first century, the performing arts have been party to an increasing number of digital media projects that bring renewed attention to questions about, on one hand, new working processes involving capture and distribution techniques, and on the other hand, how particular works—with bespoke hard and software—can exert an efficacy over how work is created by the artist/producer or received by the audience. The evolution of author/audience criteria demand that digital arts practice modify aesthetic and storytelling strategies, to types that are more appropriate to communicating ideas over interactive digital networks, wherein AR/VR technologies are rapidly becoming the dominant interface. This project explores these redefined criteria through a reimagining of Samuel Becketts Play (1963) for digital culture. This paper offers an account of the working processes, the aesthetic and technical considerations that guide artistic decisions and how we attempt to place the overall work in the state of the art.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Since the early years of the twenty-first century, the performing arts have been party to an increasing number of digital media projects that bring renewed attention to questions about, on one hand, new working processes involving capture and distribution techniques, and on the other hand, how particular works—with bespoke hard and software—can exert an efficacy over how work is created by the artist/producer or received by the audience. The evolution of author/audience criteria demand that digital arts practice modify aesthetic and storytelling strategies, to types that are more appropriate to communicating ideas over interactive digital networks, wherein AR/VR technologies are rapidly becoming the dominant interface. This project explores these redefined criteria through a reimagining of Samuel Becketts Play (1963) for digital culture. This paper offers an account of the working processes, the aesthetic and technical considerations that guide artistic decisions and how we attempt to place the overall work in the state of the art. |
Mahmalat, Samir; Aydın, Tunç; Smolic, Aljosa Pipelines for HDR Video Coding Based on Luminance Independent Chromaticity Preprocessing Journal Article In: IEEE Transactions on Circuits and Systems for Video Technology, vol. PP, no. 99, pp. 1 - 1, 2017, ISSN: 1558-2205 . @article{Mahmalat2017,
title = {Pipelines for HDR Video Coding Based on Luminance Independent Chromaticity Preprocessing},
author = {Samir Mahmalat and Tunç Aydın and Aljosa Smolic},
url = {http://ieeexplore.ieee.org/document/8054690/},
doi = {10.1109/TCSVT.2017.2758268},
issn = {1558-2205 },
year = {2017},
date = {2017-10-02},
journal = { IEEE Transactions on Circuits and Systems for Video Technology},
volume = {PP},
number = {99},
pages = {1 - 1},
abstract = {We consider the chromaticity in high dynamic range video coding and show the advantages of a constant luminance color space for encoding. For this, we introduce two constant luminance HDR video coding pipelines, which convert the source video to linear Y u′v′. A content dependent scaling of the chromaticity components serves as color quality parameter. This reduces perceivable color artifacts while remaining fully compatible to core HEVC or other video coding standards. One of the pipelines further combines the scaling with a dedicated chromaticity transform to optimize the representation of the chromaticity components for encoding. We validate both pipelines with subjective user studies in addition to an objective comparison to other state-of-the-art methods. The user studies show a significant improvement in perceived color quality at medium to high compression rates without sacrificing luminance quality compared to current standard coding pipelines. The objective evaluation suggests that both pipelines perform at least comparable to current state-of-the-art methods.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
We consider the chromaticity in high dynamic range video coding and show the advantages of a constant luminance color space for encoding. For this, we introduce two constant luminance HDR video coding pipelines, which convert the source video to linear Y u′v′. A content dependent scaling of the chromaticity components serves as color quality parameter. This reduces perceivable color artifacts while remaining fully compatible to core HEVC or other video coding standards. One of the pipelines further combines the scaling with a dedicated chromaticity transform to optimize the representation of the chromaticity components for encoding. We validate both pipelines with subjective user studies in addition to an objective comparison to other state-of-the-art methods. The user studies show a significant improvement in perceived color quality at medium to high compression rates without sacrificing luminance quality compared to current standard coding pipelines. The objective evaluation suggests that both pipelines perform at least comparable to current state-of-the-art methods. |
Ozcinar, Cagri; Abreu, Ana De; Smolic, Aljosa Viewport-aware adaptive 360° video streaming using tiles for virtual reality Inproceedings In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 2174-2178, Beijing, China, 2017, ISBN: 2381-8549. @inproceedings{Ozcinar2017,
title = {Viewport-aware adaptive 360° video streaming using tiles for virtual reality},
author = { Cagri Ozcinar and Ana De Abreu and Aljosa Smolic},
url = {https://www.researchgate.net/publication/316990176_VIEWPORT-AWARE_ADAPTIVE_360_VIDEO_STREAMING_USING_TILES_FOR_VIRTUAL_REALITY
http://ieeexplore.ieee.org/document/8296667/
https://arxiv.org/pdf/1711.02386.pdf
},
doi = {10.1109/ICIP.2017.8296667},
isbn = {2381-8549},
year = {2017},
date = {2017-09-30},
booktitle = {2017 IEEE International Conference on Image Processing (ICIP)},
pages = {2174-2178},
address = {Beijing, China},
abstract = {360◦ video is attracting an increasing amount of atten- tion in the context of Virtual Reality (VR). Owing to its very high-resolution requirements, existing professional stream- ing services for 360◦ video suffer from severe drawbacks. This paper introduces a novel end-to-end streaming system from encoding to displaying, to transmit 8K resolution 360◦ video and to provide an enhanced VR experience using Head Mounted Displays (HMDs). The main contributions of the proposed system are about tiling, integration of the MPEG- Dynamic Adaptive Streaming over HTTP (DASH) standard, and viewport-aware bitrate level selection. Tiling and adap- tive streaming enable the proposed system to deliver very high-resolution 360◦ video at good visual quality. Further, the proposed viewport-aware bitrate assignment selects an optimum DASH representation for each tile in a viewport-aware manner. The quality performance of the proposed system is verified in simulations with varying network band- width using realistic view trajectories recorded from user experiments. Our results show that the proposed streaming system compares favorably compared to existing methods in terms of PSNR and SSIM inside the viewport.
Our streaming system is available as an open source library .},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
360◦ video is attracting an increasing amount of atten- tion in the context of Virtual Reality (VR). Owing to its very high-resolution requirements, existing professional stream- ing services for 360◦ video suffer from severe drawbacks. This paper introduces a novel end-to-end streaming system from encoding to displaying, to transmit 8K resolution 360◦ video and to provide an enhanced VR experience using Head Mounted Displays (HMDs). The main contributions of the proposed system are about tiling, integration of the MPEG- Dynamic Adaptive Streaming over HTTP (DASH) standard, and viewport-aware bitrate level selection. Tiling and adap- tive streaming enable the proposed system to deliver very high-resolution 360◦ video at good visual quality. Further, the proposed viewport-aware bitrate assignment selects an optimum DASH representation for each tile in a viewport-aware manner. The quality performance of the proposed system is verified in simulations with varying network band- width using realistic view trajectories recorded from user experiments. Our results show that the proposed streaming system compares favorably compared to existing methods in terms of PSNR and SSIM inside the viewport.
Our streaming system is available as an open source library . |
Knorr, Sebastian; Croci, Simone; Smolic, Aljosa A Modular Scheme for Artifact Detection in Stereoscopic Omni-Directional Images Inproceedings In: Irish Machine Vision and Image Processing Conference, Maynooth, Ireland, 2017. @inproceedings{Knorr2017,
title = {A Modular Scheme for Artifact Detection in Stereoscopic Omni-Directional Images},
author = { Sebastian Knorr and Simone Croci and Aljosa Smolic},
url = {https://v-sense.scss.tcd.ie:443/wp-content/uploads/2017/07/imvip2017_knorr_final.pdf
},
year = {2017},
date = {2017-08-30},
booktitle = {Irish Machine Vision and Image Processing Conference},
address = {Maynooth, Ireland},
abstract = {With the release of new head-mounted displays (HMDs) and new omni-directional capture systems, 360-degree video is one of the latest and most powerful trends in immersive media, with an increasing potential for the next decades. However, especially creating 360-degree content in 3D is still an error-prone task with many limitations to overcome. This paper describes the critical aspects of 3D content creation for 360-degree video. In particular, conflicts of depth cues and binocular rivalry are reviewed in detail, as these cause eye fatigue, headache, and even nausea. Both the reasons for the appearance of the conflicts and how to detect some of these conflicts by objective image analysis methods are detailed in this paper. The latter is the main contribution of this paper and part of long-term research roadmap of the authors in order to provide a comprehensive framework for artifact detection and correction in 360-degree videos. Then, experimental results are demonstrating the performance of the proposed approaches in terms of objective measures and visual feedback. Finally, the paper concludes with a discussion and future work.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
With the release of new head-mounted displays (HMDs) and new omni-directional capture systems, 360-degree video is one of the latest and most powerful trends in immersive media, with an increasing potential for the next decades. However, especially creating 360-degree content in 3D is still an error-prone task with many limitations to overcome. This paper describes the critical aspects of 3D content creation for 360-degree video. In particular, conflicts of depth cues and binocular rivalry are reviewed in detail, as these cause eye fatigue, headache, and even nausea. Both the reasons for the appearance of the conflicts and how to detect some of these conflicts by objective image analysis methods are detailed in this paper. The latter is the main contribution of this paper and part of long-term research roadmap of the authors in order to provide a comprehensive framework for artifact detection and correction in 360-degree videos. Then, experimental results are demonstrating the performance of the proposed approaches in terms of objective measures and visual feedback. Finally, the paper concludes with a discussion and future work. |
Chen, Yang; Alain, Martin; Smolic, Aljosa Fast and Accurate Optical Flow based Depth Map Estimation from Light Fields Inproceedings In: Irish Machine Vision and Image Processing Conference (Received the Best Paper Award), 2017. @inproceedings{Yang2017,
title = {Fast and Accurate Optical Flow based Depth Map Estimation from Light Fields},
author = {Chen, Yang and Alain, Martin and Smolic, Aljosa },
url = {https://v-sense.scss.tcd.ie:443/research/light-fields/depth-map-estimation-in-light-field-images/
https://v-sense.scss.tcd.ie:443/wp-content/uploads/2017/07/Fast-and-Accurate-Optical-Flow-based-Depth-Map-Estimation-from-Light-Fields-5.pdf},
doi = {https://doi.org/10.25546/95672},
year = {2017},
date = {2017-08-30},
booktitle = {Irish Machine Vision and Image Processing Conference (Received the Best Paper Award)},
abstract = {Depth map estimation is a crucial task in computer vision, and new approaches have recently emerged taking advantage of light fields, as this new imaging modality captures much more information about the angular direction of light rays compared to common approaches based on stereoscopic images or multi-view. In this paper, we propose a novel depth estimation method from light fields based on existing optical flow estimation methods. The optical flow estimator is applied on a sequence of images taken along an angular dimension of the light field, which produces several disparity map estimates. Considering both accuracy and efficiency, we choose the feature flow method as our optical flow estimator. Thanks to its spatio-temporal edge-aware filtering properties, the different disparity map estimates that we obtain are very consistent, which allows a fast and simple aggregation step to create a single disparity map, which can then converted into a depth map. Since the disparity map estimates are consistent, we can also create a depth map from each disparity estimate, and then aggregate the different depth maps in the 3D space to create a single dense depth map.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Depth map estimation is a crucial task in computer vision, and new approaches have recently emerged taking advantage of light fields, as this new imaging modality captures much more information about the angular direction of light rays compared to common approaches based on stereoscopic images or multi-view. In this paper, we propose a novel depth estimation method from light fields based on existing optical flow estimation methods. The optical flow estimator is applied on a sequence of images taken along an angular dimension of the light field, which produces several disparity map estimates. Considering both accuracy and efficiency, we choose the feature flow method as our optical flow estimator. Thanks to its spatio-temporal edge-aware filtering properties, the different disparity map estimates that we obtain are very consistent, which allows a fast and simple aggregation step to create a single disparity map, which can then converted into a depth map. Since the disparity map estimates are consistent, we can also create a depth map from each disparity estimate, and then aggregate the different depth maps in the 3D space to create a single dense depth map. |
O'Dwyer, Néill Reconsidering movement: the performativity of digital drawing techniques in computational performance Journal Article In: Theatre and Performance Design, vol. 3, no. 1-2, pp. 68-83, 2017, ISSN: 2332-2551. @article{o2017reconsidering,
title = {Reconsidering movement: the performativity of digital drawing techniques in computational performance},
author = { Néill O'Dwyer},
editor = {Jane Collins and Arnold Aronson},
url = {http://www.tandfonline.com/doi/full/10.1080/23322551.2017.1320087},
doi = {10.1080/23322551.2017.1320087},
issn = {2332-2551},
year = {2017},
date = {2017-06-23},
journal = {Theatre and Performance Design},
volume = {3},
number = {1-2},
pages = {68-83},
abstract = {This article is concerned with investigating the aesthetic repercussions of the emergence of computer-vision techniques and their subsequent integration into drawing processes. The inception of this technology has opened possibilities for corporeal drawing techniques that engage the entire body – not just the hand and eye – and this technique is finding its perfect home in digitally engaged dance practice where it is now widely used in scenographic processes, including interactive digital projections and sonification. This article analyses an example of computer-vision-aided drawing, entitled as·phyx·i·a (2015), by a New York-based collective, in order to discuss profound sociocultural reconfigurations occasioned by the invention of a new technology. It will be demonstrated that digital media’s ability to mobilise choreography as a totalised, corporeal mode of drawing – for example in dance and movement practices – represents an avant-garde, experimental and cutting-edge terrain, in which the new technologies of inscription are gathered in innovative ways that fundamentally challenge dominant paradigms of representation. In this regard, I maintain that the innovation at the heart of such projects troubles dominant cultural programmes in a way that directs a questioning at the very heart of how we construct knowledge.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
This article is concerned with investigating the aesthetic repercussions of the emergence of computer-vision techniques and their subsequent integration into drawing processes. The inception of this technology has opened possibilities for corporeal drawing techniques that engage the entire body – not just the hand and eye – and this technique is finding its perfect home in digitally engaged dance practice where it is now widely used in scenographic processes, including interactive digital projections and sonification. This article analyses an example of computer-vision-aided drawing, entitled as·phyx·i·a (2015), by a New York-based collective, in order to discuss profound sociocultural reconfigurations occasioned by the invention of a new technology. It will be demonstrated that digital media’s ability to mobilise choreography as a totalised, corporeal mode of drawing – for example in dance and movement practices – represents an avant-garde, experimental and cutting-edge terrain, in which the new technologies of inscription are gathered in innovative ways that fundamentally challenge dominant paradigms of representation. In this regard, I maintain that the innovation at the heart of such projects troubles dominant cultural programmes in a way that directs a questioning at the very heart of how we construct knowledge. |
Abreu, Ana De; Ozcinar, Cagri; Smolic, Aljosa Look around you: saliency maps for omnidirectional images in VR applictions Inproceedings In: 9th International Conference on Quality of Multimedia Experience (QoMEX), 2017. @inproceedings{AnaDeAbreuCagriOzcinar2017,
title = {Look around you: saliency maps for omnidirectional images in VR applictions},
author = { Ana De Abreu and Cagri Ozcinar and Aljosa Smolic},
url = {https://www.researchgate.net/publication/317184829_Look_around_you_Saliency_maps_for_omnidirectional_images_in_VR_applications},
year = {2017},
date = {2017-05-31},
booktitle = {9th International Conference on Quality of Multimedia Experience (QoMEX)},
abstract = {Understanding visual attention has always been a topic of great interest in the graphics, image/video processing, robotics and human computer interaction communities. By understanding salient image regions, the compression, transmission and render- ing algorithms can be optimized. This is particularly important in omnidirectional images (ODIs) viewed with a head-mounted display (HMD), where only a fraction of the captured scene is displayed at a time, namely viewport. In order to predict salient image regions, saliency maps are estimated either by using an eye tracker to collect eye fixations during subjective tests or by using computational models of visual attention. However, eye tracking developments for ODIs are still in the early stages and although a large list of saliency models are available, no particular attention has been dedicated to ODIs. Therefore, in this paper, we consider the problem of estimating saliency maps for ODIs viewed with HMDs, when the use of an eye tracker device is not possible. We collected viewport data of 32 participants for 21 ODIs and propose a method to transform the gathered data into saliency maps. The obtained saliency maps are compared in terms of image exposition time used to display each ODI in the subjective tests. Then, motivated by the equator bias tendency in ODIs, we propose a post-processing method, namely FSM, to adapt current saliency models to ODIs requirements. We show that the use of FSM on current models improves their performance by up to 20%. The developed database and testbed are publicly available with this paper.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Understanding visual attention has always been a topic of great interest in the graphics, image/video processing, robotics and human computer interaction communities. By understanding salient image regions, the compression, transmission and render- ing algorithms can be optimized. This is particularly important in omnidirectional images (ODIs) viewed with a head-mounted display (HMD), where only a fraction of the captured scene is displayed at a time, namely viewport. In order to predict salient image regions, saliency maps are estimated either by using an eye tracker to collect eye fixations during subjective tests or by using computational models of visual attention. However, eye tracking developments for ODIs are still in the early stages and although a large list of saliency models are available, no particular attention has been dedicated to ODIs. Therefore, in this paper, we consider the problem of estimating saliency maps for ODIs viewed with HMDs, when the use of an eye tracker device is not possible. We collected viewport data of 32 participants for 21 ODIs and propose a method to transform the gathered data into saliency maps. The obtained saliency maps are compared in terms of image exposition time used to display each ODI in the subjective tests. Then, motivated by the equator bias tendency in ODIs, we propose a post-processing method, namely FSM, to adapt current saliency models to ODIs requirements. We show that the use of FSM on current models improves their performance by up to 20%. The developed database and testbed are publicly available with this paper. |
0000
|
Rana, Aakanksha; Singh, Praveer; Valenzise, Giuseppe; Dufaux, Frederic; Komodakis, Nikos; Smolic, Aljosa Deep Tone Mapping Operator for High Dynamic Range Images Journal Article Forthcoming In: Transaction of Image Processing , Forthcoming. @article{Forthcoming,
title = {Deep Tone Mapping Operator for High Dynamic Range Images},
author = {Aakanksha Rana and Praveer Singh and Giuseppe Valenzise and Frederic Dufaux and Nikos Komodakis and Aljosa Smolic},
journal = {Transaction of Image Processing },
abstract = {A computationally fast tone mapping operator
(TMO) that can quickly adapt to a wide spectrum of high
dynamic range (HDR) content is quintessential for visualization
on varied low dynamic range (LDR) output devices such as movie
screens or standard displays. Existing TMOs can successfully
tone-map only a limited number of HDR content and require an
extensive parameter tuning to yield the best subjective-quality
tone-mapped output. In this paper, we address this problem by
proposing a fast, parameter-free and scene-adaptable deep tone
mapping operator (DeepTMO) that yields a high-resolution and
high-subjective quality tone mapped output. Based on conditional
generative adversarial network (cGAN), DeepTMO not only
learns to adapt to vast scenic-content (e.g., outdoor, indoor,
human, structures, etc.) but also tackles the HDR related scene-
specific challenges such as contrast and brightness, while preserv-
ing the fine-grained details. We explore 4 possible combinations
of Generator-Discriminator architectural designs to specifically
address some prominent issues in HDR related deep-learning
frameworks like blurring, tiling patterns and saturation artifacts.
By exploring different influences of scales, loss-functions and
normalization layers under a cGAN setting, we conclude with
adopting a multi-scale model for our task. To further leverage
on the large-scale availability of unlabeled HDR data, we train
our network by generating targets using an objective HDR quality
metric, namely Tone Mapping Image Quality Index (TMQI).
We demonstrate results both quantitatively and qualitatively,
and showcase that our DeepTMO generates high-resolution,
high-quality output images over a large spectrum of real-world
scenes. Finally, we evaluate the perceived quality of our results
by conducting a pair-wise subjective study which confirms the
versatility of our method.},
keywords = {},
pubstate = {forthcoming},
tppubtype = {article}
}
A computationally fast tone mapping operator
(TMO) that can quickly adapt to a wide spectrum of high
dynamic range (HDR) content is quintessential for visualization
on varied low dynamic range (LDR) output devices such as movie
screens or standard displays. Existing TMOs can successfully
tone-map only a limited number of HDR content and require an
extensive parameter tuning to yield the best subjective-quality
tone-mapped output. In this paper, we address this problem by
proposing a fast, parameter-free and scene-adaptable deep tone
mapping operator (DeepTMO) that yields a high-resolution and
high-subjective quality tone mapped output. Based on conditional
generative adversarial network (cGAN), DeepTMO not only
learns to adapt to vast scenic-content (e.g., outdoor, indoor,
human, structures, etc.) but also tackles the HDR related scene-
specific challenges such as contrast and brightness, while preserv-
ing the fine-grained details. We explore 4 possible combinations
of Generator-Discriminator architectural designs to specifically
address some prominent issues in HDR related deep-learning
frameworks like blurring, tiling patterns and saturation artifacts.
By exploring different influences of scales, loss-functions and
normalization layers under a cGAN setting, we conclude with
adopting a multi-scale model for our task. To further leverage
on the large-scale availability of unlabeled HDR data, we train
our network by generating targets using an objective HDR quality
metric, namely Tone Mapping Image Quality Index (TMQI).
We demonstrate results both quantitatively and qualitatively,
and showcase that our DeepTMO generates high-resolution,
high-quality output images over a large spectrum of real-world
scenes. Finally, we evaluate the perceived quality of our results
by conducting a pair-wise subjective study which confirms the
versatility of our method. |
Rana, Aakanksha; Singh, Praveer; Valenzise, Giuseppe; Dufaux, Frederic; Komodakis, Nikos; Smolic, Aljosa Deep Tone Mapping Operator for High Dynamic Range Images Journal Article Forthcoming In: Transaction of Image Processing , Forthcoming. @article{Forthcoming,
title = {Deep Tone Mapping Operator for High Dynamic Range Images},
author = {Aakanksha Rana and Praveer Singh and Giuseppe Valenzise and Frederic Dufaux and Nikos Komodakis and Aljosa Smolic},
journal = {Transaction of Image Processing },
abstract = {A computationally fast tone mapping operator
(TMO) that can quickly adapt to a wide spectrum of high
dynamic range (HDR) content is quintessential for visualization
on varied low dynamic range (LDR) output devices such as movie
screens or standard displays. Existing TMOs can successfully
tone-map only a limited number of HDR content and require an
extensive parameter tuning to yield the best subjective-quality
tone-mapped output. In this paper, we address this problem by
proposing a fast, parameter-free and scene-adaptable deep tone
mapping operator (DeepTMO) that yields a high-resolution and
high-subjective quality tone mapped output. Based on conditional
generative adversarial network (cGAN), DeepTMO not only
learns to adapt to vast scenic-content (e.g., outdoor, indoor,
human, structures, etc.) but also tackles the HDR related scene-
specific challenges such as contrast and brightness, while preserv-
ing the fine-grained details. We explore 4 possible combinations
of Generator-Discriminator architectural designs to specifically
address some prominent issues in HDR related deep-learning
frameworks like blurring, tiling patterns and saturation artifacts.
By exploring different influences of scales, loss-functions and
normalization layers under a cGAN setting, we conclude with
adopting a multi-scale model for our task. To further leverage
on the large-scale availability of unlabeled HDR data, we train
our network by generating targets using an objective HDR quality
metric, namely Tone Mapping Image Quality Index (TMQI).
We demonstrate results both quantitatively and qualitatively,
and showcase that our DeepTMO generates high-resolution,
high-quality output images over a large spectrum of real-world
scenes. Finally, we evaluate the perceived quality of our results
by conducting a pair-wise subjective study which confirms the
versatility of our method.},
keywords = {},
pubstate = {forthcoming},
tppubtype = {article}
}
A computationally fast tone mapping operator
(TMO) that can quickly adapt to a wide spectrum of high
dynamic range (HDR) content is quintessential for visualization
on varied low dynamic range (LDR) output devices such as movie
screens or standard displays. Existing TMOs can successfully
tone-map only a limited number of HDR content and require an
extensive parameter tuning to yield the best subjective-quality
tone-mapped output. In this paper, we address this problem by
proposing a fast, parameter-free and scene-adaptable deep tone
mapping operator (DeepTMO) that yields a high-resolution and
high-subjective quality tone mapped output. Based on conditional
generative adversarial network (cGAN), DeepTMO not only
learns to adapt to vast scenic-content (e.g., outdoor, indoor,
human, structures, etc.) but also tackles the HDR related scene-
specific challenges such as contrast and brightness, while preserv-
ing the fine-grained details. We explore 4 possible combinations
of Generator-Discriminator architectural designs to specifically
address some prominent issues in HDR related deep-learning
frameworks like blurring, tiling patterns and saturation artifacts.
By exploring different influences of scales, loss-functions and
normalization layers under a cGAN setting, we conclude with
adopting a multi-scale model for our task. To further leverage
on the large-scale availability of unlabeled HDR data, we train
our network by generating targets using an objective HDR quality
metric, namely Tone Mapping Image Quality Index (TMQI).
We demonstrate results both quantitatively and qualitatively,
and showcase that our DeepTMO generates high-resolution,
high-quality output images over a large spectrum of real-world
scenes. Finally, we evaluate the perceived quality of our results
by conducting a pair-wise subjective study which confirms the
versatility of our method. |