Join our team – we are hiring! New Research Fellow in Creative Technologies!

The post will research, develop, pilot and demonstrate a set of professional tools and techniques for making content ‘smarter’, so that it is fully adaptive in a broad, unprecedented manner: adaptive to context (which facilitates re-use), to purpose (among or within industries), to the user (improving the viewing experience), and to the production environment (so that it is ‘future-proof’). The approach is based on research into computer animation; automated classification and tagging using deep learning and semantic labelling to describe and draw inferences; and the development of tools for automated asset transformation, smart animation, storage and retrieval. These new technologies and tools will show that a vast reduction of costs and increases in efficiency are possible, facilitating the production of more content, of higher quality and creativity, for the benefit of the competitiveness of the European creative industries.

All important information is here!

User Interaction for Image Recolouring using L2

 

Recently, an example based colour transfer approach proposed modelling the colour distributions of a palette and target image using Gaussian Mixture Models, and registers them by minimising the robust L2 distance between the mixtures. In this work we propose to extend this approach to allow for user interaction. We present two interactive recolouring applications, the first allowing the user to select colour correspondences between a target and palette image, while the second palette based application allows the user to edit a palette of colours to determine the image recolouring. We modify the L2 based cost function to improve results when an interactive interface is used, and take measures to ensure that even when minimal input is given by the user, good colour transfer results are created. Both applications are available through a web interface and qualitatively assessed against recent recolouring techniques.

The video below shows our interactive recolouring tools in action:

Video: User Interaction for Image Recolouring using L2

Links to both online demos coming soon!

References

[1] User Interaction for Image Recolouring using L2 , Mairéad Grogan, Rozenn Dahyot and Aljosa Smolic, In Proceedings of Conference on Visual Media Production, London, UK, December 2017 (CVMP ’17).

Seminar presentation by Professor Julián Cabrera Quesada, Universidad Politécnica de Madrid

 

 

 

 

 

 

 

 

Title:

Stochastic optimal control of HTTP Adaptive streaming


Abstract:

HTTP Adaptive Streaming (HAS) is becoming a key technology for audiovisual broadcasting through IP networks. This technology has been adopted and developed by important vendors such as Microsoft, Apple or Adobe and the creation of an MPEG standard (MPEG-DASH) has also contributed to its success for multimedia broadcasting.  Important IP content providers such as Netflix, Amazon, HBO etc., are using this technology for their video on-demand services, and traditional IPTV providers such us Movistar TV,  are also moving to this technology for their live broadcasting services.

One of the key elements in HAS technology is the player at the client side which has to make decisions in order to provide the best possible video quality. These decisions have to consider the dynamic network conditions, the device features and the user ?s profile and preferences. In this talk, the behaviour of the player will be described and formulated as a Markov decision problem and solutions based on Stochastic Dynamic Programming and Reinforcement learning will be presented.

Short Bio:

V-SENSE is delighted to welcome Professor Julián Cabrera Quesada as Visiting Professor until July 2018. Professor Julián Cabrera Quesada is Associate Professor in Signals, Systems and Radiocommunications at the Department at the Telecommunication School of the Universidad Politécnica de Madrid (UPM) and Researcher at Image Processing Group (Grupo de Tratamiento de Imágenes). He lectures in Digital Image Processing, Transmission systems, Digital Television, Video Coding, Audiovisual Communications, and Reinforcement Learning.  He has participated in more than 25 research projects funded by European programs, Spanish national programs and private companies. Current research interests cover several topics related to audio-visual communications, advance video coding for UHD, 3D and Multiview scenarios, depth estimation and coding, video subjective quality assessment for Multiview  and VR360 video, and  optimization of adaptive streaming techniques. He is working on the application of deep learning approaches to depth estimation and 3D reconstruction.

Event information:

12-1pm, Tuesday, 24th Oct 2017
Large Conference Room, O’ Reilly Institute

A Framework for Quality Control in Cinematic VR Based on Voronoi Patches and Saliency

Abstract:

In this paper, we present a novel framework for quality control in cinematic VR (360-video) based on Voronoi patches and saliency which can be used in post-production workflows. Our approach first extracts patches in stereoscopic omnidirectional images (ODI) using the spherical Voronoi diagram. The subdivision of the ODI into patches allows an accurate detection and localization of regions with artifacts. Further, we introduce saliency in order to weight detected artifacts according to the visual attention of end-users. Then, we propose different artifact detection and analysis methods for sharpness mismatch detection (SMD), color mismatch detection (CMD) and disparity distribution analysis. In particular, we took two state of the art approaches for SMD and CMD, which were originally developed for conventional planar images, and extended them to stereoscopic ODIs. Finally, we evaluated the performance of our framework with a dataset of 18 ODIs for which saliency maps were obtained from a subjective test with 17 participants.

Results:

Sharpness Mismatch Detection

Color Mismatch Detection

Depth Analysis

Publication:

Simone Croci, Sebastian Knorr, Lutz Goldmann, Aljosa Smolic
A Framework for Quality Control in Cinematic VR Based on Voronoi Patches and Saliency
International Conference on 3D Immersion, Brussels, Belgium, Dec. 11-12, 2017

V-SENSE at the Intermedial Beckett Symposium Saturday, 14 October 2017, 11am – 7pm

Samuel Beckett changed the theatre forever by using the new media of his time. Since his death in 1989, the analogue stage and screen technologies of the 20th century have given way to various forms of digital telepresence, and experiments in translating Beckett across media abound. In partnership with the Dublin Theatre Festival and the Trinity Long Room Hub, The Trinity Centre for Beckett Studies will curate a day of presentations, conversations, and lectures by leading experts and artists to discuss the impact of intermedial performance, contemporary art, and Beckett’s legacy.

Schedule:
11:00-1:00: Beckett and Theories of Intermediality
Anna McMullan — “Samuel Beckett: Intermedial Legacies”
David Houston Jones — “Samuel Beckett: Face, Installation, Embodiment”
Panel Discussants: Matthew Causey, Derval Tubridy, Catherine Laws
Chair: Nicholas Johnson

1:00-2:00: Lunch (provided)
Virtual Play installation in the Hoey Ideas Space

2-3:30: Beckett in Virtual Reality
Panel discussion with the members of the V-SENSE research project
Aljosa Smolic, Néill O’Dwyer, Enda Bates, Nicholas Johnson

3:30-4:00: Coffee (provided)

4:00-6:00: Beckett and Practices of Intermediality
Derval Tubridy — “Intermediality, Agency, Diversity”
Catherine Laws — “Beckett, Music, and the Intermedial”
Discussants: Ciaran Clarke, Angela Butler, Anna McMullan
Chair: Julie Bates

6:00 PM: Launch of the Trinity Centre for Beckett Studies
Jane Ohlmeyer, director of the Trinity Long Room Hub
Sam Slote, director of the Trinity Centre for Beckett Studies

The symposium is hosted by Dr Nicholas Johnson, Prof David Houston Jones, Dr Catherine Laws, Prof Anna McMullan, Dr Sam Slote, and Dr Derval Tubridy. Kindly supported by interdisciplinary seed funding from the Trinity Long Room Hub, and in partnership with the Dublin Theatre Festival, the School of Creative Arts, and V-SENSE (SFI-funded project held by Prof. Aljosa Smolic).

This event is free but does require registration- you can register for this event here

Campus LocationTrinity Long Room Hub
Accessibility: Yes
Room: Neill Lecture Theatre
Event Type: Alumni, Arts and Culture, Conferences, Lectures and Seminars, Public
Type of Event: One-time event
Audience: Undergrad, Postgrad, Alumni, Faculty & Staff, Public
Cost: Free (but registration is required)

 

Light Field Denoising by Sparse 5D Transform Domain Collaborative Filtering

In this paper, we propose to extend the state-of-the-art BM3D image denoising filter to light fields, and we denote our method LFBM5D. We take full advantage of the 4D nature of light fields by creating disparity compensated 4D patches which are then stacked together with similar 4D patches along a 5th dimension. We then filter these 5D patches in the 5D transform domain, obtained by cascading a 2D spatial transform, a 2D angular transform, and a 1D transform applied along the similarities. Furthermore, we propose to use the shape-adaptive DCT as the 2D angular transform to be robust to occlusions.
Results show a significant improvement in synthetic noise removal compared to state-of-the-art methods, for both light fields captured with a lenslet camera or a gantry.
Experiments on Lytro Illum camera noise removal also demonstrate a clear improvement of the light field quality.

This paper was given a Top 10% Paper Award at the MMSP 2017 conference held in Luton, UK.

Implementation

The C/C++ source code will soon be available !

Additional results

Visual results complementing the paper are shown below.
We then show in Tables 1 and 2 the average results presented in the paper. The corresponding detailed results are shown in Table 3 and 4.

Visual results

We show in videos below (click on a σ to start a video) side by side comparisons of noisy and de-noised light fields for different noise levels (σ corresponds to the white gaussian noise standard deviation). On the top left corner of each video is highlighted the sub-aperture image being displayed. Note that some videos may exhibit encoding artifacts.

Below, we show results for lenslet camera (Lytro) noise removal.

Average PSNR results

The ΔPSNR lines in Tables 1 and 2 correspond to the PSNR gap between the proposed approach and the best state-of-the-art method. Values highlighted in bold correspond to the best performing method for a given noise level.

Table 1 – Average denoising performances in PSNR for the EPFL dataset (Lytro Illum).
σ=10 σ=20 σ=30 σ=40 σ=50
HF4D 31.070 25.798 22.607 20.338 18.586
BM3D 35.421 32.852 31.357 30.247 29.321
BM3D EPI 36.088 33.476 31.905 30.712 29.671
VBM4D 36.075 33.522 31.923 30.674 29.630
VBM4D EPI 36.129 33.510 31.925 30.719 29.721
LFMB5D 1st step 34.388 32.810 31.684 30.743 29.911
LFMB5D 2nd step 36.503 34.214 32.868 31.843 30.987
ΔPSNR 0.374 0.692 0.943 1.124 1.266
Table 2 – Average denoising performances in PSNR for the Stanford dataset (Gantry).
σ=10 σ=20 σ=30 σ=40 σ=50
HF4D 30.577 25.432 22.213 19.921 18.164
BM3D 38.805 35.268 33.126 31.560 30.267
BM3D EPI 38.895 35.645 33.489 31.896 30.594
VBM4D 39.269 35.588 33.212 31.436 30.019
VBM4D EPI 38.697 35.604 33.558 32.026 30.809
LFMB5D 1st step 39.340 35.817 33.377 31.496 29.971
LFMB5D 2nd step 40.389 37.772 36.031 34.6713 33.511
ΔPSNR 1.120 2.127 2.473 2.645 2.701

Detailed PSNR results

Table 3 – Denoising performances in PSNR for each light field in the EPFL dataset (Lytro Illum).
Ankylosorus & Dipplodocus 1 Bikes Color Chart 1 Danger de Mort Desktop Flowers Fountain & Vincent 2 Friends 1 ISO Chart 12 Magnets 1 Stone Pillar Outside Vespa
HF4D
σ=10 30.881 30.940 31.038 31.460 31.160 31.302 30.791 30.987 30.901 30.943 30.973 31.463
σ=20 25.383 25.622 25.957 26.081 26.196 26.071 25.654 25.933 25.478 25.393 25.688 26.116
σ=30 22.047 22.459 22.775 22.953 23.072 22.965 22.478 22.832 22.139 22.006 22.556 22.999
σ=40 19.672 20.247 20.489 20.758 20.816 20.766 20.200 20.614 19.756 19.583 20.345 20.810
σ=50 17.846 18.553 18.715 19.069 19.063 19.064 18.426 18.890 17.925 17.721 18.634 19.121
BM3D
σ=10 36.094 34.760 35.711 35.039 35.706 34.250 34.308 35.518 34.948 36.044 33.928 38.750
σ=20 34.604 31.834 33.640 32.082 33.034 30.996 31.373 32.804 32.537 34.485 30.870 36.310
σ=30 33.661 30.175 32.426 30.368 31.528 29.182 29.734 31.231 31.027 33.405 29.319 34.226
σ=40 32.865 28.968 31.456 29.136 30.432 27.910 28.559 30.083 29.876 32.452 28.310 32.919
σ=50 32.140 27.939 30.606 28.140 29.531 26.893 27.574 29.135 28.912 31.564 27.565 31.849
BM3D EPI
σ=10 36.131 35.589 34.977 36.097 37.061 35.831 34.637 36.763 35.065 36.286 35.284 39.332
σ=20 34.754 32.635 32.726 33.077 33.894 32.534 31.818 34.141 32.440 34.881 32.280 36.532
σ=30 33.853 30.979 31.256 31.275 31.903 30.607 30.199 32.494 30.966 33.921 30.765 34.638
σ=40 32.951 29.783 30.126 29.839 30.781 29.188 29.032 31.192 29.963 32.937 29.658 33.094
σ=50 32.032 28.779 29.102 28.542 29.866 27.992 28.126 30.079 29.219 31.936 28.637 31.746
VBM4D
σ=10 35.966 35.628 35.401 36.008 36.464 35.967 34.742 36.631 35.326 36.101 35.467 39.195
σ=20 34.219 32.772 33.302 33.049 33.869 32.776 32.104 33.914 33.040 34.421 32.483 36.310
σ=30 33.036 31.065 32.014 31.266 32.242 30.840 30.557 32.241 31.549 33.190 30.714 34.358
σ=40 32.013 29.792 30.947 29.941 30.965 29.406 29.391 30.965 30.350 32.103 29.388 32.831
σ=50 31.074 28.762 29.999 28.868 29.891 28.282 28.430 29.900 29.316 31.117 28.327 31.592
VBM4D EPI
σ=10 36.374 35.692 35.515 36.059 36.467 35.648 34.638 36.685 35.334 36.326 35.537 39.271
σ=20 34.303 32.865 33.081 33.175 33.860 32.708 31.892 34.057 32.885 34.330 32.589 36.376
σ=30 33.025 31.185 31.618 31.446 32.317 30.966 30.300 32.445 31.391 33.058 30.918 34.433
σ=40 31.978 29.951 30.467 30.172 31.131 29.689 29.119 31.225 30.242 32.006 29.721 32.927
σ=50 31.045 28.960 29.502 29.145 30.129 28.678 28.161 30.203 29.280 31.081 28.767 31.697
LFMB5D 1st step
σ=10 35.269 33.123 33.287 34.717 34.300 34.016 32.885 34.820 33.792 35.167 34.173 37.103
σ=20 34.333 31.594 32.098 32.660 32.773 32.097 31.117 33.280 31.997 34.435 31.745 35.597
σ=30 33.690 30.415 31.343 31.207 31.652 30.567 29.958 32.071 30.791 33.772 30.367 34.370
σ=40 33.024 29.459 30.688 30.011 30.725 29.266 29.080 31.089 29.851 33.073 29.393 33.259
σ=50 32.334 28.616 30.037 28.993 29.881 28.221 28.366 30.166 29.055 32.290 28.577 32.396
LFMB5D 2nd step
σ=10 35.854 36.275 35.263 36.868 36.767 36.813 34.941 37.406 35.661 36.062 36.213 39.910
σ=20 34.791 33.577 33.612 33.954 34.538 33.690 32.334 34.958 33.423 35.200 32.942 37.549
σ=30 34.216 32.002 32.736 32.295 33.192 31.779 30.873 33.482 31.975 34.565 31.272 36.030
σ=40 33.667 30.851 32.028 31.037 32.156 30.352 29.872 32.382 30.910 33.938 30.143 34.773
σ=50 33.086 29.908 31.390 30.027 31.292 29.245 29.106 31.463 30.040 33.267 29.246 33.769
Table 4 – Denoising performances in PSNR for each light field in the Stanford dataset (Gantry).
Amethyst Bracelet Chess Eucalyptus Flowers Jelly beans Lego Bulldozer Lego Knights Lego Truck Tarot Cards and Crystal Ball (Large Angle) Tarot Cards and Crystal Ball (Small Angle) The Stanford Bunny Treasure Chest
HF4D
σ=10 31.374 30.464 31.362 29.466 31.330 29.850 30.352 31.315 29.080 31.050 31.361 29.913
σ=20 25.862 25.201 25.816 25.175 25.468 25.243 25.210 26.000 24.738 25.430 25.578 25.463
σ=30 22.543 21.864 22.576 22.154 22.035 22.181 21.966 22.813 21.660 22.088 22.189 22.492
σ=40 20.200 19.508 20.284 19.908 19.671 19.967 19.646 20.570 19.396 19.756 19.818 20.325
σ=50 18.406 17.735 18.525 18.143 17.901 18.259 17.867 18.845 17.650 17.989 18.017 18.633
BM3D
σ=10 37.549 38.438 40.829 35.177 44.865 37.429 40.135 38.394 38.154 37.914 39.814 36.962
σ=20 33.964 34.565 37.542 31.246 41.867 33.814 36.666 34.917 34.578 34.267 36.744 33.047
σ=30 31.925 32.290 35.398 29.093 39.696 31.680 34.489 32.796 32.445 32.099 34.788 30.816
σ=40 30.518 30.645 33.777 27.600 37.985 30.156 32.872 31.259 30.884 30.511 33.320 29.195
σ=50 29.422 29.271 32.412 26.354 36.548 28.934 31.566 30.037 29.586 29.175 32.137 27.764
BM3D EPI
σ=10 38.915 38.879 41.029 35.688 43.871 37.076 38.430 39.171 36.684 37.955 41.338 37.703
σ=20 36.234 35.472 38.871 31.760 41.685 33.122 34.473 36.180 32.283 34.970 39.329 33.358
σ=30 34.481 32.174 37.207 29.398 40.229 30.714 32.317 33.970 29.493 33.040 37.754 31.090
σ=40 31.200 31.430 35.894 27.748 39.080 29.031 31.015 32.307 27.596 31.557 36.484 29.412
σ=50 29.875 28.992 34.800 26.529 38.127 27.814 30.117 31.003 26.232 30.331 35.305 27.997
VBM4D
σ=10 38.561 38.997 41.004 36.531 43.941 38.334 40.253 39.164 37.906 38.433 40.225 37.882
σ=20 34.974 35.156 37.391 32.917 40.098 34.558 36.525 35.580 34.190 34.830 36.736 34.094
σ=30 32.657 32.748 34.921 30.679 37.411 32.142 34.066 33.211 31.893 32.551 34.476 31.792
σ=40 30.944 30.959 33.041 29.020 35.341 30.370 32.212 31.440 30.192 30.853 32.763 30.099
σ=50 29.592 29.526 31.535 27.704 33.667 28.980 30.729 30.035 28.834 29.494 31.384 28.755
VBM4D EPI
σ=10 38.417 38.139 41.133 34.468 44.278 36.959 39.320 38.809 36.960 37.853 40.927 37.097
σ=20 35.772 34.897 38.521 31.829 41.131 33.521 36.227 35.460 33.350 35.077 38.127 33.340
σ=30 34.016 32.810 36.608 29.999 38.950 31.355 34.077 33.236 31.105 33.233 36.245 31.061
σ=40 32.708 31.263 35.095 28.634 37.267 29.785 32.443 31.556 29.496 31.851 34.832 29.384
σ=50 31.668 30.033 33.864 27.561 35.901 28.556 31.175 30.202 28.241 30.749 33.706 28.054
LFMB5D 1st step
σ=10 38.847 38.907 40.797 36.710 43.926 38.630 40.351 39.385 37.043 38.432 40.826 38.229
σ=20 35.529 35.037 37.415 33.399 40.399 35.026 36.972 35.873 33.355 35.228 36.787 34.786
σ=30 33.190 32.515 35.162 31.306 37.294 32.624 34.781 33.479 30.940 32.856 33.911 32.469
σ=40 31.368 30.857 33.466 29.658 34.400 30.828 33.038 31.664 29.122 30.773 32.078 30.702
σ=50 29.862 29.587 32.046 28.174 32.257 29.321 31.588 30.176 27.749 29.021 30.743 29.129
LFMB5D 2nd step
σ=10 39.593 39.884 42.158 37.263 45.636 39.603 41.598 40.357 38.354 39.344 41.881 39.003
σ=20 37.020 36.988 40.045 34.332 43.826 36.762 38.936 37.852 35.371 36.673 39.383 36.071
σ=30 35.257 35.305 38.437 32.575 42.281 34.878 37.167 36.062 33.618 35.007 37.560 34.227
σ=40 33.874 34.077 37.056 31.286 41.016 33.438 35.761 34.592 32.301 33.708 36.090 32.858
σ=50 32.685 33.020 35.851 30.142 39.920 32.155 34.598 33.343 31.269 32.615 34.837 31.693

 

Related publications

2017

Alain, Martin; Smolic, Aljosa

Light Field Denoising by Sparse 5D Transform Domain Collaborative Filtering Inproceedings

IEEE International Workshop on Multimedia Signal Processing (MMSP 2017), 2017.

Abstract | Links | BibTeX

 

Virtual Play, after Samuel Beckett

Virtual Play is a reinterpretation of Samuel Beckett’s ground-breaking 1963 text, Play, with a view to engaging a 21st Century viewership that is increasingly accessing content via virtual reality technologies. It is the inaugural creative arts/cultural project by V-SENSE under their creative technologies remit. V-SENSE is a leading computer science research group at Trinity College Dublin championing, among others, the development of 3D reconstruction techniques, Virtual Reality (VR), Augmented Reality (AR) and Mixed Reality (MR) technologies for the creative cultural industries. This project has been conceived in order to demonstrate how VR content can be produced both cheaply and expertly, and therefore challenges the notion that sophisticated VR content is exclusively the domain of wealthy institutes and production houses. The core technology enabling this novel type of creative production, i.e. VR/AR content creation based on 3D reconstruction techniques, has been developed by V-SENSE researchers David Monaghan, Jan Ondrej, Konstantinos Amplianitis and Rafael Pagés.

Under the guidance of resident V-SENSE digital media artist Néill O’Dwyer (Producer), this virtual reality response to Play attempts to push the limits of possibility in consumable video and film by eliciting the new power of digital interactive technologies, in order to respond to Samuel Beckett’s deep engagement with the stage technologies of his day.

A central goal of the project is to address ongoing concerns in the creative cultural sector, regarding how to address the question of narrative progression in an interactive immersive environment. It is believed that by placing the viewer (audience) at the centre of the storytelling process, they are more appropriately assimilated to the virtual world and are henceforth empowered to explore, discover and decode the story, as opposed to passively watching and listening. This is something that has been effectively harnessed by the gaming sector using procedural graphics and animation, but film and video have struggled to engage this problem effectively, in terms of audio-visual capture techniques. As such, this project attempts to investigate these new narrative possibilities for interactive, immersive environments.

In order to investigate this problem, V-SENSE drafted in the expertise of Samuel Beckett scholar Nicholas Johnson (Director), who is assistant professor in Trinity College’s Department of Drama and secretary and co-director of the newly established Trinity Centre for Beckett Studies. By joining the project, Nick brings with him a wealth of knowledge in relation to the complexities, technicalities and nuances of staging Beckett productions. The project is complementary to his ongoing work with the Samuel Beckett Laboratory and forthcoming research project Intermedial Beckett, and will feed into research questions in contemporary Beckett Studies and the methodologies of interdisciplinary practice-as-research. Three professional actors – Colm Gleeson, Caitlin Scott and Maeve O’Mahony – round out the Drama team. All three are trusted collaborators of Nicholas Johnson and have experience with Beckett texts in performance. A high degree of precision from the actors is crucial to the success of the project, because of the difficulty in post producing video footage captured on multiple devices.

In terms of the mise en scène, the strategy consists in constructing a 3D re-interpretation of Beckett’s scene and characters, which he describes as ‘lost to age and aspect’ (Beckett, 1963), using bespoke 3D reconstruction techniques. The actors are recorded against a green screen, using a multiple camera setup. Their foreground masks are extracted from the background using novel segmentation algorithms. These masks are combined to create a dynamic photo-realistic 3D reconstruction of every actor in the scene. These reconstructions are then imported into a game engine and combined with virtual set elements, in order to create the immersive VR/AR/MR experience. The game engine software is also used to implement the rules and conditions that define the user interaction and behaviour.

In order to help embellish the immersive nature of the scene, V-SENSE have also drawn in collaboration from Enda Bates (Sound Designer), a lecturer on the Music Media Technologies (MMT) masters programme, in the Department of Electrical and Electronic Engineering. Enda’s work with the Spatial Audio Research Group in Trinity and the ongoing Trinity360 project concerns the use and production of spatial audio using 6 degrees of freedom (6DoF) for Virtual Reality, Augmented Reality and 360 Video. Enda deploys Ambisonic audio and spatial audio SDKs for game engines in order to give the user a perception of depth, distance and audio directivity in the virtual world. How the audio can be implemented for volumetric video capture is the main focus of Enda’s contribution to the research project. Can this be achieved by synthesizing different directivity patterns, or is it necessary to also set up a similar volumetric audio capture? The audio is also a central triggering device for drawing the users’ attention, in order to progress narrative at certain spatio-temporal junctures.

The project represents an important milestone for the V-SENSE research project as a whole, because it is the inaugural artistic-cultural experiment under the creative technologies remit, as defined by Prof. Aljosa Smolic in his procurement of funding from Science Foundation of Ireland (SFI). It represents a significant effort, not only within the research group by drawing together discrete research areas within computer science, but also in the college as a whole, because it engenders interdisciplinary collaboration across the departments of Computer Science, Drama and Electrical and Electronic Engineering.

2017

O’Dwyer, Néill; Johnson, Nicholas; Bates, Enda; Pagés, Rafael; Ondrej, Jan; Amplianitis, Konstantinos; Monaghan, David; Smolic, Aljosa

Virtual Play in Free-viewpoint Video: Reinterpreting Samuel Beckett for Virtual Reality Inproceedings Forthcoming

16th IEEE International Symposium on Mixed and Augmented Reality (ISMAR), IEEE Xplore digital library, Forthcoming.

Abstract | Links | BibTeX

Depth Map Estimation from Light Field Images

Depth map estimation is a crucial task in computer vision, and new approaches have recently emerged taking advantage of light fields, as this new imaging modality captures much more information about the angular direction of light rays compared to common approaches based on stereoscopic images or multi-view.
We propose a novel depth estimation method from light fields based on existing optical flow estimation methods.
The optical flow estimator is applied on a sequence of images taken along an angular dimension of the light field, which produces several disparity map estimates.
Considering both accuracy and efficiency, we choose the feature flow method as our optical flow estimator.
Thanks to its spatio-temporal edge-aware filtering properties, the different disparity map estimates that we obtain are very consistent, which allows a fast and simple aggregation step to create a single disparity map, which can then converted into a depth map.
Since the disparity map estimates are consistent, we can also create a depth map from each disparity estimate, and then aggregate the different depth maps in the 3D space to create a single dense depth map.

This paper received the Jonathan Campbell Best Paper Award at The Irish Machine Vision and Image Processing Conference, 2017.

Related publications

2017

Chen, Yang; Alain, Martin; Smolic, Aljosa

Fast and Accurate Optical Flow based Depth Map Estimation from Light Fields Inproceedings

Irish Machine Vision and Image Processing Conference (Received the Best Paper Award), 2017.

Abstract | Links | BibTeX

Video Coding and Streaming for Virtual Reality

With technical advances in virtual reality (VR) devices, the media technology field is evolving toward providing immersive VR video experiences. To this end, the 360-degree video, which can be rendered by modern head-mounted displays for VR at a sufficiently high frame-rate and resolution, is attracting an increasing amount of attention in the context of VR video applications.

Given the significant video consumer growth of demand for VR, delivery of 360-degree video is one of the most important fields that require cost-effective solutions to achieve widespread proliferation of VR technology. Considering its data-intensive representation and the best-effort nature of the Internet, delivery of 360-degree videos requires advanced video coding and streaming techniques to offer an enhanced VR experience.

In the V-SENSE project, we are investigating novel compression and streaming techniques applied to the 360-degree video.

2017

Ozcinar, Cagri; Abreu, Ana De; Knorr, Sebastian; Smolic, Aljosa

Estimation of optimal encoding ladders for tiled 360° VR video in adaptive streaming systems Conference Forthcoming

The 19th IEEE International Symposium on Multimedia (ISM2017), Taichung, Taiwan, Forthcoming.

BibTeX

Ozcinar, Cagri; Abreu, Ana De; Smolic, Aljosa

Viewport-aware adaptive 360° video streaming using tiles for virtual reality Inproceedings

2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 2017.

Abstract | Links | BibTeX