Understanding the Geometry of Photographic Images Using Deep Learning
18th October 2018Proposed by Koustav Ghosal, Sebastian Lutz
Email: {ghosalk / lutzs} at scss.tcd.ie
The goal of this project is to explore neural network architectures suitable for understanding the geometric properties of a photographic image. The geometric properties refer to the arrangement of subjects within a photographic image. They are about how and where the subjects are positioned rather than appearance-based properties such as colour, texture etc.
Figure 1 : Image Courtesy : www.dpchallenge.com
While CNNs do a very good job at understanding the appearance and texture of natural images, their understanding of the geometry is limited [2]. It is due to their translation-invariant filters [2]. In other words, for a CNN the right image in Figure 1(d) is a regular face as it fails to understand that the position of objects is also important. Recently proposed architectures such as Capsule Networks[1], Coord-Conv Net [3] addressed this problem.
However, geometry of a photographic image is a crucial component. Several styles of photographic composition depend on it. For example, frame within a frame ( Figure 1(a)), Spiral/Circular framing (Figure 1(b)), The Rule of Thirds (Figure 1(c)) are examples of some strategies adopted by photographers which depend mainly on how the subjects are positioned in the photograph.
In this project the tasks will cover but not limited to
- Review literature about geometric deep learning and geometry-modelling techniques for photographic images.
- Identify some common geometric attributes in photography and create a dataset
- Propose a new approach or extend an existing one and evaluate the algorithm on the dataset.
References
1. Sabour, S., Frosst, N. and Hinton, G.E., 2017. Dynamic routing between capsules. In Advances in Neural Information Processing Systems (pp. 3856-3866).
2. Ghosal, K., Prasad, M. and Smolic, A., A Geometry-Sensitive Approach for Photographic Style Classification.
3. Liu, R., Lehman, J., Molino, P., Such, F.P., Frank, E., Sergeev, A. and Yosinski, J., 2018. An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution. arXiv preprint arXiv:1807.03247 .