Deep-learning to Predict Perceived Differences to a Hidden Reference Image

Pacific Graphics 2019


Mojtaba Bemana1     Joachim Keinert 2     Karol Myszkowski1     Michel Bätz 2     Matthias Ziegler 2     Hans-Peter Seidel1     Tobias Ritschel3

1 MPI Informatik

2 Fraunhofer IIS

3 University College London

Abstract

Image metrics predict the perceived per-pixel difference between a reference image and its degraded (e.g. re-rendered) version. In several important applications, the reference image is not available and image metrics cannot be applied. We devise a neural network architecture and training procedure that allows predicting the MSE, SSIM or VGG16 image difference from the distorted image alone while the reference is not observed. This is enabled by two insights: The first is to inject sufficiently many un-distorted natural image patches, which can be found in arbitrary amounts and are known to have no perceivable difference to themselves. This avoids false positives. The second is to balance the learning, where it is carefully made sure that all image errors are equally likely, avoiding false negatives. Surprisingly, we observe, that the resulting no-reference metric, subjectively, can even perform better than the reference-based one, as it had to become robust against mis-alignments. We evaluate the effectiveness of our approach in an image-based rendering context, both quantitatively and qualitatively. Finally, we demonstrate two applications which reduce light field capture time and provide guidance for interactive depth adjustment.

Materials

Interactive Comparisons

Scene

Metric

Strategy



Distorted image A

Hidden reference image B

The image input to our method and the reference metric.
By clicking on the image, the differences will be visible.
The clean image, i.e. A, without artifacts. Not input to our method.

Our response on A

GT response on A

Our response on B

This is our attempt to emulate the GT response on A (seen right of it). Ideally structures and magnitude are the same. The original metric response on the tuple A, B. We try to reproduce its structures without seeing B. Our response on B, the clean image. As there are no artifacts, the metric should also not report any. Ideally the output is blue, i.e., zero, without false positives.

Our perceptualized response on A

Perceptualized metric response on A

GT user annotation on A

The same as our response on A above, but after a linear fit to user responses. Ideally, this should be close to the user anotations. It is less relevant if it similar to the perceptualized metric response. The same as the metric response on A above, but after a linear fit to user responses. Ideally, this should be close to the user anotations. The user annotation from 10 subjects.

Our on shift(B)

Metric on shift(B), B

A image shift does not create errors. A successful method will have a null response, on a clean shifted image B. This is often the case. A image shift does not create errors. A successful method will have a null response, on a clean shifted image B.
This is often not the case.

Our on shift(A)

Metric on shift(A), A

Still, when an image with error is shifted, the error must not be 0 but remain as a true positive. So this image should look like a shifted copy of the non-shifted error map. Repeating this with a classic metric also produces a shifted response.

Histogram

Metric Response





Citation

Mojtaba Bemana, Joachim Keinert, Karol Myszkowski, Michel Bätz, Matthias Ziegler, Hans-Peter Seidel, Tobias Ritschel
Learning to Predict Image-based Rendering Artifacts with Respect to a Hidden Reference Image
Computer Graphics Forum (Proc. Pacific Graphics 2019)

@article{Bemana2019,
	author	= {Mojtaba Bemana, Joachim Keinert, Karol Myszkowski, Michel Bätz, Matthias Ziegler, 
	           Hans-Peter Seidel, Tobias Ritschel},
	title	= {Learning to Predict Image-based Rendering Artifacts with Respect to a Hidden Reference Image},
	journal	= {Computer Graphics Forum (Proc. Pacific Graphics)},
	year	= {2019},
	volume = {38},
	issue = {7},
}    
	

Acknowledgements

This work was partly supported by the Fraunhofer-Max Planck cooperation program within the framework of the German pact for research and innovation (PFI) and a Google AR/VR Research Award. We acknowledge Fraunhofer IIS, Stanford, Google, and Technicolor institute for providing us the light field datasets.



Imprint / Data Protection