Image Blending and Compositing
The goal of this project is twofold: computational photography in the frequency domain and the gradient domain. In the frequency domain, I create hybrid images, implement Gaussian and Laplacian stacks, and blend images seamlessly using multiresolution blending.
Sharpening images works by taking a low-pass filter of the original image using a gaussian filter, using these values to extract high frequencies from the orginal, then adding these values back (multiplied by some fraction, alpha). The result is a sharper image, in which edges are more prominent.
We can sharpen the blurry image of my childhood dog (left) with this technique (right). Brings back good memories.
Using an approach described in the SIGGRAPH 2006 paper by Oliva, Torralba, and Schyns, I create hybrid images, which are static images that change in interpretation as a function of the viewing distance. The basic idea is that high frequencies tend to dominate perception when available, but at a distance, only the low frequency (smooth) part of the signal can be seen.
Take a look at the above image of Trump, then move farther away from your screen, or look at the scaled down image on the right. Putin, in disguise.
By blending the high frequency portion of one image (Trump) with the low frequency portion of another (Putin), we get a hybrid image. We can take the Fourier transform and display a detailed plot of frequencies in each image.
Shown above are the high frequencies of the Trump image (left), the low frequencies of the Putin image (middle), and the combined hybrid frequencies (right).
Changes of Shanghai over time. Before (left), and after (right).
Shown above are the high frequencies of the before image (left), the low frequencies of the after image (middle), and the combined hybrid frequencies (right).
For another visualization, I include the high frequency (left) and low frequency (right) filter applied to each image above.
Sometimes, the frequencies in the two images result in a hybrid image failure. I tried combining high-resolution NASA photographs of the Earth and Moon, but the high frequencies in the Moon were not able to dominate over the rich details and contrasts of our pale blue dot.
Source images shown above.
Combined hybrid image (above). High frequencies from the Moon are barely noticeable.
Shown above is the high frequency plot from the Moon image (left), and the low frequency plot from the Earth image (right). As we can see, the Fourier transform for the moon produced a radial blur of frequencies, rather than concentrated horizontal or vertical lines depicted in previous hybrid images.
Fourier transforms allow us to see the rich frequencies in each image. While the hybrid image technique of layering high and low frequencies provides a functional way of creating images that change with viewing distance, Gaussian filters are far from an ideal low-pass filter. It remains necessary to find compatible image pairs, such as the politicians, where size/shape, contrast, and alignment yield a high quality hybrid image. Additional preprocessing techniques could be added, but the simplicity of the hybrid approach is compelling, especially considering the results.
In this part, I construct Gaussian and Laplacian stacks, which are a way of visualizing different frequency bands in an image. Gaussian stacks are generated by continuously filtering each level of an image (calculated with a series of low pass filters), consequently stacking lower and lower frequencies. Conversely, Laplacian stacks store high frequencies, starting from the highest frequencies, which we can calculate by taking the difference between the corresponding Gaussian layer and the level above it.
By visualizing the Gaussian and Laplacian stacks, we can analyze the structure contained in each layer. Shown below are Gaussian stacks (top) and Laplacian stacks (bottom) for the Lincoln and Gala, Mona Lisa, and Starry Night.
I also calculate the Gaussian and Laplacian stacks for an image of a galaxy.
Now we can get on to the exciting business of blending two images together! Using a multiresolution blending technique described in the 1983 paper by Burt and Adelson, we compute a gentle seam (image spline) between two images separately at each band of the image frequencies, resulting in a much smoother seam.
From a high level, we read in two images and a mask, construct a Gaussian stack of the mask, which will provide a subtle transition between the images, make a Laplacian stack of the input images, and for every layer of the stacks, combine the two images in the Laplacian stacks with the corresponding mask from the Gaussian stack.
Apple + Orange = Oraple
Sun + Earth
Adding a gradient to the mask yields a scorched Earth effect, pictured below. I've also included images of the gradient mask used in the Oraple and Scorched Earth images, and the black and white mask used in the Sun + Earth image above.
Scorched Earth, Mask
She had a galaxy in her eyes, a universe in her mind
We can also use an irregular mask shown above (right) to create an interesting multiresolution blend.
Terrifying. I've also included the Gaussian and Laplacian stacks below.
Here I input the images in the wrong order, but hey, it's cool.
In this part of the project, we take advantage of the fact that the human eye is perceptually tuned to gradients, or changes, rather than absolute values. Using gradient domain fusion, objects or textures are blended from a source image into a target image.
First we check if theory aligns with application, and we reconstruct an image given the gradient of the original image and a single pixel value from the original. We denote the intensity of the source image at (x,y) as s(x,y) and the values of the image to solve for as v(x,y). For each pixel, then, we have three objectives:
minimize ( v(x+1,y)-v(x,y) - (s(x+1,y)-s(x,y)) )^2, where the x-gradients of v should closely match the x-gradients of s.
minimize ( v(x,y+1)-v(x,y) - (s(x,y+1)-s(x,y)) )^2, where the y-gradients of v should closely match the y-gradients of s.
minimize (v(1,1)-s(1,1))^2, where the top left corners of the two images should be the same color.
We have formulated the problem as a system of linear equations Ax=b, where x represents the vectorized pixels, b is a vector containing the results to the gradient equations, and A is a sparse matrix (where the rows correspond to x and y gradients for each pixel, and the final row encodes the top left corners of the two images, which should be the same color). Using least squares, we can solve and reconstruct the original image.
Pictured above is the original image (left) and reconstructed image (right). Pretty cool! The calculated error is small:
With the gradients of an image, we can use Poisson blending to find values for target pixels that maximally preserve the gradient of a source region without changing the background.
Given the pixel intensities of the source image
s and the target image
t, we solve for new intensity values
v within the source region
S, specified by a mask. This can be formulated as a least squares problem.
i is a pixel in
S and each
j is a neighbor of
i (one of the four surrounding pixels). We want to preserve the gradients of the source image while seamlessly blending into the target image.
Below I include the source and target image, the blended image with the source pixels directly copied into the target region, and the final blended result.
Source image of a plane (left), target image of the Bay Area from a hike I did around Mt. Tam (right).
Naive copy and paste (pictured above).
Poisson blend. Now we have a plane doing an emergency landing in a less than ideal location.
Below are two more results for Poisson blending, including one that doesn't work so well (failure example).
Above picture I took in Venice (left) and an Orca (right).
An Orca reaching to kiss a Venecian in a gondola.
The problem above is that the gradient of the source image is rough (there is a lot of variation), while the gradient of the target image is smooth. The resulting blend tries to spread gradients across the two images, which generates noticeable artifacts. Further, the scale is definitely off, but it does look like a giant parrot nearly hanging on to the side of a cliff, so that's cool.
Below I compare Laplacian pyramid blending from Part 1 with the Poisson blending techniques. Shown below is multiresolution blending (left) and Poisson blending (right).
Clearly, multiresolution blending is preferred if we are trying to preserve color and pixel intensities; however, Poisson blending, albeit slower, produces a better blend of images, with fewer noticeable intensity changes. It's clear the image on the right is preferred because we also preserve the pupil and its color.
This project was a lot of fun! Interacting with these visualizations made the code much more engaging to work with. While editing the images was inherently time consuming, generating cool optical illusions made up for the painstaking bugs and misalignments.
- CS 194-26 Course Staff at UC Berkeley
- Professor Alexei (Alyosha) Efros