In this project, I colorized very old black and white photographs. The colors came from a combination of three images,each taken with a different color filter, and thus giving us the three rgb color channels needed for a color photo. The difficult part of this project was aligning these three images, as they were not taken from the exact same position, nor aligned perfectly when they were uploaded together. In this overview, I will walk you through the approach that i took to combine these images, and the problems I encountered along the way
The first approach that I took was the most simple one. First, I split the image into three equal parts. Then, I performed an exhaustive search over a -15 to 15 pixel area and took the best alignment of the two channels. In order to determine the best alignment of the two channels, I implemented two different image metrics: Sum of Squared Distances and Normalized Cross Correlation. The results from these metrics are in the results section below.
Normalized cross correlation did not work very well, and sum of squared distances does
produce some good-looking results. Across the many more images, however, it is clear that some of the images
need a much more rigorous metric to determine the best alignment.
Additionally, this implementation was very slow on the larger tif files, which sometimes also require larger
search windows. In order to solve this problem. I implemented search with an image pyramid. This meant that
I first searched a small window using a very scaled down image (1/16 or 1/32, I didn't notice a big
difference between the two). Then, when I moved one step down in the image pyramid, I searched a small area
around the previously determined best alignment. This allowed for a very large speedup, making the
conversion for most large images take ~7 seconds to complete.
If your algorithm failed to align any image, provide a brief explanation of why.
The original problem that I encountered with aligning images was buggy code. Without much of a way to test, I struggled to figure out if my alignmnet was a result of a bug in my code, or just because my metrics were not good enough to properly align images. After a lot of frustration, I finally did the wise thing and created a test image that I knew the exact offsets for and could therefore test my code with.
Although there are some very promising results as shown above, there are also some images that fail with the same metrics. The reason for this can vary with each image so I will talk about a couple specific cases.
For this image, I believe that the streaks on the bottom of the image throw off my alignment algorithm. The red streak is quite large and distinct, which causes the ssd to have a high error if this streak is not aligned. Clearly, we do not care if it is aligned, but we are not the ones determining the offsets. To test out this theory, I cropped the lady image by 50 pixels on each side for each channel and then ran this again. As you can see, this improved the alignment a lot and the red light streak is much smaller.
For the other images that failed, I observed that they had the same strange borders or image degradations that we saw with the lady. Here is another example. The top channel has a very dark border that is throwing off the alignment of the train.
These were an attempt at solving the issues we saw above.
My first attempt at fixing this was to naively weight the pixels in the image so that the SSD would return a higher error when the center was not aligned, instead of biasing towards the borders. I did this by creating an array that had very large values in the center of the image, and low ones around the borders. This did not work well as it was operating under the assumption that the center was always the most important part of the image.
My second attempt at fixing this was to weight the pixels using the canny edge detector. This caused the edges in the image to be weighted much higher than the other parts of the image. This did not directly solve the border problem, however it did weight the important parts of the image since there were borders detected around people and other important objects.
Instead of trying to improve the image metrics further, I instead opted to implement automatic border cropping in an attempt to clean up the images and hopefully get better aligned images. I tried two different methods for this.
My first attempt was using canny edge detection. However, I found it difficult to determine the best way to use the information gleaned from this edge detector. The detector outputs an array the same size as the image, but with boolean values for each pixel: edge or no edge. This array was very noisy as it was not clear what was a border aka what we wanted to crop, or just an edge in the image aka very useful information. In order to circumvent this, I only searched over the first 1/8th or last 1/8 of the image when looking for the border of the image. Now in this window, I then took the column with the maximum number of edge detections, and cropped the image on this column. This did not give good results and the crops often cropped off important parts of the image instead of the borders. This is the reason why I tried to use this for the pixel weighting as described above, as it seemed much more useful in this case.
In order to determine where the actual borders of the image were, I instead simplified my edge detection algorithmm by taking the gradient only along the horizontal. This meant taking the value of Image[j, k + 1] - Image[j, k - 1] for each pixel. This was very successful in detecting only the left and right borders of the image, and nothing else. Here is an example:
In order to detemine where the edges were. I then counted all gradients above 150 as an edge, and everything else as not an edge. Then I took the sum for each column and took the argmax for the left half and the argmax for the right half. Here is a plot of the column sums as as reference for the utility of this method.
As you can see, it is very clear where the left and right border is, and these calculations are not distracted by the other non vertical edges in the scene. All of the images in the results section are using this edge removal algorithm, but here are before and after images to see the real affect of this method.
|
|
The last thing that I tried was to detect the horizontal borders and determine a better way to split the image in to three separate channels. However, this did not work at all. The edges that were detected were in the center of the image and rarely on the horizontal line. Looking at some more images, these separation points are not very distinct a lot of the time. Some of these borders are just light grey, which made it very difficult to detect with the canny edge detection or by taking a vertical gradient of the image. So I reverted back to the original separation method.
All Example Images
|
|
|
|
|
|
|
|
|
|
|
|
|
The result of your algorithm on a few examples of your own choosing, downloaded from the Prokudin-Gorskii collection.
|
|
|