Introduction
A image montage is an image made up of a number of smaller photographs. Together, the smaller photographs look like the one single bigger photograph. An example of a image montage is this movie poster from "The Truman Show:"
The image montage zoomer will be a screensaver which will display an image montage on the screen. The screensaver will then zoom into that image montage "forever." More specifically, as the zoomer magnifies one part of the image montage, one of the smaller pictures will grow until it becomes visible that the smaller picture is another image montage, and so on.
The idea behind the image montage zoomer came from a discussion with Charless Fowlkes where he proposed a video version of the zoomer involving video clips from the point of view of someone running through a forest. The simpler image montage zoomer idea came from a later discussion. We will be working on this project jointly, and this page serves as a place for me to write down my thoughts in order to help me clarify them.
Issues
- One obvious plentiful and possibly interesting source of data for the image montage zoomer would be a library of personal digital photos. JPEG decompression takes about 1 second per 2 megapixel photo on my P-III 500 MHz computer. Assuming 1000 library photos used to create the montages, it would take more than 16 minutes to load all of the images from their original compressed state.
- Consider a representation of each image in our collection as a 16-bit 256 × 256 bitmap. Each such image would consume 128KB of memory. A collection of 1000 such images would consume 125MB. It is not reasonable to store all of the images in physical memory.
- As the window "zooms" forever into the first image, the corners of the rectangle represented by the edges of the screen will trace out a path through the original image. That path should be at least C1 continuous (continuous first derivative) to avoid the perception of abrupt changes in motion. It is possible that C2 (continuous second derivative) continuity will be needed. We will need to pick zoom targets in advance in order to calculate such C1 and/or C2 paths.
Image Montage Architecture
In this section, we describe the workings of the system.
General Algorithm and Definitions
(not written yet)
OpenGL Rendering [proposed]
OpenGL can handle transparency and scaling of images in a platform independent, hardware accelerated manner, which makes it an attractive option for rendering the montages. This section describes the use of the OpenGL library to implement the Image Montage Zoomer.
We create a scene containing the following coplanar elements.
|
![]() Our scene. NUMTILESX = 4, NUMTILESY = 4 |
We start initially with Rectangle #1 filling the screen. After some time, we have smoothly shrunk and moved the displayed area such that Rectangle #2 is filling the screen.
At this point, we do the following steps:
- Reset the projection back to its initial settings, such that Rectangle #1 is again filling the screen.
- Take the texture that was mapped onto Rectangle #2 and map it onto Rectangle #1. Do the same for each of the rectangles in Montage #2 and Montage #1.
- Pick a new image in Montage #1 to zoom at.
- Move Rectangle #2 and Montage #2 to juxtapose the new zoom target.
- Download
Issue: This last set of steps seems will be significantly more expensive than simply re-calculating the projection window. This may cause a momentary pause in the animation each time a new image reaches the point of filling the screen. It may be possible to smooth out these operations by adding in a few new textures and completely transparent rectangles in their new locations each animation frame, instead of waiting until we have zoomed all the way into Rectangle #2. Alternatively, the motion path of the window could be a function of wall clock time such that the lengthy calculation simply causes the motion of the screen to jump ahead rather than momentarily pause.
Image Cache
One obvious plentiful and possibly interesting source of data for the image montage zoomer would be a library of personal digital photos. Consider a library of 1000 2 megapixel photos, stored with "high quality" JPEG compression. On a P-III 500 MHz computer, decompressing one such library would take more than 15 minutes at 1 second per photo. A 15 minute startup time for a screensaver is clearly unacceptable, so it will not be possible to load all of the images at startup.
Furthermore, consider a representation of each of the images as a 16-bit 256 × 256 bitmap. Each such image would consume 128KB of memory. 1000 such images would consume 125MB. Storing all of the images in memory, even at this low resolution, would consume too much physical memory.
To rectify these issues, we introduce a three-tier image information cache. The tiers of the cache are as follows.
- Mosaic Summary
- Small Texture
- Full Texture
Other parts of the application can request either the Mosaic Summary, the Small Texture, or the Full Texture from the Image Cache. If said information is already available, the Image Cache simply returns it. However, if the information has not yet been calculated, the Image Cache will cacluate the information on demand. The calculated version is then stored internally (as descrbied below) to service future requests.
Tier 1: Mosaic Summary
This is a summary, stored in memory, of the color information in the image, used in the Montage Algorithm.
The Mosaic Summary is represented as a single triple {R, G, B}. R is calculated by averaging all of the red components in each of the pixels in the Small Texture. Similarly, G is the average green component, and B is the average blue component.
Tier 2: Small Texture
The Small Texture is the Full Texture image, scaled down to a width of TEXTURE_WIDTH / NUM_TILES_X and a height of TEXTURE_HEIGHT / NUM_TILES_Y. It is used for the images in the montage that will never be displayed at high resolution.
The Small Texture tier is stored in memory for quick access. If the maximum collection size that we wish to support makes this infeasible, it may be possible to move some or all of the Small Texture tier to the disk, in uncompressed format.
Tier 3: Full Texture
This is the original image scaled to the width of TEXTURE_WIDTH and the height of TEXTURE_HEIGHT, ignoring the original aspect ratio. This image is stored in compressed format on the disk in a special cache directory.
Motion Path Calculation
The motion path problem can be viewed as follows:
The screen saver is generating an infinite sequence of points in 2-D space corresponding to the corners of each of the images that it is zooming toward. A motion path will need to be calculated based on this sequence of points which:
- must pass through every point, in order
- will ideally have a continuous second derivative such that the motion does not change direction suddenly
- must be computable without the entire sequence of points. obviously the motion path will never be calculated if we must produce the entire infinite sequence first
- can be plotted for points whose distances apart along the motion path are a geometric series.
One option is to drop requirement 2 and use bezier curves constrained to be C1 continuous. They look simple and quick to calculate and C1 continuity might be good enough for our purposes.
Software Rendering
Not all computers will be equipped with an OpenGL compatible card. Thus we implement a all-software rendering routine in case it is not possible to run the program at full speed using OpenGL. This will allow anyone to get at least an idea of what the software does.We visualize our scene as containing the following elements:
|
![]() Our scene. NUMTILESX = 4, NUMTILESY = 4 |
Currently, the software renderer currently only renders Rectangle #1 and Montage #1. We calculate, for each frame, a "virtual" rectangle, within the coordinate space of Rectangle #1, which will represent the edges of the physical screen.
We then scan convert the entire screen into Rectangle #1 and Montage #1 coordinates. if the destination coordinate is changing by at most 1 for every change in screen coordinates, we use Bresenham's line drawing algorithm. However, if the destination coordinate changes by more than 1 for every change in screen coordinates, we use a modified version of Bresenham's Algorithm derived as follows.
Let Δx = x2 - x1
Let Δy = y2 - y1
Let m = Δy / Δx
e = 0
y = y1
for x = x1 to x2 do
plot (x, y)
if (e + frac(m) < 0.5)
e = e + frac(m)
y = y + int(m)
else
e = e + frac(m) - 1
y = y + int(m) + 1
fi
done
If Δy and Δx are integers, we know that:
= frac(Δy / Δx)
= (Δy % Δx)/(Δx)
and
= int(Δy / Δx)
=Δy / Δx
We convert the if guard to integer math by multiplying by 2 × Δx and making the replacements: e' = eΔx, r = (Δy % Δx)
e + (Δy % Δx)/(Δx) < 0.5
eΔx + (Δy % Δx) < 0.5 Δx
2 (eΔx + (Δy % Δx)) < Δx
2 (e' + (Δy % Δx)) < Δx
2 (e' + f) < Δx
We make the same replacements in the update calculations for e.
e Δx = e Δx + frac(m) Δx
e' = e' + frac(m) Δx
e' = e' + (Δy % Δx)/(Δx) Δx
e' = e' + (Δy % Δx)
e' = e' + f
e Δx = e Δx + frac(m) Δx - Δx
e' = e' + frac(m) Δx - Δ x
e' = e' + (Δy % Δx)/(Δx) Δx - Δx
e' = e' + (Δy % Δx) - Δx
e' = e' + f - Δx
Which gives us the modified Bresenham's algorithm as follows:
e' = 0
y = y1
for x = x1 to x2 do
plot (x, y)
e' = e' + f
y = y + dy/dx
if ((e' << 1) >= dx)
e = e - dx
y = y + 1
fi
done
Montage Algorithm
Start with a set of images S. Each image in the set has:- A name, which is unique to that image
- A width, a positive integer number
- A height, a positive integer number
- width × height pixel color values
We also define a "mosaic summary" and a "summary closeness" function. We may wish to explore different possible "mosaic summary" and "summary closeness" functions at a later time. For now, for the purposes of describing this algorithm, we pick mosaic summary and summary closeness as follows:
The "mosaic summary" is the average color of an image, calculated by taking the arithmetic mean of all of the color values in the image.
The "summary closeness" function is the perceptual distance between two "mosaic summary" colors.
The algorithm has three steps:
- Calculation of "mosaic summaries" for all of the images in S
- Calculation of sub-"mosaic summaries" for different sub-images in S
- Matching of sub-"mosaic summaries" to full image "mosaic summaries" using the "summary closeness" function
Calculation of "mosaic summaries"
Calculation of sub-"mosaic summaries"
Matching using "summary closeness"
References
- Enlightenment Project. Imlib2. http://www.enlightenment.org/pages/imlib2.html
- Flanagan, Colin. The Bresenham Line-Drawing Algorithm. http://www.cs.helsinki.fi/group/goa/mallinnus/lines/bresenh.html
- Gnome Project. GdkPixbuf. http://developer.gnome.org/arch/imaging/gdkpixbuf.html
- Google search on Natural Splines. http://www.google.com/search?q=natural+splines
- ImageMagick.org ImageMagick. http://www.imagemagick.org/
- Microsoft Corporation. Handling Screen Savers. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/shellcc/platform/shell/programmersguide/shell_adv/scrnsave.asp
- Perry, Matt. Fireflies Screensaver. http://somewhere.fscked.org/fireflies/
- Riemersma, Thiadmer. Colour metric. http://compuphase.com/cmetric.htm
- Silvers, Robert. http://photomosaic.com/
- Troebs, Michael. Algorithms to create Image Montages. May 30, 2000. http://www.stud.uni-hannover.de/~michaelt/juggle/Algorithms.pdf.
- Troebs, Michael. Juggle. http://www.stud.uni-hannover.de/~michaelt/juggle/
- Tugrul Galatali. Really Slick Screensavers Port to GLX. http://rss-glx.sourceforge.net/


