Large Format Digital

H. G. Dietz

Department of Electrical and Computer Engineering
Center for Visualization & Virtual Environments
University of Kentucky, Lexington, KY 40506-0046

Original August 22, 2012, Latest Update May, 21, 2013

This document should be cited using something like the bibtex entry:

author={Henry Gordon Dietz},
title={{Large Format Digital}},
institution={University of Kentucky},
howpublished={Aggregate.Org online technical report},

In photography, large format has most often meant recording an image that is about 4 by 5 inches -- the 4x5 format. There are three major types of large-format cameras commonly available: monorail, field camera, and press camera (see 4x5 Cameras: a round-up for a summary of what's available). Monorail "view" cameras are typically designed to be very solid, very adjustable, and not very portable; they are great for making the highest quality images in controlled conditions, using ground glass to focus before inserting film. Field "flatbed" cameras mount the bellows on a fold-out flat bed, making adjustments a little harder, but allowing the camera to be folded into a compact form... although you'll still be carrying a tripod to use one. Press cameras are essentially field cameras with optical viewfinders, rangefinders, and flash attachments so you can take a photo fairly quickly without a tripod... although from seeing them in old movies you'd think they can't be used without a cigar in your mouth. There are also copy cameras, pinhole cameras, and other special-purpose cameras using standard large-format film.

However, this paper is mostly about the film -- and what it can be replaced with. The idea of replacing it is very old. Long ago, 4x5 cameras often were used with glass plates. Holders for 4x5 "cut film" have been standardized for a lot longer than I've been alive. My dad had a Polaroid film holder for his 4x5 so he could get both a negative and an instant print for checking the image. There were also various types of roll-film backs... and in the 1970s, I made an alternative back using parts from a Minolta MC extension tube to mount my 135-format camera (a Minolta SRT101) on a Burke & James 4x5. Well, about 8 years ago, I made a digital film holder with USB output using the sensor board from a $14 webcam. That's what you see in the photo above and is described further in Big Old Camera, tiny New Sensor. In 2012, I published an instructable on making a Large Format Adapter For Your Mirrorless Camera.

I know -- turning a grand old 4x5 into a tilt/shift bellows for a mirrorless body, or worse still into a small-sensor webcam, seems rather demented. It isn't very practical. Combining the B&J's Kodak Ektar 127mm f/4.7 lens with that tiny sensor gives something that resembles a powerful telescope more than a webcam. Image quality isn't exactly great either. However, it's not a complete blur... which means there's a lot of resolution that could be captured with a sensor that actually is 4x5.

In fairness, I should admit that 4x5 does not really make 4"x5" images, but slightly smaller. We'll ignore that difference here because it is largely a matter of precise dimensions of the film holder... and we'll be replacing that with a digital sensor anyway. Perhaps more interesting is the fact that my Burke & James 4x5, including lens, is much lighter and occupies a smaller total volume than the average "full frame" DSLR with one of the high-end zoom lenses that folks are usually using on them. The fact is that getting large-format coverage out of a lens tends to mean the lens is relatively thin (to avoid internal vignetting), so large format cameras often have comparatively tiny and very lightweight lenses. The point is that a large format camera and sensor will not fit in your pocket, but it can be less to carry than a high-end DSLR kit.

Of Diffraction And Airy Disks...

We all know that photons behave like waves, which means that no matter how wonderful a camera is, the wave nature of light will limit resolution by diffraction. An excellent discussion of diffraction appears in MellesGriot Technical Guide, Vol 2, Issue 2. Fraunhofer diffraction of a uniformly-illuminated circular aperture yields a central bright spot size of 2.44*wavelength*aperture, where aperture is the f/number of the lens. The diameter of the aperture and lens focal length do not play any independent role.

There is a little complication, however, in that the image is not being formed by a single wavelength. Most sensor technologies used in digital cameras respond to any of a wide range of photon wavelengths, and these response curves are manipulated by imposing a Bayer pattern of red, green, and blue filters over pixels. The filters actually pass disturbingly wide and overlapping portions of the spectrum, as can be seen in the Adjusted RGB Curves of various cameras. There is signficant response from about 375nm to beyond 950nm. The blue-filtered sensels have peak sensitivity around 450nm with a second, smaller, peak in the NIR around 830nm. Green-filtered sensels peak around 530nm and have relatively low NIR sensitivity -- this is close to what one would expect, as 550nm is often quoted as the reference green wavelength. Red-filtered sensels peak around 600nm, and slowly decay at longer wavelengths.

Plugging-in the relevant wavelengths, we get the following table of spot sizes in um:

Wavelength f/1 f/1.4 f/2 f/2.8 f/4 f/5.6 f/8 f/11 f/16 f/22 f/32 f/45
375nm (UV limit) .9 1.3 1.8 2.6 3.7 5.2 7.3 10.3 14.6 20.7 29.2 41.3
450nm (Blue) 1.1 1.6 2.2 3.1 4.4 6.2 8.8 12.4 17.5 24.8 35.1 49.6
530nm (Green) 1.3 1.8 2.6 3.7 5.2 7.3 10.3 14.6 20.7 29.2 41.3 58.4
600nm (Red) 1.5 2.1 2.9 4.1 5.9 8.3 11.7 16.5 23.4 33.1 46.8 66.1
950nm (NIR limit) 2.3 3.3 4.6 6.6 9.3 13.1 18.5 26.2 37.0 52.4 74.1 104.7

This table is somewhat disconcerting in that it suggests diffraction is limiting resolution on many digital cameras. For example, the Sony NEX-7's 24MP sensor has sensels that are 3.9um across -- which the table suggests will certainly be showing diffraction effects by f/5.6. In fact, for NIR even f/2 would be showing diffraction effects.

However, the table isn't quite as tough a limit as one might expect. Resolution involves distinguishing nearby points from each other. Although it is somewhat arbitrary, the Rayleigh Criterion is that two adjacent spots must have non-overlapping peaks to be resolved. Of course, there is no reason to assume these spots would be aligned with the sensel grid, so the need for Nyquist-rate sampling still applies... essentially doubling the acceptable spot size relative to the sensel size. Then there is also the spatial patterning of the Bayer filter....

It also is far from clear that classical wave theory -- the theory behind the above table -- offers a valid model of the ultimate limit on resolution. In the manufacture of circuits in modern VLSI chips, a variety of optical processes are used to create features with sizes much smaller than the wavelength of the photons used. In 2011, Intel was producing 22nm features in their chips. In 2013, they'll have 14nm features. An "R&D pipeline" slide for Intel projects details as fine as 5nm within the next few years. There is also a not-yet-fully-understood phenomenon called Extraordinary Optical Transmission (EOT) in which it has been observed that a regularly-spaced set of subwavelength apertures in a metalic film can allow much more light to be transmitted than classical theory predicts... which may be relevant when considering the regular array of sensel apertures.

In summary, it seems theoretically possible that there's plenty of resolution, but what is there really?

How Much Resolution Is There?

Well, it's pretty common to see folks on the internet saying that shooting 4x5 film and scanning it is a cost-effective way to get the quality comparable to the best, most expensive, medium-format digital backs. For example, one test showed that a 39MP medium-format digital back made images not quite as good as 4x5 film drum scanned. However, that's not really the right comparison, because we're not talking about using film for the 4x5....

If film isn't the limiting factor, most likely lens resolution is. So, how good is resolution of a typical 4x5 lens? One list of large-format lens resolutions is here. Here is a fairly large table of 4x5 lenses with center, middle, and edge line pair per mm resolutions at various apertures. Although most lenses are fairly good independent of age, lens scores did range from 10 to 85. The average performance was 54.3 center, 51.2 middle, and 39.3 edge. Let's call that 50. At the apertures typically used on 4x5 cameras, that is very close to the theoretical diffraction limit.

If the pattern is perfectly aligned, resolving a line pair would take two pixels (one dark, one light). However, there is no reason to assume such an alignment exists, in which case Nyquist turns this into four pixels. Thus, we want each pixel to be 1/200mm wide -- or 5um (five microns), which is neither unusually large or small compared to pixel sizes on commodity sensors. On a 4x5 sensor, it would mean 20,320x25,400 pixels, or a total of 516,128,000 pixels. That's essentially half a gigapixel.

Of course, that 516MP image does not have sharp single-pixel-wide detail. The Nyquist adjustment basically means one could argue that real resolution would only be 129,032,000 pixels. Let's say that again: only 129MP. Although there are higher resolution scanning digital backs, the highest resolution simultaneous capture back widely marketed captures 80MP on a medium format camera (for which few, if any, lenses are going to come close to resolving such fine detail). Thus, we're still talking about significantly higher useful resolution than current digital systems.

Just for fun, let's assume you used one of the highest resolving 4x5 lenses instead. 85 lines pairs would translate to a pixel 2.9um on a side. That's a tad small by current standards, but would not require any new fabrication technology. The resulting 4x5 pixel count would be just shy of 1.5GP. In case that seems like too much of an increase for going from 50 to 85 line pairs per mm, recall that the increase happens in two dimensions, not just one. Of course, at the Nyquist limit for the best lenses, although spatial positioning of detail can be accurate at the pixel level, no lens would be rendering single-pixel-wide detail.

It's All A Blur

Without single-pixel-wide detail, images will look pretty soft when viewed close-up at full resolution... right? Well, not exactly.

One of the biggest problems with digital cameras is Moire patterns. Detail near the resolution limit can beat against the regular spatial sampling frequency of the pixels, causing false patterns to appear in the captured image. Most digital cameras include anti-alias filters to deliberately "blur" the image so that Moire is reduced; these are really low-pass filters removing spatial frequencies near or above the pixel sampling frequency. The really cool thing about sampling at or beyond the Nyquist rate for your lens is that Moire naturally cannot occur -- there is no need for an anti-alias filter.

Actually, there's more to this. The microlens arrays used on most digital cameras are usually thought of as increasing fill factor. Any sensor pixel (sensel) is going to have some portion of its area not sensitive to light because there must be a small gap between sensels to ensure that charge doesn't leak from one to another. Most sensel designs lose additional area for wires and other circuitry; minimizing the area lost to wires is a lot of what rear illumination is about. However, microlens arrays also play a role in averaging-in detail that is a fraction of the pixel size, thus allowing a lighter anti-alias filter to be used. With our Nyquist-rate sensel spacing, a microlens array would only increase sensitivity -- so it's not really needed. Not needing microlenses is a very good thing because they add cost and introduce various types of image quality defects by making sensels overly sensitive to ray angles and potentially causing significant color shifts due to the color-dependent index of refraction of the microlens. After all, microlenses are very simple lenses, not highly-corrected compound structures.

While we're talking about the filter stack on the sensor, do we still need that Bayer CFA (color filter array)? Well, certainly we don't if monochrome images are what we want. Nearly all commodity cameras capture color, but the Leica M-Monochrom is a nice proof that somebody thinks there is a high-end market for monochrome cameras. If we do want color, there are lots of ways to get it -- not just CFAs. For example, the pixels themselves could be made to have distinctive color sensitivity profiles so that color information can be extracted; that's essentially what the Foveon sensel stacks do. In any case, thanks to our sensels being spaced according to Nyquist, an anti-alias filter shouldn't be needed in order to ensure that color information doesn't fall between pixels of the right color.

What Do We Need All Those Pixels For?

Hopefully the above has given a good technical motivation for trying to build a 4x5 digital camera, but what do we need all those pixels for?

Sometimes, you need to print really large. Building-scale murals are not going to look so great coming from a mere 36MP DSLR. My guess is that this is a lot of why 4x5 film cameras are still being made and sold.

The ability to zoom in by cropping is probably more often useful. Aside from recording events, artwork, manufacturing processes, etc. in extreme detail, such high resolution would make this the ultimate surveillance cammera. Those rediculous enlargements of details they seem to show in every episode of CSI would actually work when you have half a gigapixel. Using a fisheye lens may make such a camera a particularly powerful surveillance tool.

Low light imaging is another good application of this technology. Although individual sensels would not have any better noise characteristics than similar-sized sensels in a lower-resolution camera, being able to digitally combine pixels to produce a lower-noise image is a very powerful technology. In fact, much of the noise seen in low-light images is not due to the sensel, but literally granularity of light. Photon shot noise, statistical variations in how many photons of each wavelength are hitting a sensel over time, is becoming increasingly significant. By combining data for many sensels, a larger area -- and hence more photons -- can be averaged. In addition, the fact that each sensel's data is separately available allows more sophisticated analysis of the variation in photon arrival rates, so techniques from simple median filtering to more sophisticated processing can be used instead of simply numerically averaging across sensels.

The Aggregate. The only thing set in stone is our name.