We have finished the module on modeling physically-based lighting and materials, and now we're turning to imaging. We'll start with some imaging things, transition over to animation and physical simulation things, and then come back to more imaging topics (processing, sensors, etc.).
This note starts talking about cameras and lenses, and how to use real optical models of cameras and lenses to derive path tracing extensions in order to simulate images that have convincing depth-of-field.
Coming back to this scene, one thing that we noticed is that there appeared to be some depth-of-field happening here; the imaginary scene is quite sharp at the edges of the cup, but is blurred behind the cup. Depth of field is so visually characteristic that if we don't include it in photographs, they don't look real!
Our eyes have a finite aperture with the kinds of optics that are shared with our camera model. A simple proof of this: look at your hand in front of your face, and notice the blurred background (and then change your focus to look at the background). Depth-of-field is not just in our photographs, but is in our constant visual perception. It's a core visual aspect of photorealistic rendering.
To begin, let's take a look at cameras, separate from graphics/ray tracing, as real optical objects - and how images are formed inside of them.
This is a cross-section of a fairly fancy camera. The lens is not one piece of glass; it's multiple, layered individual lens elements stacked together. Light comes in through the lens and hits the mirror (to the viewfinder). When the capture button is pressed, the mirror pops out of the way, and the light hits the image capture sensor.
This pinhole model is a good model of what's happening inside of cameras, with one additional complication from lenses.
![BOTTOM: a picture of a man goes through a pinhole and forms an image. Physically, rays of light that pass through the pinhole result in an inverted picture; with one pinhole, we'll get a dim/noisy picture.
TOP: all the rays emanating from the tip of the person's cigar converge to that single point on the image through the area of the camera lens.](https://s3-us-west-2.amazonaws.com/secure.notion-static.com/54fb0243-4eda-466d-b754-e66f29e3e922/Untitled.png)
BOTTOM: a picture of a man goes through a pinhole and forms an image. Physically, rays of light that pass through the pinhole result in an inverted picture; with one pinhole, we'll get a dim/noisy picture.
TOP: all the rays emanating from the tip of the person's cigar converge to that single point on the image through the area of the camera lens.
In a wave optic sense, lenses also push the diffraction limit out; larger lenses can have a better diffraction limit, so the larger the lens, the sharper the image will be (and the more blurry the background will be).
The shutter exposes a sensor (and pixels on the sensor) for a precise duration.
The shutter slides to the bottom to expose the sensor, and then something slides from the top-down to cover the sensor. Those are called shutter curtains in a mechanical shutter design. Each of those pixels on the sensor surface is adding up the light that's arriving on it (the irradiance) to form the image.
There are other ways to set the duration of light integration (mechanical, electrical, etc.).
Each of the pixels on the image sensor surface isn't just a bare pixel; it has a colored gel in front of it so that the amount of light integrated at that pixel has been tinted reddish/greenish/blueish. We apply algorithms to reconstitute the R/G/B values at every pixel in order to construct an image.
When we take a photo and look at the final image, we see only a portion of the world - that's the field of view. Things that are outside the boundary of that picture are outside of that field of view.