Basic Digital Image Editing: Theory 2/3

The theoretical background which is needed for executing tasks and processes on image editing.

Two bits of Computational Photography: Bit Depth and Color Depth

How is digital information stored? -- and what is interesting for us, how is optical digital information specifically stored? We are going to think a bit mathematically, and please relax, because mathematical thinking is a simple kind of thinking.

Our arithmetical system is decimal. This means that it is based on the logic of ten. We represent 'none' with the symbol '0', calling it zero. If we want to depict one, we use the symbol '1'. Adding one more to one, we call it 'two' and write it down as '2'. And so on, we reach '9' and our ten symbols are over. If we want one more, to the '1' we add a '0'... and so we have 10. Quite smart, isn't it?

Then, guess what? Going down in the same fashion, 11 follows, 12 after that, ... and when the 2 digit phase has ended, we have accomplished 100 different numbers - from 0 to 99. Wow!

Then, guess what? The 3 digit phase starts, and by its end, we have accomplished 1000 different numbers - from 0 to 999. Going on in the same fashion, we can represent any number from 0 to infinity. Wow!

With 1 digit, we can depict 10 values. With 2 digits, we can depict 100 (10^2 = 10 X 10) values. With 3 digits, we can depict 1000 (10^3 = 10 X 10 X 10) values. Are you following me? Surely! So simple things they are! This is exactly the arithmetical system you use every day - this is the decimal system.

This is our human system. But what happens inside computers? Well, a computer is just a machine, "stupid" enough just to be capable of "perceiving" not 10, but 2 options only -- YES either NO. Why? Because the machine can only detect electricity. If it does detect it it says "yes", if not it says "no". So in a computer we can store information not in decimal mode but only in binary mode. Our human arithmetical system has 10 symbols, but a computer's system has 2 symbols: 0 and 1. That is what there is inside any digital device - numerous numbers, comprising of 0s and 1s. Just that. These two are the so-called bits: the smallest possible unit of information.

But wait, how us humans store big numbers with only two symbols? Oh yes, in absolutely the same fashion like in the decimal system. Let us compare numbers in the binary and the decimal system:

binary    decimal

----      ----

0000      0000

0001      0001

0010      0002

0011      0003

0100      0004

0101      0005

0110      0006

0111      0007

1000      0008

1001      0009

...       ...

Decimal 255 is binary 11111111, comprising of 8 1s. This means that in the binary system with 8 digits we can describe 256 values from 0 to 255 included.

We accept that 256 different grades/tones from absolute black (0) to absolute white (255) are enough to describe a so-called black-and-white photograph, according to the distinctive ability inherent to the human eye. In other words, every virtual dot can be assigned a value on the dynamic range of 0-255. Transferring the concept to the binary arithmetic system, we need a bit depth of 8 to describe a so-called black-and-white photograph. In other words, we basically need 8 bits (0s either 1s) to describe every of the dots comprising a digital black-and-white image.

While in traditional film photography the physical dots that comprise the image are actually the grains of the film, in digital photography the virtual dots are called pixels. So in order to store the information for a digital image, we need three sets of information. Let us consider, for example, an image with dimensions 800X600 pixels:

the width - it is 800
the height - it is 600
and for everyone of these 480,000 (800 X 600) items we need a number with 8 digits.

So with these numbers we can store a black-and-white photograph inside a digital device - just using only 0s and 1s. Actually a digital image file is a database keeping record of all the pixels that comprise an image, in a specific dynamic range expressed under a certain bit depth. In digital imaging, the bit depth is defined as color depth. In the case of a black-and-white image the color depth is 8, but in a color photograph 8 is not enough, because we have more colors to describe.

A color photograph is yielded with three basic colors. To describe every pixel, instead of one number in the range 0-255 we need three different numbers. This means that we need a color depth of 24 (= 8 X 3) to describe every pixel. For different values in 1 color we need 8 bits, while for values for 3 colors we need 24 bits. A black-and-white photograph has a dynamic range of 256 (2^8) colors. A color photograph has a dynamic range of 16,777,216 (2^24) colors.

The concept of color depth reflects the accuracy with which the optical information is digitally captured and stored. Together with the resolution -as we will see- color depth affects the file size. The bigger the color depth, the bigger the file size is, since more information is stored for every pixel.

Vectors

Vectors might also be called as "graphic items images". In these digital files, the information stored is considered and included in the form of geometric items, that is to say, lines, curves, circles, ellipses, rectangles, fills etc. Fills are typically not geometric, but anyway for those file formats there are supposed to be some standards and they stand also for fills. Lines and borders can contain information for width and fills and the fills can contain colors or grades etc.

The size of every geometric shape or text (note that text is also geometric information) can be scaled losslessly. You can have it as a small icon or oversize it with no problem. A vector image file firstly defines the canvas size as a rectangle, and then every geometric item contained includes its relative position in the canvas. These images are used for designs and logos. A vector image does not have detail, only shapes.
The file size for vectors is very small, because information is stored in the form of mathematical equations. For example, for a rectangle we need the two dimensions and the position, or for a circle we need the center and the radius.

Bitmaps

On the other side we have the bitmaps, or as they are successfully called, pixel images. No geometric information here, just dots, many many dots - virtual dots, the famous pixels. Every dot represents information for color. For a one-color image, the image has one channel as said, if it is a full-color image three channels are there -- or even more possibly, as we will see.

There are two fundamental features of these images - image size and resolution. The image size seems to be quite reasonable. If an image has a size of 100X100, it has in total 10.000 pixels (100 X 100), and if the two dimensions double up - if it gets to 200Χ200 - its surface will multiply by 4 and respectively the number of pixels and the file size will get 4 times (= 2 X 2) bigger. Compared with vectors, in bitmaps the information to be stored is huge, under the necessity of storing color information for all the pixels one by one.

The second fundamental feature of a bitmap is resolution, a concept expressing the detail of the optical information. As we will see in a while, image size and resolution relate to each other and they both affect optical quality. Bitmaps are the images of detail and images for detail - just right for digital photography.

So vectors are used for logos, office cards, when we want just clean shapes, while bitmaps are just right for the detail which is much appreciated in photography. Examples for applications, for vectors are the open-source Inkscape, which handles files of the type SVG, an open standard, and on the proprietary side the known Illustrator by Adobe which makes AI files. For the rasters, we have the open-source GIMP, which is our choice, which makes XCF files, and as proprietary the well known Photoshop. An example of a mixed program for both vectors and bitmaps, is Corel, well known among graphic designers.

Antialiasing

Now let us see an actual problem that appears when vector data is used and converted as pixel data.
Say that i have this capital B.

If i increase the magnification -that is to say the projection on my screen and not the item's size itself-, the perpendicular sides of the letter seem to be OK, but the edges of curves do not appear to be clear. The same problem will come up in all sides, if i rotate the item. Edges "break" revealing unwanted zig-zag pixelated patterns, this being actually a frequency issue. Image editing programs take this into account, using an automated countermeasure called antialiasing, trying to add some intermediate pixels for better interpolation.

Image Size and Resolution: two basic properties of a bitmap.

A pixel image is a rectangle consisting of virtual dots which represent visible dots. The density of the dots is expressed as resolution, being a numeric expression of the detail of optical information. Usually it comes in dpi (dots/inch) or ppi (pixels/inch) - a number expressed as pixels per natural length unit (and by reference as well as per surface, since surface equals 'length plus length').

OK? Let us shift to image size. This is the numeric expression of the "natural" dimension of the image (width X height). But "natural" is not physical of course, actually it is virtual, it is in our mind, it is just information. In a sense, similarly when i called the image as a "rectangle", i meant the same actually. Having a physical size in mind, image size is given in length units (inches, mm, cm etc.). But alas, it is also given in pixels, because natural length and resolution are interconnected. Simply, these two go together.

For example, let me say "resolution is 300dpi". If i decrease it to 150dpi, the image dimensions automatically double - presupposed that the image stays the same - so that the number of pixels remains the same. It will be so, if i do not want to reduce the number of the virtual dots. While the size of a physical image is clearly expressed in lengh units, image size in pixels is virtual, since a pixel is virtual - it is just information.
How image resolution is determined? By the quality of its source. Talking about photography, it is the camera's sensor, which converted an optical image to relevant digital information. If it was a scanner, it would be the scanner's sensor.

So just be careful with image scaling! When increasing resolution, do not worry, no data is lost, but when decreasing resolution do not decrease the dimensions at the same time, so that you do not lose detail - except of course if it is intentional, and we will see why this could be.

Some practical examples of resolution values.

If i intend to print my image, in general 300dpi is fine. Theoretically is goes high as 1200 for supposedly high quality prints in high-end machinery, but 300 is normally good. Keep in mind, that in order to retain the best quality possible, the resolution given by the initial source must be retained.

Another case is when scanning. If i want to print at a 1:1 scale, that is to say to print at the image's natural size, i will follow the 300dpi rule. But if bigger, then i need a higher resolution. Let me say i have an old photo to be reproduced, if my photo is 10cm long and i want a 10cm print i scan at 300dpi, if i want a 20cm print i choose 600dpi.

A third case is to use an image in digital media, that is to say to be seen on screens (for example in the internet). Theoretically, one would think that the 96dpi rule should be followed, since 96dpi is the typical screen resolution. If i define a bigger resolution, i have nothing to gain in optical quality, just storing redundant information. It is important to try to keep the file size low, something important on the internet in general, to keep space and bandwidth as low as possible. But this way of thinking is theoretical and it does not apply in action, because in digital screens resolution does not matter by default - that is to say a screen yields an image simply at a pixel for pixel basis - the image size being taken into account only.

An introduction to Color and Color Models

Color can be considered in various ways, which are referred as color models. Every color model is based on basic colors as components which comprise the color channels of an image.

The RGB color model

The first interesting color model is RGB, the so-called additive model. Also called "colors of light", because by mixing three basic colors we effectively achieve a full color spectrum. Also called "true color", because it represents the maximun number of colors a monitor can yield.

The RGB color model has 3 color components: Red, Green, Blue (R, G, B). RGB is additive in a sense that the effective radiation that enters in our eye is composed of the three partial radiations of R, G and B. The distinctive feature of these three primary colors is simply their degree of lightness (intensity). Every of the three is emitted by a respective physical structure on the monitor. On a computer's monitor these tiny light structures are very difficult to see, but if you look closely at a television screen (and better if the TV shows a static image) you can see that every "light pixel" of the screen comprises of three light elements: the red one, the green one and the blue one, every one of them emitting light at a specific amount.

Every one of the three color components is represented digitally with 8 bits. RGB is a 24 bit model in total - that is to say, every pixel is described with 24 bits. So all possible colors are 2 ^ 24 = 16.777.216 colors. Or, considering the channels at the level of the image as a whole, 256 for every channel means 256 X 256 X 256 = 16.777.216 different color combinations. This is the actual dynamic range of a monitor by default. Quite enough for our eyes to see all colors!

0R+0G+0B means that no light is emitted, that is to say no light at all, just darkness - we get a black dot on the screen. 255R+255G+255B means that all channels are at full intensity - we get a white dot on the screen. If all three values on R, G and B are equal, this means that no color is stronger than the others, so we see only tones of gray in a dynamic range of 256 possible grays (0-255): which is typically called black-and-white - but more on this in a while.

The CMYK color model

CMYK is the subtractive color model. It follows the opposite logic on composing color, which is important for printing or painting, that is to say in all applications where a physical colored material falls on a surface.

In the previous image the background was black because absence of color means darkness on the screen, while in the last image the background is white because absence of color means a white paper. CMYK has three components: Cyan, Magenta and Yellow. When a color image is printed, three inks - C, M, Y - are injected onto the paper at different amounts from 0% (no ink) to 100% (full ink). If you have a color printer, you know about the three cartridges: for the cyan ink, the magenta ink, and the yellow ink. But there is one more, the black ink. K represents the black ink, the letter K being used to distinguish from "Blue" in RGB. Not contributing to color directly, black is used for making dark colors really black, when needed. Well, this must be explained a bit now.

The human eye perceives color in a given way. When we stare at an object, our eye receives reflected light - in contrast with the light that is emited by light sources that emit light, like a monitor. The eye together with the brain reacts in specific ways and we perceive colors. A surface reflects a part of the light it receives according to its nature, partly absorbing it. We see only the reflected part.

Now the funny thing is that our eye sees in RGB. When i see a red object, i perceive it as such because the green and the blue are absorbed and i receive the red. You might ask, why, for example, the combination of 229 red + 205 green + 229 blue yields a specific pink? The answer is a long story and it has to do with the way our eye sees and the way the brain interprets the optical stimulus. This comes after ages of evolution of our species and we just happen to see in a specific way. In our eye, there are light receivers for red, for green and for blue, the brain receives the signals, processes them and presents them as colors and tints. In reality there is no color in nature - there is just a linear light spectrum with a visible part with varying wave lengths. The eye receives a wave, and together with the brain, a specific signal ends as a specific conclusion of the receiver that he sees a specific pink.

Green + blue makes cyan and cyan decreases when red increases. In turn, when cyan is added, red decreases. It is really a matter of balance. So we say that CMYK is subtractive. If you add yellow, blue goes back, and vice versa. If you add green, magenta goes back, and vice versa. That said, you might think that it would be rational that if you equally mix the basic colors, then red and green and blue would be all absorbed - so you would see nothing, just dark. This is also a long story, it has to do again with the way the human eye sees. No, you would not see black, you would see dark brown. This is the reason why black is added as extra in this color model, so that when you print a color photo which has black in it, in those parts you don't want to see it as dark brown, so the printer injects black ink. This is also the reason why in classic typography there is the black ink still specifically for the letters in books, so that we do not spend three inks for black, and it would be a waste anyway. We use one black ink, and the work is done. Also this solves another technical problem, the alignment of films - if black could be yielded with three inks still the letters would not be perfectly edged, so we use one black ink and we are done.

CMYK has 4 color channels, so it has a color depth of 32 in total (4 X 8 for every channel).

The Indexed model

In the Indexed color model, as implied by its name, there is a logic of indexing. The old times of the internet, it was a kind of simplification of the RGB. Color information is not stored at the pixel level, instead there is a table with a palette of 256 values - so we need a color depth of 8 only, and we have a small file size. Those times the internet was slow and image files should be really small in order to download fast. The cost was poorer image quality of course, because while in RGB we have 17 millions colors, in this way we had only 256. For the palette there were some standards, usually the computer with an algorithm picked the image colors and created average colors to build the palette and then referred every pixel to one of the 256 table values.

There is one important thing to note here. For a typical black-and-white photo that was perfectly enough: if there is no need for color, the image is perfectly yielded with the 256 tones from the indexed palette.

The Grayscale model

The Grayscale color model is not about grays. Black-and-white photography is not about black and white, either. Do you feel confused? You should! Alright, let us make things a bit clear somehow. Now watch your steps, please!

Let me consider a RGB digital image. Normally there are many pixels with different colors. Technically speaking, there is a certain number of pixels, with every one of them being represented by a certain combination of three values (for R, G, B). Fine. Let me say that i edit the image in a way, so that every pixel gets equal values on its three color components. Now how you would call this new image? Technically speaking, it should be called something like "monochrome". That is to say, a one-color image. That would be right to say. But alas, it is called black-and-white! Like it or not, this is the way it is. Like it or not, this is history.

Look at the two photos above. You can call the first one a "color photo". No problem, this is quite reasonable, it actually is an image with many colors in it, so you might call it like this. No objection. How about the second? They call it black-and-white. Objection! It probably has black, it probably has white as well, but not necessarily: it may even have none of the two! It does not have color, so it would be better to be called "no-color photo". Or let me say, "one-color photo", that would be interesting, because this color could actually be any color, for example look at the following two images:

When the photo No. 2 is projected on your screen, the light structures of your monitor emit the equal amount of light for every of the components that make every pixel. You can be absolutely certain for that, because otherwise you would see color - and i am sure you do not see any color! What you perceive as photo No 2 might be a result of either of the two following situations:

case 1: all pixels have equal values on their color components, or
case 2: the image has only one color channel.

In case 1 you do not see actually a color photo, despite the fact that you have an image in the RGB color model. In case 2, you have a photo in the Grayscale color model. And this image might be yielded not in gray but actually in any color, like images 3 and 4. Like it or not, this is the way it is. Like it or not, this is history - this is Grayscale. When you see images 3 and 4 on your screen, they are not color photos, but you see color because the pixels are yielded on your monitor with unequal values on the color components.
Now let me edit the image in a way that i make it to have a color depth of 2 - that is pixels can have values either 0 or 1. How would you call the following image?

I would call it black-and-white. Do you agree? I bet you do, after all this mess! If you do not, take a closer look please and you will stand on my right - it really does have only black and white, i swear!

Needless to say, but it must be said: from the technical point of view the important thing is to understand the structure of a digital image - whatever this image may be - beyond words and terms that may be more or less precise or confusing.

Printing color

When we work on image editing, we work on the RGB color model. After all this discussion, this is absolutely reasonable, simply because we work on monitors - and monitors have light emiting structures that work with the RGB logic.

Theoretically, you should convert an image to CMYK as your last action just before printing. This actually was happening in the good old typography, when the last stage of prepress was the conversion of color to CMYK. Nowadays with contemporary digital printers you do not need to do this conversion to CMYK -- the printer does that automatically on the fly.

Basic Digital Image Editing

Pages

Theory 2/3 > Basic concepts