A few scanning tips

www.scantips.com

B&H Photo - Video - Pro Audio

Image File Formats - JPG, TIF, PNG, GIF
Which to use?

The most common image file formats, the most important for cameras, printing, scanning, and internet use, are JPG, TIF, PNG, and GIF.

Best file types for these general purposes:

  Photographic Images Graphics, including
Logos or Line art 
Properties Photos are continuous tones, 24 bit color or 8 bit Gray, no text, few lines and edges Graphics are often solid colors, up to 256 colors, with text or lines and sharp edges
For Unquestionable Best Quality TIF or PNG (lossless compression
and no JPG artifacts)
PNG or TIF (lossless compression,
and no JPG artifacts)
Smallest File Size JPG with a higher Quality factor can be decent. TIF LZW or GIF or PNG   (graphics/logos without gradients normally permit indexed color of 2 to 16 colors for smallest file size)
Maximum Compatibility
(PC, Mac, Unix)
TIF or JPG TIF or GIF
Worst Choice 256 color GIF is very limited color, and is a larger file than 24 bit JPG JPG compression adds artifacts, smears text and lines and edges

These are not the only choices, but they are good and reasonable choices.

Web pages require JPG or GIF or PNG image types, because that is all that browsers can show. On the web, JPG is the best choice (smallest file, with quality being less important than size) for photo images, and GIF is common for graphic images. GIF was designed for modems by CompuServe, for earliest 8 bit video, and so GIF contains no printing dpi information, and is out of date for 24 bit photos now, but GIF still works quite well for video graphics on the internet.

Other than the web, TIF file format is the undisputed leader when best quality is required (when less than maximum quality is not a consideration). So TIF is very common in commercial or professional printing environments. High Quality JPG can be pretty good too, but don't ruin them by making them too small. If the goal is high quality, then only consider making JPG large instead, and plan your work so you can only save them one or two times. Adobe RGB color space may be OK for your home printer and profiles, but if you send your pictures out to be printed, the mass market printing labs normally only accept JPG files and only process sRGB color space.

Difference in photo and graphics images

Photo images have continuous tones, meaning that adjacent pixels often have very similar colors, for example, a blue sky might have many shades of blue in it. Normally this is 24 bit RGB color, or 8 bit grayscale, and a typical color photo may contain perhaps 100,000 colors, out of the possible set of 16 million colors in 24 bit RGB color.

Graphic images are normally not continuous tone (gradients are possible in graphics, but are not seen very often). Graphics are drawings, not photos, and they use relatively few colors, perhaps less than 16 colors in the entire image. In a color graphic cartoon, the entire sky will be only one shade of blue where a photo might have dozens of shades. Or a map for example is graphics, maybe 4 or 5 map colors plus 2 or 3 colors of text, plus blue water and white paper, often less than 16 colors overall. These few colors are well suited for Indexed Color. Normally the edges in graphics do not use anti-aliasing - which would add numerous shades (graphics use high resolution to smooth jaggies instead). Scanners have three modes to create the image: color (for all color work), grayscale (like B&W photos), and lineart. Line art is a special case, only two colors (black or white, with no gray), for example clip art, fax, and of course text. However low resolution line art (like cartoons on the web) is often better as grayscale, to add aliasing to hide the jaggies.

JPG files are very small files for continuous tone photo images, but JPG is poor for graphics. JPG requires 24 bit color or 8 bit grayscale, and the JPG artifacts are most noticeable in the hard edges of graphics or text. GIF files (and other indexed color files) are good for graphics, but are poor for photos (too few colors possible). However, graphics are normally not 24 bit color anyway. Formats like TIF and PNG can be used either way, 24 bit or indexed color - these file types have different internal modes to accommodate either type optimally.

What does JPG Quality "Losses" mean?   What are JPG artifacts?

JPG Quality seems slightly improved from the old days, but artifacts certainly still exist. Next is five images, each 150x80 pixels size, saved as JPG, compression of 4 to 10 Quality, and also PNG (lossless compression). TIF would appear same as PNG (both are lossless), but browsers cannot show TIF files. All were saved here with Adobe Photoshop CS5 menu File - Save AS.

These images are simple "graphics", relatively many pixels are all the same color value, whereas photo images are vasty more complex, with very many more "colors" (continuous tones), many more unique pixels, relatively few are the same colors. However, this use of graphics (here) shows the effect of JPG artifacts around sharp edges more easily, more recognizable. You should learn to recognize JPG artifacts. There are a couple of types of JPG artifacts, 8x8 pixel blocks (of same one color in the block) in smooth featureless areas, and harsh artifacts around sharp edges. Dirty colors vs pure colors is one result. Viewing at 3x actual size helps to learn to see JPG artifacts. They are real, regardless if you are aware or not.

Repeated next are the SAME exact five files again, same repeated a second time, but now with browser instructions to enlarge them to 3x size here, for a larger better look at the artifacts (but interpolation is necessarily a blurring operation).

This is how the images came out of the JPG files (all were pristine going in). There are of course diminishing differences, but the PNG just looks sort of "pure" (lossless).

Quality 4 might sometimes be "good enough" for some web page images. Quality 10 may Not be good enough to archive your prized best pictures, at least not for mine. See following JPG pages.

BTW, excessive USM sharpening is another factor that can also cause false edges.

This is what is meant by JPG Quality losses. Losses of quality, just not as good to look at. Still same count of pixels, still three bytes per pixel when opened again, etc, but now with some altered pixels, called artifacts... like dirty pixels. Remember, artifacts accumulate and get worse every time you edit and save the file as JPG again. And this is not repairable, so don't do that, be aware, and devise a better plan.

JPG Quality 10 is pretty good indeed, but it is still JPG. JPG is used where small file size is more important than absolute image quality, like web pages or email, or small memory cards. And Quality 8 to 10 may be "good enough" for most "viewing" uses, EXCEPT there is the distinction that no JPG is good for editing, and then saving repeated times - which accumulates and compounds the JPG artifacts each time saved as JPG. Your images are either important or not, but if we take short cut liberties, experience knows the time will always come when it will matter to you. We cannot undo JPG damage.

There are ifs and buts, difficult to quantify. There are lossless methods to rotate or flip JPG images without uncompressing and recompressing (Irfanview plugins offers this), so there are no additional losses in those special cases. Applications like Photoshop take heroic pains to try not to recompress image areas with no change (when possible). But bottom line, saving JPG again is a little like Russian Roulette, every time may not get you, but there are large risks, depending on how you value the life of your image.

Some people really don't notice the difference, and so about anything is "good enough" if they can still recognize the people in the pictures. There are other more critical photographers with the view that even the best is barely sufficient for THEIR images. We really cannot have the same discussion with both. I lean towards the latter group. I really don't see any reason to intentionally select less than maximum quality. Internet transmission speed probably is a consideration, but today, disk space is no reason to compromise my images quality, certainly not my original copies. Indeed, there are better ways to go about it.

The Best Plan

JPG is not so terrible if at higher Quality level, when done only once or twice. The big deal is repeated saves as JPG, which compounds more artifacts every time.

For those who critically care about their images, the best plan, and actually, the easy way, is to always keep and archive your unedited original image as the best master you have. Otherwise, you can never get it back, so never write over it - always keep that original intact (whatever it is, especially if it is a JPG from the camera - and the camera should of course be set to create the finest possible image it can.)

Then when editing (saving only to a copy, always preserving the original intact), always save your in-work image ONLY to a lossless format (TIF or PNG), EVERY time, UNTIL your last necessary FINAL one JPG save (at high quality level). For example, Photoshop, and the free editors Irfanview and Faststone (see Google), have batch modes to copy many files from JPG to TIF in one easy operation. This would be pointless unless you are intending to edit them, because this TIF step will NOT remove any existing JPG artifacts, the data will of course still contain those original JPG artifacts - but TIF Saves will not add any more. Computers and disks are big and fast and inexpensive today, this larger file size is a small issue today (and that is simply how big your data is). Then as lossless TIF files, you can edit away, red eye and color adjustments and cropping and resampling, saving with abandon (just NOT as JPG), until that needed One Final JPG Save. So now, the total is only two saves as JPG (original camera save, and this one final save, is two), which is worse than one, but much better than six. Using camera RAW images would eliminate the first, and offer other advantages too.

And then, when if any further need to edit it again comes up, discard this second JPG (as an expendable copy), and start over from your better archived master you kept. Avoid using any image saved repeated times as JPG. JPG is lossy, which means we do not get back the same quality we put in. There are more losses every time. But with lossless formats (PNG or TIF LZW), it does not matter if you save a jillion times. But it matters if saving to JPG. An extra unthinking save as JPG is not a good plan. If it overwrites your only original copy, it is a terrible plan, you can never get it back.

Batch processing: Speaking of the philosophy of expendable JPG, and assuming you keep a high quality lossless archive, then there are easy ways to run off batches of expendable JPG for one-use viewing, temporary JPG copies sized for monitor viewing, or for HDTV viewing, or to upload to be printed, etc. Several programs have batch modes to do this.

Photoshop has its menu File - Scripts - Image Processor. Can read Camera RAW images, including their previous processing instructions (White Balance, Exposure, Cropping, etc).

Faststone editor has its menu Tools - Batch Convert Selected Images. Free from the internet.

Irfanview editor has its menu Files - Batch Conversion/Rename. Free from the internet.

Of these two, Faststone may be the better editor, and Irfanview may be the better viewer.

We always resample using "Preserve Aspect Ratio", so generally, if say you want say 1800x1200 pixel size, you can specify the larger target dimension twice, 1800x1800, and then regardless if the batch contains mixed landscape or portrait shapes, the largest dimension will be 1800, as appropriate.

Or Photoshop has its Resize to Fit option, and Faststone has its Switch Width with Height option to do this more overtly. Computer speed today makes it trivial to just run off whatever you want at the moment, and then discard those expendable JPG after that one use. You always have your high quality lossless archive copy, and can do this at will.

Basics

A very major factor regarding image file format choice is whether the data compression scheme is lossless or lossy. Lossless means the pixels come back out of the file exactly as you write them in (no JPG artifacts) - any minor color change is simply unthinkable. Whereas Lossy compression (JPG) means there are intentional approximations made for expediency and smaller storage size, and the data (pixel colors) comes back out close, but not exactly what went in. For rough example of the lossy concept, imagine part of the image (maybe the sky) is shown as four shades of blue, all close to the same, but not quite the same. Lossless compression MUST save all four shades of blue, no matter how slight the difference. Lossy compression might assume all four areas are the same one same of blue, and one value stores much smaller than four, but you only get one shade back next time you look - a Quality difference. This may be undetectable in a High Quality JPG. A little less quality may often be "close enough" for many purposes, but is not fully precise - it is "lossy", not "lossless". We do get back ALL of the pixels of course, but some colors can be subtly changed, and the "loss" is in regard to the precision of the data, called Quality. In images, the visible difference is called JPG artifacts, sometimes even creating false "detail", shapes we can see - when some pixels are noticeably not the color they were expected to be. Repeat, the losses of image data we are speaking about is the color of pixels - image data is pixels, which are simply "colors", the storage of the three RGB data components. (see What is a Digital Image Anyway?)

Our digital images are dimensioned in pixels (not bytes, and definitely not inches). And a pixel is simply a color definition, the color that this tiny dot of image sampled area ought to be. Put all those colored dots together, and our brain sees the image. Any common 24-bit RGB image will use three bytes per pixel (if JPG, TIF, PNG ... doesn't matter - RGB data size is three bytes of data per pixel). So - for example- any 10 megapixel camera image data will occupy 3x10 = 30 million bytes, by definition of RGB color. This number is the "data size" (when opened into computer memory for use). A TIF file will be near that size (and is lossless), but JPG is normally compressed very heavily (lossy, not lossless) to store in a JPG file of perhaps 1/10 this size (variable with JPG Quality setting), which is "file size" (not image size and not data size). The example image size is still 10 megapixels (dimensioned width x height, in pixels), and the data size is 30 million bytes, and the JPG file size might be 3 MB. The image will still come out of the JPG file as the same 10 megapixels and the same 30 million bytes when the 3 MB JPG file is opened (and we hope its quality also comes out about the same). Image size (pixels) determines how we can use the image - this use is also dimensioned in pixels. See a summary of digital basics.

All photo editor programs like Adobe Photoshop or Adobe Elements support these file formats, which will generally support and store images in the following color modes:

  Color data mode   Bits per pixel
JPGRGB - 24 bits (8 bit color),
Grayscale - 8 bits
(only these)

JPEG always uses lossy JPG compression, but its degree is selectable, for higher quality and larger files, or lower quality and smaller files. JPG is for photo images, and is the worst possible choice for most graphics or text data.

TIFVersatile, many formats supported.
Mode: RGB or CMYK or LAB,
8 or 16 bits per color channel, called 8 or 16 bit "color" (24 or 48 bit RGB files).
Grayscale - 8 or 16 bits,
Indexed color - 1 to 8 bits,
Line Art (bilevel)- 1 bit

For TIF files, most programs allow either no compression or LZW compression (LZW is lossless, but is less effective for color images). Adobe Photoshop also provides JPG or ZIP compression in TIF files too (but which greatly reduces third party compatibility of TIF files). "Document programs" allow ITCC G3 or G4 compression for 1 bit text (Fax is G3 or G4 TIF files), which is lossless and tremendously effective (small). Many specialized image file types (like camera RAW files) are TIF file format, but using special proprietary data tags.

24 bits is called 8 bit color, three 8-bit bytes for RGB (256x256x256 = 16.7 million colors maximum.)
Or 48 bits is called 16 bit color, three 16-bit words (65536x65536x65536 = trillions of colors conceptually)

PNGRGB - 24 or 48 bits (called 8 bit or 16 bit "color"),
Grayscale - 8 or 16 bits,
Indexed color - 1 to 8 bits,
Line Art (bilevel) - 1 bit

PNG uses ZIP compression which is lossless, and slightly more effective compression than TIF LZW. For photo data, PNG is somewhat smaller files than TIF LZW, but larger files than JPG (however PNG is lossless, and JPG is not.) PNG is a newer format than the others, designed to be both versatile and royalty free, back when the patent for LZW compression was disputed for GIF and TIF files.

GIFIndexed color - 1 to 8 bits (8 bit indexes, limiting to only 256 colors maximum.)

GIF is an online video image, it contains no dpi information for printing. Designed by CompuServe for online images in the days of dialup and 8 bit indexed computer video, whereas other file formats can be 24 bits now. However, GIF is still great for web use of graphics containing only a few colors, when it is a small lossless file, much smaller and better than JPG for this.

GIF uses lossless LZW compression. (for Indexed Color, see second page at GIF link at page bottom).

Note that if your image size is say 3000x2000 pixels, then this is 3000x2000 = 6 million pixels (6 megapixels). Assuming this 6 megapixel image data is RGB color and 24 bits (or 3 bytes per pixel of RGB color information), then the size of this image data is 6 million x 3 bytes RGB = 18 million bytes. That is simply how large your image data is (see more). Then file compression like JPG or LZW can make the file smaller, but when you open the image in computer memory for use, the JPG may not still have the same image quality, but it is always still 3000x2000 pixels and 18 million bytes. This is simply how large your 6 megapixel RGB image data is (megapixels x 3 bytes per pixel).

Summary

If we want:

To save 16 bit data (48 bit color), we must use a file that supports it... TIF or PNG, but not JPG.

To save 1 bit data (line art), or an indexed color palette, we must use a file that supports it... TIF or PNG or GIF, but not JPG. The differences are that TIF cannot be shown by a browser. GIF is indexed color only, and cannot save a dpi number for printing resolution. PNG does have a few additional exotic options (Alpha channel), rarely used, and not universally compatible.

The highest quality of file compression (lossless), then that is TIF LZW or PNG, but not JPG.

The smallest possible file size, without much concern for image quality, then that is JPG.

PNG and TIF LZW are lossless compression, so their file size reduction is not as extreme as the wild heroics JPG can dream up. In general, selecting lower JPG Quality gives a smaller worse file, higher JPG Quality gives a larger better file. Your 10 megapixel RGB image data is three bytes per pixel, or 30 million bytes. Your JPG file size might only be only 5-15% of that, literally. TIF LZW might be 65-80%, and PNG might be 50-65% (very rough ballpark for 24 bit color images). We cannot predict sizes precisely because compression always varies with image detail. Blank areas, like sky and walls, compress much smaller than extremely detailed areas like a tree full of leaves. But the JPG file can be much smaller, because JPG is not required to recover the original image intact, losses are acceptable. Whereas, the only goal of PNG and TIF LZW is to be 100% lossless, which means the file is not as heroically small, but there is never any concern about compression quality with PNG or TIF LZW. They still do impressive amounts of file size compression, because the RGB image data is actually three bytes per pixel.

The most common image file formats, the most important for general purposes today, are JPG, TIF, PNG and GIF. These are not the only choices of course, but they are good and reasonable choices for general purposes. Newer formats like JPG2000 never acquired popular usage, and are not supported by web browsers, and so are not the most compatible choice.

We all have our notions, but RAW files are popular indeed, from most DSLR cameras. When we normally take any digital picture, the camera has a RAW sensor, but processes and outputs the image as a JPG file. But often we can choose to output the original RAW image instead, to defer that JPG step until later. We cannot view or use that RAW file any way other than to process it in computer software and then output a final TIF or JPG image. Postponing this processing offers a few serious advantages, better editing options, and we can bypass all JPG artifacts entirely, until the one final output Save for whatever purpose. RAW allows us to tweak exposure, and defer White Balance decisions until later, when we can see the image first, and judge any trial results. The 12 bit RAW file offers greater range for any of our adjustments.

The Next button will browse through the descriptions on the next pages, or you can use these shortcut links directly:

PNG Format TIF Format JPG Format GIF Format


Copyright © 1997-2010 by Wayne Fulton - All rights are reserved.

Previous Main Next