Wednesday, January 27, 2010

TIFF page extraction using JAI (Java Advanced Imaging)

     One of the most important parts in TIFF image development is multi-page tiff image extraction.  For image processing work, there are number of 3rd party software available in today’s market.  But when things comes to the online image processing work that time, either you write to a code for image operations as per your requirements or use a 3rd party software which support your applications.  Use of 3rd party software is not a good way to develop your application.  This is the only reason I went for our own implementations.
   
     Working with the DMS (Document Management System) / Document Image Processing, Digital Photography, Defense or Intelligent etc. are always required the image operations.  Multi-page TIFF image extraction is not an important or required part of my DMS project.  But the knowledge of tiff image extraction is a good for future prospects.  There are some parts of the codes available for tiff extraction on web world (just need to do good Google search).  To implement imaging part I decided to go with JAI (Java  Advanced Imaging).  Next topic we will see the advantages of JAI over other imaging APIs.

    At the time of TIFF page extraction, the main thing which we need to care is output image quality.  For me, it’s not important work to extract the pages from tiff image.  Quality is hardly matters as per my requirements.  That’s why I only need to extract the pages in other image type like .jpeg or .png etc.  I had set and used both types for extraction, JPEG for color images and PNG for black and white (binary/bilevel) images.  To maintain the quality of output image means to set the metadata for that image.  e.g. setting DPI of image, bit depth, horizontal and vertical resolutions etc.  I used default metadata for output image which gives 96 dpi horizontal and vertical resolutions. 


JAI (Java Advanced Imaging):
Implements a set of core image processing capabilities.
Include image tiling, regions of interest, and deferred execution.
Offer a set of core image processing operators.
Support image processing using the Java programming Language.

Why JAI?:
JAI offers lots of advantages for developers.
Platform Independent:  JAI applications will run on any machine where Java Virtual Machine is available.  It follows the Java run time library model which provides the platform independence. 

Object-oriented programming:  The API of JAI is object oriented.  It follows the object instantiation, flow of process data, concept of subclasses and parent classes, etc.

High Performance:  Using device independency we can do different types of implementations.  Its possible using JAI API.

Distributed Imaging:  JAI is also well suited as a client-server imaging programming.  Without affecting an object, remote method invocation (RMI) allows java code on a client to invoke method calls which placed in another machine.


Need to Import Packages:
java.awt.image.RenderedImage;
java.awt.image.renderable.ParameterBlock;
javax.media.jai.JAI;
javax.media.jai.RenderedOp;
com.sun.media.jai.codec.FileSeekableStream;
com.sun.media.jai.codec.ImageCodec;
com.sun.media.jai.codec.ImageDecoder;
com.sun.media.jai.codec.SeekableStream;
java.io.File;
java.io.IOException;

(SUN Microsystems offers detailed API of above mentioned classes.  Use Google to search related API on SUN site for detailed information.)

Source Code:
/*
 * Method for extraction
 * tiffFilePath: path of input tiff image
 * outputFileType: ouput image type (for color
 * image “jpeg” and for bilevel image “png”)
 */
public static void extractMultiPageTiff(String tiffFilePath,
    String outputFileType) throws IOException {

    /*
     * create object of RenderedIamge to produce
     * image data in form of Rasters
     */
    RenderedImage renderedImage[], page;
    File file = new File(tiffFilePath);
    /*
     * SeekabaleStream is use for taking input from file.
     * FileSeekableStream is not committed part of JAI API.
     */
    SeekableStream seekableStream = new FileSeekableStream(file);
    ImageDecoder imageDecoder = ImageCodec.createImageDecoder("tiff",
            seekableStream, null);
    renderedImage = new RenderedImage[imageDecoder.getNumPages()];

    /* count no. of pages available inside input tiff file */
    int count = 0;
    for (int i = 0; i < imageDecoder.getNumPages(); i++) {
        renderedImage[i] = imageDecoder.decodeAsRenderedImage(i);
        count++;
    }

    /* set output folder path */
    String outputFolderName;
    String[] temp = null;
    temp = tiffFilePath.split("\\.");
    outputFolderName = temp[0];
    /*
     * create file object of output folder
     * and make a directory
     */
    File fileObjForOPFolder = new File(outputFolderName);
    fileObjForOPFolder.mkdirs();

    /*
     * extract no. of image available inside
     * the input tiff file
     */
    for (int i = 0; i < count; i++) {
        page = imageDecoder.decodeAsRenderedImage(i);
        File fileObj = new File(outputFolderName
                + "/" + (i + 1) + ".jpg");
        /*
         * ParameterBlock create a generic
         * interface for parameter passing
         */
        ParameterBlock parameterBlock = new ParameterBlock();
        /* add source of page */
        parameterBlock.addSource(page);
        /* add o/p file path */
        parameterBlock.add(fileObj.toString());
        /* add o/p file type */
        parameterBlock.add(outputFileType);
        /* create output image using JAI filestore */
        RenderedOp renderedOp = JAI.create("filestore",
                parameterBlock);
        renderedOp.dispose();
    }
}

   
References:
Java Advanced Imaging API: http://java.sun.com/javase/technologies/desktop/media/
Java Advanced Imaging Website: https://jai.dev.java.net/
Programming in JAI (Guide for developers):
http://java.sun.com/products/java-media/jai/forDevelopers/jai1_0_1guide-unc/



Note:  Detailed information regarding JAI topic is available on SUN site. If you have any better idea or knew any simple and better solution than this, please share it with the blog topic.  Suggestions from your side are always welcome.

Thursday, January 21, 2010

TIFF Developement In Java

This blog only contains my own work experience in R&D and development related to TIFF image files using Java technology. It contains only some basic information and operations. This post provides basic idea about Tagged Image File Format. I spend lots of time for completion of this task and I think this is very useful for new java image developer. Adobe provides you everything related to TIFF image (specification, developer resources, libraries etc.), only thing you need to do is code as per your requirement. Later on I will add the development and implementation part of TIFF extraction and TIFF creation. For detail specification, you can visit the official links present in reference section at the end of this document.

Tiff (Tagged Image File Format):
One of the most popular and flexible image file formats used in web world is TIFF. This format is originally developed by company Aldus (now Adobe System). Adobe System now holds the official copyright to the TIFF image specification. The TIFF format is the standard in DMS (document management systems) using its most popular compression scheme CCITT Group IV 2D compression, which supports black and white images.
To save the storage capacity in high volume storage scanning, documents are scanned in bitonal format (black and white) rather than color or gray scale. Suitable file format with different and lossless compression scheme for today’s World Wide Web. Tiff is supported by the number of imaging software, applications and word processors. It can save multi-page documents to a single TIFF file rather than saving separate files. Tiff formats are also useful for transporting image files from one application to another or from one machine to another as they are designed to be independent of any particular hardware or software.

File Format Information:
(Image File Format)
- Code : TIFF
- File Extension : .tiff, .tif
- Media type : image/tiff, image/tiff-fx
- Compression : lossless
- Pixel Supported : a. 1 – 64 bit integer, signed/unsigned.
b. 32/64 bit IEEE floating point.
- Capability : black and white, grayscale, palette color and full color image data
- Developed by : Aldus (now Adobe Systems)

Why TIFF Development:
As per the development point of view-
- Flexible and platform-independent image format for development and storage.
- TIFF provides lossless image creation and compressions.
- Supported by the number of image manipulation software.
- Developers can use their private (own) tags in TIFF.
- Developers can include the own information inside the TIFF file.
- Saves storage space when required to store huge scanned data.
- Ignore developers’ private tags whenever not recognized by TIFF reader.
- Store multiple images within a single file.
- Developer can choose the best color scheme as per requirement.
- Don’t need a license from Adobe for Implementation of TIFF reading & writing applicaiton.
- Adobe provides full TIFF specification for download.

Color space and compression scheme:
TIFF allows for a wide range of different compression schemes and color spaces.
TIFF ImageWriter contain following compression types:
CCITT RLE
- Only applicable for Black and White Images (i.e. bilevel/binary images)
Modified Huffman compression
CCITT T.4
- Only applicable for Black and White Images (i.e. bilevel/binary images)
CCITT T.6
- Only applicable for Black and White Images (i.e. bilevel/binary images).
LZW
- Lossless Compression
- Use for maintain quality of image not for compression
JPEG
- Lossy compression
- Reduced file size to minimum
- Applicable for grayscale (1-band) and RGB (3-band) images only
ZLib
- Deflate compression except value.
PackBits
- Default lossless compression with RGB color space.
Deflate
- Lossless compression (same as LZW)
- Use for maintain quality of image not for compression
EXIF JPEG
- Regular JPEG compressed image
TIFF_NO_COMPRESSION
- TIFF with no compression.

Main Advantages and Disadvantage:
Advantages:
- Lossless Compression
- Platform independent
- Good compression types
- Less space required for storage
Disadvantages:
- Difficult to store
- Large file format
- Ambiguity in some compression types (e.g. JPEG compression)

Reference:
- For TIFF Specification, TIFF Registration and TIFF developer resources:
http://partners.adobe.com/public/developer/tiff/index.html
- LibTIFF Home Page: http://www.remotesensing.org/libtiff/
- For future development: http://www.adobe.com


Note: Please fill free to comment on this topic. Suggestions and questions from your side are always welcome.