Federation for a Free Informational Infrastructure

c42pdf

c42pdf version 0.12 build 2001-12-23
convert CCITT4 multipage 6.0 TIFF to A4 PDF 1.1
name: *C*CITT *4* TIFF images *2* (=to) *PDF*

Last touch of this page: 2 Aug 2006

"Images in general can be quite revolting." -  Ian Stewart, Foundations of Mathematics

Purpose: To now, this program - evolved from some amateur experimentation with Thomas Merz PDFlib v 0.6 library and code - has mainly been used for the fast conversion of medium-sized (1-200 pages) black/white scans e.g. of journal articles or other scientific documents (sometimes including sice reduction to allow for headers to be added in a second step). As the "ecological niche" it inhabits is so tiny, it should be one of the rather fast programs for CCITT->TIFF conversion; please respond if you find faster programs.

c42pdf FAQ
G4-compressed-TIFF-to-PDF-conversion FAQ

Download:
the highest number is the newest version - compile yourself and send me a copy if your platform is not up to date
c42pdf: source code: 0.12, 0.11, 0.10, 0.09, 0.08 (Compilation instructions)
c42pdf: AIX binary 0.12 (RISC System/6000), 0.08 (Rainer Lichtblau)
c42pdf: BSDi binary 0.09 (BSDi 3.1), 0.08 (BSDi 4.0, Ilya Oussov)
c42pdf: FreeBSD binary 0.10 (FreeBSD 4.2/intel), 0.09 (FreeBSD 4.2/intel), 0.08 (FreeBSD 2.2.7, Ilya Oussov)
c42pdf: HPUX binary 0.12, 0.11 (RISC 1.1; Don Verhagen)
c42pdf: Linux binary 0.12, 0.11 (2.2.17), 0.10 (2.2.17), 0.09 (2.0.34), 0.08
c42pdf: Mac OS X 0.12 binary (after download, expand with Stuffit Expander and run "chmod a+x c42pdf" to make the extracted program executable) 0.12 hqx file (compiled on OS X 10.2.8 by David Newman)
c42pdf: OpenBSD binary 0.10 (2.60/intel), 0.09 (2.60/intel), 0.08 (OpenBSD 2.80; Gaute Lundal)
c42pdf: Solaris binary 0.10 (SunOS 5.8/ultra4 sparc, John A Kunze), 0.09 (SunOS 5.5.1/sparc), 0.08 (Andreas Gieseler, Rainer Lichtblau)
c42pdf: Windows binary 0.12, 0.11, 0.10, 0.09, 0.08

Features:

- runs from command line
- fast processing, because CCITT4 data is directly dumped to output, it is not uncompressed into memory
- scaling option (-s), free vertical placement of reduced-format images on the page (-b option), cropping option (-c), rotating option (-R)
- can unite list of single TIFF files (-l option)
- can create long sample files from single TIFF images (-r switch)
- distributable under Aladdin Free Public License (see file "copyright")
- no log files, command line options can be viewed via Acrobat Reader's file/document info/general info or by searching raw PDF in a text editor for "Creator:"

Limitations:

- generated PDF files may not be longer than 1,000 pages; in case this it not enough there is other software to do it see FAQ
- output PDF has a flat page tree ("linked list")
- only part of TIFF 6.0 specification is used
- only CCITT4 TIFFs are handled (use www.fastio.com's tiff2pdf, pnmtotiff, Tiffkit, ImageMagick or ghostscript to command-line convert from other formats), for detailed examples and guidance see FAQ
- some functions I rarely use (e.g. nostretch) have not been tested extensively
- no support for multistrip images (you will hideously only get blank files (version 0.08) and (version 0.09) error messages for those)
- version 0.08 is not safe from buffer overflows if used maliciously (the only scenario I could imagine where this could be a problem is when you allow users to convert files with arbitrary filename length via a web CGI interface which not a good idea anyway), this has been fixed in version 0.09.

Usage:

"c42pdf" display summary of options
"c42pdf mytif.tif" make mytif.pdf from mytif.tif
"c42pdf mytif1.tif mytif2.tif" make mytif1.pdf from mytif1.tif and mytif2.tif.
Input files can be (mixed) single page or multipage TIFFs.

Options:

-h print documentation

-o output file other than *.pdf, e.g. ("c42pdf -o newname.pdf mytif.tif" makes newname.pdf instead of default output mytif.pdf)

-p paper format other than A4 (default), options are "-p B5" , "-p l" (US Letter) "-p o" (original size as defined by TIFF image data) or "-p 1190x1684" (this width 1190 times height 1684 points choice will e.g. give you A2)

-s scale other than 1.0 (original), e.g. "-s 0.7" reduces area covered by image to 0.7*0.7, centered unless "-b" option used

-b bottom in points, combine with the "-s" option, e.g. "-s 0.8 -b 0" places 0.8*0.8-shrunken image in the page bottom center. Hint: Use "-s" for adding headers and footers in a second step (e.g. by the commercial Compose toolkit, http://www.ambia.com), use combined "-s -b" to add headers only.

-l read a list delimited by whitespaces (blanks,tabs,linefeeds etc.) of file names as input, e.g. "c42pdf -o all.pdf -l mylist.txt" merges files named in mylist.txt to all.pdf. The "-l" switch must be preceeded by the "-o" switch. Hint: make such lists under Windows/DOS by "dir /b *.* > mylist.txt"; under Linux "ls * > mylist.txt"

-c <cropbox>: (leftxbottomxrightxtop): defines a crop box, e.g. "-c 45x35x45x35" crops thirty-five pixels at page bottom on top and forty-five pixels at page left and right. The information you hide by cropping (such as scanner margins) is not really lost, it can always be recovered by resetting the crop box to lower values in a PDF manipulation program like Acrobat Exchange.

-r <repetitions>: x times, default: 1: May be useful for creating huge sample files, e.g. "-r 500" repeats each image 500 times.

-R <rotate>: rotate by x degrees

-t <Group3 (three)>: also allow raw conversion of less efficient Group 3 compression

--noflip: override automatic flipping (landscape/portait) of page orientations

--nostretch: override automatic stretching of images when samples are smaller than paper size

--lockstretch: allow automatic stretching of images without changing the aspect ratio
 

Acknowledgements: This program uses PDFlib v 0.6, http://www.pdflib.com/, (C) 1997-98 Thomas Merz, Aladdin Free Public License applies. Also, major parts of the enclosed code are directly modified from the PDFlib library code (and elegance has been sacrificed for having a ready-to-run application). Development was with the gcc compiler, for details of compilation see the readme file in the source code. I also want to express thanks to Richard Urquhart, Thomas Merz, Hideki Watanabe and Rolf Macht for support and encouragement as well as Rick Lightbody, Rolf from Ars Digita, Wayne Ugorek , Brian Polak, G I Warden, Steve Reich-Rohrwig, Eberhard Opitz, Chad Armond, Antonio Costa Almeida for useful bug/feature reports. Special thanks go to James Y Hope for fixing a nasty resolution bug and hinting to how to implement the G3 option. I apologize many requests resulted in fixes to the documentation for the code is a mess and maybe should better be left untouched like those old COBOL programs...

License: Currently c42pdf inherits Aladdin Free Public License from PDFlib, it also has to be distributed with that license (enclosed document "copyright.txt" or http://www.cs.wisc.edu/~ghost/aladdin/doc/Public.html). Basically, you can distribute the program, and modify the source code, but commercial distribution, e.g. selling the program, is limited. Commercial *usage* of the program does not require any license, so do you do not have to worry about licenses at all if you do not redistribute. Although as of 1999, PDFlib was the most liberally distributed PDF C library, AFPL is more restrictive than other public licenses (GPL, Perl Artistic) - as of 2000 a GPLled PDF library has emerged in the Panda project but I have not found the time to recode. For the ideas and code alterations for c42pdf the author disclaims any copyright restrictions, so in case the underlying PDFlib shall at any time be submitted to a more free license (such e.g. GPL, Debian, GPLL, Perl Artistic) it is also distributable under that license by default. From my side, you are also explicitly welcome to recode this program under a more free license, if you do so without loss of functionality you may inherit its name and URL, feel free to contact me about this.

DISCLAIMER: No legal warranties are given for the usability of this software or any arising damages from its use. Especially, only part of TIFF specification has been used and tested, but I would like to hear if important parts for you are missing.

Contact: Further documentation, source code and problem reports or registration (optional), contact Holger Blasum via http://c42pdf.ffii.org, c42pdf ATT ffii DOTT org. I had no time to check all the options in combination with various formats (except than what I had in everyday usage), so bug reports (best: including TIFF images) or improvement for documentation are welcome !

History

On 22 July 2004 Joerg Wittner pointed to the bug that that c42pdf ignored 'A4' papersize option when given explicitly, now fixed.

On 12 Dec 2003 fixed too long strings in usage.c (thanks to path by Nelson H. F. Beebe) that do cause problems in some compilers. No change of compiled binaries, so no new version number and no need to upgrade.

Bug fix version 0.12 (23 Dec 2001, credits to Francis Bell) assures that default filenames are guessed following the last dot (not the first) and fixes a buffer overflow issue with filenames longer than 253 characters.

New version 0.11 (9 Nov 2001) addresses some bugs reported by Don Verhagen in nostretch, noflip, lockstretch options. These are only relevant if you want to use these options, no changes in core have been made. Addendum (2001-07-28): version 0.09 fixes some bugs (scaling, buffer overflows, rotating, photometric interpretation) in version 0.08. There is no urgent need to upgrade if you are happy with your working copy. If version 0.09 does not produce the desired output try using version 0.08 (and please do not hesitate to report bugs in 0.09).
Addendum (2000-06-10): c42pdf can be optimally used where you produce the scans yourself or you can be sure they all come from one source. For blind conversion of whole repositories legacy documents use with caution (see limitations).
Addendum (1999-05-27): convergent evolution is a common phenomenon for small niches :-). To pdfzone.com, a similar tiff2pdf had been submitted six days before c42pdf. As of today, I still claim c42pdf to be faster (due to less loading overhead for tinier binary?) and the source (library) is more open, but it would be unfair not to mention the competing fastio.com site for other options, file formats, OS etc.