c42pdf FAQ

Last update: 20 July 2003, Holger Blasum, c42pdf ATT ffii DOTT org

Who uses c42pdf ?

Due to its small size and easy portability, both individual end users as well as with some university document delivery services (here in Germany), see eg some search engine that also indexes PDF metadata.

What competing products are there ?

See the general FAQ .

Why does c42pdf only convert CCITT G4 files (James) ?

c42pdf was written when the optimal compression scheme for black/white scans was CCITT G4 compression which was supported in both the TIFF and PDF standards. As both TIFF and PDF scanned images consist of huge chunks of binary data embedded in tiny header information, it was an easy task to write a fast converter for image throughput.

Now the PDF standard has been widened to include Flate compression as an alternative whereas AFAIK the TIFF standard does not have Flate compression. Flate compression is about 0-15% more efficient (and, BTW, a technically simpler and smarter way). So iff you want size-optimized PDF files that will be read and downloaded many times I recommend not to use c42pdf, because it does not produce the best output (alternatives below).

c42pdf does however, produce "second-best" output (CCITT G4 compression) and is very fast in that (usually at least 10 times faster than any program doing an actual CCITT G4->Flate conversion). If you are home user or business and c42pdf works for you using it could you save the trouble of installing more "professional" alternatives. Usually you will be able to tell your scanner to produce CCITT G4 compressed TIFFs right from the beginning.

As a security feature, c42pdf does by default not convert other TIFF compression schemes in pass-through modes (although the implementation would be trivial) because that would result in compression schemes about 20-40% less optimal than CCITT G4 in the case of CCITT G3 and typically 1500% less optimal for non-compressed TIFFs. PDFs containing uncompressed image data would seriously disgruntle your users and thwart your friendly system administrator's backup scheme.

There is a -t switch to allow the 20-40% less optimal CCITT G3 compression if you know what you are doing but no way to produce umcompressed images (unless you adapt the source and recompile).

Image size of PDFs

This will mainly concern engineering drawings.

By default, c42pdf produces PDF of type 1.1, if your image size is larger than 3240x3240 pixels you will get a warning (at least from c42pdf version 0.13 on).

AFAIK, the PDF spec is fully backward-compatible, so it is possible just to increase the version number to 1.3 or anything else.

Setting the version higher number to 1.3 allows you to extend the size limit from 3240x3240 units to 14,400x14,400 units, the drawback is that persons using Acrobat Reader below version 4 may not be able to display the file.

c42pdf currently does not have a command line switch for changing version information, but nonetheless it can be done very easily:

Of course, you can also make these changes to the source and then recompile.

Remark for Windows users: the bundled Windows text editors (edit, notepad and write on XP) are not suitable for bytewise manipulation of PDFs or executables like c42pdf.exe. Use an editor with hexadecimal editing capabilities instead (such as e.g. WinHex) or the hexl-mode of emacs.

Reference: PDF specifications

Why is A4 default size?

Historically what was I needed first for Europe - didn't want to break the interface later. Use the -p option to adapt.

Why does c42pdf just do 1,000 pages (Chris) ?

And when I view page 800 why is the program so slow in jumping from page 1 to page 800 ?

The internal storage I use in the old PDFLib's version is a flat list and there is a hardcoded limit of 1000 pages. You can work around this by ex-post joining of multiple shorter files, see the preceding section on concatenation.

This limitation does not apply to any of the other alternatives.

Commercial restrictions (Jimmy) ?

There are no restrictions on the commercial use of c42pdf.

The only thing that you cannot do is sell the program itself or sell it within a package of software as a program. If you want to that it is possible of course too but we would need a permission of Thomas Merz who wrote a C library c42pdf uses.

License of c42pdf ?

Currently c42pdf inherits Aladdin Free Public License from PDFlib, it also has to be distributed with that license (enclosed document "copyright.txt" or http://www.cs.wisc.edu/~ghost/aladdin/doc/Public.html). Basically, you can distribute the program, and modify the source code, but commercial distribution, e.g. selling the program, is limited. Commercial usage of the program does not require any license, so do you do not have to worry about licenses at all if you do not redistribute. Although as of 1999, PDFlib was the most liberally distributed PDF C library, AFPL is more restrictive than other public licenses (GPL, Perl Artistic) - as of 2000 a GPLled PDF library has emerged in the Panda project but I have not found the time to recode. For the ideas and code alterations for c42pdf the author disclaims any copyright restrictions, so in case the underlying PDFlib shall at any time be submitted to a more free license (such e.g. GPL, Debian, GPLL, Perl Artistic) it is also distributable under that license by default. From my side, you are also explicitly welcome to recode this program under a more free license, if you do so without loss of functionality you may inherit its name and URL, feel free to contact me about this.

Can I register c42pdf ?

There is no need to register. If you email to the mail given at the beginning of this document, subject "Registration", then you will be notified when bugs or new versions come out. Registration is free and you can revoke it. Your email will not be used for other purposes.

How about a proper implementation of multistrip images (James) ?

> Multistrip presentations seem to occur more often with RGB images > i.e., those with color mappings. I think that b/w messages are mostly sent in single-strips (for whatever reason) so for the moment I will leave it as is. Because I tried more than a year a ago (but have lost the code) to dump multiple strips into a contiguous chunk into a PDF document I guess is that it is not possible - but I may well have run into a stupid error. At least now there is (I hope so) a clearer error message in that case.

BTW, one can generate TIFFs of various strip sizes (for testing) via ghostscript

                gs -sDEVICE=tiffg4 -sOutputFile=tiger.tif \
                        -dMaxStripSize=8192 examples/tiger.ps

or pnmtotiff (eg for incoming faxes):

                cat $1 | g32pnb | pnmtotiff -g4 -rowsperstrip 10000 > $1.tiff

Future of c42pdf ?

There are no big plans, as it just its job and the author hates bloatware, but bugs you report will be corrected and/or documented. Ports you submit will be put on the website.

Acknowledgments:

To Christoph Schulze, John A Kunze, James Y Hope, Hartmut Pilch, Greg Falvo, Jimmy Ngo, Dan Cogliano, Bill Gilchrist for comments and questions.