Friday, September 20, 2013

Preliminary support for Type 3 fonts

I'm happy to announce that a preliminary support for type 3 fonts has been added to pdf2htmlEX. For now 2 simple PDFs from PDF.js are passed:

https://github.com/mozilla/pdf.js/blob/master/test/pdfs/simpletype3font.pdf


https://github.com/mozilla/pdf.js/blob/master/test/pdfs/issue3188.pdf


This feature is actually one of the features that I want to implement the most, since the very beginning. Another one is generating background images in SVG, a preliminary version of which has also just been added.

Both features rely on CairoOutputDev from poppler, which further replies on cairo and freetype. Actually it might be possible to eliminate the dependency on freetype, but I don't want to touch those files in order to make it easier to merge upstream files in the future. Anyway seems that freetype is depended by poppler, so no big deal.


To enable this feature, you need the latest source code from git. Add `-DENABLE_SVG=ON` to cmake, and `--process-type3=1` when running pdf2htmlEX.


The current idea is, for each type 3 font, to dump each glyph into an SVG image and then combine them into a font with FontForge. It's actually inspired by FontCustom, I realized the capability of importing SVG glyphs of FontForge by reading the code of FontCustom.

Each glyph is drawn on a 100x100 canvas, although SVG is for vector graphics, CairoOutputDev would thicken thin strokes (for printing purpose?), which might ruin the font. Also there are cases that sampled raster images are stored in the SVG file, probably it is the behaviour of cairo due to the limitations of SVG. In such cases, 100x100 might not be large enough for a font.

The size is defined as GLYPH_DUMP_EM_SIZE in font.cc. I tried to set it to 1000, and indeed the quality for `issue3188.pdf` was improved; but for some other PDF files, the values in SVG files might be so large that FontForge would complain that those values cannot be stored into 16-bit fields. Or maybe it is the problem of TTF, and I'd better change it to another.

However due to the complexity of Type 3 fonts, (each glyph is a mini-PDF), especially the font matrix, I don't have a perfect solution for each possible cases. Right now let me just focus on `average` cases.

No comments :

Post a Comment