PD4ML component architecture includes two parts: HTML renderer and PDF writer. PDF wtiter is inherited from java.awt.Graphics and adds some extra methods, for example, for page breaking.
On practice it is possible to substitute the PDF output device with any other "device", implements java.awt.Graphics interface.
As an exapmple here is BufferedImage output code.
private BufferedImage htmlToImage( URL path, int width,
String ttfDir, boolean debug ) {
PD4MLHtmlParser parser = new PD4MLHtmlParser(path, ttfDir, null, null, null,
false, null, null, debug, null, null, -1, false);
Document doc = parser.buildDocument();
doc.layout(width);
int height = doc.getHeight();
if ( height > 48000 ) {
height = 48000;
}
BufferedImage image =
new BufferedImage(width, height, BufferedImage.TYPE_4BYTE_ABGR);
Graphics g = image.getGraphics();
g.setColor(Color.white);
g.fillRect(0, 0, width, height);
doc.paint(0, 0, new Rectangle(0, 0, width, height), g);
return image;
}The code above renders HTML as a single image. In the following sample it splits the HTML layout into smaller portions and produces a set of images (pages):
private Vector htmlToImages( URL path, int width,
int pageHeight, String ttfDir, boolean debug ) {
PD4MLHtmlParser parser = new PD4MLHtmlParser(path, ttfDir, null, null, null,
false, null, null, debug, null, null, -1, false);
Document doc = parser.buildDocument();
doc.layout(width);
int height = doc.getHeight();
int pageNum = (int)((float)height / pageHeight + 1);
Vector result = new Vector();
for ( int i = 0; i < pageNum; i++ ) {
BufferedImage image =
new BufferedImage(width, pageHeight, BufferedImage.TYPE_4BYTE_ABGR);
Graphics g = image.getGraphics();
g.setColor(Color.white);
g.fillRect(0, 0, width, pageHeight);
doc.paint(0, -(pageHeight * i),
new Rectangle(0, -(pageHeight * i), width, pageHeight * (i+1)), g);
result.addElement(image);
}
return result;
}The resulting BufferedImage may be easily converted to GIF, JPEG or PNG. With support of JAI it may be converted to virtually any raster image format, including TIFF.
The examples above do not respect page break tags, defined in HTML. A support of page breaks would require a more sophisticated technique (which is also supported by PD4ML).