Linkedin In the not too distant past, I have had to implement solutions for generating PDF documents, based on dynamic data and a document template to be defined by the end-user. The approach we took was to allow the end user to create the document layout in MS Word, embedding simple tags to indicate the position of dynamic data elements. It shows how you can save a Word document as XML and specify a stylesheet to be applied when saving. The article introduces a stylesheet — Word2FO. I decided to give it a spin.
|Published (Last):||5 September 2015|
|PDF File Size:||1.92 Mb|
|ePub File Size:||4.34 Mb|
|Price:||Free* [*Free Regsitration Required]|
Linkedin In the not too distant past, I have had to implement solutions for generating PDF documents, based on dynamic data and a document template to be defined by the end-user. The approach we took was to allow the end user to create the document layout in MS Word, embedding simple tags to indicate the position of dynamic data elements.
It shows how you can save a Word document as XML and specify a stylesheet to be applied when saving. The article introduces a stylesheet — Word2FO. I decided to give it a spin. To interpret them, you must run them through a formatter, along with other data, such as graphics and font metrics, to create a final displayable or printable file.
It includes everything needed to paginate and format a document. Some of the formatting supported by XSL-FO, but not by CSS, includes right-to-left and top-to-bottom text, footnotes, margin notes, page numbers in cross-references, and more.
I then open Word , create a simple document with several layout features that are bound to pose a challenge on the conversion to XSL-FO. I then select Save As from the File Menu. The document type is XML. This brings up a checkbox Apply Transform. When checked, the Transform button is enabled.
That was news to me. Well, I select the Word2FO. However we lost a couple of details. Most notable the column layout in the original Word document. Also a couple of fonts — no arial or any sans-serif for that matter is created into the PDF.
I will give it a try with the latest beta 0. I have tried with FOP 0. Still no column layout, still no sans-serif font types. Substituting with default font. Mar 22, AM org. BufferedOutputStream; import java. File; import java. FileOutputStream; import java. IOException; import java. Transformer; import javax. TransformerFactory; import javax. Source; import javax. Result; import javax. StreamSource; import javax.
Fop; import org. FOPException; import org. FormattingResults; import org. MimeConstants; import org. Related posts:.
Apache FOP Replacement
The latter one most likely is too lossy to serve as an example for your requirements but the former two are adequate. All these converter classes are derived from the common base class AbstractWordConverter which provides a basic framework for word conversion classes. To implement your task converting doc to pdf with having all formattings like tables, images, alignments, therefore, you should also derive a converter class from that AbstractWordConverter and for implementing the abstract methods let yourself be inspired by the three concrete implementation classes. Just like in the other converter classes, concentrating the very PDF library specific code into a PdfDocumentFacade class seems like a good idea.
Apache(tm) FOP: Embedding
It also provides limited read only support for the older Word 6 and Word 95 file formats. For some use cases, especially around text extraction, support is very strong. For others, support may be limited or incomplete, and it may be necessary to dig down into low-level code. Error checking may be missing in places, so it may be possible to accidentally generate invalid files. Enhancements to fix such things are generally very well received! You will need to ensure you include the appropriate jars and their dependencies! Please note that in version 3.
Converting Word documents to XSL-FO (and onwards to PDF)
This list is most likely badly incomplete. Clipping of text and graphics is not supported. Support for TrueType fonts may be added later. AFP has grown in functionality over time and not every environment supports the latest features.