Docx4j for Clojure

By Felix Johnson
Last updated: 12/24/2017

Background

docx4j is a Java library for creating and maniplating Microsoft Word .docx files through the Open Office file format. This webpage explores using the docx4j library with Clojure, a dialect of the Lisp programming language that runs on the Java virtual machine. At the time of writing, the version of docx4j was 3.3.6.

Leiningen configuration

To add docx4j to your project, add [org.docx4j/docx4j "3.3.6"] to your :dependencies keyword.

A quick note about logging

docx4j uses the log4j 1.x library to manage error logging. If you call the createPackage static function below without configuring logging, an error message such as the following may appear:

log4j:WARN No appenders could be found for logger (org.docx4j.utils.ResourceUtils).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

For now, I will ignore log4j configuration, but there are several clojure-based tutorials available on the Internet.

Creating a Hello, World document

For a simple document containing only "Hello, World", we need a WordprocessingMLPackage object. The createPackage static function handles much of the housekeeping for creating an empty WordprocessingMLPackage object.

(import org.docx4j.openpackaging.packages.WordprocessingMLPackage)
(def wordml-package (WordprocessingMLPackage/createPackage))

This creates an object using default settings. To ensure the paper is US Letter-sized and in portrait mode, use the two-argument version of createPackage.

(import org.docx4j.openpackaging.packages.WordprocessingMLPackage)
(import org.docx4j.model.structure.PageSizePaper)
(def wordml-package (WordprocessingMLPackage/createPackage PageSizePaper/LETTER false))

Now that the package object is created, we can retrieve the main document part and add content to it.

(.addParagraphOfText (.getMainDocumentPart wordml-package) "Hello, world!")

The last step is to save the file.

(.save wordml-package (clojure.java.io/file "hello-world.docx"))

The result is here.

Adding a header and footer

Based on the HeaderFooterCreate.java sample file, we can create a header and footer.

(import org.docx4j.openpackaging.parts.WordprocessingML.HeaderPart)
(def wordml-header (HeaderPart.))

(import org.docx4j.openpackaging.parts.WordprocessingML.FooterPart)
(def wordml-footer (FooterPart.))

The next step is to add the header and footer to the main document, and save the return values so we can update the references later.

(def header-rel (.addTargetPart (.getMainDocumentPart wordml-package) wordml-header))
(def footer-rel (.addTargetPart (.getMainDocumentPart wordml-package) wordml-footer))

Now let's add some simple content. To create and add content, we need a context. There are different context classes available based on the document type, but for this example, I'll use org.docx4j.jaxb.Context. Content is based on the paragraph-run-text model, so the text must be added to the run and the the run added to the paragraph. Additionally, content must be wrapped in the appropriate part tag. For headers, this is done through createHdr() and footers are created through createFtr().

(import org.docx4j.jaxb.Context)
(def wordml-factory (Context/getWmlObjectFactory))
;; Create the text.
(def header-text (.createText wordml-factory))
;; Set the text value.
(.setValue header-text "This is the header.")
;; Create the run.
(def header-run (.createR wordml-factory))
;; Add the text to the run.
(.add (.getContent header-run) header-text)
;; Create the paragraph.
(def header-para (.createP wordml-factory))
;; Add the run to the paragraph.
(.add (.getContent header-para) header-run)
;; Create the header tag.
(def header-tag (.createHdr wordml-factory))
;; Add the paragraph to the header.
(.add (.getContent header-tag) header-para)
;; Add the header tag to the header part.
(.setJaxbElement wordml-header header-tag)

;; Update the references.
(def sections (.getSections (.getDocumentModel wordml-package)))
(def last-sect-pr (.getSectPr (.get sections (- (.size sections) 1))))
(when (nil? last-sect-pr)
(do
(def last-sect-pr (.createSectPr wordml-factory))
(.addObject (.getMainDocumentPart wordml-package) last-sect-pr)
(.setSectPr (.get sections (- (.size sections) 1)))
))
;; Create the header reference.
(def header-reference (.createHeaderReference wordml-factory))
(.setId header-reference (.getId header-rel))
;; Set the type of reference
(import org.docx4j.wml.HdrFtrRef)
(.setType header-reference HdrFtrRef/DEFAULT)
;; Add the reference
(.add (.getEGHdrFtrReferences last-sect-pr) header-reference)

The resulting file, with a header is here.

The resulting file, with a header and footer is here.

A touch of style

So far, the document lacks any defined styles. This is a topic for later investigation.
The first step is to retrieve and/or create the styles part.

(def wordml-styles-part (.getStyleDefinitionsPart (.getMainDocumentPart wordml-package) true))