Friday, 13 September 2013

How to add comments to a .docx XML

How to add comments to a .docx XML

At work, we have a word document that we have to edit all the time to pass
on to another team, to tell them how to perform some tasks. Since I don't
like mindlessly filling out data, and I always look for ways to simplify
the tasks I have to do, I decided I would automate this process. After
considering a few methods (such as generating a word document from scratch
or editing an existing document), I decided to edit the document in-place.
I have inserted special tags into the document (specifically, they take
the form [SOME_NAME_HERE]), and I will then parse the document for those
special tags and replace them with the value I actually need. I then
extract the .docx to a folder with all of the XML documents inside of it,
and parse the document.xml file, replacing the values.
During this process, depending on what is actually needed, there are
sections of the document that will have to be removed from it. So my first
thought was to add comments to the document.xml file. For example:
<!-- INITIAL BUILD ONLY -->
<w:p w:rsidR="00202319" w:rsidRPr="00D00FF5"
w:rsidRDefault="00202319" w:rsidP="00AC0192">
<w:r w:rsidR="00E548A2" w:rsidRPr="00D00FF5">
<w:rPr>
<w:rStyle w:val="emcfontstrong"/>
</w:rPr>
<w:t>Some text here</w:t>
</w:r>
</w:p>
<!-- END INITIAL BUILD ONLY -->
Then, when I go to generate the output word document, I would simply
remove all of the sections that were "INITIAL BUILD ONLY" (unless, of
course, it is the initial build).
However, the issue I am running in to is that when you convert the
document back to a Word document, open in Word and save it, it will
"cleanup" the document, and remove all of the comments I've added to it.
So, my question is, is there any way to preserve the comments in the
document, or is there any special tags I could add to the XML that would
not be visible during standard view/edit of the document, but would not be
removed by Word upon a save?

No comments:

Post a Comment