Mass conversion of Word documents to Office Open XML

Author cambell | 24.07.2007 | Category WeSay

I needed to convert hundreds of documents from Word .doc files to standalone Office Open XML files. The includes an Office File Converter utility for bulk conversions but best I can tell, it only produces the zipped docx variety. (See Doug Mahugh’s description). I need to process these with an XSLT file.

After much digging around, I found the PrimaryInteropAssembly redistributable for Office. I figured I could make a simple utility to do the conversion using that. After installing the PIA, it would not show up in the list of .Net assemblies and I was really puzzled until I finally looked in the COM assemblies. Sure enough there it was.

I ended up making a little utility that will open a Word document and save it as whatever type you specify. For multiple files, I just used the FOR command:

FOR %f IN (*.doc) DO WordConvert %f %~nf.xml FlatXML

Releasing the COM objects proved to be harder than I thought. When I first ran the application, (on fifty odd files) I got an instance of WinWord for each time it ran! Eventually had to force garbage collection twice at the end.
