4 May 2006

Office Files in XML

As a developer, one of the issues that I have to constantly tackle with is to export information into Microsoft Excel or Microsoft Word format. Sad to say, most people I deal with can only work using Excel, Word or Powerpoint. In fact, my company actually used Powerpoint to create contracts! All my customers want report in Excel format. I have seen so many attempts to force data into Excel files (pseudo or not) in the past year.

The easy way out is simply to export to a HTML file but name the file with an extension of .xls for Excel or .doc for Word. When you use the appropriate Office application to open the file, the file is automatically translated. Of course this is less then optimal, especially when the requirements consist of more advance features of Office, like graphs or formula or WordArt.

The correct way to do it is to use the office COM+ objects to create the document. If you have done any COM+ programming, you would know that it is such a horrible way to code your application. In fact, even Microsoft is trying to move away from that programming model for the past few years though its .NET Framework.

Having said all that, we can now talk about the topic of this posting: Office files in XML format. If you have been following tech news, one of the biggest complaints about Microsoft Office is the proprietary file format used. Many open source projects have been trying to build decent Office applications to challenge Microsoft Office, but the biggest hurdle is to export to Microsoft Office file format to enable interoperability with users of Microsoft Office. As Microsoft guards its file format zealously, most of the application can only derive the file format through guesswork.

Not long ago, OpenOffice started support a open standard format for office application files known as OpenDocument. With such an open standard format, documents that you create are no longer hold hostage by Microsoft and its applications. You no longer have to buy Microsoft Office in order to read the documents that you create. In fact, at least one state in US recognize the wisdom in such support for open standard file format, that it actually enforced through law that all documents for the state government must be in OpenDocument format.

Of course this does not sit well with Microsoft. If this will to continue in other states or country, Microsoft will be losing an important weapon to guard their monopoly in Office software applications. Instead of supporting the OpenDocument standard, Microsoft comes up with its own “open standard” format, MS XML. The link basically provides an analytical comparison between OpenDocument and MS XML.

What does all these translate to a humble developer like me? Lots. For one, I can write create my own Office files by simply adhering to the open standard MS XML file format, without the need of COM+ objects or third party tools. Or so I thought…

After reading the article, it becomes clear that Microsoft is still trying to protect its IP through obscurity and confusion. From what I can see in the article, it will probably be very difficult to create a decent parser to write information to MS XML file format. I will probably end up having to use some tool from Microsoft, which probably won’t not be free (M$) or powerful enough to do what I need to do. (Otherwise everyone will be able to create Microsoft Offices ;P )

We will have to wait and see if the market can force Microsoft to eventually support OpenDocument standard, though I have serious doubt about it :-(

1 comment:

Zuraffo said...

I think you should go into IT consultancy. Anything consultancy can earn a lot of money. Most people will be too stupid to effectively make use of your consultancy anyway so you can rack up the hours. Good money!