Nuts and Bolts of WordML

It seems that Microsoft is making some progress. That was partially ignored by community because of various reasons. Fortunately it is one step in long walk to get to better world?

Fortunately it is somehow usable. Especially when able to ignore some weird functionality and benefit from XML. To give you some clues that may help you. First one is just go on and try. One important thing that I read in O'Reily's sample chapter of nice book "Office 2003 XML" is do not try to understand everything! Just play a bit and do not be scared of many unknown tags.

Few things to try just reformat XML into some nice indented form, like Notepad++ (TextFX -> TextFX HTML Tidy -> Tidy: Reindent XML). It produces nicely readable code BUT it breaks things in there so you must make a copy. At this point I was wondering whether some web application would not help me editing somehow the document (DOM, OracleXML or others).

Another point is that Microsoft WordXML at least in Word 2003 is able to do so strange thing like marking XML as being in UTF-8 but actualy saving it as a ISO Latin1 maybe CP497. Therefore "recode lat1..utf-8" gets handy.

Lot of unexpected behaviour must have been cleaned in Office 2007 but it is not adopted so widely and therefore you should keep 2003 version for some period of time.

Another strange thing that I revealed and is somehow documented somewhere on msdn.microsoft.com that when making pages landscape and portrait you should know that you have to put for all sections into except the last otherwise you get one surplus page.

Comments

Popular Posts