I wanted to get an example document posted so people get a chance to look through the new Office 12 XML formats and see what the similarities and differences are with the Word 2003 XML format. I took a basic document and saved it out in the new format, as well as in Word 2003's XML format. This is still very early code, so a number of the structures could still change, but I'm pretty confident this is close to what the final version will look like. Also, the majority of the file size is taken up by an embedded picture, so you won't see a significant file size saving with the new format compared to the current binary formats.
You will see right away that it's just pure XML representing the file. I read a post on a blog today where the author mistakenly thought these new formats weren't XML, but instead just XML-based. I guess if that's referring to the fact that we use ZIP as a container it would be true, but other than ZIP, everything else is pure XML following the W3C XML 1.0 standard. I still remember when we decided to go with ZIP as the container... it was a pretty straightforward decision. There were already a number of other formats out there using XML and ZIP, so we figured that would be the best way to go if we wanted people to have an easier time working with our files. Using a single flat XML file wasn't really ever given serious consideration just because of the file size bloat. This was especially true for PowerPoint, where presentations often contain tons of pictures, and having to encode those to store in a single XML file just didn't make a lot of sense.
|