Office Open XML/Legacy Implementation
"Office Open XML" is an XML based file format that has been published as ECMA-376. It is used as default file format by Microsoft Office 2007.
There are plans to support this file format in OpenOffice.org for interoperation with Microsoft Office 2007.
There are 3 major types of formats
- WordprocessingML - For word processor documents (file extensions may be docx, docm)
- SpreadsheetML - For spreadsheet documents (file extensions may be xlsx, xlsm)
- PresentationML - For presentation documents (file extensions may be pptx, pptm)
- DrawingML - Used by other markup language to represent graphics data.
- VML - A legacy vector markup.
OpenXML document is a package that consists of a flat collection of "parts". Each "part" has a case-insensitive part name that consists of a slash (/) delimited sequence of segment names such as "/pres/slides/slide1.xml".
For the most part, the ZIP compression is used to package the parts, in which case the term "package" refers to the ZIP archive, and the parts refer to the individual files archived within. The part name in this case is the file path within the archive.
Each part also has a content type, and /[Content_Types].xml provides the content type of each part within the archive.
Packages and parts can contains explicit relationships to other parts as well as to external resources. Every explicit relationship has an ID and a type, and relationship types are named using URIs.
The set of explicit relationships for each package or part is stored in a relationship part whose name (or path) follows a specific convention e.g. the relationship part for a part called "/a/b/c.xml" is called "/a/b/_rels/c.xml.rels". As a special case, the relationship part for the package as a whole is called "/_rels/.rels".
There is some code in the oox module (OOX) from the Xml project. Its initial version from the CWS xmlfilter02 has been integrated into SRC680_m243. The continuing CWS is xmlfilter03 in SRC680. (view the workspace on EIS)
To fetch the oox code from CVS (using CVSROOT is set properly):
cvs co -r cws_src680_xmlfilter03 -d oox xml/oox
One important note: we use the term fragment in the name of our source files to correspond with what the standard calls part. For instance, the source file that contains class definition that handles the workbook part in SpreadsheetML is called workbookfragment.cxx. This convention is prevalent across all application types within oox module.
Bonsai is also convenient to follow the changes:
Word document import is in the writerfilter module.