Extracting Compound Documents

Compound documents are composed of a container document and embedded documents. For example, a Microsoft Word document may contain an embedded Microsoft Excel spreadsheet. The embedded spreadsheet is considered to be embedded one level down from the container document. CloudNine™ Explore supports up to 99 levels of embedded compound documents. So if the embedded spreadsheet includes an embedded PowerPoint file (second level down) that further includes an embedded PDF (third level down), CloudNine™ Explore can extract all four files.


Enabling Extraction of Compound Documents

The Expand compound documents check box on the Analysis tab in the New Case Settings and Edit Case Settings dialog box determines whether or not embedded documents are expanded and indexed separately from their parent document when imported into the CloudNine™ Explore case.

When the Expand compound documents check box is selected, compound documents are expanded and imported as attachments of their parent documents, and the embedded files will be search-able in CloudNine™ Explore.

For more information about the Expand compound documents check box and other case settings, see Cases in CloudNine™ Explore.


Supported Embedded File Types

The following file types are supported for extraction from compound documents in CloudNine™ Explore:

Microsoft Word/RTF

Microsoft Excel

Microsoft PowerPoint

Adobe Acrobat PDF


Microsoft Visio

Microsoft Outlook.FileAttach (Word-authored e-mail with inline attachments, generally stored in RTF)

Microsoft Project


*A Package is a general type of embedded file. For example, it can be a generic text file or a zip file. Any of the above types may also be embedded as a package type depending on the software installed when a user embeds the file. For example, if a user were to embed an Excel spreadsheet into a Word document, and Excel is not installed, the spreadsheet will be embedded as Package.


Supported containers file types and embedded files

The following table lists common embedded file types that CloudNine™ LAW supports for extraction.


Office 95 and earlier versions of Office are not supported with CloudNine™ Explore.

Non-Microsoft Office Formats


Adobe Acrobat (PDF)


Rich text format (RTF)

Office 2007 and above


Excel Spreadsheet (OpenXml)


MS Office Data File (OpenXml)


PowerPoint Presentation (OpenXml)


Word (OpenXml)

Office 2003




Word (xml)




Excel (xml) *Compound documents not supported in this format








Project (xml) *Compound documents not supported in this format






Visio (xml) *Currently not recognized by file engine

Office 2002/XP, Office 2000













Office 97









**Detection of embedded files in these file types is limited to the types of files supported for extraction (see above list).