Difference between revisions of "Documentation/DevGuide/OfficeDev/Filtering Process"

From Apache OpenOffice Wiki
Jump to: navigation, search
(Loading content)
m (Storing content)
 
(25 intermediate revisions by 5 users not shown)
Line 7: Line 7:
 
|NextPage=Documentation/DevGuide/OfficeDev/Filter
 
|NextPage=Documentation/DevGuide/OfficeDev/Filter
 
}}
 
}}
{{DISPLAYTITLE:Filtering Process}}
+
{{Documentation/DevGuideLanguages|Documentation/DevGuide/OfficeDev/{{SUBPAGENAME}}}}
 +
{{DISPLAYTITLE:Filtering Process}}
 
__NOTOC__
 
__NOTOC__
In {{PRODUCTNAME}} the whole process of loading or saving content is a modular system based on UNO services. Some of them are abstract (like e.g. the <idl>com.sun.star.document.ExtendedTypeDetection</idl> and the filter services) and so allow to bind extendable sets of instances implementing them, others (like e.g. the <idl>com.sun.star.document.TypeDetection</idl> service) are those that define the work flow. As they are exchangeable like any UNO service the whole process of finding and using filters can be changed without any need to change other involved components.
+
In {{AOo}} the whole process of loading or saving content is a modular system based on UNO services. Some of them are abstract (like e.g. the <idl>com.sun.star.document.ExtendedTypeDetection</idl> and the filter services) and so allow to bind extendable sets of instances implementing them, others (like e.g. the <idl>com.sun.star.document.TypeDetection</idl> service) are those that define the work flow. As they are exchangeable like any UNO service the whole process of finding and using filters can be changed without any need to change other involved components.
  
 
===Loading content===
 
===Loading content===
  
The most general way to load content into {{PRODUCTNAME}} is calling the [http://api.openoffice.org/docs/common/ref/com/sun/star/frame/XComponentLoader.html#loadComponentFromURL com.sun.star.frame.XComponentLoader:loadComponentFromURL]() method of a suitable object. Such object may be the <idl>com.sun.star.frame.Desktop</idl> object or any instance of the <idl>com.sun.star.frame.Frame</idl> service. Content will end up in a frame object always, if called at the desktop the method will find or create this frame using some of the passed arguments as described in the API documentation linked above. Then it will forward the call to this frame. Here's a diagram showing the workflow that will be explained in the following parargraphs.
+
The most general way to load content into {{AOo}} is calling the <idlm>com.sun.star.frame.XComponentLoader:loadComponentFromURL</idlm>() method of a suitable object. Such object may be the <idl>com.sun.star.frame.Desktop</idl> object or any instance of the <idl>com.sun.star.frame.Frame</idl> service. Content loaded this way will end up in a frame object always, if called at the desktop the method will find or create this frame using some of the passed arguments as described in the API documentation linked above. Then it will forward the call to this frame. Here's a diagram showing the workflow that will be explained in the following paragraphs.
  
 
[[Image:sequence_diagram_load_url.png|thumb|center|500px|General Filtering Process]]
 
[[Image:sequence_diagram_load_url.png|thumb|center|500px|General Filtering Process]]
  
The content will be passed to the loadComponentFromURL() call as a <idl>com.sun.star.document.MediaDescriptor</idl> service that here is implemented as a Sequence of <idl>com.sun.star.beans.PropertyValue</idl>. In most cases it will contain several properties that allow to create an object implementing <idl>com.sun.star.io.XStream</idl> that can be used to read the content. It also may contain some properties that the code of other objects (filter, model, controller, frame etc.) can use to steer the loading process. If no properties shall be handed over and the file content is specified by a URL only, the URL can be passed as an explicit argument and the MediaDescriptor is empty. To understand how to work with the MediaDescriptor in the implementation of a filter or elsewhere see the [[Documentation/DevGuide/OfficeDev/Handling_Documents#MediaDescriptor| documentation of it]] in the chapter about loading documents.
+
The content will be passed to the <code>loadComponentFromURL()</code> call as a <idl>com.sun.star.document.MediaDescriptor</idl> service that here is implemented as a Sequence of <idl>com.sun.star.beans.PropertyValue</idl>. In most cases it will contain several properties that allow to create an object implementing <idl>com.sun.star.io.XStream</idl> or <idl>com.sun.star.io.XInputStream</idl> that can be used to read the content. It also may contain some properties that the code of other objects (filter, model, controller, frame etc.) can use to steer the loading process. If no properties shall be handed over and the file content is specified by a URL only, the URL can be passed as an explicit argument and the MediaDescriptor can stay empty. To understand how to work with the MediaDescriptor in the implementation of a filter or elsewhere, especially how to retrieve a stream from it, see the [[Documentation/DevGuide/OfficeDev/Handling_Documents#MediaDescriptor| documentation of it]] in the chapter about loading documents.
  
If the stream, provided from the outside or created by the first consumer, is not seekable, every consumer creates one. It creates a buffering stream component that reads in the original stream and provides a seekable stream for all further consumers. This buffered stream can be put into the media descriptor.
+
The component loader uses instances of the <idl>com.sun.star.frame.FrameLoader</idl> or <idl>com.sun.star.frame.SynchronousFrameLoader</idl> services. Which frame loader instance will be used depends on the type of the content. This type must be detected first (see below) based on the TypeDetection configuration that allows to register filters or frame loaders for a particular type. {{AOo}} has a generic Frame Loader service that is used when the detected type has no own frame loader registered but filters. If a custom frame loader is registered for a particular type, it's up to that implementation how the content loading process is carried out and if it uses filters or not. As the current topic is "filters", we will concentrate on the generic frame loader here.
  
 +
{{Note|The <idl>com.sun.star.frame.FrameLoader</idl> service is deprecated. If a custom frame loader is registered, it should be a <idl>com.sun.star.frame.SynchronousFrameLoader</idl> service.}}
  
 +
To load content based on a filter first it must be detected which filter is the right one to use and which document type must be used for this filter to work properly. As basically any content type may be loaded into any available document type and even the same type could be loaded into the same document type in different ways, we could find many registered filters for a particular content type. Finding the right one by evaluating what is passed in the MediaDescriptor is the job of the <idl>com.sun.star.document.TypeDetection</idl> service. The result of this detection will be the name of the content type, the name of the wanted filter and the service name of the document model that shall be the target of the loading process. These results will be placed into the MediaDescriptor so that any code in other objects called later can use that information. By providing either the type name of the content or the document service name in the MediaDescriptor handed over to the component loader the search for a filter can be narrowed down to a subset of filters that match these criteria. By providing a filter name in the MediaDescriptor the detection can even be bypassed completely (the component loader will add the matching type and document service names to the MediaDescriptor though). As the whole process of the Type Detection is completely based on the configuration, it will be described in the [[Documentation/DevGuide/OfficeDev/Configuring_a_Filter_in_OpenOffice.org|chapter about the TypeDetection configuration]].
  
 
+
The next steps will be managed by the generic <idl>com.sun.star.frame.SynchronousFrameLoader</idl> service and hands the target frame over to it. The Frame Loader will create the document of the wanted type using the document service name found in the MediaDescriptor. It will also take the detected filter name and ask the <idl>com.sun.star.document.FilterFactory</idl> service to create the filter and perhaps initialize it with some necessary parameters and ask it for importing the content into the new document (this is described in the chapter [[Documentation/DevGuide/OfficeDev/Filter|about filters]]). If all of this went fine, it will attach the document to the target frame by creating a Controller object for the document model.
Before loading can start, two objects must be found that will work together: the suitable filter and a document model it can load into. As basically any content type may be loaded into any available document model and even the same type could be loaded into said model in different ways, we could have many filters for a particular content type. To find the right one by evaluating what is passed in the MediaDescriptor is the job of the <idl>com.sun.star.document.TypeDetection</idl> service. The result of this detection will be the name of the content type, the name of the wanted filter and the service name of the document model that shall be the target of the loading process. The filter will be created by the <idl>com.sun.star.document.FilterFactory</idl> service.
+
 
+
The <code>TypeDetection</code> also employs the <idl>com.sun.star.document.ExtendedTypeDetection</idl> that examines the given resource and confirms the unique type name determined by <code>TypeDetection</code>. The <code>MediaDescriptor</code> is updated, if necessary, and a unique type name is returned.
+
 
+
Finally, the component loader ensures there is a frame, or creates a new one, if necessary, and asks a frame loader service (<idl>com.sun.star.frame.FrameLoader</idl> or <idl>com.sun.star.frame.SynchronousFrameLoader</idl>) to load the resource into the frame. Its interface <idl>com.sun.star.frame.XFrameLoader</idl> has a method <code>load()</code> that takes a frame, the <code>MediaDescriptor</code> and an event listener, and creates a <idl>com.sun.star.document.ImportFilter</idl> instance at the <code>FilterFactory</code> to load the resource into the given frame. For this purpose, it calls <code>createInstance()</code> with the filter implementation name (such as <code>com.sun.star.comp.Writer.GenericXMLFilter</code>) or <code>createInstanceWithArguments()</code> with the implementation name and additional arguments used to initialize the filter.
+
 
+
Then, the loader calls <code>setTargetDocument()</code> and <code>filter()</code> on the <code>ImportFilter</code> service. The <code>ImportFilter</code> creates its results in the given target document.
+
  
 
===Storing content===
 
===Storing content===
  
A URL or a stream is passed to <code>storeToURL()</code> or <code>storeAsURL()</code> in the interface <idl>com.sun.star.frame.XStorable</idl>, implemented by office documents. The store properties create a media descriptor that is filled with the URL or stream, and the store properties. The <code>TypeDetection</code> provides a unique type name that is used with the <code>FilterFactory</code> to create a <idl>com.sun.star.document.ExportFilter</idl>.
+
A MediaDescriptor is passed to <code>storeToURL()</code> or <code>storeAsURL()</code> in the interface <idl>com.sun.star.frame.XStorable</idl>, implemented by office documents. It will contain several properties that allow to create an object implementing <idl>com.sun.star.io.XStream</idl> or <idl>com.sun.star.io.XOutputStream</idl> that can be used to store the content. It also may contain some properties that give more information about how the storing process should be done. If no properties shall be handed over and the target file is specified by a URL only, the URL can be passed as an explicit argument and the MediaDescriptor can stay empty. To understand how to work with the MediaDescriptor in the implementation of a filter or elsewhere, especially how to retrieve a stream from it, see the [[Documentation/DevGuide/OfficeDev/Handling_Documents#MediaDescriptor| documentation about it]] in the chapter about loading documents.
 
+
The <code>XStorable</code> implementation calls <code>setSourceDocument()</code> and <code>filter()</code> at the filter, which writes the results to the storage specified in the <code>MediaDescriptor</code> passed to <code>filter()</code>.
+
 
+
{{Documentation/Note|Many existing filters are legacy filters. The <code>XStorable</code> implementation does not use the <code>FilterFactory</code> to create them, but triggers filtering by internal calls.}}
+
 
+
In the following, the modules that participate in the loading process are discussed in detail.
+
 
+
=== TypeDetection ===
+
 
+
Every content to be loaded must be specified, that is, the type of content represented in the {{PRODUCTNAME}} must be well known in {{PRODUCTNAME}}. The type is usually document type,.however, the results of active contents, for example, macros, or database contents are also described here.
+
 
+
A special service <idl>com.sun.star.document.TypeDetection</idl> is used to accomplish this. It provides an API to associate, for example, a URL or a stream with the extensions well known to {{PRODUCTNAME}}, MIME types or clipboard formats. The resulting value is an internal unique type name used for further operations by using other services, for example, <idl>com.sun.star.frame.FrameLoaderFactory</idl>. This type name can be a part of the already mentioned <code>MediaDescriptor</code>.
+
 
+
It is not necessary or useful to replace this service by custom implementations.,It works in a generic method on top of a special configuration. Extending the type detection is done by changing the configuration and is described later. It is required to make these changes if new content formats are provided for [{{PRODUCTNAME}}, because this is the reason to integrate custom filters into the product.
+
 
+
=== ExtendedTypeDetection ===
+
 
+
Based on the registered types, flat detection is already possible, that is,. the assignment of types, for example, to a URL, on the basis of configuration data only. Tlat detection cannot always get a correct result if you imagine someone modifying the file extension of a text document from .odt to .txt.. To ensure correct results, we need deep detection, that is, the content has to be examined. The <idl>com.sun.star.document.ExtendedTypeDetection</idl> service performs this task. It is called detector. It gets all the information collected on a document and decides the type to assign it to. In the new modular type detection, the detector is meant as a UNO service that registers itself in the {{PRODUCTNAME}} and is requested by the generic <code>TypeDetection</code> mechanism, if necessary.
+
 
+
To extend the list of the known content types of {{PRODUCTNAME}}, we suggest implementing a detector component in addition to a filter. It improves the generic detection of {{PRODUCTNAME}} and makes the results more secure.
+
 
+
Inside {{PRODUCTNAME}}, a detector service is called with an already opened stream that is used to find out the content type. In case no stream is given, it indicates that someone else uses this service, for example, outside {{PRODUCTNAME}}). It is then allowed to open your own stream by using the URL part of the <code>MediaDescriptor</code>. If the resulting stream is seekable, it should be set inside the descriptor after its position is reset to 0. If the stream is not seekable, it is not allowed to set it. Please follow the already mentioned rules for handling streams.
+
 
+
=== FrameLoader ===
+
 
+
Frame loaders load a detected type. A visual component is expected as the result. Such visual components are:
+
 
+
* trivial components only implementing <idl>com.sun.star.awt.XWindow</idl>
+
* simple office components implementing the <idl>com.sun.star.frame.Controller</idl> service
+
* full featured office components implementing the <idl>com.sun.star.document.OfficeDocument</idl> service.
+
::Further details are found in section [[Documentation/DevGuide/OfficeDev/Framework API|Framework API]].
+
 
+
A frame loader service exist in different versions:
+
 
+
* <idl>com.sun.star.frame.FrameLoader</idl> for asynchronous
+
* <idl>com.sun.star.frame.SynchronousFrameLoader</idl> for synchronous load processes.
+
 
+
It can be searched or created by another service <idl>com.sun.star.frame.FrameLoaderFactory</idl>that is described below. The synchronous version is optional. Both services can be implemented at the same component, but the synchronous version is preferred, if it is supported.
+
 
+
There are two ways to extend {{PRODUCTNAME}} to load a new content format:
+
 
+
* implementing a frame loader that uses its own internal mechanism to create the expected visual component, for example, . local file access.
+
* implementing a filter that does the same,but is used by a generic frame loader implementation.
+
  
Note that the first method does not work for exporting, because a loader service can not be used at save time.  To enable a content format for import and export is to provide a filter service. A generic frame loader implementation already exists in {{PRODUCTNAME}} that uses all well known registered filters in a uniform way. So the second method is preferred.
+
If the MediaDescriptor contains a type name or a filter name, the suitable export filter will be created using the <code>FilterFactory</code>. If neither of them is provided, the document will be stored with the latest ODF filter.
  
 
{{PDL1}}
 
{{PDL1}}
  
 
[[Category:Documentation/Developer's Guide/Office Development]]
 
[[Category:Documentation/Developer's Guide/Office Development]]

Latest revision as of 14:34, 9 August 2021



In Apache OpenOffice the whole process of loading or saving content is a modular system based on UNO services. Some of them are abstract (like e.g. the com.sun.star.document.ExtendedTypeDetection and the filter services) and so allow to bind extendable sets of instances implementing them, others (like e.g. the com.sun.star.document.TypeDetection service) are those that define the work flow. As they are exchangeable like any UNO service the whole process of finding and using filters can be changed without any need to change other involved components.

Loading content

The most general way to load content into Apache OpenOffice is calling the loadComponentFromURL() method of a suitable object. Such object may be the com.sun.star.frame.Desktop object or any instance of the com.sun.star.frame.Frame service. Content loaded this way will end up in a frame object always, if called at the desktop the method will find or create this frame using some of the passed arguments as described in the API documentation linked above. Then it will forward the call to this frame. Here's a diagram showing the workflow that will be explained in the following paragraphs.

General Filtering Process

The content will be passed to the loadComponentFromURL() call as a com.sun.star.document.MediaDescriptor service that here is implemented as a Sequence of com.sun.star.beans.PropertyValue. In most cases it will contain several properties that allow to create an object implementing com.sun.star.io.XStream or com.sun.star.io.XInputStream that can be used to read the content. It also may contain some properties that the code of other objects (filter, model, controller, frame etc.) can use to steer the loading process. If no properties shall be handed over and the file content is specified by a URL only, the URL can be passed as an explicit argument and the MediaDescriptor can stay empty. To understand how to work with the MediaDescriptor in the implementation of a filter or elsewhere, especially how to retrieve a stream from it, see the documentation of it in the chapter about loading documents.

The component loader uses instances of the com.sun.star.frame.FrameLoader or com.sun.star.frame.SynchronousFrameLoader services. Which frame loader instance will be used depends on the type of the content. This type must be detected first (see below) based on the TypeDetection configuration that allows to register filters or frame loaders for a particular type. Apache OpenOffice has a generic Frame Loader service that is used when the detected type has no own frame loader registered but filters. If a custom frame loader is registered for a particular type, it's up to that implementation how the content loading process is carried out and if it uses filters or not. As the current topic is "filters", we will concentrate on the generic frame loader here.

Documentation note.png The com.sun.star.frame.FrameLoader service is deprecated. If a custom frame loader is registered, it should be a com.sun.star.frame.SynchronousFrameLoader service.

To load content based on a filter first it must be detected which filter is the right one to use and which document type must be used for this filter to work properly. As basically any content type may be loaded into any available document type and even the same type could be loaded into the same document type in different ways, we could find many registered filters for a particular content type. Finding the right one by evaluating what is passed in the MediaDescriptor is the job of the com.sun.star.document.TypeDetection service. The result of this detection will be the name of the content type, the name of the wanted filter and the service name of the document model that shall be the target of the loading process. These results will be placed into the MediaDescriptor so that any code in other objects called later can use that information. By providing either the type name of the content or the document service name in the MediaDescriptor handed over to the component loader the search for a filter can be narrowed down to a subset of filters that match these criteria. By providing a filter name in the MediaDescriptor the detection can even be bypassed completely (the component loader will add the matching type and document service names to the MediaDescriptor though). As the whole process of the Type Detection is completely based on the configuration, it will be described in the chapter about the TypeDetection configuration.

The next steps will be managed by the generic com.sun.star.frame.SynchronousFrameLoader service and hands the target frame over to it. The Frame Loader will create the document of the wanted type using the document service name found in the MediaDescriptor. It will also take the detected filter name and ask the com.sun.star.document.FilterFactory service to create the filter and perhaps initialize it with some necessary parameters and ask it for importing the content into the new document (this is described in the chapter about filters). If all of this went fine, it will attach the document to the target frame by creating a Controller object for the document model.

Storing content

A MediaDescriptor is passed to storeToURL() or storeAsURL() in the interface com.sun.star.frame.XStorable, implemented by office documents. It will contain several properties that allow to create an object implementing com.sun.star.io.XStream or com.sun.star.io.XOutputStream that can be used to store the content. It also may contain some properties that give more information about how the storing process should be done. If no properties shall be handed over and the target file is specified by a URL only, the URL can be passed as an explicit argument and the MediaDescriptor can stay empty. To understand how to work with the MediaDescriptor in the implementation of a filter or elsewhere, especially how to retrieve a stream from it, see the documentation about it in the chapter about loading documents.

If the MediaDescriptor contains a type name or a filter name, the suitable export filter will be created using the FilterFactory. If neither of them is provided, the document will be stored with the latest ODF filter.

Content on this page is licensed under the Public Documentation License (PDL).
Personal tools
In other languages