Difference between revisions of "Performance/load performance implement"

From Apache OpenOffice Wiki
Jump to: navigation, search
Line 1: Line 1:
Implement is in 'xmloff', and each element processing is divided into several steps:
+
{{Performance}}
 +
Please view [[Performance/Odf_document_load_performance_increase_feasibility_analysis]] about analysis. This document explain implement and some discussion.
  
        1 )processElement() : Process element
+
= Implement =
                2) _startElement() : Start element
+
                3) processSubContexts() : Sub elements process,
+
                                  if this element is a parent element.
+
                                  Every sub element processing will be divided into three steps.
+
                        4)_createChildContext(): Create Sub Context
+
                        5)_processSubContext(): Process sub element
+
                        7)_collectSubContext(): Collect the sub element's result data
+
                9) _characters(): Process element's content
+
                10)_endElement(): End element
+
        11)commit(): commit the result data to parent element
+
  
And every parent element has three solutions to process sub elements:
+
== Collect data  throught sax: ==
 +
* The first step is simple, and no more problem.
 +
* We can collect valid data information to a data structure throught sax parser, the structure's elements can be queried quickly by location  identified. I select the vector, now.
 +
* The  element has these base fields:
 +
**  sal_uInt16 m_prefix;
 +
**  rtl_uString* mp_localName;
 +
**  sal_uInt32 m_local;
 +
**  sal_uInt32 m_distance;
 +
**  sal_uInt32 m_count;
 +
**  and other fields such as namespace / full name / cached token enum ...
  
a) Serial processing
+
== Processing method ==
 +
* We can go through the result's element by element's "m_local" "m_distance" "m_count"; Get parent element and subelements.
 +
* Every element's processing that has two process steps:
 +
** Process itself , process the attribute and process subelements.
 +
** The result information commit to the parent element.
 +
[[Image:Odfcontext Process Implement.jpg]]
  
b) Parallel processing
+
*** processElement() : Process element
 +
**** _startElement() : Start element
 +
**** processSubContexts() : If this element is a parent element, subelements process; every subelement processing will be divided into three steps.
 +
***** _createChildContext(): Create son context
 +
***** _processSubContext(): Process subelement
 +
***** _collectSubContext(): Collect the subelement's result data
 +
**** _characters(): Process element's content
 +
**** _endElement(): End element
 +
*** commit(): commit the result data to parent element
  
c) delayed processing
+
* An element has three solutions, Serial / Parallel / delayed processing.
 +
** Serial processing: every element will be same processing. As current processing end, back to parent. The parent will jump to next son processing if has next, or back to parent's parent.
 +
** Parallel processing: One element that has many subelements, will split every subelement's  "_processSubContext()" into different work thread. When all subelements end, the "_collectSubContext()" will be serial processing.
 +
** Delayed processing: One element that has many subelements, the first subelement will process and the others will delay to the document processing end.
  
In the 'content.xml' stream,
+
== Processing ==
 +
* It is serial processing to the interests of the whole.
 +
** We know an odf document will be four base parts : "meta"  "settings"  "styles"  "content".  
 +
** The dependent relation is : "content" -> "styles" -> "settings" -> "meta"( I think it is no more problem -:) )
 +
** So It will be same as now "meta"  -> "settings" -> "styles" -> "content".
 +
 +
* To every part parallel processing ,that will be possible.
 +
** Meta part, Settings part and Styles part those can be "Parallel processing".
 +
** I think it can be "Parallel processing" or "Delayed processing", Conent part; No other part depend this part.
 +
 +
= Difficulty =
  
it is 'c)' the top parent element of display element.
+
== Meta part ==
 +
* We know "<office:meta>" , it's subelements is like "<meta:*>". I think the subelements has no correlation between. That can be parallel processing.
 +
* Currently, this part process a DOM object, I do't know why. So this part is serial processing, now.
  
In the 'settings.xml/styles.xml' stream,
+
== Settings part ==
 +
* The "<office:settings>", every subelement of that will get an "beans::PropertyValue". It can be parallel processing.
  
it is 'b)' the top parent element of setting/style element.
+
== Styles part and Conent part ==
 +
* The object from sfx2 , sd , sc and sw; It is complex.  
  
And 'meta.xml', 'settings.xml', 'styles.xml', 'content.xml' It is serial processing.
+
= Plan =
 
+
* Implement almost source code about sd to plan.
[[Image:Odfcontext Process Implement.jpg]]
+
* Debug the Serial processing process is right.
 +
* Try to test Parallel processing.  
 +
* Try to analysis the Delayed processing is feasibility.
  
 
[[Category:Performance]]
 
[[Category:Performance]]

Revision as of 08:59, 7 December 2009

Performance 170.png
Performance Project

performance.openoffice.org

Quick Navigation

Team

Communication

Activities

About this template


Please view Performance/Odf_document_load_performance_increase_feasibility_analysis about analysis. This document explain implement and some discussion.

Implement

Collect data throught sax:

  • The first step is simple, and no more problem.
  • We can collect valid data information to a data structure throught sax parser, the structure's elements can be queried quickly by location identified. I select the vector, now.
  • The element has these base fields:
    • sal_uInt16 m_prefix;
    • rtl_uString* mp_localName;
    • sal_uInt32 m_local;
    • sal_uInt32 m_distance;
    • sal_uInt32 m_count;
    • and other fields such as namespace / full name / cached token enum ...

Processing method

  • We can go through the result's element by element's "m_local" "m_distance" "m_count"; Get parent element and subelements.
  • Every element's processing that has two process steps:
    • Process itself , process the attribute and process subelements.
    • The result information commit to the parent element.

Odfcontext Process Implement.jpg

      • processElement() : Process element
        • _startElement() : Start element
        • processSubContexts() : If this element is a parent element, subelements process; every subelement processing will be divided into three steps.
          • _createChildContext(): Create son context
          • _processSubContext(): Process subelement
          • _collectSubContext(): Collect the subelement's result data
        • _characters(): Process element's content
        • _endElement(): End element
      • commit(): commit the result data to parent element
  • An element has three solutions, Serial / Parallel / delayed processing.
    • Serial processing: every element will be same processing. As current processing end, back to parent. The parent will jump to next son processing if has next, or back to parent's parent.
    • Parallel processing: One element that has many subelements, will split every subelement's "_processSubContext()" into different work thread. When all subelements end, the "_collectSubContext()" will be serial processing.
    • Delayed processing: One element that has many subelements, the first subelement will process and the others will delay to the document processing end.

Processing

  • It is serial processing to the interests of the whole.
    • We know an odf document will be four base parts : "meta" "settings" "styles" "content".
    • The dependent relation is : "content" -> "styles" -> "settings" -> "meta"( I think it is no more problem -:) )
    • So It will be same as now "meta" -> "settings" -> "styles" -> "content".
  • To every part parallel processing ,that will be possible.
    • Meta part, Settings part and Styles part those can be "Parallel processing".
    • I think it can be "Parallel processing" or "Delayed processing", Conent part; No other part depend this part.

Difficulty

Meta part

  • We know "<office:meta>" , it's subelements is like "<meta:*>". I think the subelements has no correlation between. That can be parallel processing.
  • Currently, this part process a DOM object, I do't know why. So this part is serial processing, now.

Settings part

  • The "<office:settings>", every subelement of that will get an "beans::PropertyValue". It can be parallel processing.

Styles part and Conent part

  • The object from sfx2 , sd , sc and sw; It is complex.

Plan

  • Implement almost source code about sd to plan.
  • Debug the Serial processing process is right.
  • Try to test Parallel processing.
  • Try to analysis the Delayed processing is feasibility.
Personal tools