Performance/load performance implement

From Apache OpenOffice Wiki
Jump to: navigation, search

Performance 170.png
Performance Project

Quick Navigation




About this template

Please view Performance/load_performance_analysis about analysis. This document explain implement and some discussion.


Collect data throught sax:

  • The first step is simple, and no more problem.
  • We can collect valid data information to a data structure throught sax parser, the structure's elements can be queried quickly by location identified. I select the vector, now.
  • The element has these base fields:
    • sal_uInt16 m_prefix;
    • rtl_uString* mp_localName;
    • sal_uInt32 m_local;
    • sal_uInt32 m_distance;
    • sal_uInt32 m_count;
    • and other fields such as namespace / full name / cached token enum ...

Processing method

  • We can go through the result's element by element's "m_local" "m_distance" "m_count"; Get parent element and subelements.
  • Every element's processing that has two process steps:
    • Process itself , process the attribute and process subelements.
    • The result information commit to the parent element.

Odfcontext Process Implement.jpg

      • processElement() : Process element
        • _startElement() : Start element
        • processSubContexts() : If this element is a parent element, subelements process; every subelement processing will be divided into three steps.
          • _createChildContext(): Create son context
          • _processSubContext(): Process subelement
          • _collectSubContext(): Collect the subelement's result data
        • _characters(): Process element's content
        • _endElement(): End element
      • commit(): commit the result data to parent element


An element has three solutions, Serial / Parallel / delayed processing.

Serial processing

  • Every element will be same processing. As current processing end, back to parent. The parent will jump to next son processing if has next, or back to parent's parent.

Parallel processing

  • One element that has many subelements, will split every subelement's "_processSubContext()" into different work thread. When all subelements end, the "_collectSubContext()" will be serial processing.

Delayed processing

  • One element that has many subelements, the first subelement will process and the others will delay to the document processing end.


  • It is serial processing to the interests of the whole.
    • We know an odf document will be four base parts : "meta" "settings" "styles" "content".
    • The dependent relation is : "content" -> "styles" -> "settings" -> "meta"( I think it is no more problem -:) )
    • So It will be same as now "meta" -> "settings" -> "styles" -> "content".
  • To every part parallel processing ,that will be possible.
    • Meta part, Settings part and Styles part those can be "Parallel processing".
    • I think it can be "Parallel processing" or "Delayed processing", Conent part; No other part depend this part.


Meta part

  • We know "<office:meta>" , it's subelements is like "<meta:*>". I think the subelements has no correlation between. That can be parallel processing.
  • Currently, this part process a DOM object, I do't know why. So this part is serial processing, now.

Settings part

  • The "<office:settings>", every subelement of that will get an "beans::PropertyValue". It can be parallel processing.

Styles part and Conent part

  • The object from sfx2 , sd , sc and sw; It is complex.


  • Implement almost source code about sd to plan.
  • Debug the Serial processing process is right.
  • Try to test Parallel processing.
  • Try to analysis the Delayed processing is feasibility.Performance/load_performance_implement_2
Personal tools