From Apache OpenOffice Wiki
Jump to: navigation, search

Performance 170.png
Performance Project

Quick Navigation




About this template

Investigation and Profiling of Writer Load/Save Performance

We started a systematic profiling of the load/save performance on a current milestone (DEV300_m45). Im using Intel vTune on Windows and cachegrind (with cache-simulation) on Linux. On each platform each of the four documents (see Testdocuments below) is profiled for a load and a save procedure. Each measurement is done twice. In total, this will result in:

2 (platforms/profilers) * 4 (documents) * 2 (load/save) * 2 (measurements) = 32 measurements in total

The analysis of the data should:

  • show, if the profiler output is stable and reproducable
  • show, if there are any differences between platforms and profiler (and for cachegrind: does cache-simulation result in meaningful additional accuracy?)
  • identify the hotspots of the current implementation
  • be a basis to evalute the progress by optimizations

Birdsview Callgrind (Contributions To StoreToUrl DEV300_m45)

Contributions to StoreToUrl
Document Lib Method Instructions fetched Cycle Cost Est. Instructions fetched Cycle Cost Est.
ScienceThesis 1 libsfx SfxBaseModel::StoreAsUrl 11592161864 12700226484 100.00% 100.00%
libsfx SfxObjectShell::SaveAsOwnFormat 4567386009 5193895809 39.40% 40.90%
libsfx SfxObjectShell::GenerateAndStoreThumbnail 2336567459 2525183559 20.16% 19.88%
libsfx SfxMedium::Commit 4535296593 4794222283 39.12% 37.75%
other 152911803 186924833 1.32% 1.47%
ScienceThesis 2 libsfx SfxBaseModel::StoreAsUrl 11144535020 12250598830 100.00% 100.00%
libsfx SfxObjectShell::SaveAsOwnFormat 4575357778 5189916868 41.05% 42.36%
libsfx SfxObjectShell::GenerateAndStoreThumbnail 1925216412 2116790032 17.27% 17.28%
libsfx SfxMedium::Commit 4535327879 4792981289 40.70% 39.12%
other 108632951 150910641 0.97% 1.23%
Manual 1 libsfx SfxBaseModel::StoreAsUrl 2019415631 2238108511 100.00% 100.00%
libsfx SfxObjectShell::SaveAsOwnFormat 1239343887 1399947937 61.37% 62.55%
libsfx SfxObjectShell::GenerateAndStoreThumbnail 607055773 642691573 30.06% 28.72%
libsfx SfxMedium::Commit 140565338 157789035 6.96% 7.05%
other 32450633 37679966 1.61% 1.68%
Manual 2 libsfx SfxBaseModel::StoreAsUrl 2043487795 2268710495 100.00% 100.00%
libsfx SfxObjectShell::SaveAsOwnFormat 1262263271 1428106081 61.77% 62.95%
libsfx SfxObjectShell::GenerateAndStoreThumbnail 608153207 644497207 29.76% 28.41%
libsfx SfxMedium::Commit 141902391 154627511 6.94% 6.82%
other 31168926 41479696 1.53% 1.83%
Spec 1 libsfx SfxBaseModel::StoreAsUrl 17995716426 20443306916 100.00% 100.00%
libsfx SfxObjectShell::SaveAsOwnFormat 16713230798 19057592468 92.87% 93.22%
libsfx SfxObjectShell::GenerateAndStoreThumbnail 619569536 652497286 3.44% 3.19%
libsfx SfxMedium::Commit 605712688 663294858 3.37% 3.24%
other 121121223 133131403 0.67% 0.65%
Spec 2 libsfx SfxBaseModel::StoreAsUrl 18062701503 20537474953 100.00% 100.00%
libsfx SfxObjectShell::SaveAsOwnFormat 16681477861 19055676231 92.35% 92.78%
libsfx SfxObjectShell::GenerateAndStoreThumbnail 654539677 685122457 3.62% 3.34%
libsfx SfxMedium::Commit 605562742 663544862 3.35% 3.23%
other 121121223 133131403 0.67% 0.65%
MailMerge 1 libsfx SfxBaseModel::StoreAsUrl 40777455765 53626856665 100.00% 100.00%
libsfx SfxObjectShell::SaveAsOwnFormat 23118516732 34395921752 56.69% 64.14%
libsfx SfxObjectShell::GenerateAndStoreThumbnail 17475120469 19021058129 42.85% 35.47%
libsfx SfxMedium::Commit 161846924 179849124 0.40% 0.34%
other 21971640 30027660 0.05% 0.06%
MailMerge 2 libsfx SfxBaseModel::StoreAsUrl 34409955066 45317878876 100.00% 100.00%
libsfx SfxObjectShell::SaveAsOwnFormat 22931786501 33059699851 66.64% 72.95%
libsfx SfxObjectShell::GenerateAndStoreThumbnail 11294919676 12054180246 35.03% 26.60%
libsfx SfxMedium::Commit 161097067 173977107 0.47% 0.38%
other 22151822 30021672 0.06% 0.07%
std. deviation libsfx SfxBaseModel::StoreAsUrl 9.31% 9.21%
libsfx SfxObjectShell::SaveAsOwnFormat 1.17% 2.53%
libsfx SfxObjectShell::GenerateAndStoreThumbnail 23.04% 23.30%
libsfx SfxMedium::Commit 0.61% 2.21%
other 16.88% 12.56%

Callgrind Save XML-Generation (Contributions To SaveAsOwnFormat DEV300_m45)

The following profiling files have been generated with:

valgrind --tool=callgrind "--toggle-collect=*SaveAsOwnFormat*" ./soffice.bin

Rerunning the save procedure shows them to have a high reproducability (~1% deviation for SaveAsOwnFormat as a whole).

To analyse these files open them with kcachegrind or callgrind_annotate.

Callgrind Load XML-Parsing (Contributions To LoadOwnFormat DEV300_m45)

The following profiling files have been generated with:

valgrind --tool=callgrind "--toggle-collect=*LoadOwnFormat*" ./soffice.bin

To analyse these files open them with kcachegrind or callgrind_annotate.

Benchmarking with vTune

In the spreadsheet Media:odfsave.ods you can find a list of the top consumers from some selected libraries (sw, xo, svl, svt, sal3, sfx2) when saving the ODF specification document.


Implemented Optimizations

Save time index entries

Issue 57008

Issue 57008 related to save time index entries ( ). It saves est. 10% of the save time.

Conversion of Hyperlinks

issue 100683

  • Conversion of hyperlinks takes a lot of time. To bring hyperlinks into the correct form and to make them relative to the target URL the methods from svt's URIHelper are used. This is done for _all_ URLs independent of the protocol. In the given document URLs are mostly http while the document is stored to file. One approach to solve this issue is to convert only if the protocols are the same. Another approach is to manage the hyperlinks within the document and to keep them in the correct form all the time. So they only have to be converted at the time the target URL changes (saveAs/storeToURL). On the given system this would save about 4 s from a total of 12 s save time. (stopwatch estimation)

Even worse: issue 50983

  • Saving a file with a lot of fragment URLs "#bookmarkname" to a network share takes a lot of time. In comparision: DEV300 m41 takes about 1:55 min while os128 takes only 28s to save the document to a network share.

To make sure the file URLs are correctly normalized the dialog code to insert all kinds of links has to call the normalization. This applies to Insert/Hyperlink, Insert/Picture from File and others.

String Indexed Access of PropertySets

issue 99568

  • Another rather big part of processing time is consumed to access the members of the implementations of css::beans::XPropertySet, XPropertyState, XPropertySetInfo. To find the requested element by it's name the methods from SfxItemPropertySet, SfxItemPropertySetInfo etc. iterate over an array of structs that define a property (SfxItemPropertyMap). This can be seen by the numbers from SfxItemPropertyMap::getByName, rtl::OUString::equalsAsciiL, SfxItemPropertySetInfo::hasPropertyByName in the svl library.
  • The replacement for the SfxItemPropertyMap that uses an std::hash_map is ready. After changing a lot of code in the applications as well as in svtools, sfx2, svx and others I started to compare the load/save times.
  • The result is not as expected. In Media:Odfsave_withhash.ods you can see that SfxItemPropertyMap::getByName() takes longer than before. The new function takes about 5.3 s totally. These are about 1.7 s more than it's predecessor SfxItemPropertyMap::GetByName() required. The time is consumed mostly in the _M_Find<::rtl::OUString> method of the hash_map implementation.
  • One of the probable reasons is the fact that the sorted access to properties eliminated a lot of string comparisons.

Iteration over Frame Collections

Issue 101084

The methods SwDoc::GetFlyCount and SwDoc::GetFlyNum contribute more than 13 % of the instructions to SaveAsOwnFormat for the MailMerge document. The iteration over the frames array is O(n^2). Suggested solution:

  • make the frames collections support XEnumerationAccess

Compressed files do not need to be compressed again in Storage

Issue 100722

  • The large contribution of SfxMedium::Commit for documents "ScienceThesis" and "Manual" to StoreAsUrl in the callgrind analysis are attributed to the pictures in the document. We are investigating, if it might help to store image files that are already compressed (JPEG for examples) directly to the storage without trying to compress again in vain.

Identified Hotspots

Using XMultiPropertySet where XTolerantMultiPropertySet might suffice and be more performant

  • To decide which properties have to be saved xmloff uses the interface methods css::beans::XPropertyState::getPropertyStates() and css::beans::XMultiPropertySet::getPropertyValues(). It could also use the interface css::beans::XTolerantMultiPropertySet::getDirectPropertyValuesTolerant() which is not implemented for Writer's UNO objects.
  • Saving Writer's text content is done by iterating over the paragraphs and iterating over so-called text portions within the paragraphs. Text portions are parts of the paragraph that have a single attribute set, text fields, redline portions, inline anchored frames etc. It might make sense to detect their properties at construction time and preset their css::uno::XTolerantMultiPropertySet interface. And the moment a text portion is created that adds a bookmark to remember it's position. The impact on real documents is not yet checked.
  • A test implementation of the XTolerantMultiPropertySet in Writer's text portion objects didn't result in increased save speed.

Font Fallback

The huge contribution of GenerateAndStoreThumbnail in the callgrind measurements for some of the documents is attributed to substitution matching for missing fonts. This might be an issue to investigate.

Test Documents

Microsoft Word Documents where loaded and saved as odt before profiling.

Personal tools