Summer of Code 2006

From Apache OpenOffice Wiki
Revision as of 15:17, 31 May 2006 by Jsc (Talk | contribs)

Jump to: navigation, search Summer of Code Projects is proud to participate in the Summer of Code initiative sponsored by Google. Some suitable suggestions are listed below. Ideas for other tasks can be sent to related developer project or contact the project lead.

We would be glad to support a few projects. Mentoring capacities are limited. So we depend on your dedication during the preparation of the detailed specification and description of the outcome.

The application period is May 1 - May 8, 2006. But keep in mind that it may need some prearrangement before you are ready to sign up and apply.

For further general questions about the initiative Google prepared FAQs for students and mentors.

SoC 2006 Logo


API / Programmability

Programmability: Dynamic UNOIDL Reference Browser

The API is specified in IDL (Interface Definition Language). IDL allows a language independent description of the API and language bindings/bridges allows the use of the same API from various languages where exactly such a language binding/bridge exists. But often it is difficult for users of the API to map the language independent IDL reference documentation to their preferred and used programming language, e.g. Java or StarBasic. But exactly this would speed up their daily work, a Java programmer would like to have a Javadoc like reference of the API. The idea of a “Dynamic IDL Reference Browser” is to develop a new concept for documenting IDL types in XML and a concept to provide a dynamically created representation for different languages. That means in detail that the reference browser would dynamically create a piece of XML code from the IDL definition and the provided XML documentation string for the IDL type. An appropriate XSL transformation would convert the XML representation into the required language representation. For example a Java developer gets the correct Java mapping for an IDL interface type. The dynamically approach has the advantage that we can (in a second step) analyze the navigation path to a type and can show context specific documentation. That can be very useful for generic types which have different meanings in different contexts. For example a return type of a generic type XEnumeration from function F of interface X. The generated documentation for XEnumeration would show exactly the info which is necessary to understand and use XEnumeration in the context of function F from interface X. It shows for example that the enumeration in this context always contains objects of type Y and even more information for more complex context dependent information.

  • required skills/knowledge: Java, XML, XSLT
Juergen.Schmidt at

Basic IDE: BASIC/UNO Object Browser

Target of this project is to create an UI component that allows to browse through the structure of UNO objects. Unlike the Basic IDE watch window that recently has been improved to display object properties and their values in an reasonable way the object browser should show the objects' API structure (properties, methods, interfaces, services). It has to be specified if this should refer to the Basic API view (Basic types, no explicit distinction of different interfaces) or the UNO view (UNO types, interface oriented) or if the browser should support two modes allowing to switch between Basic and UNO view. A Basic view is more challenging because it requires Basic related knowledge and probably new functionality in the Basic project. Actually the component is more like a type browser as the objects' static class/interface structure should be displayed, but unfortunately most information are only available for a "living" object using the XtypeProvider interface and the Reflection and Introspection API. Nevertheless it would also make sense to support types directly, e.g. to show the new style UNO service / multi inheritance interfaces.

  • required skills/knowledge: Java, C++, StarBASIC, UNO
Andreas.Bregas at


Basic Citation Enhancements

Priority: Critical Difficulty: medium-high

The bibliographic facility on Openoffice needs considerable enhancement to make it competitive with other bibliographic applications. This is a vital task in the enhancement project.

The task is to implement the most basic changes and additions to the Writer core code (the API basic code, and UNO mappings) necessary to implement basic support for:

  • code to read and write the new, and much improved, citation format (A developer - CPH has already made good progress with this). See Citation XML info design and implementation and implementation discussion.
  • exposing the new code though UNO
  • Write a simple program in OOo Basic or Python to demonstrate the new UNO functions by Inserting and displaying the enhanced citations in Writer using the new format.

See our Developer's Page and Florien's blog entry on this topic.

  • required skills/knowledge: good C++ programming skills.
  • required skills/knowledge: experience with Object Oriented Technologies such as COM/COM+/DCOM and Corba.
  • required skills/knowledge: some knowledge of, or ability to quickly utilise, the Openoffice API and UNO
  • required skills/knowledge: able to write a small OOo basic program.


Application advice

David Wilson (dnw at, Florian Reuter (Florian.Reuter at

Convert the Citeproc Bibliographic Formatting Engine to Python or C++

Priority: Important Difficulty: medium

The new Bibliographic formatting engine, Citeproc, currently exists as a set of XSLT 2.0 stylesheets. In addition, a Ruby prototype has been started. The formatting engine needs to be re-written in C++ or Python (although ultimately to be C++) and then integrated into OOo, though the UNO bridge and the OOo add-on package manager.

The developer will need to test the coding by calling the formatting engine from within Writer (using OOo Basic), and passing a list of citation reference IDs to the Citeproc and returning the formatted citation text and inserting it into Writer.

If possible, it would be great if the developer could also utilise the new internal citation structures being developed for Writer for the demonstration.

See our Developer's Page and this article

  • required skills/knowledge: good C++ or Python programming skills.


Application advice

David Wilson (dnw at

Develop a Bibliographic Database design and implement a Ruby On Rails prototype

Priority: Important Difficulty: medium

Assist is the development of of a Bibliographic Database and build a Ruby On Rails database management system with Browse, Add, Delete and Edit functions.

OOo Bibliographic project Leader, Bruce D'Arcus, has started working on database designs and now needs a skilled database designer to take on this task. The person would preferably have some Ruby on Rails exposure.

See our developer's wiki page

  • required skills/knowledge: strong Database design.
  • required skills/knowledge: Ruby On Rails exposure.


Application advice

David Wilson (dnw at


Calc: Show formula syntax in tip help

When a cell formula is being edited, in addition to the function selection above the cell, show some input hints (like the syntax of the function that is edited) in a tip help window below the cell.

  • required skills/knowledge: good C++ expertise
  • recommended skills/knowledge: -
Niklas.Nebel at

Calc: Add resizeable margin on page preview

In the page preview, add the possibility to interactively change page margins by dragging an indicator, instead of using the page style dialog (see

  • required skills/knowledge: good C++ expertise
  • recommended skills/knowledge: -
Niklas.Nebel at

Calc: Extend non-linear optimiser in spreadsheet (solver)

Improve the solver component for Calc Kohei Yoshida has written a basic 'Solver' component for Calc which handles simple linear and non-linear optimisations of spreadsheet content. It would be useful to extend the set of optimisers to be faster and more stable. Finding appropropriately licensed libraries would be ideal, although some candidates may be interested in creating new analytics. This will be an ongoing project, so the preference will be towards maintainable, documented code rather than hard core performance.

Developers should be able to build OOo and have a background in numeric optimisation.

Skills: C++, Numeric Analysis

Jody Goldberg <>
Kohei Yoshida <>

Database Access

Linking external text tables into OOo databases

Native OOo Base databases (.odb) use the HSQLDB ( database engine. HSQLDB features a mechanism to link external text tables into the database, as if they were a native HSQL table (

The task of this project is to bring this feature to For this, the respective database driver of OOo Base has to be extended with an API to administrate such text table links. Additionally, an user interface needs to be designed and implemented for using this API.

  • required skills/knowledge: C++, relational database concepts
  • recommended skills/knowledge: OOo's component technology (UNO)
  • useful skills/knowledge: HSQLDB text table concepts
Frank.Schoenheit at Sun.COM

Native SQLite driver

SQLite ( is a SQL database used as lean backend in a number of applications. The task is to write a native SDBC database driver (resp. finalize the existing skeleton: for, accessing SQLite files efficiently.

  • required skills/knowledge: C++, relational database concepts
  • recommended skills/knowledge: SQLite API
  • useful skills/knowledge: OOo's component technology (UNO)
Ocke.Janssen at Sun.COM

Embed Derby into databases Base features an abstract mechanism to embed database backend files into OOo databases (.odb). Currently, this is implemented for HSQLDB (http:///, which is used as OOo's default database engine.

To allow this feature for other engines, one must:

  • virtualize the engine's file access, so that it re-routes all its file operations through an abstract API.
  • implement this API on the OOo Base side

The project is to do those implementations for Apache Derby database (

  • required skills/knowledge: C++
  • recommended skills/knowledge: relational database concepts
  • useful skills/knowledge: OOo's component technology (UNO)
Ocke.Janssen at Sun.COM

Database driver UI modularization Base follows a component-oriented approach for enabling database access. For this, database drivers are installed in which provide access to a certain (class of) database(s).

While at the driver level, the implementation is pretty good modularized, the UI implementation can be improved. Currently, there are a lot of places in the code with hard-coded information, such as "database X requires UI option Y".

The goal of this project is to design and implement a reasonable architecture for bringing a driver to the UI. The existing implementation needs to be migrated to this new architecture. As a proof of concept, an existing currently-external driver (e.g. should be modified so that it can be deployed into an OOo installation and makes use of the features of the new architecture.

  • required skills/knowledge: C++
  • recommended skills/knowledge: OOo's component technology (UNO)
  • useful skills/knowledge: OOo's configuration concepts
Frank.Schoenheit at Sun.COM

HSQLDB: single-file backend

HSQLDB (http:/// is the database engine used by

HSQL currently creates a number of adjacent files to store its data, where all files together comprise the whole database. To allow the user of Base to have a "all-in-one-file" database experience, those HSQL files are currently embedded in some OOo-specific container-file (the .odb file).

To overcome various disadvantages of this approach, it is desirable that HSQL stores its data in a single, large file. Preliminary code and concepts exist for this, but no final implementation.

This project needs to be worked on in close collaboration with the HSQLDB project, whose owner, Fred Toussi, will act as co-mentor.

  • required skills/knowledge: good Java expertise, relational databases
  • recommended skills/knowledge: HSQLDB architecture

The following project description was provided by Fred Toussi, HSQLDB project owner:

Currently, HSQLDB stores the database information in four separate files. These files are written to using different API’s. Streams are used for some while random access is used for others.

The project’s aim is to allow a single file to be used for all permanent data (temporary session data, or file locking may still use a separate file). So existing files that need to be integrated into one are the .properties, .log, .data and .backup files. The new single file will be accessed only as a random access file.

The project consists of developing an interface between exsisting code. At the low level, there is already an implementation of a random access file, org.hsqldb.persist.ScaledRAFile. The new interfaces will allow existing file services to use the single random access file.

All the files in the org.hsqldb.persist package should be studied, with the understanding that the funtionality of many of these files, including lock and property saving functionality, will become redundant with the introduction of single file persistence.

The student is expected to study and become proficient in the Java IO packages and the API calls to these packages currently made from HSQLDB, and write the code and test packages.

The mentor will provide regular guidance and help on the design of required interfaces and supervise their implementation.

As this is considered an essential development for HSQLDB, all the work done by the student will be used, and if necessary, modified or improved by other project developers. The student will get due credit and will be provided with work references by the HSQLDB Project Maintainer upon successful completion of the project.

Frank.Schoenheit at Sun.COM

HSQLDB: editable views

HSQLDB, used as's database engine, features the usage of views (basically, stored SQL queries).

An advantage of views over queries is that the former can be re-used in other queries, i.e. you can do a SELECT * FROM foo where foo is the name of an existing view, i.e. a SELECT statement itself.

A disadvantage of views is that the constituting statement cannot be edited once the view has been created.

Goal of this project is to provide a "Edit view" functionality in Base, for the moment for the HSQLDB backend only. For this, an API needs to be defined which allows editing views. This API has to be implemented in Base' dedicated HSQLDB driver. Additionally, the user interface should respect the existence of this API by offering an "Edit View" item in the context menu of views. Once chosen, Base' query designer is to be started, to modify the SQL statement constituting the view. Saving the work in the designer should change the view's underlying statement.

  • required skills/knowledge: C++
  • recommended skills/knowledge:'s component model (UNO)
Frank.Schoenheit at Sun.COM


Download / Mirror Management Tool: Bouncer

We made the first steps to use bouncer for download and mirror management. This tool developed at the Oregon State University Open Source Lab would significantly ease the handling of download pages. There are some crucial features open for Bouncer v3 you could help to finalize.

Mike Morgan and Lars Lohn


Performance: Fast native implemented OOo Dispatch Loader

Implementing a faster small Office Loader, independent from OOo libraries that uses native APIs to show the splashscreen. This will speed up UI response time after first Office start by 90 percent. Implementing propagation of command line parameters with a native pipe implementation. This will speed up loading of documents after double clicking them in the OS desktop environment. Remove any dependency to OOo libraries from the loader code, so that it is implemented with native APIs only. This allows the loader to be as small as possible and to run without any bootstrapping. This will reduce I/O overhead and unnecessary code execution at startup. Adjust the pipe implementation in desktop module to fit the native pipe implementations Prototypes of a Loader for Windows and Linux are available, implementing the required features at least on one platform.

Hennes.Rohling at


Filter: creating a Visio filter for OOo Draw

OOo currently lacks support for a Visio import filter, so that it would be good to have someone starting to write such a filter from scratch utilizing the OOo API for direct creation of the document or the ODF format as input format to be loaded by the OOo application. Although a full blown filter seems to be unrealistic to be written within the short timeframe, large parts of Visio documents should be able to be filtered correctly.

  • required skills/knowledge: good C++ expertise, graphics programming
  • recommended skills/knowledge: Visio file format / ODF or OOo API
Sven.Jacobi at Sun.COM

GSL: Port Cairo Canvas to Win32

Cairo canvas is an canvas backend, which provides beautiful anti-aliased rendering functionality using the 'cairo' library. Today the cairo canvas backend works only on u*ix systems. The objective is to allow to use cairo on win32. Cairo already targets win32, and using it would give a faster, more beautiful slide-show experience.

Building OOo on Win32 is moderately painful, requiring a VC++.Net 2003 compiler, although it's possible that (with some extra work) the (free) VC++ Express Edition could be used.

  • Skills: C, C++, cygwin, Win32 compiling
Radek Doulik (radekdoulik at


Source Code Repository Activity

Provide statistics for CVS repository activity. The first step would be a quick analyse what type of statistics other projects are providing. Together with our requirement to better understand how the code base and groups of committers are evolving this should give input to start work on the preparation of some nice stats and graphs about repository activity. Results should be generated with open-source tools, automatically and on a regular basis for the website. Additionally an interface to cia would be helpful.

Stefan Taxhet (stefan.taxhet at


Integrating Autodoc generated IDL documentation into the Visual .NET IDE


Autodoc (, the ( documentation parser, creates a series of HTML documentation pages out of the SDK IDL files (see The generated documentation can be integrated for example into Netbeans. “Integrated” means that from that IDE (integrated development environment) the documentation is accessible via internal indexes and/or context sensitive help. It would be useful to be able to integrate it also into other much used development environments, for example the Visual .NET IDE. Integration into the Visual .NET IDE can be done by providing some additional files that match the MS Help 2.0 format (see for some information).

  1. Enable Autodoc to provide the additionally needed files to integrate the created documentation into the Visual .NET IDE.
  2. Document the mechanism to compile those files and make the Autodoc generated documentation visible within the IDE.
  • Needed: Good C++.
  • Helpful: Knowledge about the SDK, experience with an IDE.
Nikolai.Pretzell at


Writer: Improved Lotus WordPro Import Filter

Currently there is a very basic import filter for WordPro that only imports pure text. It would be desirable to import more content and structure from WordPro documents.

Oliver.Specht at

Writer: Component for guessing the language of a text

Currently the OOo spell checker tries to guess the language of a text with an unknown word by searching for it in a fixed number of dictionaries or thesauri because checking all available languages would be too much. This limited search is inconvenient if the used language is not amongst the preselected ones. There are already known heuristic approaches to guess the possible language of a text, some of them are available as descriptions, some even as source code. Integrating such a component into OOo would improve working in multi language documents.

(Possibly useful references: #1, #2)

Thomas.Lange at

Writer: Import Filter for Word Perfect Graphics files

The writerperfect import filter for WordPerfect(tm) documents, based on libwpd (, was integrated into the OOo2.0. The libwpd is a standalone library also used by other projects like KOffice or AbiWord. Another standalone library that reads WordPerfect graphics files would improve the interoperability with the WordPerfect Office Suite. An import filter for this file format based on this new standalone library can become embedded into OOo in the same way as libwpd. Please find more information at" The base for this work will be the, already existing, even though unreleased, code of libwpg library (

  • Being highly comfortable developping using C++ and Standard Template Library. Have in mind that the code has to be strictly portable.
  • Not need of knowing the API
Some hints about possible tasks
  1. Convert all vector graphics records of WPG2 format. It is already done for WPG1 and it works quite correctly (some problems with colours due to an undocumented record, but fixable since we have a kind of documentation now).
  2. The parameters of the API callbacks should be as close as possible following the SVG properties for a given object (ellipse, circle, path, ...). So that one can use the wpg2svg to visualize instantly the conversion result. And like that, it will be easier to use the libwpg library with another libraries that render SVG, i.e. librsvg. And, last but not least, the converter code could be than used with anything that generates SVG.
  3. Still maintain wpg2raw in order to be able to create a regression test suite.
  4. Create a wpg2odg converter which outputs SAX messages in odg format (handler independent) + write two handlers (very trivial), one for dumping the content.xml into stdout and other for writing a zipped odg file. Because of the OOo document handler, everything should be preferably contained in one single flat content.xml including styles and metadata if converted. One can use styles.xml in the handler that writes the zipped file if we need to set some default styles for third party applications as we do it in the wannabe KWord import filter and in the CVS version of wpd2sxw.
  5. Abstract the stream implementation inside libwpg and make the odg filter core not depend on any specific stream abstaction layer. Only the handlers. Write a sample input stream implementation class using C++ STL input stream classes (get rid of dependency on libgsf).

If this is done in short time, the following things could be done, but optional:

  1. Rewrite the libwpg API so that it is breakage proof like libwpd is currently, possibly using libwpd's public API (WPXProperty, WPXPropertyList, WPXPropertyListVector and WPXString). Or extend the libwpd's API if the original one is not corresponding to our needs (i.e. the SVG property points="X1,Y1 X2,Y2 ... Xn,Yn" could be a problem).
  2. Convert text strings.
  3. Find a way to extract the bitmap part of WPG2, decode it (run-length encoding) convert it to some simple raster format (*.bmp ???) and add it [maybe base64 encoded] into the resulting xml.
Practical guidelines
  1. For a nice integration with OO.o, no STL types should appear in the API. Many distributions use OO.o built against system libraries and OO.o uses internally stlport. STL and stlport types do not have the same signatures, so bad bad if one links a library compiled using STL from inside OO.o build environment.
  2. Put the libwpg API in a separate namespace in order to avoid possible conflicts with whatever other library OO.o could use.
  3. Avoid completely "using namespace std" statement, since it is possible that it will have problems with one or other of the compilers on one of OO.o platforms.
Fridrich Strba (fridrich_strba at openoffice dot org)

Writer: iWork 'Pages' importer

The Apple iWork 'Pages' product stores its files in a .zip file, using an XML format. It would be simple enough to parse this and import at least some of the information: at a minimum raw text, simple styles, tables and images.

There is some documentation for the Apple format at: and we can ask for more information where necessary. It would be necessary for the applicant to have a copy of Pages to generate & compare test files.

Skills: C++, XML

Michael Meeks (mmeeks at
Svante Schubert (Svante.Schubert at

Writer: Grammar Checker

Currently, there is no grammar checker that comes with OpenOffice. This is one of the only things that would prevent people from switching from MS Office. Practically every other word processor (MS Word, WordPefect, Abiword...) has a grammar checker, and it's about time that OpenOffice had one too.


Porting: Integrate the native Mac OS X FilePicker into (Aqua/X11)

Concerns : 2.0.3 or superior, for Mac OS X port (both Aqua and X11 versions)

Integrate the native Mac OS X FilePicker into The OOo FilePicker is already designed as a UNO component. What is needed is a new implementation of the FilePicker component based on the native Mac OS X FilePicker as it has been done on MS Windows for instance. On current Mac OS X port of, the current FilePicker is not native, less ergonomic.

Possible tasks :

Familiarize with the native Mac OS X FilePicker API and the OOo UNO interfaces for the FilePicker Describe current implementation : design, concerned modules, isolate classes and parameters to manage propose a new design using Apple API write and test a proof of concept

  • Skills : knowledge of languages C / C++ , using Carbon/Cocoa API
  • Proposed by : Tino Rachui
Tino Rachui ( Tino dot Rachui at Sun dot COM )

Porting: Integrate help into Mac OS X help center (Aqua/X11)

The help system is quite independent of the underlying operating system and thus doesn't integrate well into the help system on Mac OS X.

Possible tasks:

Learn how help system works in and how you can call it

Create sample help book for Apple Help

Prepare conversion framework for current XML/HTML based help pages to help book

Integrate provided solution into CVS

  • Skills: C/C++, XML/XSLT, Carbon API
  • Proposed by: Pavel Janík
Pavel Janík ( Pavel at Janik dot cz )
Frank Peters (Frank dot Thomas dot Peters at Sun dot COM)

Porting : Mac OS X Address book integration (Aqua / X11)

Synopsis: OOo currently is integrated with the Mozilla address book but not with the native Mac OS X address book. This is annoying for Mac OS X user. For better system integration it would be desirable to integrate with the Mac OS X address book.

Concerns: 2.x for Mac OS X (both Aqua and X11 versions)

Skills: Knowledge of languages C/C++, Mac OS X APIs and Application frameworks like Carbon or Cocoa for instance, knowledge of the Mac OS X address book APIs

Tasks: Familiarize with the Mac OS X address book API's Familiarize with the current OOo Mozilla address book integration Make a prototype for OOo Mac OS X address book integration

Oliver Braun ( obrmac at openoffice dot org )
Frank Schoenheit ( Frank dot Schoenheit at Sun.COM )

Porting: Implement native font support, using native Apple API (Aqua / X11)

Does concern : 2.0.3 or superior, for Mac OS X port (both Aqua and X11 versions)

Subject proposed by : none yet

Tasks :

(1) analyze current implementation : design, concerned modules, isolate classes and parameters

(2) propose a design for the new one, using Apple API

(3) write and test a proof of concept

If enough time: (4) implement the solution, with possible backport to X11 version.

  • Skills : knowledge of languages C / C++ , using Carbon/Cocoa API, knowledge of font systems and font technology
Eric Bachard (ericb at openoffice dot org)

Porting : Mac OS X Spell checker integration (Aqua / X11)

Synopsis: OOo currently uses aspell or hunspell as spell checking components. Mac OS X comes with a build-in spell checker, which should be used by OOo for consitancy reasons, i.e. to save the user from having to maintain two dictionaries (OOo/OS X) in parallel.

Concerns: 2.x for Mac OS X (both Aqua and X11 versions)

Skills: Knowledge of languages C/C++/Objective C or Java, Mac OS X APIs and Application frameworks like Carbon or Cocoa for instance, knowledge of the Mac OS X spell checking APIs

Tasks: Familiarize with the Mac OS X spell checking API's Familiarize with the way OOo utilizes spell checkers Make a prototype for OOo Mac OS X spell checker integration

Oliver Braun ( obrmac at openoffice dot org )
Thomas Lange ( Thomas dot Lange at Sun.COM )


Release QA Tracking Tool

Tracking a release with dozens of languages and several platforms can be tedious. This is especially the case if the only used tool is a manually maintained table with colors indicating the status.

A web based tool should help us to keep track of the location and status of the builds as well as the responsible contact. The workflow in place should be supported. The tool should provide an overview, the means to maintain a release including easy status manipulations.

Andre Schnabel (Andre.Schnabel at

Testtool Result Tracking Tool

Many of the QA team members are running automated tests for several releases, snapshots ... Every release should have successfully passed a number of tests. At the moment it is almost impossible to tell, what scripts have been run and what the results have been.

An online database could be of help here. The idea is to send the resultfiles (or an extract of the results) to this database, so they can be collected and analyzed.

A browser based frontend could help us to keep track of the testtool results.

André Schnabel (Andre.Schnabel at
Helge Delfs (Helge.Delfs at Sun.COM)

VBA interop: Chart API has Excel VBA interoperability support under development, with some exciting preliminary results (see eg. here) The work is essentially creating and extending simple mapping layers between new VBA compatible object model APIs and the existing OOo UNO APIs (using UNO).

A significant missing piece here is the Shape API (of which Charting is an extension). Samples of existing wrappers for other objects can be found here: and

  • Skills: C++, Basic/VBA, [UNO]
dev at
Noel Power <npower at>

OOo Project

SoC Sample Suggestion Title


Contact Person (email at domain)
Personal tools