New OOo Mirror Structure

From Apache OpenOffice Wiki
Jump to: navigation, search

New mirror structure for OpenOffice.org

Ideas for a new structure to distribute OpenOffice.org builds in the mirror network

The number of native language teams are increasing and therefore install sets for more languages of OpenOffice.org (OOo) are possible to release. The result is that more and more space on the mirrors network is needed to cover such demand by the contributors and the users. We want to endorse this demand by providing more languages.


The demand will increase ... :

  • on /extended to ca. 300 GB (for RCs, Beta, Dev Builds, L10N, ISO files) and
  • on /stable and /localized to ca. 200 GB.


In order to not increase the mirror sizes on purely it could be an alternative to split some parts into packages:

  • The /stable part contains en-US builds only. As this is very interesting for the very most it it will stay as it is.
  • The /localized part contains all languages regardless if you want to distribute them in the region you are working. But what about splitting this into packages?

  1. Define a package that contains languages by coherent regions
    We can split all languages into reasonable packages and you can choose which ones you like to have on your mirror. With this method you can decide on your own about the total size you need. When we need more space in total you can try to hold your current size with choosing only a set of packages.

  2. Define a package that contains languages by their download rate
    It's also possible to look at the download rate for all languages and define packages depending on this. So, the most downloaded languages can be in one package, the second most in another package and so on.

  3. Do you have another idea?
    Of course we do not want to predefine the packages that you have to choose. We can speak about the number of languages and/or the total file sizes for each package. Or maybe about another structure? At the end you have the decision on your own how many languages and MB's you want to distribute through your servers.

  • The last part is /extended which contains mostly Beta / RC's, Developer Snapshots and L10N relevant files. Also this part needs to be increased as we expect larger RC's in the future. E.g., the newest OOo 3.3.0 RC 3 is about 78 GB. However, only the mirrors that support the extended set are affected and these are mostly very powerful servers.
  • Disk space on archive mirrors will also need to be upgraded as we want to keep every legacy product release there.


Important:
If you still want to provide the full set of files there is nothing to change. Only if you want to distribute a part of the set you have to select the respective packages and to change/extend your rsync scripts.


Please give us your comment on this. Either via mail or in this Wiki. We are open to your opinions. Thank you.


Big picture of the new structure

Description Size
/stable/<version>/... en-US builds only
/localized/<language code>/<version>/... localized builds
/extended/... RCs, Beta, Dev Builds, L10N, ISO files
/packages/<package number/... the numbered packages from the tables below


When will the new structure go live?

Deployment for OOo 3.3.0

The upcoming OOo 3.3.0 release will have full installation sets for up to 41 languages (avg. 166 MB per file) and language packs for up to 70 languages (avg. 20 MB per file), so in total approx. 78 GB.

For this we will create the structure and hard-link the localized builds into the package subdirs, even when the packages will be filled with just a few languages or nothing at the moment. The advantage is that the package structure is already available and does not need to be changed (or at least not too much) when the other languages can be released.

Deployment for the future

For the future we plan to build release files for more languages and the total size of a release could of course increase like mentioned above when they got tested and approved. However, the package structure will remain the same but with more files in the subdirs.


Some statistical sizes

File sizes are calculated with OOo 3.2.1 as basis. All other numbers are reflecting the future:

Description Size
Current total size of the /stable part (GB) 7
Current total size of the /localized part (GB) 44
Current total size of the /extended part (GB) 90
Average file size of a full install build (MB) 162
Average file size of a langpack build (MB) 21
Full install platforms (#) 11
Langpack platforms (#) 9
Languages full builds (#) 100
Languages langpacks (#) 100
Future total size of the /stable part (GB) 2
Future total size of the /localized part (GB) 200
Future total size of the /extended part (GB) 300


Possibility 1

The following packages are defined with languages within a common region:

English (US)

ISO code Language Mostly spoken in
(region)
Size
(GB)
Languages
(#)
en-US English (US) World 2 1
no separate package, already available in /stable


Europe

ISO code Language Mostly spoken in
(region)
Size
(GB)
Languages
(#)
be-BY Belarusian Eastern Europe
bg Bulgarian Eastern Europe
et Estonian Eastern Europe
hu Hungarian Eastern Europe
lv Latvian Eastern Europe
lt Lithuanian Eastern Europe
ro Romanian Eastern Europe
uk Ukrainian Eastern Europe
Package 1 15 8
br Bretonian Middle Europe
en-GB English (British) Middle Europe
fr French Middle Europe
ga Irish Middle Europe
gl Galician Middle Europe
oc Occitan Middle Europe
cy Welsh Middle Europe
Package 2 13 7
cs Czech Middle Europe
nl Dutch Middle Europe
eo Esperanto World
de German Middle Europe
pl Polish Middle Europe
sk Slovak Middle Europe
sl Slovenian Middle Europe
Package 3 13 7
da Danish Northern Europe
fi Finnish Northern Europe
is Icelandic Northern Europe
nb Norwegian (Bokmal) Northern Europe
nn Norwegian (Nynorsk) Northern Europe
sv Swedish Northern Europe
Package 4 12 6
ast Asturian Southern Europe
eu Basque Southern Europe
ca Catalan Southern Europe
ca-XV Catalan (Valencian) Southern Europe
it Italian Southern Europe
es Spanish Southern Europe
pt-BR Portuguese (Brazilian) South America
pt Portuguese (European) Southern Europe
Package 5 15 8
sq Albanian Southern Europe
bs Bosnian Southern Europe
hr Croatian Southern Europe
el Greek Southern Europe
mk Macedonian Southern Europe
sr Serbian (Cyrillic) Southern Europe
sh Serbian (Latin) Southern Europe
Package 6 13 7


European Union

The following package is defined with languages spoken in the European Union (source: Wikipedia - Percentage of EU population speaking language)

ISO code Language Mostly spoken in
(region)
Size
(GB)
Languages
(#)
cz Czech European Union
nl Dutch European Union
en-GB English (British) European Union
fr French European Union
de German European Union
el Greek European Union
it Italian European Union
pl Polish European Union
ru Russian European Union
es Spanish European Union
sv Swedish European Union
Package 7 21 11


Asia

ISO code Language Mostly spoken in
(region)
Size
(GB)
Languages
(#)
ka Georgian Central Asia
kk Kazakh Central Asia
ru Russian Central Asia
tg Tajik Central Asia
uz Uzbek Central Asia
Package 8 10 5
zh-CN Chinese (simplified) Eastern Asia
zh-TW Chinese (traditional) Eastern Asia
dz Dzongkha Central Asia
ja Japanese Eastern Asia
ko Korean Eastern Asia
mn Mongolian Eastern Asia
ne Nepali Central Asia
bo Tibetan Eastern Asia
ug Uyghur Eastern Asia
Package 9 17 9
as Assamese Indian
brx Bodo Indian
dgo Dogri Indian
gu Gujarati Indian
hi Hindi Indian
kn Kannada Indian
ks Kashmiri Indian
kok Konkani Indian
mai Maithili Indian
Package 10 17 9
ml Malayalam Indian
mr Marathi Indian
or Oriya Indian
pa-IN Punjabi Indian
sa-IN Sanskrit Indian
sd Sindhi Indian
si Sinhala Indian
ta Tamil Indian
te Telugu Indian
Package 11 17 9
ar Arabic Middle East
fa Persian Middle East
he Hebrew Middle East
ku Kurdish Middle East
tr Turkish Middle East
Package 12 10 5
bn Bengali Southern Asia
my Burmese Southern Asia
id Indonesian Southern Asia
km Khmer Southern Asia
th Thai Southern Asia
vi Vietnamese Southern Asia
Package 13 12 6


Africa

ISO code Language Mostly spoken in
(region)
Size
(GB)
Languages
(#)
om Oromo Eastern Africa
rw Kinyarwanda Eastern Africa
sw-TZ Swahili Eastern Africa
Package 14 6 3
af Afrikaans Southern Africa
en-ZA English (South African) Southern Africa
nr Ndebele Southern Africa
ns Northern Sotho Southern Africa
st Southern Sotho Southern Africa
ss Swazi Southern Africa
ts Tsonga Southern Africa
tn Tswana Southern Africa
ve Venda Southern Africa
xh Xhosa Southern Africa
zu Zulu Southern Africa
Package 15 19 10


Calculation example 1

Now

Description Size Comment
/stable 7 GB en-US builds only
/localized 44 GB all localized languages
/extended 90 GB files for RCs, Beta, Dev Builds, L10N, ISO files


Future

Description Size Comment
/stable 2 GB en-US builds only
/localized 38 GB localized languages from packages 3,5,8 only
/extended 300 GB files for RCs, Beta, Dev Builds, L10N, ISO files


Possibility 2

The following packages are defined with languages regarding their percental download rate:

Rank ISO code Language Percent Size
(GB)
Languages
(#)
1 en-US English (US) 47.05%
Package 1 2 1
2 fr French 13.67%
3 de German 8.82%
4 it Italian 8.40%
5 ja Japanese 5.71%
6 es Spanish 3.49%
7 pl Polish 3.30%
Package 2 12 6
8 ru Russian 2.43%
9 nl Dutch 1.82%
10 sv Swedish 1.14%
11 da Danish 0.92%
12 zh-CN Chinese (simplified) 0.77%
13 zh-TW Chinese (traditional) 0.66%
14 nb Norwegian (Bokmal) 0.59%
15 tr Turkish 0.42%
Package 3 15 8
16 ko Korean 0.25%
17 pt-BR Portuguese (Brazilian) 0.18%
18 el Greek 0.08%
19 gl Galician 0.06%
20 pt Portuguese (European) 0.06%
21 nn Norwegian (Nynorsk) 0.05%
22 vi Vietnamese 0.05%
23 lt Lithuanian 0.04%
Package 4 15 8
24 mk Macedonian 0.01%
25 sh Serbian (Latin) <0.01%
26 ku Kurdish <0.01%
27 sr Serbian (Cyrillic) <0.01%
28 ga Irish <0.01%
29 en-GB English (British) <0.01%
30 et Estonian <0.01%
31 ar Arabic <0.01%
Package 5 15 8


Calculation example 2

Now

Description Size Comment
/stable 7 GB en-US builds only
/localized 44 GB all localized languages
/extended 90 GB files for RCs, Beta, Dev Builds, L10N, ISO files


Future

Description Size Comment
/stable 2 GB en-US builds only
/localized 42 GB localized languages from packages 2,3,4 only
/extended 300 GB files for RCs, Beta, Dev Builds, L10N, ISO files
Personal tools