Education ClassRoom/Previous Logs/tinderboxes

From Apache OpenOffice Wiki
Jump to: navigation, search
cloph Hi * 10:59
chacha_chaudhry cloph: hi 10:59
cloph I uploaded my slides - link is on the agenda-page, or go straight to http://muenchen-surf.de/lohmaier/misc/ 10:59
* lgodard (n=lgodard@AGrenoble-152-1-65-106.w86-193.abo.wanadoo.fr) has joined #education.openoffice.org 10:59
ericb2 cloph: FYI, vincent vikram informed me there is a firewall, and some of his students will read the log afterwards 10:59
cloph Not many people here yet... :-) but getting more apparently... 10:59
ericb2 cloph: and they will ask using mail or mailing lists 11:00
ericb2 cloph: even IRC is difficult at some places 11:00
chacha_chaudhry cloph: yes it is ... at some universities 11:00
cloph Just shout "Go" when I should start/when you got the slides :-) 11:02
ericb2 cloph: thanks for your slides 11:03
ericb2 as .pdf : http://muenchen-surf.de/lohmaier/misc/All_about_Tinderbox.pdf 11:03
ericb2 as .odp : http://muenchen-surf.de/lohmaier/misc/All_about_Tinderbox.odp 11:03
ericb2 cloph: we are ready :) you can start when you want 11:03
chacha_chaudhry cloph: Go 11:03
chacha_chaudhry  :) 11:03
cloph OK then - as you already read the agenda, you know what this talk is about: Tinderbox :-) 11:04
cloph You can see how the talk will proceed on the contents slide - but as you can read faster than I can type, I'll not read it to you :-) <flip/> 11:05
cloph If you have questions in the meantime, don't hesitate to interrupt me, feel free to ask without rising your hand first 11:05
cloph The question: "What is tinderbox?" can be answered fairly easily: It is a system that collects build stati from various sources and displays 11:06
those in a hopefully clear and eady to understand way.
cloph This is the basic task that tinderbox has. To reach that goal, it has other features, like integration with bonsai (or other tools). 11:07
chacha_chaudhry sources means ? -- various platforms or OS? 11:07
cloph In that case it means both. 11:07
cloph There are multiple clients that build the code, those clients run on different OS/Platforms, have a different build-setup and build different 11:08
cws
cloph So while there might be two builders that run linux, one can use Sun's JDK, the other can use gcj, or one can use gcc 3.4, the other gcc 4.3 - that sometimes can make a big difference. 11:09
cloph For those who don't know what bonsai is: Bonsai is a tool that collects commit-information, it is a more advanced "CVS viewer" - it allows you to query for commits in a given period of time, or associated with a specific tag or file. 11:10
* ericb2 suggests to read : http://wiki.services.openoffice.org/wiki/Education_ClassRoom/Practice#Bonsai_use 11:10
cloph Not OOo developed tinderbox, it merely modified Mozilla's tinderbox2 (rather slightly modifications only). Mozilla is a great project when 11:11
it comes to such stuff (think of Bugzilla and stuff)
cloph Ah, great :-) 11:11
cloph <flip/> So how's tinderbox used within the OpenOffice.org project? 11:11
cloph Tinderbox provides overview pages of the results, grouped per status of a CWS (more on that later) 11:12
cloph It is a fast way to check: "Will a problematic cws soon hit the Master" (that's how I use those pages at least :-)) - besides those overview 11:13
pages, it also has status pages for the indivdual cws, that shows more info (also more on that later)
cloph Tinderbox is also integrated with EIS, in a way that it gets the tag-list (the list of CWS, what milestone they're based on and what cws 11:14
modules they include) from EIS via SOAP, and queries cvs directly to get hold of the latest milestones.
cloph Some of you might also heard of buildbot or termite already - this is a related tool, that also is meant to provide a way to automatically 11:15
build code on different buildslaves
cloph In OOo, both real tinderboxslaves as well as (some of) the buildbots report the build status to tinderbox. From a tinderbox point of view, 11:16
it doesn't matter what system build the code, it doesn't make a difference there.
cloph <flip/>Pictures say more than a thousand words, so just have a look at some of the example (in case you could resist the urge to click on 11:17
one of the URLs :-)
cloph The first one shows the overview page that shows the cws in nominated state, as you see, all is green (or yellow). Green is good :-) 11:18
cloph <flip/>as another example a few cws in the new state. You see some are red. Red is bad :-( 11:18
cloph In case you wondered what the other colors mean: <flip/> 11:18
cloph OOo uses the following: There is a "success" status, a "test failed" status (orange), a "build failed" status, a "currently building" status, a 11:19
"dirty" status and a "fold" status.
* ericb2 discovering the real sense of Orange color :) 11:20
cloph The test failed status is maybe a little misleading, as it currently is not used as to indicate some tests failed or not (sorry ericb2 :-) 11:20
ericb2 cloph: np 11:20
cloph The only buildslave that makes use of that status is the Mac PPC buildslave to indicate when it had to rebuild i18npool multiple times or similar (non-reproducible build failures that can be overcome by just rebuilding the affected modules, a rather special szenario, only affecting the PPC) 11:21
cloph I guess "greeen" and "red" are self-explanatory. The dirty status can be set thanks to bonsai integration. That way tindebox knows when commits have been performed after a build was started. So it knows that the results (while valid for the code that was built), doesn't reflect the current status of the cws anymore. 11:22
cloph The grey status is mainly introduced for the buildbot buildslaves, that don't manage their buildqueue themselves, but are told what to 11:24
build. That way they can say: "sorry man, I don't want to build that stuff". Mainly because a buid-breaker is known already, or the buildslave just only wants to build newer milestones and not old cruft..
cloph <flip/> I already mentioned that Tinderbox is using EIS - the same is true the other way round. EIS uses tinderbox as well, it can show the 11:25
tinderbox status in the EIS overview pages.
cloph Unfortunately the default view when browsing to a EIS-CWS page is "overview", and that doesn't show the tinderbox info, but that can be 11:26
configured by the user (but of course is not possible when just using the "guest" login for convenience
cloph <flip/> So far, I only talked about the overview pages, those don't offer that much info as opposed to the real per-cws status pages. 11:27
* ericb2 always clicks "All" button 11:27
cloph Again there is a cross-reference to EIS (the link at the very top will bring you to the corresponding EIS page). 11:27
* cloph has the default (when logged in) set to Tinderbox :-D 11:27
cloph The status page is structured in a table view. At the left you see a time column, next to it a "guilty" column, and after that the columns for 11:29
the individual buildslaves (be it real tinderboxes like the Fedora and Mac ones in the example, or buildbots like the O3-build and Win-XP2 ones).
cloph The "Guilty" column lists commits, you can have a look at http://tinderbox.go-oo.org/aquavcl08/status.html for example, that lists the 11:30
latest commit by ericb2 to the cws
cloph That info comes from Bonsai. 11:30
cloph You can use either the commit-entry in the "guilty" column or the timeline to query bonsai for what exactly was committed. 11:30
ericb2 good idea to link with Bonsai 11:32
cloph I basically only query bonsai by passing by the tinderbox page of the cws, since that way I don't have to fill in the query form manually, 11:32
and usually I'm only interested in the commits after the last successful build, the tinderbox pages makes that easier (IMHO)
cloph The header of the buildslave column show some info about the buildlsave, like when the cws was built last, what the average buildtime is 11:33
(actually mean, not average), how long a current build will still run
cloph see e.g. http://tinderbox.go-oo.org/iconupdate300u1/status.html 11:33
cloph average buildtime is around 220 minutes, and the result is overdue (that is because I'm currently building another tree outside tinderbox and that costs CPU :-)) 11:34
cloph The box with the status result then is the most important part: That box provides access to the buildlogs and only indicates when a cws was built. 11:35
cloph Some slaves (those maintained by me at least), also specify what patches were applied (for known build-breakers affecting the Master the CWS is based on), and whether some of the abovementioned quirks were needed (in the case on the screenshot, the i18npool problem was hit) 11:36
cloph In case the build was done a while back, you can as well go back in time with the "show previous xxx hours" at the bottom of the overview page. 11:36
cloph (But the actual logs might not be available anymore, they get removed by a cronjob) 11:37
cloph <flip/>So let's assume the build broke (is marked as red) and you want to know why it broke. Klick on one of the "l L C" links to open the popup (this might be a bit tricky, since it closes when you hover over another link before reaching the popup, and also when hovering over the "close" link in the popup itself) 11:38
cloph From that pupup, there are links to show a brief (Summary) log and the full log. 11:38
cloph In the case for OOo, where full logs can reach 40 to 50 MB (uncompressed), the only sensible way to start is by using the brief log, that 11:39
only shows lines above and below a "error", and skips the rest.
ericb2 cloph: when you find an error, what can be done ? Do you send a mail to the dev asking him to fix the problem ? 11:40
cloph The first what is listed on the brief-log page are the tinderbox annotations (more on that later), the most important being the tinderbox- 11:40
administrator one. This is meant to show the admin that is responsible for the buildbot, the one who can be contacted when there's a problem with the
buildslave itself/the person that can be asked for help in reading the log.
cloph ericb2: Yes, either mail directly, file an issue, comment in EIS, try to reach the dev on IRC. 11:41
cloph What way you choose basically depends on how urgent it is. If the cws is already nominated, do whatever you can to make them aware of the problem :-) 11:41
cloph If it is still in status new, the dev might not care already, since more changes are to come anyway/doesn't even build for the developer him/herself 11:42
cloph If you look at the screenshot, you might notice one problem already: For the buildbot buildslaves, not the real administrator of the buildbot is shown, but a general alias, "buildermaster@termite.go-oo.org" - this is a limitation of buildbot currently, and might be solved in future. 11:43
cloph <flip/>Now to the next part. Following the annotations, the detected error messages are listed. 11:43
cloph Note that those lines are not "errors" by themselves, merely lines that /could be/ errors. It is just detecting words like "failed" or "error" in 11:44
the log and using those to flag a line (of course more elaborate than that, but enough to get the idea)
cloph As builds usually stop after they hit an error, the error is usually found at the very bottom of the list (more or less, since many buildslaves do parallel builds, so it might be further up a little)<flip/> 11:45
cloph So in the first line that is shown in the next screenshot, just above the buildlog you can find the error that broke this build: "error: "m_xORB" was not declares in (this scope) 11:46
cloph Click on that link and it will bring you to the line where it appears in the log <flip/> 11:46
cloph There it is, flagged in red, with context above and below. There you also see what I mentioned above: this was a parallel build, so you see 11:47
lines of other stuff that was compiled interspread with the module that broke.
cloph The links on the left are the linenumbers, each line has an html-anchor, so you can link to any line in the log directly, the "Next" links 11:48
jump to the next "error" in the log.
cloph I write "error" since the error count that is shown on the colored build-status box always causes confusion: "How can a build flagged as 11:49
successful, when there have benn 30 errors?" is a often heard question.
cloph So now that we had a look on basic funcionality of tindebox and had a look on how to use it, let's switch to the "why" part, why bother? 11:50
cloph I don't know how many of you already built OOo - Just let me say that building OOo takes looooong. OOo is huge, and requires much 11:50
time (and also diskspace) to build.
cloph It is very annoying when you start a build in the evening, to start working or testing the build the next moring, only to find out that your 11:51
build broke after 20 Minutes.
cloph OOo's development module is designed in a way that it should ensure that there's always a usuable Master. 11:52
* lgodard has quit ("Leaving.") 11:52
cloph It is split in childworkspaces, cws, there development is focues on a few issues, few features or a big one, seperated from other development activities. So after a while (every week or two weeks), those cws that are done get integrated into a master. The number of cws can be quite 11:53
high.
cloph If the master then breaks, you need to investigate: Why does it break? Is it a combination of cws that cause the break, or is one cws just being broken? 11:54
* Lachs (n=Gregor@sd-socks-197.staroffice.de) has joined #education.openoffice.org 11:54
cloph Here's where tinderbox jumps in. It can tell: Look, this cws is flagged red, it caused a build breaker. 11:54
cloph Ideally that cws will not be integrated after the problem is solved, but even when it is, that info can help to find a solution earlier, to find 11:55
the developer faster who can fix the breaker.
* lgodard (n=lgodard@AGrenoble-152-1-65-106.w86-193.abo.wanadoo.fr) has joined #education.openoffice.org 11:55
cloph While release-engineers build the code before they announce the master as ready, they of course only use their setup, and that doesn't reflect what the community builders use. Some use Sun's java, some use gcj, some build with features that are turned off in Sun's configuration, some disable features. Some do excessive multi-processing builds, etc. 11:56
cloph So the goal is: Don't release a master that cannot be built by somebody. 11:57
cloph <flip/>So why does it still happen then? This brings us to the limitations of tinderbox. 11:57
cloph The basic problem is compliance. 11:58
cloph Not all build-breakers are faults in the code. There can be a misconfiguration of the buildslave, there can be a problem with the master that the cws is based on (so the problem is in the master, and not in the changes the developer did in his/her cws), there can be infrastructure problems (anoncvs not up-to-date or not reachable at all) 11:59
cloph Furthermore people are impatient, they want results "immediately" after they commited their stuff. This is of course not possible, buidling takes 3 to four hours on fast machines, and of course the build is not started immediately after the commit, since there are other cws to be built as well. 12:00
cloph As another kind of limitation, that is not related to buildability, is the fact that tinderbox doesn't care about whether the produced Office 12:02
actually works or not, what counts is only "are ther build-breakers or not". (the test_failed status already suggest that this is not a limitation of tinderbox,
one could actually use a dedicated status for that), the problem is that none of the bots do run tests, that there are/
cloph Furthemore running tests also costs time, meaning the build results for the cws would be delayed even further. 12:02
cloph <flip/>Also while the community buiders use a variety of build-configurations, tinderbox only covers a very small part of it. 12:03
cloph There just aren't enough buildslaves to cover each and every setup. 12:03
cloph buildslaves also use a fixed set of configure options, so don't detect when stuff breaks in code that is not activated, and because of a 12:04
limitation of EIS, the buildslaves cannot build cws that introduce a new module to cvs (that module just isn't listed in EIS, the bot cannot know about it)
cloph Last but not least, fixing a breaker sometimes is a lot easier or only possible when you have access to an affected buildhost, so even if a developer did have a look at the look, he/she might not be able to fix it 12:05
cloph While tinderbox has a way to handle installsets (you can send files or links to installsets), given the size of OOo (140MB for Mac install set for example), it is just impossible to upload every installset that is build by the slaves, and since the tinderbox buildslaves are all self-contained, decide themselves what they build, there is no way to request an installset but by asking the maintainer. 12:07
cloph (Buildbot on the other hand can be used to request an installset) 12:07
cloph <flip/>Now to the recruiting part :-) 12:07
cloph What can be done to help? - well the first one is simple: Provide a buildslave. But of course not everybody has a suitable build-machine or 12:08
wants to maintain a buildslave, so there are other options as well
cloph Be a mediator between the results and the developers. Notify them of build-breakers caused by their code (ideally in form of a patch), and 12:09
maybe even more important: Notify the administrator of the bot when the build-breaker is caused by the bot, not by the code.
* lgodar1 (n=lgodard@AGrenoble-152-1-65-106.w86-193.abo.wanadoo.fr) has joined #education.openoffice.org 12:10
cloph The list of "errors" might be cut as well, while it is possible to just whitelist some of the lines, it might actually be more desireable to get rid of the complaining in the first place. 12:10
cloph This is kind of a janitorial task, and can cause a lot of work, but maybe someone wants to tackle it nevertheless :-) 12:11
ericb2 cloph: yes 12:11
cloph Good :-) <flip/> so in order to setup a buildslave, you of course need to know how it actually works 12:12
* ericb2 updated the logs for people who cannot use IRC 12:12
cloph The interaction with the tinderbox system is very simple: The buildslaves just need to send their buildlogs via mail to tinderbox. Nothing more, nothing less. 12:12
cloph Tinderbox then passes the logs through the errorparser to create the brief and full logsd and creates the statuspages for the cws. Add the bonsai information to that and tinderbox' job is done. 12:13
cloph <flip/>Of course in order to run a bot, you must be able to build OOo on your system, then automate that process and you have a tinderbox buildslave 12:14
cloph <flip/>You need to pay attention to the mail though 12:15
cloph tinderbox needs to know to what tree (cws) the log belongs, when the build was done, what the outcome was, what buildslave build it, etc. That's what the tinderbox annotations are for. You just put those lines above the actual log. 12:15
cloph And you need to add the mail-header corresponding to the type of message: One with the log in the body: Use X-Tinder: cookie, for logs with gzipped attachment, use X-Tinder: gzookie. 12:16
cloph <flip/>the gzipped logs are one of those cusomisations applied to OOo's tinderbox installation. Uncompressed logs, as mentioned before can be huge, 40MB and more. 12:17
cloph But those logs compress very, very well. A gzipped log is 2,5 to 3 MB in size only. 12:17
cloph Sending mail can be easily automated with perl (or mutt, or ....) - two modules that I used my self are Mail::Sender that can be installed 12:18
via CPAN, and SendEmail
cloph I now suggest SendEmail, since that one supports connections with TSL, as required when using gmail for example, it is a standalone program written in perl and works quite well. 12:19
cloph <flip/>On the buildscript side, the script doesn't need to do much either: It needs to setup the buildtree, apply patches for known breakers (and annotate them if possible), and then finally send the captured log to tinderbox. It is advised that the buildslave doesn't only send the mail when all is finished, but also when it is starting a build, that way people know when a build is running, and when the results can be 12:21
cloph Then continue the process, start with the next cws... 12:22
* lgodar1 has quit ("Leaving.") 12:22
* lgodard has quit ("Leaving.") 12:22
* lgodard (n=lgodard@AGrenoble-152-1-65-106.w86-193.abo.wanadoo.fr) has joined #education.openoffice.org 12:22
cloph So - that basically concludes the presentation. I learned that I type far, far too slow to stay in the announced time, but Since you're still (or 12:23
again :-)) here, I don't think that really did matter... <flip/> So questions and answers time. Anyone?
ericb2 cloph: sorry, I was copying/pasting the changes 12:24
ericb2 chacha_chaudhry: questions ? 12:24
ericb2 cloph: I got one: to summarize, if ever I got a machine and can give processor time, hw proceed, what install ? Where ask, whom ask for 12:25
tips ?
ericb2 cloph: I noticed the first step is complete an OpenOffice.org build 12:26
ericb2 cloph: and then, start with tinderbox 12:26
* lgodard (n=lgodard@AGrenoble-152-1-65-106.w86-193.abo.wanadoo.fr) has left #education.openoffice.org 12:26
cloph (I think I forgot the mention the link to the wiki pages in the presentation: http://wiki.services.openoffice.org/wiki/Tinderbox here you 12:26
find links regarding EIS, a link to the RedTinderboxStatusinEIS page (that lists some known false positives), and also short setup-guide)
chacha_chaudhry cloph: any client side buildslave clients easy to config? 12:26
cloph ericb2: Yes, the prerequisite is that one is able to build OOo. 12:26
* lgodard (n=lgodard@AGrenoble-152-1-65-106.w86-193.abo.wanadoo.fr) has joined #education.openoffice.org 12:26
cloph chacha_chaudhry: You mean ready-to-use scripts? 12:27
chacha_chaudhry cloph: yes 12:27
cloph I have one that I could make reusable... I use it on Linux and Mac, so it should work for those, and since I use perl, the princible would also work on cygwin (but of course I didn't pay attention regarding paths and stuff) 12:28
ericb2 cloph: how many time/day does it need to maintain a tinderbox ? Do you need to upgrade something from time to time ? 12:29
cloph My scripts listen on a fifo for enqueue requests, you can do "echo mycws > fifo-pipe" to enqueue a build (a cronjob can automate this), 12:29
clear the queue with "echo dequeue > fifo-pipe" and stop the slave "echo quit > fifo-pipe" (will wait until build is finished
cloph ericb2: Ah, good catch. 12:29
cloph A buildslaves requires attention every time a new master is released. 12:30
chacha_chaudhry cloph: may you upload it some place ? It would be helpful 12:30
cloph You need to check whether that new milestone built fine on your machine, and if not hunt for the necessary patches/file issues so that the master can be built again. 12:30
cloph chacha_chaudhry: Sure, but I guess I need to clean it up first, it is not very clean code :-) 12:31
chacha_chaudhry cloph: :) Okay 12:31
ericb2 cloph: a word from Vincent Vikram by email, since he does not follow the meeting directly : lease tell cloph that the presentation was 12:31
good and some time is needed(for me) to absorb it.
cloph I think I can have it ready on Wednesday or something 12:31
chacha_chaudhry cloph: I can help with perl to improve them though once I set up :) 12:32
cloph Thanks for the offer :-) 12:32
ericb2 cloph: from Vincent again : Please define EIS and CWS as in "Tinderbox is using EIS" and [11:26] <cloph> Unfortunately the default view when browsing to a EIS-CWS page is "overview 12:33
cloph ericb2/Vincent: Thanks for the feedback - In case there are questions in the next days or weeks, I'm IRC more or less every evening, and you can use MemoServ to leav a message or write a mail 12:34
cloph EIS is the Environment Information System 12:34
cloph It is a management tool that lists all cws-related information, lists what cws are integrated into what master, etc. 12:34
cloph A http://wiki.services.openoffice.org/wiki/CWS is a childworkspace where development happens. Think of it as a copy of the complete sources (where only a few modules are actually part of the cws itself, the rest is taken from the master) 12:35
* ericb2 would like to add: this second ClassRoom was "tool" oriented", and next will be more code oriented 12:35
cloph More info on EIS can be found here: http://wiki.services.openoffice.org/wiki/EIS 12:35
* cloph didn't know what to focus on, but I guess the code of tinderbox itself (cgi/perl, BTW) wouldn't be too useful :-) 12:36
ericb2 cloph: the order of the classroom takes in consideration the need to know a bit about OpenOffice.org environment, and how build it 12:37
ericb2 cloph: once done, we'll discover the code 12:38
ericb2 cloph: and your ClassRoom was great, far more interactive than mine 12:39
chacha_chaudhry cloph: Thanks for your efforts slides are very informative. :-) 12:40
ericb2 cloph: thanks a lot for your time, and for your great work ! 12:41
cloph Thanks - I'll add the link to the tinderbox-page in the wiki and reupload :-) 12:41
ericb2 @all : you'll find the complete log at : http://wiki.services.openoffice.org/wiki/Education_ClassRoom/Previous_Logs/tinderboxes 12:41
* chacha_chaudhry has quit ("Ex-Chat") 12:43
ericb2 The next ClassRoom will be Wednesday 21st May , 11:00 Hambourg Paris Time ( same hour, same channel : #education.openoffice.org ) 12:47
ericb2 we will receive Philipp Lohmann, who will present us the gsl project. Let's talk about code ! 12:47
End of TinderBox ClassRoom. See you !!

Generated by irclog2html.py 2.6 by Marius Gedminas - find it at mg.pov.lt!

Personal tools