2012年2月2日星期四

Distributed builds - hudson - Hudson Wiki

Distributed builds - hudson - Hudson Wiki

Hudson supports the "master/slave" mode, where the workload of building projects are delegated to multiple "slave" nodes, allowing single Hudson installation to host a large number of projects, or provide different environments needed for builds/tests. This document describes this mode and how to use it.

Contents

How does this work?

A "master" is an installation of Hudson. When you weren't using the master/slave support, a master was all you had. Even in the master/slave mode, the role of a master remains the same. It will serve all HTTP requests, and it can still build projects on its own.

Slaves are computers that are set up to build projects for a master. Hudson runs a separate program called "slave agent" on slaves. There are various ways to start slave agents, but in the end a slave agent and Hudson master needs to establish a bi-directional byte stream (for example a TCP/IP socket.)

When slaves are registered to a master, a master starts distributing loads to slaves. The exact delegation behavior depends on configuration of each project. Some projects may choose to "stick" to a particular machine for a build, while others may choose to roam freely between slaves. For people accessing Hudson website, things works mostly transparently. You can still browse javadoc, see test results, download build results from a master, without ever noticing that builds were done by slaves.

Follow the Step by step guide to set up master and slave machines to quickly start using distributed builds.

Different ways of starting slave agents

Pick the right method depending on your environment and OS that master/slaves run.

Have master launch slave agent via ssh

Hudson has a built-in SSH client implementation that it can use to talk to remote sshd and start a slave agent. This is the most convenient and preferred method for Unix slaves, which normally has sshd out-of-the-box. Click Manage Hudson, then Manage Nodes, then click "New Node." In this set up, you'll supply the connection information (the slave host name, user name, and ssh credential). Note that the slave will need the master's public ssh key copied to ~/.ssh/authorized_keys. (This is a decent howto if you need ssh help). Hudson will do the rest of the work by itself, including copying the binary needed for a slave agent, and starting/stopping slaves. If your project has external dependencies (like a special ~/.m2/settings.xml, or a special version of java), you'll need to set that up yourself, though. [Where is this documented?]

This is the most convenient set up on Unix.

Have master launch slave agent on Windows

For Windows slaves, Hudson can use the remote management facility built into Windows 2000 or later (WMI+DCOM, to be more specific.) In this set up, you'll supply the username and the password of the user who has the administrative access to the system, and Hudson will use that remotely create a Windows service and remotely start/stop them.

This is the most convenient set up on Windows, but does not allow you to run programs that require display interaction (such as GUI tests).

Note : Unlike other Node's configuration type, the Node's name is very important as it is taken as the node's address where to create the service !

Write your own script to launch Hudson slaves

If the above turn-key solutions do not provide flexibility necessary, you can write your own script to start a slave. You place this script on the master, and tell Hudson to run this script whenever it needs to connect to a slave.

Typically, your script uses a remote program execution mechanism like SSH, RSH, or other similar means (on Windows, this could be done by the same protocols through cygwin or tools like psexec), but Hudson doesn't really assume any specific method of connectivity.

What Hudson expects from your script is that, in the end, it has to execute the slave agent program like java -jar slave.jar, on the right computer, and have its stdin/stdout connect to your script's stdin/stdout. For example, a script that does "ssh myslave java -jar ~/bin/slave.jar" would satisfy this.
(The point is that you let Hudson run this command, as Hudson uses this stdin/stdout as the communication channel to the slave agent. Because of this, running this manually from your shell will do you no good).

A copy of slave.jar can be downloaded from http://yourserver:port/jnlpJars/slave.jar . Many people write scripts in such a way that this 160K jar is downloaded during the script, to make sure the consistent version of slave.jar is always used. The SSH Slaves plugin does this automatically, so slaves configured using this plugin always use the correct slave.jar.

Updating slave.jar
Technically speaking, in this set up you should update slave.jar every time you upgrade Hudson to a new version. However, in practice slave.jar changes infrequently enough that it's also practical not to update until you see a fatal problem in start-up.

Launching slaves this way often requires an additional initial set up on slaves (especially on Windows, where remote login mechanism is not available out of box), but the benefits of this approach is that when the connection goes bad, you can use Hudson's web interface to re-establish the connection.

Launch slave agent via Java Web Start

Another way of doing this is to start a slave agent through Java Web Start (JNLP). In this approach, you'll interactively logon to the slave node, open a browser, and open the slave page. You'll be then presented with the JNLP launch icon. Upon clicking it, Java Web Start will kick in, and it launches a slave agent on the computer where the browser was running.
This mode is convenient when the master cannot initiate a connection to slaves, such as when it runs outside a firewall while the rest of the slaves are in the firewall. OTOH, if the machine with a slave agent goes down, the master has no way of re-launching it on its own.

On Windows, you can do this manually once, then from the launched JNLP slave agent, you can install it as a Windows service so that you don't need to interactively start the slave from then on.

If you need display interaction (e.g. for GUI tests) on Windows and you have a dedicated (virtual) test machine, this is a suitable option. Create a hudson user account, enable auto-login, and put a shortcut to the JNLP file in the Startup items (after having trusted the slave agent's certificate). This allows one to run tests as a restricted user as well.

Launch slave agent headlessly

This launch mode uses a mechanism very similar to Java Web Start, except that it runs without using GUI, making it convenient for an execution as a daemon on Unix. To do this, configure this slave to be a JNLP slave, take slave.jar as discussed above, and then from the slave, run a command like this:

$ java -jar slave.jar -jnlpUrl http://yourserver:port/computer/slave-name/slave-agent.jnlp 

Make sure to replace "slave-name" with the name of your slave.

Other Requirements

Also note that the slaves are a kind of a cluster, and operating a cluster (especially a large one or heterogeneous one) is always a non-trivial task. For example, you need to make sure that all slaves have JDKs, Ant, CVS, and/or any other tools you need for builds. You need to make sure that slaves are up and running, etc. Hudson is not a clustering middleware, and therefore it doesn't make this any easier.

Example: Configuration on Unix

This section describes my current set up of Hudson slaves that I use inside Sun for my day job. My master Hudson node is running on a SPARC Solaris box, and I have many SPARC Solaris slaves, Opteron Linux slaves, and a few Windows slaves.

  • Each computer has an user called hudson and a group called hudson. All computers use the same UID and GID. (If you have access to NIS, this can be done more easily.) This is not a Hudson requirement, but it makes the slave management easier.
  • On each computer, /var/hudson directory is set as the home directory of user hudson. Again, this is not a hard requirement, but having the same directory layout makes things easier to maintain.
  • All machines run SSHD. Windows slaves run cygwin sshd.
  • All machines have ntp client installed, and synchronize clock regularly with the same NTP server.
  • Master's /var/hudson have all the build tools beneath it --- a few versions of Ant, Maven, and JDKs. JDKs are native programs, so I have JDK copies for all the architectures I need. The directory structure looks like this:
    /var/hudson   +- .ssh   +- bin   |   +- slave  (more about this below)   +- workspace (hudson creates this file and store all data files inside)   +- tools       +- ant-1.5       +- ant-1.6       +- maven-1.0.2       +- maven-2.0       +- java-1.4 -> native/java-1.4 (symlink)       +- java-1.5 -> native/java-1.5 (symlink)       +- native -> solaris-sparcv9 (symlink; different on each computer)       +- solaris-sparcv9       |   +- java-1.4       |   +- java-1.5       +- linux-amd64           +- java-1.4           +- java-1.5 
  • Master's /var/hudson/.ssh has private/public key and authorized_keys so that a master can execute programs on slaves through ssh, by using public key authentication.
  • On master, I have a little shell script that uses rsync to synchronize master's /var/hudson to slaves (except /var/hudson/workspace) I use this to replicate tools on all slaves.
  • /var/hudson/bin/launch-slave is a shell script that Hudson uses to execute jobs remotely. This shell script sets up PATH and a few other things before launching slave.jar. Below is a very simple example script.
    #!/bin/bash  JAVA_HOME=/opt/SUN/jdk1.6.0_04 PATH=$PATH:$JAVA_HOME/bin export PATH java -jar /var/hudson/bin/slave.jar 
  • Finally all computers have other standard build tools like svn and cvs installed and available in PATH.

Scheduling strategy

Some slaves are faster, while others are slow. Some slaves are closer (network wise) to a master, others are far away. So doing a good build distribution is a challenge. Currently, Hudson employs the following strategy:

  1. If a project is configured to stick to one computer, that's always honored.
  2. Hudson tries to build a project on the same computer that it was previously built.
  3. Hudson tries to move long builds to slaves, because the amount of network interaction between a master and a slave tends to be logarithmic to the duration of a build (IOW, even if project A takes twice as long to build as project B, it won't require double network transfer.) So this strategy reduces the network overhead.

If you have interesting ideas (or better yet, implementations), please let me know.

Transition from master-only to master/slave

Typically, you start with a master-only installation and then much later you add slaves as your projects grow. When you enable the master/slave mode, Hudson automatically configures all your existing projects to stick to the master node. This is a precaution to avoid disturbing existing projects, since most likely you won't be able to configure slaves correctly without trial and error. After you configure slaves successfully, you need to individually configure projects to let them roam freely. This is tedious, but it allows you to work on one project at a time.

Projects that are newly created on master/slave-enabled Hudson will be by default configured to roam freely.

Master on public network, slaves within firewall

One might consider setting up the Hudson master on the public network (so that people can see it), while leaving the build slaves within the firewall (because having a lot of machines on the internet is expensive.) There are two ways to make it work:

  • Allow port-forwarding from the master to your slaves within the firewall. The port-forwarding should be restricted so that only the master with its known IP can connect to slaves. With this set up in the firewall, as far as Hudson is concerned it's as if the firewall doesn't exist.
  • Use JNLP slaves and have slaves connect to the master, not the other way around. In this case it's the slaves that initiates the connection, so it works correctly with the NAT firewall.

Note that in both cases, once the master is compromised, all your slaves can be easily compromised (IOW, malicious master can execute arbitrary program on slaves), so both set-up leaves much to be desired in terms of isolating security breach. Build Publisher Plugin provides another way of doing this, in more secure fashion.

Running Multiple Slaves on the Same Machine

It is possible to run multiple slave instances on a Windows machine, and have them installed as separate Windows services so they can start up on system startup. While the correct use of executors largely obviates the need for multiple slave instances on the same machine, there are some unique use cases to consider:

  • You want more configurability between the configured nodes. Say you have one node set to be used as much as possible, and the other node do be used only when needed.
  • You may have multiple Hudson master installations building different things, and so this configuration would allow you to have slaves for more than one master on the same box. That's right, with Hudson you really can serve two masters.

Follow these steps to get multiple slaves working on the same Windows box:

  • Add the first slave node in Hudson and give it its own working dir (e.g. hudson-slave-a).
  • Go to the slave page from the slave box and launch by JNLP, then use the menu to install it as a service instead.
  • Once the service is running, you'll get hudson-slave.exe and hudson-slave.xml in your slave's work dir.
  • Bring up windows services and stop the Hudson Slave service.
  • Open a shell prompt, cd into the slave work dir.
  • First run "hudson-slave.exe uninstall" to uninstall the one that the jnlp-launched app installed. This should remove it from the service list.
  • Now edit hudson-slave.xml. Modify the id and name values so that your mutliple slaves are distinct. I called mine hudson-slave-a and Hudson Slave A.
  • Run hudson-slave.exe install and then check the Windows service list to ensure it is there. Start it up, and watch Hudson to see if the slave instance becomes active.
  • Now repeat this process for a second slave, beginning with configuring the new node in the master config.

When you go to create the second node, it is nice to be able to copy an existing node, and copy the first node you setup. Then you just tweak the Remote FS Root and a couple other settings to make it distinct. When you are done you should have two (or more) Hudson slave services in the list of Windows services.

Troubleshooting tips

Some interesting pages on issues (and resolutions) occurring when using Windows slaves:

Some more general troubleshooting tips:

  1. Every time Hudson launches a program locally/remotely, it prints out the command line to the log file. So when a remote execution fails, login to the computer that runs the master by using the same user account, and try to run the command from your shell. You tend to solve problems quickly in this way.
  2. Each slave has a log page showing the communication between the master and the slave agent. This log often shows error reports.
  3. If you use binary-unsafe remoting mechanism like telnet to launch a slave, add the -text option to slave.jar so that Hudson avoids sending binary data over the network.
  4. When the same command runs outside Hudson just fine, make sure you are testing it with the same user account as Hudson runs under. In particular, if you run Hudson master on Windows, consult How to get command prompt as the SYSTEM user.
  5. Feel free to send your trouble to users@hudson.dev.java.net

Other readings

从软件工程和计算机科学说开去(转寄)

发信人: zhaoce (米高蜥蜴), 信区: Java
标 题: 从软件工程和计算机科学说开去
发信站: BBS 未名空间站 (Wed Feb 1 15:12:09 2012, 美东)

今天把美工的需求压给外包去做了
忙里偷闲,有点时间,敲点字
说说java

先从软件工程和计算机科学的差异说起
不知道其它学校是怎么设置专业的
但是在我们学校,cs系里面只有两个major,一个是软件工程,另外一个就是传统的cs
cs不仅仅在cs系里面有,数学系也有cs的major
但是软件工程这个major只有cs里面有
看软件工程这个东西的历史,你会发现
它诞生的时候就是计算机行当发生巨大危机,几乎发展不下去的时候
ibm一个1000人左右的团队,就已经算是"超大型"的项目了
这在今天是不可想象的,今天超过1000人以上的项目随处可见
而在当时,这几乎就已经到了合作的极限了

其实传统的cs压根不是工程学这种应用科学,而是数学
因为cs捣腾的是算法,是数学,真正的数学系里面
最优秀的人搞的都是几何,代数,分析
这个层次是数学理论层次,也就是基础数学major干的事
次优秀的去搞统计,统计已经是应用数学,包括统计极其以下搞的都是应用数学
数学系里面长期有搞理论的看不起搞应用的说法,因为名声都是搞理论的拿去
剩下的去搞应用的都是去套利之辈,所以到了统计,女生的比例陡然升高
因为虽然证明很难,但是用起来实在是不难
搞理论和搞统计都不行的,去搞计算数学,计算数学就是搞算法那些东西
后来衍生出了一个行当,叫做cs,所以今天的cs就是从这个行当搞出来的
最早就是为了捣腾计算,为了方便计算而搞出来的一个学科
所以统计major还需要学学数学分析,而到了计算数学和cs这个major
干脆连数学分析都不学了,因为彻底用不上了

既然到了应用领域,那么随之而来的就是各种工程学所普遍遇到的问题
也就是如何协作才能更有效率,一般来说
沟通成本跟团队人数的数量是呈指数相关
也就是说,随着团队人数的增多,沟通成本会呈指数级上升
两个人的合作跟100个人的合作沟通成本的差异不仅仅是50倍的差异,而很有可能是e^
50倍的差异
所以1000人的团队就很容易变成"超大型"的项目

幸运的是,人类发展的历史过程中不仅仅只有数学,还有物理,应用物理造就了工程学
所以当计算机发展到这个瓶颈的时候,有足够的经验可以直接套用
所以发展出了计算机工程和软件工程这两个major
前者在ee那个系里面,后者还是留在cs系里面

那么如何降低沟通成本成了首要任务,这个时候其实软件工程已经脱离了传统的cs领域
也就是数学领域,软件工程这个major已经不再强调数学,而强调对其他学科知识的应用
不是说数学不重要,而是重要性退步了,变得不是最重要的了
最重要的是如何协作,所以同时诞生了有各种新生事物
比如面向对象和模块化,这些东西的本意都是为了降低沟通成本
所谓降低沟通成本就是说,我不想也不需要知道你在干什么,我只需要知道你能给我什
么就够了
顶多我再处理一下,如果你给我的不能满足我的要求,我该怎么办
至于你如何实现的?关我屁事
这是软件工程和计算机科学的核心差异
计算机科学是科学,丫的计算机每天捣腾的就是探索宇宙的秘密,如何实现不重要,关
键是能不能实现
以及为什么能实现,所以这一定关你事
而软件工程是工程,丫的讲究的是协作,关键是不是能不能实现,而是如何更好地实现
至于为什么能实现,这就可以不关你屁事了

所以面向对象特别重要,因为封装了他人的工作,降低了沟通的成本
所以java才这么流行,当然这是java流行的一方面,另外一方面是因为它的语法跟c接近
大部分流行过的高级语言都是c like的
所以封装很重要,所以当你在学习java的时候,你应该想到的是如何去利用别人的工作
成果
而不是去把别人做过的给再做一遍,这是十分愚蠢而又装逼的
因为你真要装逼,去数学系,到时候就怕你会觉得自己蠢得无可药救
要是不想装逼,那就不要浪费时间去看别人的源码,因为搞懂别人的源码所耗费的时间
还比不不上你自己去写一个来得快,但是要看文档,javadoc要熟练,这是别人用来降
低沟通成本的努力
所以你学spring也好,学jboss也罢,都不要去看源代码,直接做实例学习
源码有空再看,当然这里面有一定风险,那就是如果错了怎么办?
错误是工程的一部分,不存在没有错误的项目
一定会有,不要指望你一下子就能搞定所有问题,完成大部分,小部分的慢慢改
整体构架不要变形,大方向对,小细节可以放一放
而且我们还有一个环节叫做测试,其实java的相关几乎就是软件工程所有环节的最好体现
你会发现需求其实是最难搞定的,需求搞定了之后,后续的环节会变得无比可爱
如果需求搞得不三不四的,那后面你就等着哭吧,这个星球上一半以上的项目都是失败的
一个经典的面试题就是,你认为软件工程各个环节最难的是哪一个?
标准答案一定是越靠前的越难,越靠后的越简单
做过项目的人应该都清楚,业务需求一变,杀人的心都有
这就能解释为什么软件工程强调其他学科的知识,而不是数学

在现实生活中,最普遍的数学应用一般都会在金融领域优先体现
因为银行业是一个国家的命脉,所有行当都要跟银行业打交道
所以无论是统计还是软件工程,都在business里面有对应的应用学科,前者是精算
后者是business系里面有一个major叫做information system
这个major是专门用来跟it里面软件工程major相挂钩的一个专业
这个major每天教学生的就是如何转换业务需求成it能够看得懂的方式
所以有空去了解了解business的知识对于你找工作也好,升迁也罢都是大有帮助的
所以学完了java,求职最理想的去处也是银行,只不过这个银行不是投行
而是各个商业银行,汇丰,花旗这种,这些银行都有自己的it部门,甚至很多是it子公司
比如汇丰电子

然后这个时候有一个核心问题,就是保密的问题
你设想一下,如果一个银行的核心技术掌握在某一家it公司手里会有什么后果?
银行那些鬼精的人会上这个当么?
所以在很早以前,就有无数的人冲击sun,要求sun开放java源代码
后来当sun宣布java开源的时候,无数的人欢呼雀跃
sun作为一家商业公司是不合格的,但是作为一家大学的研究机构,却是非常称职的

然后说说商业公司,一个public商业公司存在目的是什么?
很简单,最大化商业公司的价值,这是教科书上的标准答案
这也是股东的利益所在,否则没有人会愿意购买这家公司的股票了
所以追功逐利是一家商业公司一定要做的事,否则董事会会直接把ceo赶走
就像对付jobs一样,给我滚
所以如果一个it公司掌握了银行的定价权之后,它会怎么做?
只要有利润,它一定会毫不客气地挥起屠刀

幸运的是,这个星球上还有一个东西叫做开源
感谢这些共产主义者的努力,使得我们不再受制于人
你要收费?我换
但是开源是否绝对靠谱?
well,作为长期持有左倾思想的人,我始终相信一点
那就是,群众的智慧是无穷的

不过这个星球上还有另外一群人
就是不掏钱买东西,他就浑身不自在,觉得这个不可信
一定要掏钱买了别人的东西,他才感觉舒服
事实上这种匪夷所思违背常理的动物的存在本身也是一大奇迹
任何一个资本家遇到这种动物,做梦都会笑醒来,一只长得肥厚膘肉嗷嗷待宰的猪啊

事实上开源的存在对于软件工程是有利的
因为开源往往不会开到商业逻辑去,你见过什么开源的银行前台软件?或者是开源的精
算软件?不是没有
但是现实的复杂性让人们很难对这些经常性变化的需求做出一个普适性的软件
而且搞it的有几个懂business的?就算做出来了,其它人有几个能看懂的?看懂的里面
又有几个愿意贡献的?
如果再看其它领域,什么物流,化工,交通等领域就更少了
事实上开源的真正冲击是对那些搞传统cs那些搞系统软件的
想想现在有多少已经开源的,db,os,appserver遍地都是
这些都曾经是长期用来给那些搞cs的混饭吃的产品
现在还剩下几家公司在搞db?几家搞os?
应用软件也好不到哪去
以前用来赚钱的browser现在就干脆没有公司用来收费了,类似的还有ftp软件,im聊天
软件……
这些领域大部分搞it的都多少懂一点,而他们里面有的是开源的爱好者

先说到这里,下次再说说为什么你不要去"学"一些东西,比如.net
事实上你说你去学本身就是一个joke,同一个公司的产品,用户界面完全可以傻瓜化
学.net就跟说去学用bt下黄片一样可笑,开源的很多产品也不需要"学"
只是说开源的产品傻瓜化的进度慢一点而已,但是也快了
都是工具,要理解什么时候用,怎么用,剩下的……
快,装逼犯,是你登场的时候了,来,说说回字有几种写法
--

※ 来源:・WWW 未名空间站 海外: mitbbs.com 中国: mitbbs.cn・[FROM: 132.205.]

Multiple Hudson instances on one machine

Hi,

We've installed multiple instances of Hudson on a single LINUX server with out problems. We

1) created a new user for each instance of Hudson we required
2) installed a new version of Tomcat and Hudson for each user.

Hudson's home directory is a sub directory of the users home directory, so there were no conflicts  :-)

I couldn't say how you'd do this on a Windows Server.

The only problem we had was when we needed to allocate ports when running unit tests. If two different Hudson instances wanted exclusive access to use the same port at the same time then we got false failures of some of our tests.

2012年2月1日星期三

How to check if all Hudson jobs have a timeout? « Jan's Blog

How to check if all Hudson jobs have a timeout? « Jan's Blog

推荐一个好用的自动构建工具Hudson (信息这么多,申请加精) - 构建与发布管理 - SCMLife.com

推荐一个好用的自动构建工具Hudson (信息这么多,申请加精) - 构建与发布管理 - SCMLife.com

HudsonIntro.pdf application/pdf Object

HudsonIntro.pdf application/pdf Object

RunningBuildbotOnWindows – Buildbot

RunningBuildbotOnWindows – Buildbot

Buildbot on Windows - Installation Instructions.

Buildbot runs on Windows, as both a slave and a master. The master can run on any convenient platform.

Prerequisites for a slave and for a master

  • Python 2.x. Use the latest 2.x version of Python.
  • PyWin32. Match the version with the python you installed.
  • Twisted. Match the version with the python you installed.
  • Zope.Inteface. This will be installed using setuptools and then easy_installing it via "C:\python27\scripts\easy_install zope.interface".
  • Buildbot, of course. You need 0.8.2 or later.

Additional Prerequisites for a master

  • Jinja2. This will be installed using setuptools and then easy_installing it via "C:\python27\scripts\easy_install Jinja2".
  • PyOpenSSL. This will be installed using setuptools and then easy_installing it via "C:\python27\scripts\easy_install PyOpenSSL".

For the remainder of this document we will assume python was installed in "C:\Python27". Please adjust accordingly if you install to a different path.

Install python, pywin32, and Twisted for all users. On Windows Server 2003 it was found that this MUST be done while logged in as the actual "Administrator" account, NOT as a regular user with Administrator privileges, because of the way filesystem permissions are inherited during egg installation. The buildbot service itself need not be, and probably should not be, a privileged account (see below). When running python setup.py, be sure to use the full path to the version of python you expect to run buildbot (It is ok to have multiple versions of python installed). The pywin32 and Twisted installers will be matched for the version of python that will be running buildbot.

Install Buildbot itself with

C:\python27\python setup.py install 

The following steps were successfully used on Windows Server 2003 SP2 (32 bit) using python 2.7 for buildbot 0.8.3p1. They should work for later versions of buildbot. Note the use of an absolute path to the python executable during installation; this is required if you have multiple side-by-side installations of python

  1. log in as Administrator directly (ctrl-alt-delete from the login screen) Doing this as any other user, even users with Administrator privileges, will cause the eggs to be installed with incorrect permissions and a non-privileged buildbot service will fail to launch.
  2. download the above packages via firefox, for example into c:\Documents and Settings\Administrator\Desktop. IE will try to turn the egg file into a zip file.
  3. run a cmd prompt as Administrator
  4. type cd Desktop
  5. type .\python-2.7.1.msi Click Run, Leave default Install for All Users, click Next, leave default Python27 destination, click Next, click Next, click Finish.
  6. type .\pywin32-214.win32-py2.7.exe Click Next, Leave default of c:\Python27\ click Next, click Next, click Finish.
  7. type .\Twisted-10.2.0.winxp32-py2.7.exe Click Next, Leave default of C:\Python27\, click Next, click Next, click Finish.
  8. type .\setuptools-0.6c11.win32-py2.7.exe Click Next, Leave default of C:\Python27\, click Next, click Next, click Finish.
  9. type c:\Python27\scripts\easy_install.exe zope.interface-3.6.1-py2.7-win32.egg
  10. if you are installing a master, install Jinja2 using the easy_install method from the Jinja2 website.
  11. if you are installing a master, install the correct version of pyOpenSSL from the 2.7 egg, from here https://launchpad.net/pyopenssl . This is necessary for SSL email to work (build failure emails).
  12. if you are installing a master, unzip buildbot-0.8.5.zip in this folder. type cd buildbot-0.8.5\buildbot-0.8.5, type c:\python27\python.exe setup.py install
  13. if you are installing a slave, unzip buildbot-slave-0.8.5.zip in this folder. type cd buildbot-slave-0.8.5\buildbot-slave-0.8.5, type c:\python27\python.exe setup.py install
  14. reboot

Test Your Installation

Verify the slave installation. For example:

  C:\Documents and Settings\Administrator\Desktop>c:\python27\scripts\buildslave --version Buildslave version: 0.8.3 Twisted version: 10.2.0 

Create a User

The rest of this document assumes you are running as a non-privileged user. We recommend creating one specifically for your build slave.

Setup

Basic setup under windows is the same as under other operating systems, except that BuildBot currently cannot manage masters or slaves whose paths have spaces in them. If you absolutely must use a path with spaces, you can map that path to a drive letter and refer to it that way. Run buildslave create-slave to create your slave directory. Edit the info files in the slave directory as appropriate. Then start the buildslave process. Check twistd.log for any error messages.

You can call it quits and be done at this point. Any time the command prompt buildslave is running under closes, your slave will shutdown -- this includes reboots for windows update! Thus it is highly desirable to install buildbot as a windows service. When installed as a service, Windows will automatically restart your slave after a reboot.

Service

You'll need to grant your non-privileged user the ability to run services. To do this, run secpol.msc as an Administrator (the author found the easiest way to do this was to right-click a command line prompt and "run as administrator", and then enter secpol.msc in the command line). Then:

  1. Select the "Local Policies" folder
  2. Select the "User Rights Assignment" folder
  3. Double click "Log on as a service"
  4. Use "Add User or Group..." to add your non-privileged user here.

If you don't already have one open, open up a command shell running as administrator. Then use a builtin buildbot script to install a service for buildbot:

c:\python27\python.exe c:\python27\scripts\buildbot_service.py --user YOURDOMAIN\theusername --password thepassword --startup auto install 

(except all as one line, of course!) Note the "YOURDOMAIN\" part is required. For most setups, you can replace "YOURDOMAIN" with ".", making the full user name ".\theusername" (where theusername is the username of the non-privileged user, and thepassword is that user's login password). Ignore any messages about "Failed to register with the windows firewall".

The next bit of setup involves adding permissions on a certain registry key. However, the registry key does not yet exist because the service has never been started. Under Administrative Tools find Services and run it as an administrator. Then start the "Buildbot" service. Windows will immediately complain that the service shutdown immediately; that's fine.

In your administrator command prompt, run "regedit". Use it to navigate to "HKEY_LOCAL_MACHINE\System\CurrentControlSet\services"; in that long list of services you will find a "Buildbot" entry (i.e. folder). Right click the Buildbot folder and select permissions. In the dialog that comes up, add a new user to the list, and then grant that user "Full Control".

Now try again to start the service. Windows will again immediately complain that the service shutdown immediately; that's fine. This step is required in order to create the "Parameters" key we'll use in the next step.

Under that "Buildbot" services in the registry you will find a "Parameters" key (you may need to hit F5 to refresh so you can see it). There are currently no values under this key. You will need to *create* a 'String Value' under "Parameters" which is named "directories". Set the value of "directories" to be the full path to your slave's configuration directory. This is the same directory where buildbot.tac and twistd.log live.

You will now need to setup a buildbot master and/or slaves into the directory you specified. See Build Bot Tutorial: First Run for more information.

Finally, go back to your "Services" running as administrator and right-click "Buildbot" to start your service -- or just reboot.

Note: any environment variables that need to be set on the slave machine for your bot's build commands to work have to be set in the global System Environment settings or (better) in the unprivileged user's User Environment settings.

In a virtualenv

If you choose to run your service in a virtualenv, there are a few further issues you'll need to take care of.

First, PyWin32 won't install directly into a virtualenv, so you'll have to install it globally.

The service runs an executable installed by PyWin32 to c:\Python26\Lib\site-packages\win32\PythonService.exe. When you're doing the regedit steps above you'll see that path in the ImagePath key. PythonService.exe looks for python.exe in a folder relative to where it is installed, so you'll need to create some symlinks (or on pre-Vista systems, copies). Change that ImagePath to point into your virtualenv, e.g. C:\Users\buildslave\venv\Lib\site-packages\win32\PythonService.exe. Then symlink c:\Python26\Lib\site-packages\win32 into the corresponding place in your virtualenv. Finally, make a symlink for python, which is in your virtualenv's Scripts\ directory, in the root of the virtualenv.

Next, you'll want to update the service's environment so that the virtualenv is in the PATH. This article shows how to do that. Note: any environment variables set here will not propagate to the build commands issued by the bot. Those should be set in the User Environment of the unprivileged user as described above.

Many Buildslaves

Once you have mastered the basics of setting up a buildslave as a Win32 service, you might want to set up more buildslaves on the same system. Do not go to the trouble of cloning your BuildBot service - merely add your other slave configurations to the "directories" string value, separated by semicolons. For example:

directories   REG_SZ   C:\BuildBot\Proj1;C:\BuildBot\Proj2;C:\BuildBot\Proj3 

Consider using vms to manage your slaves. Create a clean base installation and clone it for each new build.

Restart the service and your new buildslaves will connect.

Disabling JIT dialogs on the buildslave

If you are running tests that occasionally fail with exceptions on the buildslave, you will want to disable the Just-In-Time (JIT) debugger that is configured with Visual Studio. When an unhandled exception occurs, the default JIT configuration will launch a dialog that will wait for user confirmation, effectively halting your build.

Follow these steps to disable the JIT:

  1. Run regedit (as Administrator)
  2. DELETE HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug\Debugger
  3. DELETE HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\.NETFramework\DbgManagedDebugger
  4. SET HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\.NETFramework\DbgJITDebugLaunchSetting to 0x1 (right click value, select Modify, hexadecimal, 1), ok
  5. If you are running a 64 bit version of windows DELETE HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows NT\CurrentVersion\AeDebug\Debugger
  6. If you are running a 64 bit version of windows DELETE HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\.NETFramework\DbgManagedDebugger
  7. Reboot

These settings are clarified at the following two sites. It should be noted that the registry changes from BOTH sites must be done.

http://msdn.microsoft.com/en-us/library/2ac5yxx6(v=VS.90).aspx

http://msdn.microsoft.com/en-us/library/5hs4b7a6(v=VS.90).aspx

Other Registry Settings

You will probably want to disable the Windows Error Reporting facility on Windows 7. This will prevet the annoying dialog box of "Report this error". Under HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\Windows Error Reporting add a REG_DWORD of "Disabled" and set it to "1"

Troubleshooting

If you have any issues with the service installation or buildbot-as-a-service startup, you can see the actual problem using Windows' "Event Viewer". It's under "Windows Logs" and then "Applications"; look for problems of type "Error" with a source of "Buildbot". Please report any such errors when asking for help on the mailing list.

The buildbot master.cfg needs the slave information; the master will ignore a slave it doesn't know about.

If the buildbot service fails to start, but the script runs for the same user when logged in, it may be due to permissions that were in effect when the buildbot package and pre-requisites were installed. See "PreRequisites?" above.

Visual Studio build configs such as "Win32|Any CPU" will need to be escaped thus: "Win32^|Any CPU". Consider saving yourself development time by using python instead of bat and cmd for slave scripts.