<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0"><channel><atom:link rel="hub" href="http://tumblr.superfeedr.com/" xmlns:atom="http://www.w3.org/2005/Atom"/><description>ǝɹǝɥ ʇxǝʇ lnɟʇɥƃısuı
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));

try {
var pageTracker = _gat._getTracker("UA-8156370-3");
pageTracker._trackPageview();
} catch(err) {}</description><title>The Cluster Guy</title><generator>Tumblr (3.0; @theclusterguy)</generator><link>http://theclusterguy.clusterlabs.org/</link><item><title>Pacemaker 1.0.12 Released</title><description>&lt;p&gt;Thanks once again to the efforts of Keisuke MORI from NTT, the latest bug fixes have been back-ported from 1.1 and another instalment of the &lt;a href="http://www.clusterlabs.org/wiki/Pacemaker"&gt;Pacemaker&lt;/a&gt; 1.0 release series is now ready for general consumption.&lt;/p&gt;

&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;b&gt;Changesets&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt; 96 &lt;/td&gt;
    &lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;b&gt;Diff&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;121 files changed, 8617 insertions(+), 988 deletions(-)&lt;/td&gt;
    &lt;/tr&gt;&lt;/table&gt;&lt;p&gt;Important changes since Pacemaker-1.0.11 include:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;cib: Call gnutls_bye() and shutdown() when disconnecting from remote TLS connections&lt;/li&gt;
&lt;li&gt;cib: Remove disconnected remote connections from mainloop&lt;/li&gt;
&lt;li&gt;crmd: Cancel timers for actions that were pending on dead nodes&lt;/li&gt;
&lt;li&gt;crmd: Do not wait for actions that were pending on dead nodes&lt;/li&gt;
&lt;li&gt;crmd: Ensure we do not attempt to perform action on failed nodes&lt;/li&gt;
&lt;li&gt;PE: Correctly recognise which recurring operations are currently active&lt;/li&gt;
&lt;li&gt;PE: Demote from Master does not clear previous errors&lt;/li&gt;
&lt;li&gt;PE: Ensure restarts due to definition changes cause the start action to be re-issued not probes&lt;/li&gt;
&lt;li&gt;PE: Ensure role is preserved for unmanaged resources&lt;/li&gt;
&lt;li&gt;PE: Ensure unmanaged resources have the correct role set so the correct monitor operation is chosen&lt;/li&gt;
&lt;li&gt;PE: Move master based on failure of colocated group&lt;/li&gt;
&lt;li&gt;pengine: Correctly determine the state of multi-state resources with a partial operation history&lt;/li&gt;
&lt;li&gt;PE: Only allocate master/slave resources once&lt;/li&gt;
&lt;li&gt;Shell: implement -w,—wait option to wait for the transition to finish&lt;/li&gt;
&lt;li&gt;Shell: repair template list command&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;You also can see the &lt;a href="https://github.com/ClusterLabs/pacemaker-1.0/blob/master/ChangeLog"&gt;full changelog&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;I have updated the &lt;a href="http://www.clusterlabs.org/wiki/ReleaseCalendar"&gt;release calendar&lt;/a&gt; and the next 1.0.x release is planned for mid-May 2012.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://github.com/ClusterLabs/pacemaker-1.0/tarball/Pacemaker-1.0.12"&gt;source tarball&lt;/a&gt; is also available directly from GitHub.&lt;/p&gt;

&lt;p&gt;Pre-built packages for Pacemaker are available immediately for current &lt;a href="http://www.opensuse.org/"&gt;openSUSE&lt;/a&gt; (12.1, 11.4, 11.3) and &lt;a href="http://fedoraproject.org/"&gt;Fedora&lt;/a&gt; (16, 15, 14) releases as well as &lt;a href="http://fedoraproject.org/wiki/EPEL"&gt;EPEL-5&lt;/a&gt; from the &lt;a href="http://www.clusterlabs.org/rpm/"&gt;ClusterLabs Build Area&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Users of more most distributions are &lt;strong&gt;encouraged to use the latest 1.1.x release&lt;/strong&gt; - either from the &lt;a href="http://www.clusterlabs.org/rpm-next/"&gt;1.1 Build Area&lt;/a&gt; or from the distribution directly.&lt;/p&gt;

&lt;p&gt;General installation instructions are available at from the &lt;a href="http://clusterlabs.org/wiki/Install"&gt;ClusterLabs wiki&lt;/a&gt;.&lt;/p&gt;</description><link>http://theclusterguy.clusterlabs.org/post/13237273559</link><guid>http://theclusterguy.clusterlabs.org/post/13237273559</guid><pubDate>Thu, 24 Nov 2011 04:43:15 +0100</pubDate></item><item><title>New Version Control System</title><description>&lt;p&gt;Since September, Pacemaker has started using &lt;a href="http://git-scm.com/"&gt;Git&lt;/a&gt; for the 1.1 and devel trees.&lt;/p&gt;

&lt;p&gt;There were some minor technical advantages over &lt;a href="http://mercurial.selenic.com/"&gt;Mercurial&lt;/a&gt; (which I still personally prefer), but mostly the decision was driven by the pain associated with switching between SCMs multiple times a day.&lt;/p&gt;

&lt;p&gt;The majority of development now happens on &lt;a href="https://github.com/ClusterLabs/pacemaker"&gt;GitHub&lt;/a&gt;, which has some great features for &lt;a href="http://github.com/features/projects/codereview"&gt;reviewing patches&lt;/a&gt; and general collaboration.&lt;/p&gt;

&lt;p&gt;The Pacemaker tree is also periodically sync’d to the &lt;a href="http://git.clusterlabs.org/"&gt;Cluster Labs&lt;/a&gt; server in case GitHub is unavailable for any reason.&lt;/p&gt;

&lt;p&gt;For those new to Git, GitHub has many tips for &lt;a href="http://help.github.com/set-up-git-redirect"&gt;setting up&lt;/a&gt; Git, creating a &lt;a href="http://help.github.com/fork-a-repo/"&gt;local copy&lt;/a&gt; of the Pacemaker repo to work in, &lt;a href="http://help.github.com/send-pull-requests/"&gt;submitting your changes&lt;/a&gt; upstream (we use the Fork + Pull Model), and other &lt;a href="http://help.github.com/git-cheat-sheets/"&gt;assorted resources&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Be sure to configure &lt;a href="http://help.github.com/set-your-user-name-email-and-github-token/"&gt;email and user&lt;/a&gt; information so you get credit for your hard work too!&lt;/p&gt;</description><link>http://theclusterguy.clusterlabs.org/post/11380792939</link><guid>http://theclusterguy.clusterlabs.org/post/11380792939</guid><pubDate>Thu, 13 Oct 2011 04:25:01 +0200</pubDate></item><item><title>New Issue Tracker</title><description>&lt;p&gt;Since it’s clearly not acceptable for our issue tracker to be offline for months at a time, it is time to replace the Bugzilla instance hosted by the Linux Foundation with something else.&lt;/p&gt;

&lt;p&gt;One candidate that came close was the github issue tracker, but alas it doesn’t support attachments.
The end result is that we now have an instance of &lt;a href="http://www.bugzilla.org/"&gt;Bugzilla v4&lt;/a&gt; at:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://bugs.clusterlabs.org"&gt;http://bugs.clusterlabs.org&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Bug numbers start at 5000.&lt;br/&gt;
This avoids clashing with older ones and &lt;em&gt;may&lt;/em&gt; enable us to import the old ones if it ever comes back up again.
I would advise people to assume this wont happen and to re-create any unresolved issues.&lt;/p&gt;</description><link>http://theclusterguy.clusterlabs.org/post/11378258276</link><guid>http://theclusterguy.clusterlabs.org/post/11378258276</guid><pubDate>Thu, 13 Oct 2011 03:32:48 +0200</pubDate></item><item><title>Pacemaker 1.0.11 Released</title><description>&lt;p&gt;The latest installment of the &lt;a href="http://www.clusterlabs.org/wiki/Pacemaker"&gt;Pacemaker&lt;/a&gt; 1.0 release series is now ready for general consumption.&lt;/p&gt;

&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;b&gt;Changesets&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt; 85 &lt;/td&gt;
    &lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;b&gt;Diff&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;500 files changed, 69642 insertions(+), 58270 deletions(-)&lt;/td&gt;
    &lt;/tr&gt;&lt;/table&gt;&lt;p&gt;Thanks once again to the efforts of Keisuke MORI and NTT, the latest bug fixes have been back-ported from 1.1&lt;/p&gt;

&lt;p&gt;Important changes since Pacemaker-1.0.10 include:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;cib: Repair the processing of updates sent from peer nodes&lt;/li&gt;
&lt;li&gt;crmd: All pending operations should be recorded, even recurring ones with high start delays&lt;/li&gt;
&lt;li&gt;crmd: Bug lf#2509 - Watch for config option changes from the CIB even if we’re not the DC&lt;/li&gt;
&lt;li&gt;crmd: Bug lf#2528 - Introduce a slight delay when creating a transition to allow attrd time to perform its updates&lt;/li&gt;
&lt;li&gt;crmd: Bug lf#2545 - Ensure notify variables are accurate for stop operations&lt;/li&gt;
&lt;li&gt;crmd: Bug lf#2559 - Fail actions that were scheduled for a failed/fenced node&lt;/li&gt;
&lt;li&gt;crmd: Cancel recurring operations while we’re still connected to the lrmd&lt;/li&gt;
&lt;li&gt;crmd: Don’t abort transitions when probes are completed on a node&lt;/li&gt;
&lt;li&gt;crmd: Ensure the CIB is always writable on the DC by removing a timing hole&lt;/li&gt;
&lt;li&gt;crmd: Update failcount for failed promote and demote operations&lt;/li&gt;
&lt;li&gt;PE: Bug lf#2495 - Prevent segfault by validating the contents of ordering sets&lt;/li&gt;
&lt;li&gt;PE: Bug lf#2508 - Correctly reconstruct the status of anonymous cloned groups&lt;/li&gt;
&lt;li&gt;PE: Bug lf#2544 - Prevent unstable clone placement by factoring in the current node’s score before all others&lt;/li&gt;
&lt;li&gt;PE: Bug lf#2554 - target-role alone is not sufficient to promote resources&lt;/li&gt;
&lt;li&gt;PE: Ensure fencing of the DC preceeds the STONITH_DONE operation&lt;/li&gt;
&lt;li&gt;PE: Ensure that fencing has completed for stop actions on stonith-dependent resources (lf#2551)&lt;/li&gt;
&lt;li&gt;PE: Prevent clones from being stopped because resources colocated with them cannot be active&lt;/li&gt;
&lt;li&gt;PE: Prevet use-after-free resulting from unintended recursion when chosing a node to promote master/slave resources&lt;/li&gt;
&lt;li&gt;Shell: don’t create empty optional sections (bnc#665131)&lt;/li&gt;
&lt;li&gt;Tools: Bug lf#2528 - Make progress when attrd_updater is called repeatedly within the dampen interval but with the same value&lt;/li&gt;
&lt;li&gt;Tools: Prevent crm_resource commands from being lost due to the use of cib_scope_local&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;You also can see the &lt;a href="http://hg.clusterlabs.org/pacemaker/1.0/raw-file/tip/ChangeLog"&gt;full changelog&lt;/a&gt;,&lt;/p&gt;

&lt;p&gt;As per our &lt;a href="http://www.clusterlabs.org/wiki/ReleaseCalendar"&gt;release calendar&lt;/a&gt;, the next 1.0.x release is planned for mid-September.&lt;/p&gt;

&lt;p&gt;The &lt;a href="http://hg.clusterlabs.org/pacemaker/1.0/archive/Pacemaker-1.0.11.tar.bz2"&gt;source tarball&lt;/a&gt; is also available directly from Mercurial.&lt;/p&gt;

&lt;p&gt;Pre-built packages for Pacemaker and it’s immediate dependancies are available immediately for openSUSE 11.2, 11.3, Fedora-13 and EPEL-5 from the &lt;a href="http://www.clusterlabs.org/rpm/"&gt;ClusterLabs Build Area&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Users of more recent distributions are encouraged to use the latest 1.1.x - either from the &lt;a href="http://www.clusterlabs.org/rpm-next/"&gt;1.1 Build Area&lt;/a&gt; or the distribution directly.&lt;/p&gt;

&lt;p&gt;General installation instructions are available at from the &lt;a href="http://clusterlabs.org/wiki/Install"&gt;ClusterLabs wiki&lt;/a&gt;.&lt;/p&gt;</description><link>http://theclusterguy.clusterlabs.org/post/5129499003</link><guid>http://theclusterguy.clusterlabs.org/post/5129499003</guid><pubDate>Mon, 02 May 2011 13:08:51 +0200</pubDate></item><item><title>Pacemaker 1.1.5 Released</title><description>&lt;p&gt;The latest installment of the &lt;a href="http://www.clusterlabs.org/wiki/Pacemaker"&gt;Pacemaker&lt;/a&gt; 1.1 release series is now ready for general consumption.&lt;/p&gt;

&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;b&gt;Changesets&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt; 184 &lt;/td&gt;
    &lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;b&gt;Diff&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;605 files changed, 46103 insertions(+), 26417 deletions(-)&lt;/td&gt;
    &lt;/tr&gt;&lt;/table&gt;&lt;p&gt;As well as the usual round of bug fixes, see the &lt;a href="http://hg.clusterlabs.org/pacemaker/1.1/raw-file/tip/ChangeLog"&gt;full changelog&lt;/a&gt;, S.U.S.E. has implemented support for ACLs.
This means that you can now delegate permission to control parts of the cluster (as defined by you) to non-root users.&lt;/p&gt;

&lt;p&gt;ACLs are still disabled by default, but you can read &lt;a href="http://www.clusterlabs.org/doc/acls.html"&gt;their documentation&lt;/a&gt;, provide feedback and decide if its something you want to use.&lt;/p&gt;

&lt;p&gt;As per our &lt;a href="http://www.clusterlabs.org/wiki/ReleaseCalendar"&gt;release calendar&lt;/a&gt;, the next 1.1 release is planned for mid-April and 1.0.11 should be available in March depending on how quickly we can get the bugfixes from 1.1 backported.&lt;/p&gt;

&lt;p&gt;Pre-built packages for Pacemaker and it’s immediate dependancies are available immediately for openSUSE 11.3, Fedora-14 and EPEL-5 from the &lt;a href="http://www.clusterlabs.org/rpm-next/"&gt;ClusterLabs Build Area&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;a href="http://hg.clusterlabs.org/pacemaker/1.1/archive/Pacemaker-1.1.5.tar.bz2"&gt;source tarball&lt;/a&gt; is also available directly from Mercurial.&lt;/p&gt;

&lt;p&gt;General installation instructions are available at from the &lt;a href="http://clusterlabs.org/wiki/Install"&gt;ClusterLabs wiki&lt;/a&gt;.&lt;/p&gt;</description><link>http://theclusterguy.clusterlabs.org/post/3462561268</link><guid>http://theclusterguy.clusterlabs.org/post/3462561268</guid><pubDate>Wed, 23 Feb 2011 12:36:01 +0100</pubDate></item><item><title>New Logo?</title><description>&lt;p&gt;One unexpected outcome from the recent &lt;a href="http://www.linuxplumbersconf.org/2010/"&gt;Linux Plumbers&lt;/a&gt; conference was the contribution of a new logo to the project by NTT.&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_lbro32bsb31qzagr8.png" alt="New Logo"/&gt;&lt;/p&gt;

&lt;p&gt;Quite possibly you’re now wondering how this logo relates at all to clustering and the Pacemaker project.
Don’t worry, they came up with a backstory too!&lt;/p&gt;

&lt;p&gt;In various forms of racing there is quite often someone/something setting a benchmark time or speed.
This entity is often referred to as the pace-setter,  &lt;a href="http://en.wikipedia.org/wiki/Pacemaker_%28running%29"&gt;pacemaker&lt;/a&gt;, or colloquially as a “rabbit”.&lt;/p&gt;

&lt;p&gt;The logo is therefor a stylized pair of rabbit ears and the implication is that we’re setting new standards for cluster resource management.&lt;/p&gt;

&lt;p&gt;As well as the logo, NTT also contributed some very professional looking banner images they’d created a Japanese &lt;a href="http://linux-ha.sourceforge.jp/wp/"&gt;cluster site&lt;/a&gt; they’ve been busy building up.
Even if you can’t speak Japanese, be sure to check out the shiny intro movie on the front page!&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_lbrofycDve1qzagr8.jpg" alt="banner red"/&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_lbrogrG7ah1qzagr8.jpg" alt="banner white"/&gt;&lt;/p&gt;

&lt;p&gt;I quite like the logo and the message, but I’m interested in the community’s reaction.
I’ve created an &lt;a href="http://polldaddy.com/poll/4074769"&gt;online poll&lt;/a&gt;, be sure to let us know what you think.&lt;/p&gt;

&lt;script type="text/javascript" charset="utf-8" src="http://static.polldaddy.com/p/4074769.js"&gt;&lt;/script&gt;</description><link>http://theclusterguy.clusterlabs.org/post/1551578523</link><guid>http://theclusterguy.clusterlabs.org/post/1551578523</guid><pubDate>Fri, 12 Nov 2010 12:07:00 +0100</pubDate></item><item><title>Pacemaker Release Roundup</title><description>&lt;p&gt;It may have seemed quiet since July,  but things were actually so busy that I couldn’t find the time to publicize our new releases.&lt;/p&gt;

&lt;p&gt;First up, the long awaited 1.0.10 is finally here. 
Thanks once again to the hard work of Keisuke MORI from NTT, 1.0.10 contains all the bug fixes from the recent 1.1.3 and 1.1.4 releases.
You can preview the list of updates with the new online &lt;a href="http://hg.clusterlabs.org/pacemaker/stable-1.0/raw-file/tip/ChangeLog"&gt;change log&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In addition to &lt;a href="http://hg.clusterlabs.org/pacemaker/1.1/raw-file/tip/ChangeLog"&gt;general bugfixes&lt;/a&gt;, the big news in 1.1.3 was the addition of a &lt;a href="http://theclusterguy.clusterlabs.org/post/907043024/introducing-the-pacemaker-master-control-process-for"&gt;master control process&lt;/a&gt; and support for cman.
Cman support allows us to run on top of a traditional RHCS cluster stack - replacing just the rgmanager component (more details on this in a subsequent post).&lt;/p&gt;

&lt;p&gt;1.1.3 also introduced a new logging system inspired by the kernel and a PoC from Lars Ellenberg.
It enables us to selectively enable logs for specific files, functions and even individual lines.
Eventually this should result in less being logged by default.&lt;/p&gt;

&lt;p&gt;The successor to 1.1.3 was all about &lt;a href="http://theclusterguy.clusterlabs.org/post/1241986422/large-cluster-performance"&gt;performance&lt;/a&gt;.
In 1.1.4 we managed to speed up the CIB and Policy Engine by about 80% each.
So if you have 100’s of resources, you &lt;em&gt;really&lt;/em&gt; want to be using this version (the changes were far too invasive to consider including in a 1.0 release).&lt;/p&gt;

&lt;p&gt;Packages for all three releases are available from the 
  &lt;a href="http://www.clusterlabs.org/rpm"&gt;rpm&lt;/a&gt; and &lt;a href="http://www.clusterlabs.org/rpm-next"&gt;rpm-next&lt;/a&gt; repositories on clusterlabs.org&lt;/p&gt;

&lt;p&gt;In other news, I have also recently updated the &lt;a href="http://www.clusterlabs.org/wiki/ReleaseCalendar"&gt;release calendar&lt;/a&gt; for 2011.&lt;/p&gt;</description><link>http://theclusterguy.clusterlabs.org/post/1551292286</link><guid>http://theclusterguy.clusterlabs.org/post/1551292286</guid><pubDate>Fri, 12 Nov 2010 11:02:39 +0100</pubDate></item><item><title>Pacemaker, Heartbeat, Corosync, WTF?</title><description>&lt;p&gt;One question I still get a lot is what all these projects are/do and how they all relate.&lt;/p&gt;

&lt;p&gt;Here is the list of the possible components that might make up a &lt;a href="http://www.clusterlabs.org"&gt;Pacemaker&lt;/a&gt; install is:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;Pacemaker - Resource manager&lt;/li&gt;
&lt;li&gt;Corosync - Messaging layer&lt;/li&gt;
&lt;li&gt;Heartbeat - Also a messaging layer&lt;/li&gt;
&lt;li&gt;Resource Agents - Scripts that know how to control various services&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;em&gt;Pacemaker&lt;/em&gt; is the thing that starts and stops services (like your database or mail server) and contains logic for ensuring both that they’re running, and that they’re &lt;strong&gt;only&lt;/strong&gt; running in one location (to avoid data corruption).&lt;/p&gt;

&lt;p&gt;But it can’t do that without the ability to talk to instances of itself on the other node(s), which is where &lt;a href="http://linux-ha.org"&gt;Heartbeat&lt;/a&gt; and/or &lt;a href="http://corosync.org"&gt;Corosync&lt;/a&gt; come in.&lt;/p&gt;

&lt;p&gt;Think of &lt;em&gt;Heartbeat&lt;/em&gt; and &lt;em&gt;Corosync&lt;/em&gt; as &lt;a href="http://www.freedesktop.org/wiki/Software/dbus"&gt;dbus&lt;/a&gt; but between nodes.
Somewhere that any node can throw messages on and know that they’ll be received by all its peers. 
This bus also ensures that everyone agrees who is (and is not) connected to the bus and tells Pacemaker when that list changes.&lt;/p&gt;

&lt;p&gt;For two nodes Pacemaker could just as easily use sockets, but beyond that the complexity grows quite rapidly and is very hard to get right - so it really makes sense to use existing components that have proven to be reliable.&lt;/p&gt;

&lt;p&gt;You only need one of them though :-)&lt;/p&gt;

&lt;p&gt;Finally, in order to avoid teaching Pacemaker about every possible service that people might want to make highly available, we make use of the &lt;a href="http://opencf.org/home.html"&gt;OCF&lt;/a&gt; standard to hide the details in scripts - which we call &lt;em&gt;Resource Agents&lt;/em&gt;. 
Any series of command-line actions can be easily turned into a resource agent by adding them to an existing &lt;a href="http://hg.clusterlabs.org/pacemaker/1.1/file/tip/extra/resources/Dummy"&gt;template&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;However a collection of the most commonly useful ones are made available as part of the &lt;a href="http://www.linux-ha.org/wiki/Resource_Agents"&gt;Resource Agents&lt;/a&gt; project.&lt;/p&gt;

&lt;p&gt;And of course pre-built packages for all these come with most of the popular Linux distributions, including Fedora, openSUSE, SLES &gt;= 10, RHEL &gt;= 6, Debian, and Ubuntu.&lt;/p&gt;</description><link>http://theclusterguy.clusterlabs.org/post/1262495133</link><guid>http://theclusterguy.clusterlabs.org/post/1262495133</guid><pubDate>Thu, 07 Oct 2010 13:32:43 +0200</pubDate></item><item><title>Large Cluster Performance</title><description>&lt;p&gt;Over the last few days, I’ve spent a bunch of time improving Pacemaker’s performance in large clusters.&lt;/p&gt;

&lt;p&gt;This involved profiling the CIB and Policy Engine, identifying and optimizing hotspots and improving algorithm designs.&lt;/p&gt;

&lt;p&gt;Since most of my work is done in virtual machines, it wasn’t possible to use oprofile.
Strictly speaking oprofile worked, but without hardware performance counters the results weren’t very helpful.  I also tried gprof, but that is more about counting calls rather than time spent.&lt;/p&gt;

&lt;p&gt;Eventually I switched to callgrind and when combined with a tool Tim found called &lt;a href="http://code.google.com/p/jrfonseca/wiki/Gprof2Dot"&gt;Gprof2Dot&lt;/a&gt; and/or kcachegrind, finally got the data I was looking for.&lt;/p&gt;

&lt;p&gt;To do your own profiling, simply set &lt;em&gt;PCMK_callgrind_enabled&lt;/em&gt; to either &lt;em&gt;yes&lt;/em&gt; or to the name of a Pacemaker daemon you wish to profile.  Eg. &lt;em&gt;PCMK_callgrind_enabled=cib&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Overall, the CIB (which is the main bottleneck in a large cluster) and the Policy Engine are about 70% faster.&lt;/p&gt;

&lt;p&gt;The improvements will be available with 1.1.4 is released next month, or from our &lt;a href="http://hg.clusterlabs.org/pacemaker/1.1/"&gt;1.1 code repository&lt;/a&gt; right now.&lt;/p&gt;

&lt;p&gt;A summary of the various changes and description of future work is below.
Any assistance in further optimization would be appreciated :-)&lt;/p&gt;

&lt;p&gt;— Andrew&lt;/p&gt;

&lt;h2&gt;PE&lt;/h2&gt;

&lt;p&gt;Use case:&lt;br/&gt;
* 100 nodes
* 100 clones, clone-max=100 (10,000 effective resources)
* 100 resource location constraints&lt;/p&gt;

&lt;p&gt;Baseline: with probes 20-30 minutes
Baseline: without probes 28s&lt;/p&gt;

&lt;h3&gt;Phase 1&lt;/h3&gt;

&lt;p&gt;Use hashtables instead of lists for stores the available nodes for a resource
New time without probes: 18s&lt;/p&gt;

&lt;h3&gt;Phase 2&lt;/h3&gt;

&lt;p&gt;Defer creation of deletion,promote and demote constraints until they are needed 
New time without probes: 13s&lt;/p&gt;

&lt;h3&gt;Phase 3&lt;/h3&gt;

&lt;p&gt;Use g_list_prepend() instead of g_list_append() for the list of ordering constraints
New time without probes: 5s&lt;/p&gt;

&lt;h3&gt;Phase 4&lt;/h3&gt;

&lt;p&gt;New algorithm for determining which clone instances need probing
New time with probes: 31s&lt;/p&gt;

&lt;h3&gt;Future work&lt;/h3&gt;

&lt;ul&gt;&lt;li&gt;Further improve the algorithm for determining which resources need to be probed&lt;/li&gt;
&lt;li&gt;Further optimize the algorithm for enforcing ordering constraints&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;CIB&lt;/h2&gt;

&lt;p&gt;The CIB was harder to profile.
Rather than give it one large task to chew through and see how long it took using a few printf’s to provide granularity, I had to run it through a profiler while it was operating in a real cluster and see where most of the time was being spent.&lt;/p&gt;

&lt;h3&gt;Phase 1&lt;/h3&gt;

&lt;p&gt;Remove most uses of cib_msg_copy(), reduced the amount of needless copying.&lt;/p&gt;

&lt;p&gt;Phase speedup: 10%&lt;/p&gt;

&lt;h3&gt;Phase 2&lt;/h3&gt;

&lt;p&gt;Compression costs a LOT, don’t do it unless we’re hitting message limits.
For now, use 256k as the threshold at which compression kicks in.
The previous limit was 10k, compressing 184 of 1071 messages accounted for 23% of the total CPU used by the cib.&lt;/p&gt;

&lt;p&gt;Each time we validated the CIB, we were re-reading and re-parsing the RelaxNG schema, which accounted for 28% of the CIB’s CPU usage on the DC.
We now read it once and cache the result for the life of the CIB process.&lt;/p&gt;

&lt;p&gt;Phase speedup: 51%&lt;/p&gt;

&lt;h3&gt;Phase 3&lt;/h3&gt;

&lt;p&gt;Push detection of group and set ordering changes to (the less busy) slave instances.
This detection was costing 15% of the CIB’s total CPU time on the DC.&lt;/p&gt;

&lt;p&gt;Phase speedup: 15%&lt;/p&gt;

&lt;h3&gt;Future work&lt;/h3&gt;

&lt;p&gt;The majority of CPU spent by the CIB is in post-processing.&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;Detecting what changed so we can minimize the network load: &lt;em&gt;diff_xml_object&lt;/em&gt;, 35.5% CPU time&lt;/li&gt;
&lt;li&gt;Calculating the current digest so peers can verify the diffs and detect ordering changes: &lt;em&gt;calculate_xml_digest&lt;/em&gt;, 31% CPU time&lt;/li&gt;
&lt;/ul&gt;</description><link>http://theclusterguy.clusterlabs.org/post/1241986422</link><guid>http://theclusterguy.clusterlabs.org/post/1241986422</guid><pubDate>Mon, 04 Oct 2010 13:26:01 +0200</pubDate></item><item><title>Introducing the Pacemaker Master Control Process for Corosync-based Clusters</title><description>&lt;p&gt;The latest addition to the Pacemaker 1.1 series is a master control process (MCP) and associated init script.&lt;/p&gt;

&lt;p&gt;This means that Pacemaker is now started/stopped independently of the messaging layer.
We anticipate that this should result in a simpler and more reliable startup/shutdown procedure when used in combination with Corosync.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Forking inside a multi-threaded process like Corosync causes all sorts of pain.
  This has been problematic for Pacemaker as it needs a number of daemons to 
  be spawned.&lt;/p&gt;
  
  &lt;p&gt;Likewise, Corosync was never designed for staggered shutdown - something
  previously needed in order to prevent the cluster from leaving before Pacemaker 
  could stop all active resources.&lt;/p&gt;
  
  &lt;p&gt;By moving this functionality into the MCP, the whole system should become
  more reliable&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It should be noted that when using the MCP, &lt;strong&gt;Corosync will refuse to shutdown if Pacemaker is still running&lt;/strong&gt;.
Pacemaker will also naturally fail to start if Corosync isn’t active yet.&lt;/p&gt;

&lt;p&gt;So, starting with 1.1.3, the following Corosync-based options are possible:&lt;/p&gt;

&lt;ol&gt;&lt;li&gt;corosync + pacemaker plugin (v0) &lt;/li&gt;
&lt;li&gt;corosync + pacemaker plugin (v1) + mcp&lt;/li&gt;
&lt;li&gt;corosync + cpg + cman + mcp&lt;/li&gt;
&lt;li&gt;corosync + cpg + quorumd + mcp&lt;/li&gt;
&lt;/ol&gt;&lt;p&gt;Option ‘1’ corresponds to what people have been using since openais/corosync started being supported.
If Pacemaker starts being supported in RHEL6, its probably going to look like option ‘3’. 
Option ‘4’ is what we’re all working towards.&lt;/p&gt;

&lt;p&gt;Anyone having startup or shutdown problems (with Pacemaker 1.1 or 1.0) should immediately move to clusters based on option ‘2’ or ‘3’.&lt;/p&gt;

&lt;p&gt;Both involve the new master control process and therefor benefit from the more reliable startup/shutdown design.&lt;/p&gt;

&lt;p&gt;Additionally, ‘3’ uses CPG for messaging (whereas ‘2’ still uses the plugin which makes it compatible with option nodes running ‘1’).&lt;/p&gt;

&lt;p&gt;Unfortunately option ‘4’ isn’t fully baked yet, there’s still a few kinks in the pacemaker/quorumd interaction to be worked out.
This will happen in the coming months, however any assistance in this process would be highly appreciated.&lt;/p&gt;

&lt;p&gt;To use option ‘2’, simply change: &lt;code&gt;ver: 0&lt;/code&gt; to &lt;code&gt;ver: 1&lt;/code&gt; in the pacemaker service block of &lt;em&gt;corosync.conf&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;To use option ‘3’, you can either:
 * use &lt;em&gt;cluster.conf&lt;/em&gt; and &lt;code&gt;service cman start&lt;/code&gt; or,
 * add the cman bits to &lt;em&gt;corosync.conf&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Using &lt;em&gt;cluster.conf&lt;/em&gt; is the preferred approach.
Its far easier to maintain and start automatically starts the necessary pieces for using GFS2.&lt;/p&gt;

&lt;h3&gt;Alternative 1 - Sample cluster.conf for a two-node cluster&lt;/h3&gt;

&lt;pre&gt;&lt;code&gt;&lt;?xml version="1.0"?&gt;
&lt;cluster config_version="1" name="beekhof"&gt;
    &lt;fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/&gt;
    &lt;clusternodes&gt;
            &lt;clusternode name="pcmk-1" nodeid="1" votes="1"&gt;
                    &lt;fence/&gt;
            &lt;/clusternode&gt;
            &lt;clusternode name="pcmk-2" nodeid="2" votes="1"&gt;
                    &lt;fence/&gt;
            &lt;/clusternode&gt;
    &lt;/clusternodes&gt;
    &lt;cman/&gt;
    &lt;fencedevices/&gt;
    &lt;rm/&gt;
&lt;/cluster&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Alternative 2 - Sample corosync.conf additions for a two node CMAN cluster&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Be sure to set &lt;code&gt;nodename&lt;/code&gt; appropriately for each host.&lt;/strong&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;cluster {
    name: beekhof

    clusternodes {
            clusternode {
                    votes: 1
                    nodeid: 1
                    name: pcmk-1
            }
            clusternode {
                    votes: 1
                    nodeid: 2
                    name: pcmk-2
            }
    }
    cman {
            expected_votes: 2
            cluster_id: 123
            nodename: pcmk-1
            two_node: 1
            max_queued: 10
    }
}

service {
    name: corosync_cman
    ver: 0
}

quorum {
    provider: quorum_cman
}
&lt;/code&gt;&lt;/pre&gt;</description><link>http://theclusterguy.clusterlabs.org/post/907043024</link><guid>http://theclusterguy.clusterlabs.org/post/907043024</guid><pubDate>Thu, 05 Aug 2010 11:31:19 +0200</pubDate></item><item><title>Pacemaker 1.0.9 Released</title><description>&lt;p&gt;The latest installment of the &lt;a href="http://www.clusterlabs.org/wiki/Pacemaker"&gt;Pacemaker&lt;/a&gt; 1.0 stable series is now ready for general consumption.&lt;/p&gt;

&lt;p&gt;Coinciding with 1.0.9 is a new version of Corosync (1.2.5).
Included in both are some important fixes that should resolve most of the startup issues people have been seeing.
Also included in this release are the fixes for issues reported by &lt;a href="http://valgrind.org/"&gt;Valgrind&lt;/a&gt; and &lt;a href="http://www.coverity.com/products/static-analysis.html"&gt;Coverity&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;As per our &lt;a href="http://www.clusterlabs.org/wiki/ReleaseCalendar"&gt;release calendar&lt;/a&gt;, the next 1.0 release is planned for mid-September and 1.1.3 will be available in late July.&lt;/p&gt;

&lt;hr&gt;&lt;p&gt;I’d like to particularly thank Keisuke MORI for his help with this release.
Keisuke-san has taken on the role of Patch Manager for 1.0, so it is because of his hard work that we have backports of all the bugfixes from 1.1 :-)&lt;/p&gt;

&lt;p&gt;This change has enabled me to focus on 1.1 and, I hope, be slightly more responsive to bug reports and questions on the mailing list(s).&lt;/p&gt;

&lt;hr&gt;&lt;p&gt;Pre-built packages for Pacemaker and it’s immediate dependancies are available immediately for openSUSE, SLES, Fedora, RHEL, CentOS from the &lt;a href="http://www.clusterlabs.org/rpm/"&gt;ClusterLabs Build Area&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Regular updaters may also have noticed the expanded version scheme used by packages on clusterlabs.org.
The build scripts now automatically bump the version numbers when rebuilding the stack.
This usually occurs when new versions of corosync, cluster-glue or heartbeat come out.&lt;/p&gt;

&lt;p&gt;Versions are now of the form:
    x.y.x-a.b&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;em&gt;x.y.z&lt;/em&gt; is the upstream version (this is the only time the tarball is changed)&lt;/li&gt;
&lt;li&gt;&lt;em&gt;a&lt;/em&gt; indicates the number of spec file changes (ie. changes to dependancies)&lt;/li&gt;
&lt;li&gt;&lt;em&gt;b&lt;/em&gt; indicates how many times the package has been rebuilt with unchanged tarballs and spec files&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;So the following version: 
    pacemaker-1.0.9-1.4
would mean the fourth rebuild of the initial spec file for the upstream version &lt;em&gt;1.0.9&lt;/em&gt; of Pacemaker.&lt;/p&gt;

&lt;p&gt;Debian users should check for updates &lt;a href="http://clusterlabs.org/wiki/Install#Debian"&gt;Martin’s repo&lt;/a&gt; over the coming days and Ubuntu fans can visit &lt;a href="https://edge.launchpad.net/~ubuntu-ha-maintainers/+archive/ppa"&gt;LaunchPad&lt;/a&gt; for 8.04 and 9.10 packages.&lt;/p&gt;

&lt;p&gt;The &lt;a href="http://hg.clusterlabs.org/pacemaker/stable-1.0/archive/Pacemaker-1.0.9.tar.bz2"&gt;source tarball&lt;/a&gt; is also available directly from Mercurial.&lt;/p&gt;

&lt;p&gt;General installation instructions are available at from the &lt;a href="http://clusterlabs.org/wiki/Install"&gt;ClusterLabs wiki&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Release Statistics&lt;/h3&gt;

&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;b&gt;Changesets&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt; 152 &lt;/td&gt;
    &lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;b&gt;Diff&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;266 files changed, 14324 insertions(+), 3842 deletions(-)&lt;/td&gt;
    &lt;/tr&gt;&lt;/table&gt;&lt;h3&gt;Changes of note since Pacemaker-1.0.8&lt;/h3&gt;

&lt;ul&gt;&lt;li&gt;High: ais: Ensure the list of active processes sent to clients is always up-to-date&lt;/li&gt;
&lt;li&gt;High: ais: Fix previous commit, actually return a result in get_process_list()&lt;/li&gt;
&lt;li&gt;High: ais: Fix two more uses of getpwnam() in non-thread-safe locations&lt;/li&gt;
&lt;li&gt;High: ais: Look for the correct conf variable for turning on file logging&lt;/li&gt;
&lt;li&gt;High: ais: Need to find a better and thread-safe way to set core_uses_pid. Disable for now.&lt;/li&gt;
&lt;li&gt;High: ais: Use the threadsafe version of getpwnam&lt;/li&gt;
&lt;li&gt;High: cib: Also free query result for xpath operations that return more than one hit&lt;/li&gt;
&lt;li&gt;High: cib: Fix the application of unversioned diffs&lt;/li&gt;
&lt;li&gt;High: cib: Remove old developmental error logging&lt;/li&gt;
&lt;li&gt;High: Core: Bug lf#2414 - Prevent use-after-free reported by valgrind when doing xpath based deletions&lt;/li&gt;
&lt;li&gt;High: Core: Fix memory leak in replace_xml_child() reported by valgrind&lt;/li&gt;
&lt;li&gt;High: Core: fix memory leaks exposed by valgrind&lt;/li&gt;
&lt;li&gt;High: crmd: Bug 2401 - Improved detection of partially active peers&lt;/li&gt;
&lt;li&gt;High: crmd: Bug lf#2379 - Ensure the cluster terminates when the PE is not available&lt;/li&gt;
&lt;li&gt;High: crmd: Bug lf#2414 - Prevent use-after-free of the PE connection after it dies&lt;/li&gt;
&lt;li&gt;High: crmd: Bug lf#2439 - cancel_op() can also return HA_RSCBUSY&lt;/li&gt;
&lt;li&gt;High: crmd: Bug lf#2439 - Handle asynchronous notification of resource deletion events&lt;/li&gt;
&lt;li&gt;High: crmd: Do not allow the target_rc to be misused by resource agents&lt;/li&gt;
&lt;li&gt;High: crmd: Do not ignore action timeouts based on FSA state&lt;/li&gt;
&lt;li&gt;High: crmd: Ensure we dont get stuck in S_PENDING if we loose an election to someone that never talks to us again&lt;/li&gt;
&lt;li&gt;High: crmd: Fix memory leaks exposed by valgrind&lt;/li&gt;
&lt;li&gt;High: crmd: Remove race condition that could lead to multiple instances of a clone being active on a machine&lt;/li&gt;
&lt;li&gt;High: crmd: Send erase_status_tag() calls to the local CIB when the DC is fenced, since there is no DC to accept them&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#1959 - Fail unmanaged resources should not prevent other services from shutting down&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#2383 - Combine failcounts for all instances of an anonymous clone on a host&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#2384 - Fix intra-set colocation and ordering&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#2403 - Enforce mandatory promotion (colocation) constraints&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#2412 - Correctly locate clone instances by their prefix&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#2422 - Ordering dependencies on partially active groups not observed properly&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#2424 - Use notify oepration definition if it exists in the configuration&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#2433 - No services should be stopped until probes finish&lt;/li&gt;
&lt;li&gt;High: PE: Do not be so quick to pull the trigger on nodes that are coming up&lt;/li&gt;
&lt;li&gt;High: PE: Fix colocation for interleaved clones&lt;/li&gt;
&lt;li&gt;High: PE: Fix colocation with partially active groups&lt;/li&gt;
&lt;li&gt;High: PE: Fix memory leaks reported by valgrind&lt;/li&gt;
&lt;li&gt;High: PE: Make the current data set a global variable so it does not need to be passed around everywhere&lt;/li&gt;
&lt;li&gt;High: PE: Prevent endless loop when looking for operation definitions in the configuration&lt;/li&gt;
&lt;li&gt;High: PE: Rewrite native_merge_weights() to avoid Fix use-after-free&lt;/li&gt;
&lt;li&gt;High: Shell: always reload status if working with the cluster (bnc#590035)&lt;/li&gt;
&lt;li&gt;High: Tools: crm_mon - fix memory leaks exposed by valgrind&lt;/li&gt;
&lt;li&gt;Medium: ais: Correctly set logfile permissions in all cases&lt;/li&gt;
&lt;li&gt;Medium: ais: create the final directory too for resource agents (bnc#603190)&lt;/li&gt;
&lt;li&gt;Medium: ais: Make sure debug messages make it into the logfiles too&lt;/li&gt;
&lt;li&gt;Medium: Build: Do not enable the -ansi compiler option by default, prevents use of strtoll()&lt;/li&gt;
&lt;li&gt;Medium: cib: Bug lf#2352 - Changes to group order are not detected or broadcast to peers&lt;/li&gt;
&lt;li&gt;Medium: cib: Correctly free the cib contents at signoff when in file-based mode&lt;/li&gt;
&lt;li&gt;Medium: cib: xpath - Allow all hits to be deleted, allow the no_children option to return multiple hits&lt;/li&gt;
&lt;li&gt;Medium: PE: Bug lf#2391 - Ensure important options (notify, unique, etc) are always exposed during resource operations&lt;/li&gt;
&lt;li&gt;Medium: PE: Bug lf#2410 - Do not complain about missing agents during probes of a-symetric clusters&lt;/li&gt;
&lt;li&gt;Medium: PE: Bug lf#2426 - stop-all-resources should not apply to stonith resources&lt;/li&gt;
&lt;li&gt;Medium: PE: Bug lf#2435 - Support colocation sets with negative scores&lt;/li&gt;
&lt;li&gt;Medium: PE: Check for use-of-NULL in dump_node_scores()&lt;/li&gt;
&lt;li&gt;Medium: PE: Do not overwrite existing meta attributes (like timeout) for notify operations&lt;/li&gt;
&lt;li&gt;Medium: PE: Ensure deallocated resources are stopped&lt;/li&gt;
&lt;li&gt;Medium: PE: If there are no compatible peers when interleaving clones, ensure the instance is stopped&lt;/li&gt;
&lt;li&gt;Medium: PE: Ignore colocation weights from clone instances&lt;/li&gt;
&lt;li&gt;Medium: RA: SystemHealth: exit properly when the required software is not installed (bnc#587940)&lt;/li&gt;
&lt;li&gt;Medium: Shell: do not error on missing resource agent with asymmetrical clusters (lf#2410)&lt;/li&gt;
&lt;li&gt;Medium: Shell: do not verify empty configurations (bnc#602711)&lt;/li&gt;
&lt;li&gt;Medium: shell: find hb_delnode in correct directory&lt;/li&gt;
&lt;li&gt;Medium: Shell: observe op_defaults when verifying primitives (bnc#590033)&lt;/li&gt;
&lt;li&gt;Medium: Shell: on no id match the first of property-like elements (lf#2420)&lt;/li&gt;
&lt;li&gt;Medium: Shell: skip resource checks for property-like elements (lf#2420)&lt;/li&gt;
&lt;li&gt;Medium: Shell: verify meta attributes and properties (bnc#589867)&lt;/li&gt;
&lt;li&gt;Medium: Shell: verify only changed elements on commit (bnc#590033)&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm_mon: refresh screen on terminal resize (bnc#589811)&lt;/li&gt;
&lt;/ul&gt;</description><link>http://theclusterguy.clusterlabs.org/post/729058739</link><guid>http://theclusterguy.clusterlabs.org/post/729058739</guid><pubDate>Wed, 23 Jun 2010 15:46:14 +0200</pubDate><category>announce</category></item><item><title>Feature Spotlight: Utilization</title><description>&lt;p&gt;New in 1.1 is the ability for Pacemaker to factor the system resources (RAM, CPU, etc) into its placement algorithms.&lt;/p&gt;

&lt;p&gt;First, simply define the system resources provided by your nodes.
We’ll use &lt;em&gt;cores&lt;/em&gt; in this example, but you can literally use any name you care to dream up.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;crm configure node pcmk-1 utilization cores=2
crm configure node pcmk-2 utilization cores=4
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Then we tell the cluster how many &lt;em&gt;cores&lt;/em&gt; are needed by each VM.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;crm configure primitive kvm-small ocf:heartbeat:VirtualDomain utilization cores=1
crm configure primitive kvm-medium ocf:heartbeat:VirtualDomain utilization cores=2
crm configure primitive kvm-big ocf:heartbeat:VirtualDomain utilization cores=3
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Finally, we tell Pacemaker how to use the information:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;crm configure property placement-strategy=utilization
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now Pacemaker will ensure the load from your virtual machines will be distributed “evenly” throughout the cluster - without the need for convoluted sets of colocation constraints.&lt;/p&gt;

&lt;p&gt;Download 1.1 today from:
   &lt;a href="http://www.clusterlabs.org/rpm-next/"&gt;http://www.clusterlabs.org/rpm-next/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;Limitations&lt;/h3&gt;

&lt;p&gt;This type of problem Pacemaker is dealing with here is known as the &lt;a href="http://en.wikipedia.org/wiki/Knapsack_problem"&gt;knapsack problem&lt;/a&gt; and falls into the &lt;a href="http://en.wikipedia.org/wiki/NP-complete"&gt;NP-complete&lt;/a&gt; category of computer science problems - which is fancy way of saying “takes a really long time to solve”.&lt;/p&gt;

&lt;p&gt;Clearly in a HA cluster, its not acceptable to spend minutes, let alone hours or days, finding an optional solution while services remain unavailable.&lt;/p&gt;

&lt;p&gt;So instead of trying to solve the problem completely, Pacemaker uses a &lt;em&gt;best effort&lt;/em&gt; algorithm for determining which node should host a particular service.
This means it arrives at a “solution” much faster than traditional linear programming algorithms, but my do so at the price of leaving some services stopped.&lt;/p&gt;

&lt;p&gt;In the contrived example above:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;kvm-small would be allocated to pcmk-1&lt;/li&gt;
&lt;li&gt;kvm-medium would be allocated to pcmk-2&lt;/li&gt;
&lt;li&gt;kvm-large would remain inactive&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Which is not ideal.&lt;/p&gt;

&lt;h4&gt;Strategies for Dealing with the Limitations&lt;/h4&gt;

&lt;ul&gt;&lt;li&gt;Ensure you have sufficient physical capacity
It might sounds obvious, but if the physical capacity of your nodes is (close to) maxed out by the cluster under normal conditions, then failover isn’t going to go well.
Even without the Utilization feature, you’ll start hitting timeouts and getting secondary “failures”.&lt;/li&gt;
&lt;li&gt;Build some buffer into the capabilities advertised by the nodes
Advertise slightly more resources than we physically have on the (usually valid) assumption that a VM will not use 100% of the configured number of cores/RAM/etc &lt;em&gt;all&lt;/em&gt; the time.
This practice is also known as “over commit”.&lt;/li&gt;
&lt;li&gt;Specify service priorities
If the cluster is going to sacrifice services, it should be the ones you care (comparatively) about the least.&lt;/li&gt;
&lt;/ul&gt;</description><link>http://theclusterguy.clusterlabs.org/post/570381880</link><guid>http://theclusterguy.clusterlabs.org/post/570381880</guid><pubDate>Tue, 04 May 2010 10:00:00 +0200</pubDate><category>tips</category></item><item><title>Pacemaker ships as part of Ubuntu 10.4 - Lucid Lynx</title><description>&lt;p&gt;Ubuntu LTS 10.04 now comes with full support for Pacemaker on Corosync and Heartbeat:
   &lt;a href="http://fghaas.wordpress.com/2010/05/03/ubuntu-10-04-with-full-cluster-stack-support/"&gt;http://fghaas.wordpress.com/2010/05/03/ubuntu-10-04-with-full-cluster-stack-support/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Kudos to everyone involved!&lt;/p&gt;</description><link>http://theclusterguy.clusterlabs.org/post/567805394</link><guid>http://theclusterguy.clusterlabs.org/post/567805394</guid><pubDate>Mon, 03 May 2010 11:10:00 +0200</pubDate></item><item><title>New Pacemaker Packages</title><description>&lt;p&gt;I’ve begun uploading 1.0.8-3 to the clusterlabs.org servers.&lt;/p&gt;

&lt;p&gt;Upon closer inspection, it became apparent that the 1.0.8-2 packages were built with the wrong tarball and this led to some substantial problems with the shell.&lt;/p&gt;

&lt;p&gt;To rectify this, I’ve built 1.0.8-3.
This new version uses the original 1.0.8 tarball and an updated spec file (to fix the snmp dependancies).&lt;/p&gt;

&lt;p&gt;Apologies for the inconvenience.&lt;/p&gt;</description><link>http://theclusterguy.clusterlabs.org/post/503306887</link><guid>http://theclusterguy.clusterlabs.org/post/503306887</guid><pubDate>Wed, 07 Apr 2010 16:40:55 +0200</pubDate></item><item><title>Pacemaker in Debian</title><description>&lt;p&gt;Good news for Debian fans, Pacemaker has officially made it into &lt;a href="http://www.debian.org/releases/sid/"&gt;Sid&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;According to their &lt;a href="http://fghaas.wordpress.com/2010/04/07/complete-pacemaker-cluster-stack-now-in-debian/"&gt;blog post&lt;/a&gt;, it should also be available as an official &lt;a href="http://backports.org"&gt;backport&lt;/a&gt; for existing Debian stable releases “soon”.&lt;/p&gt;</description><link>http://theclusterguy.clusterlabs.org/post/503218030</link><guid>http://theclusterguy.clusterlabs.org/post/503218030</guid><pubDate>Wed, 07 Apr 2010 15:50:21 +0200</pubDate></item><item><title>Website Updates</title><description>&lt;p&gt;The &lt;a href="http://www.clusterlabs.org"&gt;http://www.clusterlabs.org&lt;/a&gt; server has been migrated and now features a new splash-page and a custom skin for the wiki.&lt;/p&gt;

&lt;p&gt;Hopefully the splash page will be a more helpful entry point for people exploring the project for the first time.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://clusterlabs.org"&gt;http://clusterlabs.org&lt;/a&gt; will still show the old site until the weekend (when I switch over the mail server too).&lt;/p&gt;</description><link>http://theclusterguy.clusterlabs.org/post/469848995</link><guid>http://theclusterguy.clusterlabs.org/post/469848995</guid><pubDate>Wed, 24 Mar 2010 09:37:39 +0100</pubDate></item><item><title>Pacemaker 1.0.8 Released</title><description>&lt;p&gt;The latest installment of the &lt;a href="http://www.clusterlabs.org/wiki/Pacemaker"&gt;Pacemaker&lt;/a&gt; 1.0 stable series is now ready for general consumption.&lt;/p&gt;

&lt;p&gt;In this release, apart from various bug-fixes, Dejan has split the shell up into modules.
It is anticipated that this will make it easier to maintain moving forward.&lt;/p&gt;

&lt;p&gt;We are now following the published &lt;a href="http://www.clusterlabs.org/wiki/ReleaseCalendar"&gt;release schedule&lt;/a&gt; on the clusterlabs wiki.
The next release is planned for mid-June and our main focus is now on features for 1.1/1.2 (see the &lt;a href="http://theclusterguy.clusterlabs.org/post/441442543/new-pacemaker-release-series"&gt;previous post&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Pre-built packages for Pacemaker and it’s immediate dependancies are currently building and will be available for openSUSE, SLES, Fedora, RHEL, CentOS from the &lt;a href="http://www.clusterlabs.org/rpm/"&gt;ClusterLabs Build Area&lt;/a&gt; shortly.&lt;/p&gt;

&lt;p&gt;Debian users should check for updates &lt;a href="http://clusterlabs.org/wiki/Install#Debian"&gt;Martin’s repo&lt;/a&gt; over the coming days and Ubuntu fans can visit &lt;a href="https://edge.launchpad.net/~ubuntu-ha-maintainers/+archive/ppa"&gt;LaunchPad&lt;/a&gt; for 8.04 and 9.10 packages.&lt;/p&gt;

&lt;p&gt;The &lt;a href="http://hg.clusterlabs.org/pacemaker/stable-1.0/archive/Pacemaker-1.0.8.tar.bz2"&gt;source tarball&lt;/a&gt; is also available directly from Mercurial.&lt;/p&gt;

&lt;p&gt;General installation instructions are available at from the &lt;a href="http://clusterlabs.org/wiki/Install"&gt;ClusterLabs wiki&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Release Statistics&lt;/h3&gt;

&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;b&gt;Changesets&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt; 181 &lt;/td&gt;
    &lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;b&gt;Diff&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt;
329 files changed, 22172 insertions(+), 12297 deletions(-)&lt;br/&gt;&lt;small&gt;The size of the diff is significantly impacted by the rearrangement of the shell&lt;/small&gt;
        &lt;/td&gt;
    &lt;/tr&gt;&lt;/table&gt;&lt;h3&gt;Changes of note since Pacemaker-1.0.7&lt;/h3&gt;

&lt;ul&gt;&lt;li&gt;High: Agents: ping - Prevent shell expansion of ‘*’ when there are files in /var/lib/heartbeat/cores/root (Patch from Sébastien PRUDHOMME)&lt;/li&gt;
&lt;li&gt;High: ais: Bug lf#2340 - Force rogue child processes to terminate after waiting 2.5 minutes&lt;/li&gt;
&lt;li&gt;High: ais: Bug lf#2359 - Default expected votes to 2 inside Corosync/OpenAIS plugin&lt;/li&gt;
&lt;li&gt;High: ais: Bug lf#2359 - expected-quorum-votes not correctly updated after membership change&lt;/li&gt;
&lt;li&gt;High: ais: Bug rhbz#525552 - Move non-threadsafe calls to setenv() to after the fork()&lt;/li&gt;
&lt;li&gt;High: crmd: Bug bnc#578644 - Improve handling of cancelled operations caused by resource cleanup&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#2317 - Avoid needless restart of primitive depending on a clone&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#2358 - Fix master-master anti-colocation&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#2361 - Ensure clones observe mandatory ordering constraints if the LHS is unrunnable&lt;/li&gt;
&lt;li&gt;High: PE: Correctly implement optional colocation between primitives and clone resources&lt;/li&gt;
&lt;li&gt;High: Shell: add support for xml in cli&lt;/li&gt;
&lt;li&gt;High: Shell: check timeouts also against the default-action-timeout property&lt;/li&gt;
&lt;li&gt;High: Shell: edit multiple meta_attributes sets in resource management (lf#2315)&lt;/li&gt;
&lt;li&gt;High: Shell: improve configure commit (lf#2336)&lt;/li&gt;
&lt;li&gt;High: Shell: new cibstatus import command (bnc#585471)&lt;/li&gt;
&lt;li&gt;High: Shell: restore error reporting in options&lt;/li&gt;
&lt;li&gt;High: Shell: update previous node lookup procedure to include the id where necessary&lt;/li&gt;
&lt;li&gt;High: Shell: move scores from resource sets to the constraint element (lf#2331)&lt;/li&gt;
&lt;li&gt;High: Shell: recovery from bad/outdated help index file&lt;/li&gt;
&lt;li&gt;Medium: ais: getpwnam() is also not thread safe, move after the call to fork()&lt;/li&gt;
&lt;li&gt;Medium: ais: Set permissions to allow ‘to_file’ logging to function correctly&lt;/li&gt;
&lt;li&gt;Medium: Core: Give signal handlers higher priority - patch based on Lars Ellenbergs work&lt;/li&gt;
&lt;li&gt;Medium: crmd: Bug bnc#578644 - Do not send operation updates for deleted resources&lt;/li&gt;
&lt;li&gt;Medium: PE: Bug bnc#586710 - Make sure migration ops use the correct meta options (eg. timeouts)&lt;/li&gt;
&lt;li&gt;Medium: PE: Deprecate the lifetime tag in constraints&lt;/li&gt;
&lt;li&gt;Medium: Tools: attrd - Only ignore the update if the attributes value is completely stable (ie. supplied, current, and stored all match)&lt;/li&gt;
&lt;li&gt;Medium: Tools: Bug lf#2302 - Use the same resource printing logic for html and non-html output&lt;/li&gt;
&lt;li&gt;Medium: Tools: Bug LF#2312 - crm_mon - Prevent zombie child processes when using custom traps (Patch from Bernd Schubert)&lt;/li&gt;
&lt;li&gt;Medium: Tools: Bug lf#2330 - Add a blank line after the subject to indicate the beginning of the mail body&lt;/li&gt;
&lt;li&gt;Medium: Tools: Bug lf#2330 - Move the blank line before the body text instead&lt;/li&gt;
&lt;li&gt;Medium: Tools: Bug lf#2330 - Use \r in addition to \n for line endings&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm_mon - Add support for older versions of SNMP - Patch derived from the work of sato yuki&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm_mon - Display the true fail-count, not the effective value&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm_mon - Use node uname in snmp/smtp/etc events&lt;/li&gt;
&lt;li&gt;Medium: Tools: hb2openais: add support for corosync (and more)&lt;/li&gt;
&lt;li&gt;Medium: Shell: add option to control sorting of cib elements (lf#2290)&lt;/li&gt;
&lt;li&gt;Medium: Shell: do not cache node and resource ids (lf#2368)&lt;/li&gt;
&lt;li&gt;Medium: Shell: fix commit for new clones of new groups (bnc#585471)&lt;/li&gt;
&lt;li&gt;Medium: Shell: help: unsort help items&lt;/li&gt;
&lt;li&gt;Medium: Shell: implement lifetime for rsc migrate and node standby (lf#2353)&lt;/li&gt;
&lt;li&gt;Medium: Shell: load update should update existing elements&lt;/li&gt;
&lt;li&gt;Medium: Shell: node attributes update in configure (bnc#582767)&lt;/li&gt;
&lt;li&gt;Medium: Shell: parse lists not tupples&lt;/li&gt;
&lt;li&gt;Medium: Shell: Repair “cib cibstatus op” functionality (bnc#585641)&lt;/li&gt;
&lt;li&gt;Medium: Shell: repair node show (thanks to T. Schraitle) (bnc#587883)&lt;/li&gt;
&lt;li&gt;Medium: Shell: repare clone/ms cleanup (nbc#583288)&lt;/li&gt;
&lt;li&gt;Medium: Shell: catch IOErrors when opening files&lt;/li&gt;
&lt;li&gt;Medium: Shell: check for duplicate children when creating groups (lf#2326)&lt;/li&gt;
&lt;li&gt;Medium: Shell: do not allow score-attribute in orders&lt;/li&gt;
&lt;li&gt;Medium: Shell: do not fiddle with cib when there is no cib (bnc#575701)&lt;/li&gt;
&lt;li&gt;Medium: Shell: do not produce empty resource sets when adding roles/actions&lt;/li&gt;
&lt;li&gt;Medium: Shell: do not verify empty configurations (lf#2316)&lt;/li&gt;
&lt;li&gt;Medium: Shell: fix CIB upgrade command (bnc#578637)&lt;/li&gt;
&lt;li&gt;Medium: Shell: fix exit code for template apply&lt;/li&gt;
&lt;li&gt;Medium: Shell: fix reference replacement in resource sets&lt;/li&gt;
&lt;li&gt;Medium: Shell: install crm_cli.txt also in the datadir&lt;/li&gt;
&lt;li&gt;Medium: Shell: use the CRM_HELP_FILE variable if set&lt;/li&gt;
&lt;/ul&gt;</description><link>http://theclusterguy.clusterlabs.org/post/452813842</link><guid>http://theclusterguy.clusterlabs.org/post/452813842</guid><pubDate>Tue, 16 Mar 2010 21:25:35 +0100</pubDate><category>announce</category></item><item><title>New Pacemaker Release Series</title><description>&lt;p&gt;A number of new branches have been created in the last few days which are integral to how we plan to add new features in a controlled manner.&lt;/p&gt;

&lt;p&gt;Current set of branches:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;1.0 - The existing stable series&lt;/li&gt;
&lt;li&gt;1.1 - The current feature series&lt;/li&gt;
&lt;li&gt;1.2 - The next stable series (expected Q4 2010)&lt;/li&gt;
&lt;li&gt;devel - Where new features are added&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The idea is that &lt;em&gt;1.0&lt;/em&gt; will continue receive only bugfixes (the amount hopefully continuing to reduce over time) and all new features go first into &lt;em&gt;devel&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;1.1&lt;/em&gt; will be the merge point, primarily receiving bug fixes for the new features we pulled in from &lt;em&gt;devel&lt;/em&gt; (at least two weeks before a &lt;em&gt;1.1&lt;/em&gt; release).&lt;/p&gt;

&lt;p&gt;The intention is to pursue bi-monthly releases of &lt;em&gt;1.1&lt;/em&gt; over the next 6-12 months until we can offer a set of compelling new features, a sane way to use them, and rock-solid stability.&lt;/p&gt;

&lt;p&gt;At that time, we will freeze development and release it as 1.2.0&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_kz4mgizPlt1qzagr8.jpg" alt="Here is the repository layout in graphical form "/&gt;&lt;/p&gt;

&lt;h3&gt;But Can I use it in Production?&lt;/h3&gt;

&lt;p&gt;Cluster admins are naturally a cautious bunch, so in order to make &lt;em&gt;1.1&lt;/em&gt; more palatable, we have also created two new schemas.&lt;/p&gt;

&lt;p&gt;Schemas can be thought of as a contract between the developers and admins.
They are a declaration of what we support.&lt;/p&gt;

&lt;p&gt;The set is the new set of validation types:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;pacemaker-1.0&lt;/li&gt;
&lt;li&gt;pacemaker-1.1 &lt;/li&gt;
&lt;li&gt;pacemaker-1.2&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;em&gt;pacemaker-1.0&lt;/em&gt; is (and will remain) &lt;strong&gt;identical&lt;/strong&gt; to what people are using now&lt;/p&gt;

&lt;p&gt;When new features are added, their configuration syntax will first appear in &lt;em&gt;pacemaker-1.1&lt;/em&gt;.
If, based on your feedback, a feature’s syntax needs to be modified we will change it here and solicit further feedback.&lt;/p&gt;

&lt;p&gt;Once a feature has been locked down and become stable, its configuration pieces will be added to &lt;em&gt;pacemaker-1.2&lt;/em&gt;.
Once a feature appears in &lt;em&gt;pacemaker-1.2&lt;/em&gt;, we will not change its syntax in any &lt;em&gt;1.1&lt;/em&gt; or &lt;em&gt;1.2&lt;/em&gt; release.&lt;/p&gt;

&lt;h4&gt;What does this mean?&lt;/h4&gt;

&lt;p&gt;It means that when used with the &lt;em&gt;pacemaker-1.0&lt;/em&gt; or &lt;em&gt;pacemaker-1.2&lt;/em&gt; schemas, &lt;em&gt;1.1&lt;/em&gt; can be safely used in production in the same manner as &lt;em&gt;1.0&lt;/em&gt; is today.&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;If the existing syntax is all you need, consider &lt;em&gt;1.1&lt;/em&gt; with the &lt;em&gt;pacemaker-1.0&lt;/em&gt; schema.&lt;/li&gt;
&lt;li&gt;If you want to try a new stable feature, use &lt;em&gt;1.1&lt;/em&gt; with the &lt;em&gt;pacemaker-1.2&lt;/em&gt; schema.&lt;/li&gt;
&lt;li&gt;If you want to try a new experimental feature, use &lt;em&gt;1.1&lt;/em&gt; with the &lt;em&gt;pacemaker-1.1&lt;/em&gt; schema.&lt;/li&gt;
&lt;/ul&gt;&lt;h3&gt;Is 1.0 Still Supported?&lt;/h3&gt;

&lt;p&gt;Yes. 1.0 i still supported and will recieve all relevant bug fixes until at least 2012.
See our &lt;a href="http://www.clusterlabs.org/wiki/Releases"&gt;releases&lt;/a&gt; page.&lt;/p&gt;

&lt;h3&gt;How can I Install it?&lt;/h3&gt;

&lt;p&gt;Users of RPM-based distributions will be able to install in the usual manner. However, to avoid forcing everyone to upgrade, packages will be located under:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&lt;a href="http://www.clusterlabs.org/rpm-next"&gt;http://www.clusterlabs.org/rpm-next&lt;/a&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;At this time the intention is to only produce packages for the most recent openSUSE (11.2), Fedora (12), and EPEL (5) based distributions. 
If you would like to try 1.1 but your distribution/version is not listed, please contact the project and we’ll see what we can do.&lt;/p&gt;

&lt;p&gt;Packages should be available in the next week “or so”.&lt;/p&gt;

&lt;h3&gt;Identified Areas of Development&lt;/h3&gt;

&lt;h4&gt;Existing&lt;/h4&gt;

&lt;ol&gt;&lt;li&gt;New stonith daemon (configuration is unchanged), including:

&lt;ul&gt;&lt;li&gt;support for RHCS Stonith Agents&lt;/li&gt;
&lt;li&gt;cluster-wide notifications (that allow the cluster to make more intelligent decisions)&lt;/li&gt;
&lt;li&gt;ability to perform multiple fencing operations in parallel (faster recovery)&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;Service placement influenced by the physical resources (RAM, CPU, etc) required by the service (and offered by the host)&lt;/li&gt;
&lt;li&gt;A new tool for simulating failures and the cluster’s reaction to them&lt;/li&gt;
&lt;li&gt;Tell the cluster to serialize an otherwise unrelated a set of resource actions (eg. Xen migrations) &lt;/li&gt;
&lt;/ol&gt;&lt;h4&gt;Planned&lt;/h4&gt;

&lt;ol&gt;&lt;li&gt;Failover domains, an easy way to specify ordered preferences for a set of hosts&lt;/li&gt;
&lt;li&gt;ACLs, the ability to restrict read/write access to the configuration based on users and groups&lt;/li&gt;
&lt;li&gt;Improvements to the System Health feature&lt;/li&gt;
&lt;li&gt;Freeze/thaw, an easy way to ensure a consistent backup of data in use by distributed systems.&lt;/li&gt;
&lt;li&gt;The ability to monitor services on non-cluster nodes (such as VMs)&lt;/li&gt;
&lt;/ol&gt;</description><link>http://theclusterguy.clusterlabs.org/post/441442543</link><guid>http://theclusterguy.clusterlabs.org/post/441442543</guid><pubDate>Thu, 11 Mar 2010 17:55:12 +0100</pubDate><category>announce</category></item><item><title>Pacemaker removed from OBS</title><description>&lt;p&gt;Today I removed Pacemaker from server:ha-clustering on the openSUSE build service.&lt;/p&gt;

&lt;p&gt;I lost patience with the service some time ago and the project has been providing pre-built packages from &lt;a href="http://www.clusterlabs.org/rpm"&gt;cluster labs&lt;/a&gt; ever since (see our &lt;a href="http://www.clusterlabs.org/wiki/Install"&gt;install page&lt;/a&gt; for more details).&lt;/p&gt;

&lt;p&gt;It seems no-one else has had the time or patience to keep the build service updated since my departure so, after noticing their age and the fact that they no longer even build on the majority of targets, I made the decision to remove them.&lt;/p&gt;

&lt;p&gt;Hopefully this will help avoid confusing those wanting the latest Pacemaker software.&lt;/p&gt;</description><link>http://theclusterguy.clusterlabs.org/post/370384760</link><guid>http://theclusterguy.clusterlabs.org/post/370384760</guid><pubDate>Thu, 04 Feb 2010 11:42:00 +0100</pubDate><category>announce</category></item><item><title>Pacemaker 1.0.7 Released</title><description>&lt;p&gt;The latest installment of the &lt;a href="http://www.clusterlabs.org/wiki/Pacemaker"&gt;Pacemaker&lt;/a&gt; 1.0 stable series is now ready for general consumption.&lt;/p&gt;

&lt;p&gt;In this release, we’ve made a number improvements to clone handling - particularly the way ordering constraints are processed - as well as some really nice improvements to the shell.&lt;/p&gt;

&lt;p&gt;The next 1.0 release is anticipated to be in mid-March.
We will be switching to a bi-monthly release schedule to begin focusing on development for the next stable series (more details soon).
If you have feature requests, now is the time to voice them and/or provide patches :-)&lt;/p&gt;

&lt;p&gt;Pre-built packages for Pacemaker and it’s immediate dependancies are currently building and will be available for openSUSE, SLES, Fedora, RHEL, CentOS from the &lt;a href="http://www.clusterlabs.org/rpm/"&gt;ClusterLabs Build Area&lt;/a&gt; shortly.&lt;/p&gt;

&lt;p&gt;Debian users should check for updates &lt;a href="http://clusterlabs.org/wiki/Install#Debian"&gt;Martin’s repo&lt;/a&gt; over the coming days and Ubuntu fans can visit &lt;a href="https://edge.launchpad.net/~ubuntu-ha-maintainers/+archive/ppa"&gt;LaunchPad&lt;/a&gt; for 8.04 and 9.10 packages.&lt;/p&gt;

&lt;p&gt;The &lt;a href="http://hg.clusterlabs.org/pacemaker/stable-1.0/archive/Pacemaker-1.0.7.tar.bz2"&gt;source tarball&lt;/a&gt; is also available directly from Mercurial.&lt;/p&gt;

&lt;p&gt;General installation instructions are available at from the &lt;a href="http://clusterlabs.org/wiki/Install"&gt;ClusterLabs wiki&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Release Statistics&lt;/h3&gt;

&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;b&gt;Changesets&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt; 193 &lt;/td&gt;
    &lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;b&gt;Diff&lt;/b&gt;&lt;/td&gt;
        &lt;td&gt; 220 files changed, 15933 insertions(+), 8782 deletions(-)&lt;/td&gt;
    &lt;/tr&gt;&lt;/table&gt;&lt;h3&gt;Changes of note since Pacemaker-1.0.6&lt;/h3&gt;

&lt;ul&gt;&lt;li&gt;High: PE: Bug 2213 - Ensure groups process location constraints so that clone-node-max works for cloned groups&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#2153 - non-clones should not restart when clones stop/start on other nodes&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#2209 - Clone ordering should be able to prevent startup of dependant clones&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#2216 - Correctly identify the state of anonymous clones when deciding when to probe&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#2225 - Operations that require fencing should wait for ‘stonith_complete’ not ‘all_stopped’.&lt;/li&gt;
&lt;li&gt;High: PE: Bug lf#2225 - Prevent clone peers from stopping while another is instance is (potentially) being fenced&lt;/li&gt;
&lt;li&gt;High: PE: Correctly anti-colocate with a group&lt;/li&gt;
&lt;li&gt;High: PE: Correctly unpack ordering constraints for resource sets to avoid graph loops&lt;/li&gt;
&lt;li&gt;High: Tools: crm: load help from crm_cli.txt&lt;/li&gt;
&lt;li&gt;High: Tools: crm: resource sets (bnc#550923)&lt;/li&gt;
&lt;li&gt;High: Tools: crm: support for comments (LF 2221)&lt;/li&gt;
&lt;li&gt;High: Tools: crm: support for description attribute in resources/operations (bnc#548690)&lt;/li&gt;
&lt;li&gt;High: Tools: hb2openais: add EVMS2 CSM processing (and other changes) (bnc#548093)&lt;/li&gt;
&lt;li&gt;High: Tools: hb2openais: do not allow empty rules, clones, or groups (LF 2215)&lt;/li&gt;
&lt;li&gt;High: Tools: hb2openais: refuse to convert pure EVMS volumes&lt;/li&gt;
&lt;li&gt;High: cib: Ensure the loop for login message terminates&lt;/li&gt;
&lt;li&gt;High: cib: Finally fix reliability of receiving large messages over remote plaintext connections&lt;/li&gt;
&lt;li&gt;High: cib: Fix remote notifications&lt;/li&gt;
&lt;li&gt;High: cib: For remote connections, default to CRM_DAEMON_USER since thats the only one that the cib can validate the password for using PAM&lt;/li&gt;
&lt;li&gt;High: cib: Remote plaintext - Retry sending parts of the message that did not fit the first time&lt;/li&gt;
&lt;li&gt;High: crmd: Ensure batch-limit is correctly enforced&lt;/li&gt;
&lt;li&gt;High: crmd: Ensure we have the latest status after a transition abort&lt;/li&gt;
&lt;li&gt;High (bnc#547579,547582): Tools: crm: status section editing support&lt;/li&gt;
&lt;li&gt;High: shell: Add allow-migrate as allowed meta-attribute (bnc#539968)&lt;/li&gt;
&lt;li&gt;Medium: Build: Do not automatically add -L/lib, it could cause 64-bit arches to break&lt;/li&gt;
&lt;li&gt;Medium: PE: Bug lf#2206 - rsc_order constraints always use score at the top level&lt;/li&gt;
&lt;li&gt;Medium: PE: Only complain about target-role=master for non m/s resources&lt;/li&gt;
&lt;li&gt;Medium: PE: Prevent non-multistate resources from being promoted through target-role&lt;/li&gt;
&lt;li&gt;Medium: PE: Provide a default action for resource-set ordering&lt;/li&gt;
&lt;li&gt;Medium: PE: Silently fix requires=fencing for stonith resources so that it can be set in op_defaults&lt;/li&gt;
&lt;li&gt;Medium: Tools: Bug lf#2286 - Allow the shell to accept template parameters on the command line&lt;/li&gt;
&lt;li&gt;Medium: Tools: Bug lf#2307 - Provide a way to determin the nodeid of past cluster members&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm: add update method to template apply (LF 2289)&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm: direct RA interface for ocf class resource agents (LF 2270)&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm: direct RA interface for stonith class resource agents (LF 2270)&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm: do not add score which does not exist&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm: do not consider warnings as errors (LF 2274)&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm: do not remove sets which contain id-ref attribute (LF 2304)&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm: drop empty attributes elements&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm: exclude locations when testing for pathological constraints (LF 2300)&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm: fix exit code on single shot commands&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm: fix node delete (LF 2305)&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm: implement -F (—force) option&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm: rename status to cibstatus (LF 2236)&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm: revisit configure commit&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm: stay in crm if user specified level only (LF 2286)&lt;/li&gt;
&lt;li&gt;Medium: Tools: crm: verify changes on exit from the configure level&lt;/li&gt;
&lt;li&gt;Medium: ais: Some clients such as gfs_controld want a cluster name, allow one to be specified in corosync.conf&lt;/li&gt;
&lt;li&gt;Medium: cib: Clean up logic for receiving remote messages&lt;/li&gt;
&lt;li&gt;Medium: cib: Create valid notification control messages&lt;/li&gt;
&lt;li&gt;Medium: cib: Indicate where the remote connection came from&lt;/li&gt;
&lt;li&gt;Medium: cib: Send password prompt to stderr so that stdout can be redirected&lt;/li&gt;
&lt;li&gt;Medium: cts: Fix rsh handling when stdout is not required&lt;/li&gt;
&lt;li&gt;Medium: doc: Fill in the section on removing a node from an AIS-based cluster&lt;/li&gt;
&lt;li&gt;Medium: doc: Update the docs to reflect the 0.6/1.0 rolling upgrade problem&lt;/li&gt;
&lt;li&gt;Medium: doc: Use Publican for docbook based documentation&lt;/li&gt;
&lt;li&gt;Medium: fencing: stonithd: add metadata for stonithd instance attributes (and support in the shell)&lt;/li&gt;
&lt;li&gt;Medium: fencing: stonithd: ignore case when comparing host names (LF 2292)&lt;/li&gt;
&lt;li&gt;Medium: tools: Make crm_mon functional with remote connections&lt;/li&gt;
&lt;li&gt;Medium: xml: Add stopped as a supported role for operations&lt;/li&gt;
&lt;li&gt;Medium: xml: Bug bnc#552713 - Treat node unames as text fields not IDs&lt;/li&gt;
&lt;li&gt;Medium: xml: Bug lf#2215 - Create an always-true expression for empty rules when upgrading from 0.6&lt;/li&gt;
&lt;/ul&gt;</description><link>http://theclusterguy.clusterlabs.org/post/340780359</link><guid>http://theclusterguy.clusterlabs.org/post/340780359</guid><pubDate>Mon, 18 Jan 2010 12:38:47 +0100</pubDate><category>announce</category></item></channel></rss>

