November 2011
1 post
Pacemaker 1.0.12 Released
Thanks once again to the efforts of Keisuke MORI from NTT, the latest bug fixes have been back-ported from 1.1 and another instalment of the Pacemaker 1.0 release series is now ready for general consumption.
Changesets
96
Diff
121 files changed, 8617 insertions(+), 988 deletions(-)
Important changes since Pacemaker-1.0.11 include:
cib: Call gnutls_bye() and...
October 2011
2 posts
New Version Control System
Since September, Pacemaker has started using Git for the 1.1 and devel trees.
There were some minor technical advantages over Mercurial (which I still personally prefer), but mostly the decision was driven by the pain associated with switching between SCMs multiple times a day.
The majority of development now happens on GitHub, which has some great features for reviewing patches and general...
New Issue Tracker
Since it’s clearly not acceptable for our issue tracker to be offline for months at a time, it is time to replace the Bugzilla instance hosted by the Linux Foundation with something else.
One candidate that came close was the github issue tracker, but alas it doesn’t support attachments.
The end result is that we now have an instance of Bugzilla v4 at:
http://bugs.clusterlabs.org
...
May 2011
1 post
Pacemaker 1.0.11 Released
The latest installment of the Pacemaker 1.0 release series is now ready for general consumption.
Changesets
85
Diff
500 files changed, 69642 insertions(+), 58270 deletions(-)
Thanks once again to the efforts of Keisuke MORI and NTT, the latest bug fixes have been back-ported from 1.1
Important changes since Pacemaker-1.0.10 include:
cib: Repair the processing...
February 2011
1 post
Pacemaker 1.1.5 Released
The latest installment of the Pacemaker 1.1 release series is now ready for general consumption.
Changesets
184
Diff
605 files changed, 46103 insertions(+), 26417 deletions(-)
As well as the usual round of bug fixes, see the full changelog, S.U.S.E. has implemented support for ACLs.
This means that you can now delegate permission to control parts of the cluster (as...
November 2010
2 posts
New Logo?
One unexpected outcome from the recent Linux Plumbers conference was the contribution of a new logo to the project by NTT.
Quite possibly you’re now wondering how this logo relates at all to clustering and the Pacemaker project.
Don’t worry, they came up with a backstory too!
In various forms of racing there is quite often someone/something setting a benchmark time or...
Pacemaker Release Roundup
It may have seemed quiet since July, but things were actually so busy that I couldn’t find the time to publicize our new releases.
First up, the long awaited 1.0.10 is finally here.
Thanks once again to the hard work of Keisuke MORI from NTT, 1.0.10 contains all the bug fixes from the recent 1.1.3 and 1.1.4 releases.
You can preview the list of updates with the new online change log.
...
October 2010
2 posts
Pacemaker, Heartbeat, Corosync, WTF?
One question I still get a lot is what all these projects are/do and how they all relate.
Here is the list of the possible components that might make up a Pacemaker install is:
Pacemaker - Resource manager
Corosync - Messaging layer
Heartbeat - Also a messaging layer
Resource Agents - Scripts that know how to control various services
Pacemaker is the thing that starts and stops services...
Large Cluster Performance
Over the last few days, I’ve spent a bunch of time improving Pacemaker’s performance in large clusters.
This involved profiling the CIB and Policy Engine, identifying and optimizing hotspots and improving algorithm designs.
Since most of my work is done in virtual machines, it wasn’t possible to use oprofile.
Strictly speaking oprofile worked, but without hardware performance...
August 2010
1 post
Introducing the Pacemaker Master Control Process...
The latest addition to the Pacemaker 1.1 series is a master control process (MCP) and associated init script.
This means that Pacemaker is now started/stopped independently of the messaging layer.
We anticipate that this should result in a simpler and more reliable startup/shutdown procedure when used in combination with Corosync.
Forking inside a multi-threaded process like Corosync causes...
June 2010
1 post
1 tag
Pacemaker 1.0.9 Released
The latest installment of the Pacemaker 1.0 stable series is now ready for general consumption.
Coinciding with 1.0.9 is a new version of Corosync (1.2.5).
Included in both are some important fixes that should resolve most of the startup issues people have been seeing.
Also included in this release are the fixes for issues reported by Valgrind and Coverity.
As per our release calendar, the...
May 2010
2 posts
1 tag
Feature Spotlight: Utilization
New in 1.1 is the ability for Pacemaker to factor the system resources (RAM, CPU, etc) into its placement algorithms.
First, simply define the system resources provided by your nodes.
We’ll use cores in this example, but you can literally use any name you care to dream up.
crm configure node pcmk-1 utilization cores=2
crm configure node pcmk-2 utilization cores=4
Then we tell the...
Pacemaker ships as part of Ubuntu 10.4 - Lucid...
Ubuntu LTS 10.04 now comes with full support for Pacemaker on Corosync and Heartbeat:
http://fghaas.wordpress.com/2010/05/03/ubuntu-10-04-with-full-cluster-stack-support/
Kudos to everyone involved!
April 2010
2 posts
New Pacemaker Packages
I’ve begun uploading 1.0.8-3 to the clusterlabs.org servers.
Upon closer inspection, it became apparent that the 1.0.8-2 packages were built with the wrong tarball and this led to some substantial problems with the shell.
To rectify this, I’ve built 1.0.8-3.
This new version uses the original 1.0.8 tarball and an updated spec file (to fix the snmp dependancies).
Apologies for the...
Pacemaker in Debian
Good news for Debian fans, Pacemaker has officially made it into Sid.
According to their blog post, it should also be available as an official backport for existing Debian stable releases “soon”.
March 2010
3 posts
Website Updates
The http://www.clusterlabs.org server has been migrated and now features a new splash-page and a custom skin for the wiki.
Hopefully the splash page will be a more helpful entry point for people exploring the project for the first time.
http://clusterlabs.org will still show the old site until the weekend (when I switch over the mail server too).
1 tag
Pacemaker 1.0.8 Released
The latest installment of the Pacemaker 1.0 stable series is now ready for general consumption.
In this release, apart from various bug-fixes, Dejan has split the shell up into modules.
It is anticipated that this will make it easier to maintain moving forward.
We are now following the published release schedule on the clusterlabs wiki.
The next release is planned for mid-June and our main...
1 tag
New Pacemaker Release Series
A number of new branches have been created in the last few days which are integral to how we plan to add new features in a controlled manner.
Current set of branches:
1.0 - The existing stable series
1.1 - The current feature series
1.2 - The next stable series (expected Q4 2010)
devel - Where new features are added
The idea is that 1.0 will continue receive only bugfixes (the amount...
February 2010
1 post
1 tag
Pacemaker removed from OBS
Today I removed Pacemaker from server:ha-clustering on the openSUSE build service.
I lost patience with the service some time ago and the project has been providing pre-built packages from cluster labs ever since (see our install page for more details).
It seems no-one else has had the time or patience to keep the build service updated since my departure so, after noticing their age and the...
January 2010
3 posts
1 tag
Pacemaker 1.0.7 Released
The latest installment of the Pacemaker 1.0 stable series is now ready for general consumption.
In this release, we’ve made a number improvements to clone handling - particularly the way ordering constraints are processed - as well as some really nice improvements to the shell.
The next 1.0 release is anticipated to be in mid-March.
We will be switching to a bi-monthly release schedule...
Ubuntu looking for Pacemaker testers
Ubuntu is looking to switch its supported cluster stack to Corosync+Pacemaker and has put out a “Call for testers”.
Check out the link if this is something you’re interested in.
Pre-Announce: End of Pacemaker 0.6 support is near
Unless there are violent objections, I plan to officially stop supporting 0.6 at the end of February.
Since I’ve not seeing any bugs reported for some time, it seems that anyone still using 0.6 is happy with it for their workload.
Also, 1.0 has been out for over a year now and contains significant improvements over 0.6 including
A unified shell that hides the XML scaffolding
Migration...
November 2009
2 posts
New Documentation Formats
I’m pleased to report that the core Pacemaker documentation is now available in PDF, HTML (chunked and single page) and even TXT formats.
The old Pages.app sources have been replaced with DocBook which allows them to be:
published in a variety of formats
kept under version control
included in the packages
updated by anybody
Additionally, we’re using Publican to produce the...
1 tag
Pacemaker 1.0.6 Released
The next installment of the Pacemaker 1.0 stable series is now ready for general consumption.
In addition to further polishing of the crm shell and CLI tools, this is the first release to support CoroSync (version 1.1.2 or greater is required).
The ”Pacemaker Explained” reference has also been converted to docbook and is included as part of the tarball (and pre-built packages if...
October 2009
1 post
1 tag
Advisory: Don't use Pacemaker on Corosync (yet)
I spent some time looking into the state of the Pacemaker/Corosync integration today and I can only recommend Pacemaker users stay on the previous version of OpenAIS (aka. Whitetank).
In a nutshell, shutdown is utterly broken.
r2140 of Corosync removed the shutdown worker thread which allowed plugins such as Pacemaker to continue sending and receiving cluster messages.
Without it, Corosync...
September 2009
3 posts
1 tag
Clusters From Scratch
The first of a new series of step-by-step guides for Pacemaker.
This installment covers installation, the creation of an active/passive cluster and its conversion to active/active.
Technologies used include:
Fedora 11 as the host operating system
OpenAIS to provide messaging and membership services,
Pacemaker to perform resource management,
DRBD as a cost-effective alternative to shared...
1 tag
Version Control Prompt
I find it convenient to include current SCM data before my regular Bash prompt (reduces the chance of “accidents”). Perhaps someone else will find it useful too.
function prompt-pre-exec() {
scm=""
repo_root=$(hg root 2>/dev/null)
if [ -e CVS ]; then
scm=":: cvs ::"
elif [ -e .svn ]; then
scm=":: svn : ${prompt_hl}r$(svn info | grep Revision | sed...
1 tag
Configuring Heartbeat v1 Was So Simple
…because it couldn’t do anything.
People who loved how simple Heartbeat v1 was to configure often complain how complex Pacemaker is.
But the key differences between the two configurations are driven by the very features that haresources-based clusters couldn’t provide.
Granted we made a mess of things with the original XML syntax.
When the job of writing the CRM/Pacemaker...
August 2009
7 posts
Another Documentation Update
Quick FYI… I’ve made some more improvements to the Configuration Explained PDF
http://clusterlabs.org/mediawiki/images/f/fb/Configuration_Explained.pdf
Changes include:
Fixed a number of date based rule examples
Updated details on the stonith-enabled option
Fixed the URL for the obtaining the XSLT conversion script
Explanations of the possible values for the target-role and...
1 tag
Dev Repository Recreated
For a variety of reasons, the Pacemaker dev repository has been recreated and its history pruned of non-pacemaker related changes.
Any existing clones need to be removed and re-cloned as the all the changeset hashes have changed and it is fundamentally a different repository.
The good news though, is that with the help of some Mercurial hacking, any links using the “old” hashes...
Pacemaker 1.0.5 Released
I’m back from vacation so it’s time for another Pacemaker bug-fix release.
Testing went flawlessly and so without further ado, here it is…
Pre-built packages for Pacemaker and it’s immediate dependancies will be available for openSUSE, SLES, Fedora, RHEL, CentOS from the OpenSUSE Build Service once it catches up.
Debian users should check for updates Martin’s repo...
Poll: Which Distro do you use for Pacemaker?
Please let us know by filling out the following poll:
http://www.clusterlabs.org/wiki/UsagePoll
Any version of Pacemaker counts, as do versions of Heartbeat running the CRM.
1 tag
Choose the Right Hardware
Recently I was asked to help diagnose a cluster that was behaving incredibly badly.
In this case, it was a 2-node cluster based on OpenAIS and they were simulating failures.
They’d initiate a failure and sure enough, the other node would recognize and respond appropriately.
So far so good.
The problems started when the failed node rebooted.
When the failed node returned, it came up...
Updated Documentation
The Configuration Explained PDF has been updated for 1.0.4/5
http://clusterlabs.org/mediawiki/images/f/fb/Configuration_Explained.pdf
Changes include:
Instructions for enabling resource migration
Clarified the difference between moving and migrating a resource
Clarified how the semantics of resources sets differ from that of groups
Fixed instructions for enabling Pacemaker for...
Pacemaker 1.0.5: Testing In Progress
A quick note to say that 1.0.5 testing officially started today.
Release testing usually takes 1-2 weeks.
Currently queued changes for this release:
High (bnc#507255): Tools: crm: implement date expressions
High: Build: Fix compilation when snmp and esmtp are not available
High: Core: Show help text and exit with rc 1 if option processing failed
High: PE: Bug 2160 - Dont shuffle clones due...
July 2009
2 posts
Pacemaker in Fedora 12
Good news for Fedora fans, we’ve successfully navigated the required red tape and Pacemaker will ship in Fedora 12.
Hopefully Debian and Ubuntu will not be far behind.
1 tag
Resource Migration and Regression Testing
Yesterday I was working on a migration bug.
It didn’t take long to identify or fix, and afterwards I was terribly pleased with myself.
The fix was simple, elegant and allowed the cluster to use migration (instead of stop then start) more often.
Why had I not seen how easy it was sooner?
Unfortunately it was because I’d ignored half the problem.
One decision I’m...
June 2009
1 post
1 tag
Pacemaker 1.0.4 Released
It took a little longer than expected, but the latest 1.0 maintenance release (1.0.4) is finally available.
Apart from a number of important bug fixes, the latest release is the first to include comprehensive man-pages for all CLI tools. These are generated from the source code using help2man and so are guaranteed to be accurate.
Unfortunately for RHEL and CentOS users, those distros...
May 2009
6 posts
Highly Available Data Corruption
Whenever there is doubt, there is no doubt
- Robert De Niro, Ronin
There is little point ensuring service continuity if the underlying data is toast.
Pacemaker makes use of a concept called STONITH to prevent this from happening but many people don’t understand what it is or why it is so important.
What is STONITH?
STONITH is an acronym for “Shoot The Other Node In The...
STONITH Death-match →
Nice description of STONITH’s limitations in a 2-node environment
1 tag
A Brief, Incomplete, and Mostly Wrong History of... →
1 tag
Why Wont the Cluster Start my Services?
Its a common question and a worthy topic for an extended article.
Here’s the steps I usually follow when diagnosing such issues.
Is the cluster allowed to start services?
Check quorum status with crm_mon —one-shot
Quorum is a property of the cluster which is attained when more than half the number of known nodes is online.
Unlike Heartbeat, OpenAIS based clusters don’t...
1 tag
raison d'etre
This tumbl/blog/thingy exists because I’ve finally accepted that “If we build it, they will come” is a fallacy. The internet is a big place and if you don’t speak up, you’ll get lost in the noise of those that do.
So, I’m going to try and use this place to raise awareness of a project that’s very important to me - Pacemaker - an incredibly advanced open...
Is this thing on?
Nothing to see here yet. Just taking the software for a spin.