Chris's Blog

A Perl ustack Helper?

I’ve done some work with DTrace and Perl, using my Devel::DTrace::Provider and Andy Armstrong’s Devel::DTrace, but there’s never been a ustack helper: because of the structure of the Perl interpreter, it’s impossible to write one – at least, in the intended manner…

Perl and ustack helpers - the big problem

The primary problem with a Perl ustack helper is the lack of correspondence between the C stack and the Perl stack: unlike Python, V8 Javascript etc, Perl’s stack is handled entirely separately, and this is why the “standard” ustack helper will never be possible: there just aren’t the C stack frames there to annotate.

What we can try though, is to annotate a single C frame with as much of the Perl stack as we can find by chasing pointers from that frame.

This post explains my experiments with that idea. To give the game away early, it doesn’t entirely work, but there are some techniques here that might be useful elsewhere.

A lot of what I’ve done here is based directly on the existing ustack helpers for other languages, specifically V8 and Python.

Dynamically loading the ustack helper

If at all possible, I want the helper to work on an unmodified Perl binary - if the thing works at all, I want to be able to use it like a CPAN-style module rather than having to patch Perl. The first problem is to get the the helper loaded.

Given the way libusdt works, it seems likely we can load a helper just like a provider, by ioctl()ing the DOF down to the kernel from a Perl XS extension. Of course, that’s all DTrace’s DOF-loading _init routine does anyway, just we’ll be doing it slightly later on in the process’s life.

Unfortunately this facility isn’t part of libusdt’s public API yet, but it’s really not that much code, especially if we’re only supporting Illumos-based systems.

Actually building the helper DOF is trivial: compile the script with dtrace_program_fcompile(), and wrap it in DOF with dtrace_dof_create().

Loading the DOF containing the helper program works, and means we can initialise the helper from an extension module, rather needing to patch it into Perl’s build process.

Finding - or rather creating - a frame to annotate

Ideally we need a stack frame which is always found in the interpreter’s C stack, for which we can easily find the address of the function, and where the stack is passed as one of the arguments. There’s no such frame in the standard Perl interpreter, but we can manufacture one. Perl lets us replace the “runops loop” with one of our own, and we can use this to meet all of our requirements.

The runops loop function is the core of the interpreter, responsible for invoking the current “op”, which returns the new current op.

The usual, non-debug, runops loop looks like this (in Perl 5.16.2):

Standard Perl runops loop (run.c)
1
2
3
4
5
6
7
8
9
10
11
int
Perl_runops_standard(pTHX)
{
    dVAR;
    register OP *op = PL_op;
    while ((PL_op = op = op->op_ppaddr(aTHX))) {
    }

    TAINT_NOT;
    return 0;
}

The top-level loop is always visible during execution, and we can replace the usual function with one of our own, fulfilling our first two requirements.

If we make this loop execute ops through another function, and pass that function a pointer to the Perl stack, we fulfill the final requirement. These functions are dtrace_runops and dtrace_call_op:

Modified runops loop (Helper.c)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
STATIC OP *
dtrace_call_op(pTHX_ PERL_CONTEXT *stack)
{
        return CALL_FPTR(PL_op->op_ppaddr)(aTHX);
}

STATIC int
dtrace_runops(pTHX)
{
        while ( PL_op ) {
                if ( PL_op = dtrace_call_op(aTHX_ cxstack), PL_op ) {
                        PERL_ASYNC_CHECK(  );
                }
        }

        TAINT_NOT;
        return 0;
}

We’ll target the annotation at dtrace_call_op(), and attempt to walk the stack starting from the PERL_CONTEXT pointer we’re given.

Actually installing the alternative runops loop is a standard Perl extension technique, and we just need to make sure it happens early enough that the top-level loop is ours rather than the standard one.

Frame annotations

The primary purpose of the ustack helper is to provide a descriptive string for a frame where there’s no corresponding C symbol - for JITted code, say. If there is such a symbol, the ustack helper’s string will be ignored - and in this case, there is, dtrace_call_op.

Fortunately there’s a mechanism for adding annotations to these frames, and that’s what we’ll use here: a string beginning with an @ will be used as an annotation. In the Python helper, it looks like this:

1
2
libpython2.4.so.1.0`PyEval_EvalFrame+0xbdf
           [ build/proto/lib/python/mercurial/localrepo.py:1849 (addchangegroup) ]

Targetting a specific frame

In a helper action, arg0 is the program counter in that stack frame. If we make the address of our inserted dtrace_call_op function available to the helper, and of the preceding function, we can compare the pc to these two addresses to determine when we’re annotating that function.

Here, this->start and this->end have been initialised to the addresses of dtrace_call_op and the preceding function:

1
2
3
4
5
6
7
8
9
10
11
dtrace:helper:ustack:
/this->pc >= this->start/
{
        this->go++;
}

dtrace:helper:ustack:
/this->pc < this->end/
{
        this->go++;
}

For a reason I’m not entirely sure of, combining these predicates into one doesn’t work.

Passing values into the helper

With the extra control over the helper initialisation we get from loading it “by hand”, it turns out that macros work fine! We can use this to pass values into the helper: symbol addresses and structure member offsets.

It doesn’t seem to be possible to simply #include <perl.h> - the D compiler barfs badly on Perl’s headers, which are .. involved. Fortunately we can do the necessary sizeof and offsetof work in C and pass the results into D with macros. This should buy at least some ability to cope with changes to Perl’s data structures, though more sweeping changes will still break things entirely.

Macros are strings, so all the values passed need to be formatted with sprintf; at least this is just setup code.

Copying a C string

Unless I’ve missed something, this is awkward. Our final stacktrace string that the helper will return as the frame’s annotation is allocated out of D scratch space, so we need to copy C strings from userspace into it. If we have the string’s length available this is easily done with copyinto(), but if we’ve just got a char *, it’s not.

Ideally we could take the string’s length with strlen() and do a copy – but strlen isn’t available to helpers. It doesn’t seem to be possible to use strchr() either, since it returns string and not char *, so we can’t find the length that way.

I’m not sure if the lack of strlen is an oversight, or if there’s some reason that it’s unsafe in arbitrary context: it seems that if something like strchr is safe, strlen also ought to be.

We can’t just copy a fixed length of data, so a character-by-character “strncpy” is needed:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
/* Copy a string into this->buf, at the location indicated by this->off */

#define APPEND_CHR_IF(offset, str) \
dtrace:helper:ustack:                                                      \
/this->go == 2 && !this->strdone/                                          \
{                                                                          \
    copyinto((uintptr_t)((char *)str + offset), 1, this->buf + this->off); \
    this->off++;                                                           \
}                                                                          \
dtrace:helper:ustack:                                                      \
/this->go == 2 && !this->strdone && this->buf[this->off - 1] == '\0'/      \
{                                                                          \
    this->strdone = 1;                                                     \
    this->off--;                                                           \
}

#define APPEND_CSTR(str) \
dtrace:helper:ustack:    \
/this->go == 2/          \
{                        \
    this->strdone = 0;   \
}                        \
APPEND_CHR_IF(0, str) \
APPEND_CHR_IF(1, str) \
APPEND_CHR_IF(2, str) \
...
[ up to the length of string required]

Walking the stack

After all that, actually walking the stack from the pointer we’ve been passed is relatively simple. Using the information in Perlguts Illustrated, we walk the context stack, appending frame annotations to our string buffer.

Obviously it’s only possible to walk a limited number of frames, and with the default size limit on helper size and the ops required for string copies, quite a limited number of frames.

The output!

Here’s an incredibly simple example of the output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
0     20                      write:entry 
              libc.so.1`__write+0x15
              libperl.so`PerlIOUnix_write+0x46
              libperl.so`Perl_PerlIO_write+0x47
              libperl.so`PerlIOBuf_flush+0x50
              libperl.so`Perl_PerlIO_flush+0x45
              libperl.so`PerlIOBuf_write+0x11d
              libperl.so`Perl_PerlIO_write+0x47
              libperl.so`Perl_do_print+0xa7
              libperl.so`Perl_pp_print+0x195
              Helper.so`dtrace_call_op+0x3f
                [ 
                  t/helper/01-helper.t:
                  t/helper/01-helper.t:24
                  t/helper/01-helper.t:25
                  t/helper/01-helper.t:21
                  t/helper/01-helper.t:17
                  t/helper/01-helper.t:13
                ]
              Helper.so`dtrace_runops+0x56
              libperl.so`perl_run+0x380
              perl`main+0x15b
              perl`_start+0x83

This shows file:lineno pairs for each stack frame representing a subroutine call that was found walking the context stack.

Here’s a (slightly) less trivial example, taken during a run of the CPAN shell program:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
0     20                      write:entry 
              libc.so.1`__write+0x15
              libperl.so`PerlIOUnix_write+0x46
              libperl.so`Perl_PerlIO_write+0x47
              libperl.so`PerlIOBuf_flush+0x50
              libperl.so`Perl_PerlIO_flush+0x45
              libperl.so`PerlIOBuf_write+0x11d
              libperl.so`Perl_PerlIO_write+0x47
              libperl.so`Perl_do_print+0xa7
              libperl.so`Perl_pp_print+0x195
              Helper.so`dtrace_call_op+0x3f
                [ 
                  -e:
                  -e:1
                  /opt/local/lib/perl5/5.14.0/CPAN.pm:325
                  /opt/local/lib/perl5/5.14.0/CPAN.pm:325
                  /opt/local/lib/perl5/5.14.0/CPAN.pm:345
                  /opt/local/lib/perl5/5.14.0/CPAN.pm:421
                  /opt/local/lib/perl5/5.14.0/CPAN/Shell.pm:1494
                  /opt/local/lib/perl5/5.14.0/CPAN/Shell.pm:1461
                ]
              Helper.so`dtrace_runops+0x56
              libperl.so`perl_run+0x246
              perl`main+0x15b
              perl`_start+0x83

The code, and its limitations

The code is available on Github. I don’t plan to release this module to CPAN any time soon!

For anything but the most trivial examples this code probably won’t provide useful Perl stacktraces, and it’s only been tried on Perl 5.14.2 built with threads, on an Illumos-derived system.

It certainly won’t work on the Mac, since ustack helpers are disabled there, and won’t work without threads enabled in Perl because of an implementation detail of Perl OPs we’re exploiting that’s different without threads.

Hopefully though, this post sheds a bit of light on ustack helpers, and maybe there are some interesting techniques here for other situations.

Deploying a PKI With Chef

Traditionally, rolling out an X.509-style PKI has been a lot of work. You’ve got to create the CA, generate all the necessary private keys and distribute them securely, create signed certificates and distribute those, and then do it all again when renewal time comes around. Often this is enough work that a simpler security model is used instead, like a shared secret.

With a modern configuration management system we ought to be able to do better than this. Our CM system can know what certificates are required where, and which CA should issue them. We’ve got an authenticated transport, and the ability for an operator to control the certificate issuing process from a central location.

This is the solution the ssl cookbook tries to provide for Chef-managed infrastructures: it allows recipes to specify the certificates required alongside the services which will use them, generates private keys right on the host, and manages the certificate issuing process via the Chef server and a separate command-line tool.

Using the ssl_certificate resource

To use the ssl cookbook, you’ll start by declaring as resources the certificates your infrastructure needs. Probably the most common will be a certificate for an https server - the standard “SSL certificate” that gives the resource its name:

A typical ssl_certificate resource (ssl_certificate1.rb)
1
2
3
4
5
6
7
8
ssl_certificate "www.example.com" do
  ca "MyCA"
  key "/etc/ssl/www.example.com.key"
  certificate "/etc/ssl/www.example.com.cert"
  type "server"
  bits 1024
  days 365
end

Here, we’re specifying a certificate whose Common Name is “www.example.com”. The other parts of the certificate’s Distinguished Name are set as node attributes, and needn’t be specified here.

The CA whose signature is requested here is MyCA. This is a string which identifies your CA. It might identify an internal CA, or it might be an abbreviation which indicates an external CA you might use, but you need to choose an appropriate “short name” for your CA to use here. Later, we’ll use this identifier to find the CSR to sign.

The key and certificate locations are given here, which you’d need to also specify in your webserver configuration. The key is given restrictive 0600 permissions, and ownership defaults to root:root. The key itself is declared to be of type server, and of length 1024 bits.

A validity period of 365 days is requested for the certificate. This is a request, and need not be honoured when the CSR is signed, but it will be respected if the automatic signing tool is used.

The first time this resource is converged, the key will be generated and installed, and a temporary certificate signed by a temporary CA is also installed – this allows a webserver configured to use the key and certificate to start up successfully. Browsers connecting to the server will see a warning, indicating the correct certificate is not yet in place.

The Chef run also generates a CSR and PGP-encrypts the private key, then saves them both into node attributes, in the “CSR Outbox”. This contains the CSR, and the requested CA and validity period ready for the signing tool.

Signing certificates

The chef-ssl tool is provided as a Ruby gem, called chef-ssl-client, and is maintained in the cookbook’s git repository. This is a standalone command line tool, which handles the certificate issuing parts of the process, but which also provides for the creation of new CAs and the issuing of ad hoc certificates.

Often you’ll be signing certificates with your own CA for internal services, then deploying that CA to your clients. chef-ssl makes it easy to sign CSRs for these internal CAs:

$ chef-ssl autosign --ca-name MyCA --ca-path ./MyCA

You’ll be prompted for the CA key’s passphrase, and then a Chef search will be performed, looking for data in individual nodes’ outboxes. The search is constrained by the CA identifier you specified.

Each matching CSR is then presented in turn, and if you’re happy to issue the signed certificate, answering “yes” will do a number of things:

  • Create a certificate for the CSR
  • Sign the certificate with your CA
  • Upload the certificate as a Chef data bag item

The data bag item is named for the Common Name of the certificate, and contains all the information relating to the individual key and certificate. The PGP-encrypted key is stored for archive purposes, along with the CSR, and the certificate itself, the date and the signing CA’s certificate. The data bag item may be archived for backup of both the key and certificate.

Installing Certificates

On the next run after the certificate is issued and uploaded, Chef notes the existence of the relevant data bag item, downloads it and installs the certificate and the signing CA’s certificate.

Having installed the real certificate, Chef then removes the CSR from the outbox – it’s important here that only the Chef node ever updates the outbox, and only the chef-ssl script ever updates data bags. In this way, clobbering of data is avoided, and it’s clear which part of the system is responsible for each piece of data.

Notifications are sent when the new certificate is installed, which allows (for example) the relevant web server to be restarted and begin using the certificate.

Summary

This cookbook should allow you to concentrate on choosing the right security architecture for your systems, rather than avoiding a genuine PKI where it’s the right solution only because of the work involved.

The cookbook is available here:

https://github.com/VendaTech/chef-cookbook-ssl

where further documentation and examples are available. GitHub issues and pull requests are most welcome.

Custom chef_gem Resources

Since Chef 0.10.10, there’s been a new core resource for installing Ruby gems to support Chef cookbooks: chef_gem. This installs gems into the current Ruby at compile time, and handles making the newly-installed gems available to the running Chef client – wrapping up an ugly piece of code found in many cookbooks:

Before chef_gem (before.rb)
1
2
3
4
5
gem_package "ruby-shadow" do
  action :nothing
end.run_action :install

Gem.clear_paths

into a single resource declaration:

With chef_gem (after.rb)
1
chef_gem "ruby-shadow"

We hadn’t considered switching to this resource though, despite a compatibility cookbook being available for older Chef versions, because our local Ruby environment is based on RPMs. We build a single Ruby RPM for all our management tools, with individual gems also built into RPMs, so our gem dependency resources specify yum_package rather than gem_package.

This hasn’t given us any problems beyond a need to patch cookbooks to install our RPMs, but now we’re open sourcing some of our cookbooks, it means that our dependencies are specific to our local environment.

Using chef_gem as an abstraction though, we can specify dependencies in a way that’s appropriate for an open source cookbook but have a local override which actually installs our RPMs. We also avoid having to patch cookbooks just to make sure our RPMs are installed for dependencies.

This is the resource definition, which is a shim between the chef_gem resource and the yum_package provider, adjusting the requested package name as we need for our RPMs.

Local chef_gem (libraries/chefgem.rb)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
class Chef
  class Resource
    begin
      send(:remove_const, 'ChefGem')
    rescue
      nil
    end

    class ChefGem < Chef::Resource::YumPackage

      def initialize(name, run_context=nil)
        # our gem name -> rpm name mapping
        name = "rubygem19-#{name}"
        super
        @resource_name = :chef_gem
        @provider = Chef::Provider::Package::Yum
        after_created
      end

      def after_created
        Array(@action).flatten.compact.each do |action|
          self.run_action(action)
        end
        Gem.clear_paths
      end
    end
  end
end

This code goes in a local-enviroment cookbook as a library. Once we upgrade to Chef 0.10.10 or later, it will override the core chef_gem resource, and continue to install our RPMs.

Libusdt - Creating DTrace Providers at Runtime

With this post I’d like to introduce libusdt, a library for creating DTrace USDT providers at runtime, intended as the basis for implementing custom DTrace providers in dynamic languages, but applicable wherever the provider definition isn’t known at compile time.

A bit of the history of this library: in 2007 I’d been tinkering with a Ruby extension to allow access to libdtrace as a consumer of DTrace data, with the hope that this might make it easier to write complex analysis scripts - like this port to ruby of Chris Gerhard’s scsi.d script.

At this point, the Ruby interpreter had been patched by Joyent to include its own USDT provider, which was accessible using the consumer library. What wasn’t possible though was to define providers in Ruby, to create probes meaningful to the application rather than just to the interpreter. My first attempt at this, autogenerating and building the C source and D script required to specify a provider, was never going to be anything other than an experiment.

This post by Keith McGuigan, though, showed that the JVM was capable of creating providers at runtime without invoking the usual compile-time machinery. It seemed that to do this using the existing USDT provider in Solaris, it’d have to be doing a number of things:

  • Dynamically creating tracepoints in the running process
  • Dynamically creating DOF to describe the provider to the kernel
  • Loading the DOF to enable the provider.

Was the JVM doing all this? Keith’s blog post didn’t discuss the low-level implementation, and there didn’t seem to be a way to do it with libdtrace…

This was just before dtrace.conf(08). I was fortunate enough to attend and was able to speak to Keith about the JVM implementation. It turns out that this is pretty much how it works - the JVM with its JIT-compiler already has the capability of creating new native code at runtime, and creating the appropriate DOF document is simply a matter of following the extensive documentation in <sys/dtrace.h>.

I got this working in Ruby, and so now ruby-dtrace could both produce and consume DTrace data, and eventually do it on all the platforms supporting USDT providers - Solaris and Mac OS X, SPARC, Intel and PowerPC, 32 and 64 bit (it’s entirely responsible for the obsolete computers I still keep around…). This supported providers like Rack::Probe.

Devel::DTrace::Provider followed for Perl, like ruby-dtrace mostly in Perl - it was clear that the core of these extensions needed to be a standalone library. One more similar port, node-dtrace-provider, provoked a start on libusdt.

Incidentally the Perl port was written to add probes to my employer’s large Perl web-app, for an evaluation on Solaris - we immediately discovered unexpected behaviours, with the additional context provided by the application’s probes going some way to making up for the lack of a Perl stack helper.

libusdt is currently very limited - right now it only supports the Mac, and then only for 64 bit processes - but my plan is to bring over all the platform support from ruby-dtrace and then to port all the language bindings to use it as the core of their implementation.

Once that’s done, I’d like to find ways to reduce the disabled-probe overhead - while is-enabled probes are supported and used to allow costly argument-gathering work to be avoided where possible, the use of generated stub functions means there’s still some overhead.

Some of this work will undoubtedly be language-specific, tying the generated probes in more closely with the runtimes, but some may also come from improvements in the library itself.

Ruby 1.9.2 and YAML

I found another gotcha when embedding MCollective in a larger application: Ruby’s YAML library situation. Historically Ruby has used the Syck library, but since 1.9.2 libyaml is used if available, through a set of bindings called Psych.

While Syck is still available and may be selected at runtime, the default is chosen at compile time, based on the presence or not of libyaml - if it’s there when you build Ruby, you’ll get Psych by default.

To find out which binding your Ruby uses, inspect YAML::ENGINE.yamler:

Finding the current YAML library
1
2
ruby-1.9.2-p290 :004 > YAML::ENGINE.yamler
 => "syck"

This wouldn’t be a problem unless there were differences, and although it’s fair to say that Psych is more compliant with the YAML spec, it’s less tolerant of slight deviations - some of which Syck is guilty of.

Coming back to MCollective, this means that an MCollective running on Ruby 1.9.2 with Psych won’t necessarily be able to communicate with other MCollective installations running on Ruby 1.8 with Syck. Mostly this seems to be down to handling of whitespace in multiline YAML values, which MCollective makes heavy use of since it nests one YAML document in another.

The initial fix in my environment was to rebuild Ruby 1.9.2 without libyaml, so Syck is used everywhere. Beyond that, upgrading everything to the same Ruby 1.9.2 build appears to be the best long term answer.

Embedding MCollective

MCollective is “a framework to build server orchestration or parallel job execution systems”. It’s very easy to create and deploy agents to your hosts, and then interrogate or instruct them with simple command-line tools. For ad hoc tasks, this is ideal.

As part of our platform engineering work though, we’re integrating MCollective with existing tools, and with our new orchestration systems – using MCollective as the glue between a centralised controller and its agents on individual hosts.

In this situation, our “client” isn’t a command-line tool but a larger Rails application with a number of other dependencies. Bringing the mcollective tools into this environment isn’t straightforward, since a number of the default options need to be changed.

Here’s my first attempt to embed an MCollective client in my application:

Embedding with default options (mco-embed.rb)
1
2
3
4
5
6
7
8
9
require 'mcollective'

class EmbeddedClient
  include MCollective::RPC

  def self.client(agent)
    client = rpcclient(agent)
  end
end

This has a number of problems.

I need to make the “mcollective” library code available. There’s currently no gem available, so it’s not possible to just gem “mcollective”, “1.2.0” in the Gemfile. There’s also the MCollective plugins, which need to be in sync with the core library version.

The configuration also needs to be available, and the default location is /etc/mcollective/client.cfg - which is naturally outside of any deployed application’s root, and so we’d need to deploy that separately.

Having made the code available, the next problem to deal with is command-line option parsing. By default, MCollective will parse and validate the process’s ARGV, and raise an exception if it finds an option it doesn’t recognise. This isn’t appropriate in an embedded client, so we need to bypass option parsing.

Here’s how to create an embedded MCollective client, taking these points into account:

Embedding, take 2 (mco-embed2.rb)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
require 'mcollective'

class EmbeddedClient
  include MCollective::RPC

  def self.client(agent)
    options =  MCollective::Util.default_options
    options[:config] = 'config/mcollective.cfg'
    client = rpcclient(agent, {:options => options})
    client.discovery_timeout = 10
    client.timeout = 120
    client
  end
end

and here’s the configuration that points to:

Configuration (mcollective.cfg)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
topicprefix = /topic/
main_collective = mcollective
collectives = mcollective
libdir = vendor/mcollective/plugins
logfile = stdout
loglevel = debug

# Plugins
securityprovider = psk
plugin.psk = yeah

connector = stomp
plugin.stomp.host = 10.101.1.16
plugin.stomp.port = 61613
plugin.stomp.user = guest
plugin.stomp.password = guest

# Facts
factsource = yaml
plugin.yaml = config/mcollective_facts.yaml

This works, but there’s a problem - in rpcclient(), there’s a call to exit! on any exception during the client setup process, including exceptions thrown by the stomp library. We need to avoid that and handle exceptions ourselves, so:

Embedding, take 3 (mco-embed3.rb)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
require 'mcollective'

class EmbeddedClient
  include MCollective::RPC

  def self.client(agent)
    options =  MCollective::Util.default_options
    options[:config] = 'config/mcollective.cfg'
    client = MCollective::RPC::Client.new(agent, :options => options)
    client.discovery_timeout = 10
    client.timeout = 120
    client
  end
end

Now we can handle exceptions ourselves, but there’s still an issue - the stomp library writes directly to $stderr and we should be collecting that output and logging it in whatever way is appropriate for the application:

Logging $stderr (mco-embed4.rb)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def log_stderr &block
  begin
    real_stderr, $stderr = $stderr, StringIO.new
    yield
  ensure
    $stderr, stderr = real_stderr, $stderr
    stderr.string.each_line do |line|
      log(:error, "stderr: #{line}")
    end
  end
end

# and then:
log_stderr do
  run_mcollective_processes
end

At this point, we’ve got the following problems solved:

  • MCollective’s library code and plugins are within our application tree
  • The configuration file is also within the application
  • MCollective won’t try to reinterpret our application’s ARGV
  • We can handle client-setup exceptions ourselves
  • Stomp errors are logged, rather than being output to stderr.

which should let the application interact with MCollective in a reliable and maintainable way.