Skip to content

Ganglia GMond Python Modules

Ng Zhi An edited this page Jul 28, 2014 · 2 revisions

Gmond Python metric modules

One of the new features of Ganglia 3.1.x is the ability to create C/Python metric gathering modules. These modules can be plugged directly into gmond to monitor user-specified metrics.

In previous versions (2.5.x, 3.0.x), the only way to add user-specified metrics is via a command line tool called gmetric and the way to inject metrics into gmond is simply to run gmetric via a cronjob or some other process. While this works for most people, it makes user-specified metrics difficult to manage.

This document will dive into the specifics for writing a Python metric monitoring module.

The following are prerequisites for building/using Python module support:

  • Ganglia 3.1.x
  • Python 2.3.4+ (this is the oldest tested version which comes with Red Hat Enterprise Linux 4, older 2.3 versions should work as well)
  • Python development headers (usually in the form of python-devel binary packages)

Installation

RPM

If you are trying to install Python metric modules support on a RPM-based system, install the ganglia-gmond-modules-python RPM. This includes everything needed for Python metric modules support to work.

APT

apt-get install ganglia-monitor

Also see additional notes below.

Source

If you are building from source, please make sure that you include the --with-python option during configure. If the Python interpreter is detected, this option will be added automatically.

Checklist

To confirm that your Ganglia installation has Python support correctly setup, double check the following:

  • gmond.conf has a line which reads something along the lines of include ("/etc/ganglia/conf.d/*.conf"). This is the directory where you should place configuration files for your Python modules as .pyconf files
  • modpython.conf exists in /etc/ganglia/conf.d - it contains a directive which will include the pyconf files
  • You have modpython.so in /usr/lib{64}/ganglia
  • The directory /usr/lib{64}/ganglia/python_modules exists. This is the directory where Python modules should be placed as .py files.

These things should be automatically done for you if you installed Python modules support via binary packages. If that is not the case please file a bug at the distribution's corresponding bug tracker.

Ubuntu 10.10 notes

Ubuntu 10.10 does not come with Python support for gmond fully setup. You will need to:

  • Create /etc/ganglia/conf.d/modpython.conf and make it look like https://sourceforge.net/apps/trac/ganglia/browser/trunk/monitor-core/gmond/modules/conf.d/modpython.conf.in

    • for instance:
    modules {
      module {
         name = "python_module"
         path = "/usr/lib(64)/ganglia/modpython.so"
         params = "/usr/lib(64)/ganglia/python_modules"
      }
    }
    
    include('/etc/ganglia/conf.d/*.pyconf')
    
  • Create the directory /usr/lib(64)/ganglia/python_modules

  • Ensure that /usr/lib(64)/ganglia/modpython.so already exists (Ubuntu 10.10 gets this one right when you install ganglia via apt)

Writing custom Python modules

Writing a Python module is very simple. You just need to write it following a template and put the resulting Python module (.py) in /usr/lib(64)/ganglia/python_modules. A corresponding Python Configuration (.pyconf) file needs to reside in /etc/ganglia/conf.d/.

If your Python module needs to access certain files on the server, keep in mind that the module will be executed as the user which runs gmond. In other words, if gmond runs as user nobody then your module will also run as nobody. So make sure that the user which runs gmond has the correct permissions to access the files in question.

The Ganglia distribution comes with an example Python module in /usr/lib(64)/ganglia/python_modules/example.py. Alternatively, this file is also viewable from our SVN repository: http://ganglia.svn.sourceforge.net/viewvc/ganglia/branches/monitor-core-3.1/gmond/python_modules/example/example.py?view=markup. There are many more modules you can look at for inspiration in the github repo: https://github.com/ganglia/gmond_python_modules.

Example module

Let's look at a real-life example of a Python module which monitors the temperature of the host, by reading a file in the /proc file system, let's call this temp.py:

acpi_file = "/proc/acpi/thermal_zone/THRM/temperature"

def temp_handler(name):  
    try:
        f = open(acpi_file, 'r')

    except IOError:
        return 0

    for l in f:
        line = l.split()

    return int(line[1])

def metric_init(params):
    global descriptors, acpi_file

    if 'acpi_file' in params:
        acpi_file = params['acpi_file']

    d1 = {'name': 'temp',
        'call_back': temp_handler,
        'time_max': 90,
        'value_type': 'uint',
        'units': 'C',
        'slope': 'both',
        'format': '%u',
        'description': 'Temperature of host',
        'groups': 'health'}

    descriptors = [d1]

    return descriptors

def metric_cleanup():
    '''Clean up the metric module.'''
    pass

#This code is for debugging and unit testing
if __name__ == '__main__':
    metric_init({})
    for d in descriptors:
        v = d['call_back'](d['name'])
        print 'value for %s is %u' % (d['name'],  v)

Module requirements

There are three functions that must exist in every python metric module. These functions are:

  • def metric_init(params):
  • def metric_cleanup():
  • def metric_handler(name):

While the first two functions above must exist explicitly (ie. they must be named as specified above), the metric_handler() function can actually be named anything. The functions are explored in detail below.

def metric_init(params): {#defmetric_initparams:}

This function must exist and explicitly named 'metric_init' in your module. It will be called once at initialization time - that is, once when gmond starts up. It can be used to do any kind of initialization that the module requires in order to properly gather the intended metric.

metric_init() also takes a single dictionary type parameter which contains configuration directives that were designated for this module in the gmond.conf file. In addition to any other initialization that is done, the function must also create, populate and return the metric description dictionary or a dictionary list. Each description dictionary must contain the following elements:

  • name: name of the metric
  • call_back: The function in your module to call when collecting metric data
    • If your metric module supports multiple metrics, each being defined through their own metric descriptor, your module may actually implement more than one metric_handler function.
  • time_max: maximum time in seconds between metric collection calls
    • The exact nature of this element is unclear, as is its relationship to the 'collect_every' configuration directive in your pyconf for the module. For all intents and purposes, this element seems... useless.
  • value_type: string | uint | float | double
  • units: unit of your metric
  • slope: zero | positive | negative | both
    • This value maps to the data source types defined for RRDTool
    • If 'positive', RRD file generated will be of COUNTER type (calculating the rate of change)
    • If 'negative', ????
    • 'both' will be of GAUGE type (no calculations are performed, graphing only the value reported)
    • If 'zero', the metric will appear in the "Time and String Metrics" or the "Constant Metrics" depending on the value_type of the metric
  • format: format string of your metric
  • description: description of your metric
    • Visible in web frontend if you hover over host metric graph
  • groups (optional): groups your metric belongs to
    • The group(s) in the web frontend with which this metric will be associated

These elements are basically the same type of data that must be supplied to the gmetric commandline utility with the exception of the call_back function. See the gmetric help document for more information.

The metric descriptor can also include additional attributes and values which will be attached to the metric metadata as extra data. The extra data will be ignored by Ganglia itself but can be used by the web front as additional display or metric handling data. (The use of SPOOF_HOST and SPOOF_NAME extra attributes are examples that will be described in a later version.)

def metric_cleanup():

This function must exist and explicitly named 'metric_cleanup' in your module. It will be called only once when gmond is shutting down. Any module clean up code can be executed here and the function must not return a value.

def metric_handler(name):

The 'metric_handler' function can actually be called anything you want, as long as it matches the name of the function you defined in the corresponding 'call_back' element in your metric descriptor. It takes one parameter, 'name', which is the value defined in the 'name' element in your metric descriptor.

pyconf

The corresponding config file for the module, temp.pyconf, lives in /etc/ganglia/conf.d/temp.pyconf and looks like this:

modules {
  module {
    name = "temp"
    language = "python"
    # The following params are examples only
    #  They are not actually used by the temp module
    param RandomMax {
      value = 600
    }
    param ConstantValue {
      value = 112
    }
  }
}

collection_group {
  collect_every = 10
  time_threshold = 50
  metric {
    name = "temp"
    title = "Temperature"
    value_threshold = 70
  }
}

The above configuration file contains two major sections with various sub-sections: modules and collection_group.

Modules

The modules section contains configuration data that is specific to each module being loaded. It may contain either a single module sub-section or multiple sub-sections. Within each module sub-section is the name of the metric module, the language in which the module was written and zero or more module specific param'(s)".

name

The name of the module corresponds to the filename of the module you created (without the ".py").

language

Unless you've written your module in C/C++, you MUST explicitly declare the language of your module in the pyconf. Declaring 'python' as your language instructs gmond to look in the python_modules directory for your module.

param

Each param sub-section has a name and a value. The name and value make up the name/value pair that is passed into the metric_init() function as a params list as described above. The parameters defined here are passed to your module in the metric_init function as a dictionary, where the 'name' of the parameter is the key, and the value is... the value. Therefore you can access your custom params with something like this:

RandomMax = 500

def metric_init(params):
    global RandomMax

    if 'RandomMax' in params:
        RandomMax = params['RandomMax']

    ...

Collection_group

The rest of the configuration file follows the same format as for any other collection_group or metric. Looking at the man page for gmond.conf is particularly instructive, but we'll go over the example collection_group directives here:

collect_every or collect_once {#collect_everyorcollect_once}

collect_every tells gmond the frequency (in seconds) with which to collect data from the metrics defined in this collection_group. In the example, the 'temp' metric will be collected every 10 seconds.

You can also instruct gmond to collect 'static metrics', which should be collected only once (at gmond startup), with collect_once=yes. This is useful for things that shouldn't change on the server between reboots (eg number of CPUs).

time_threshold

The maximum frequency (in seconds) with which to report metric data to Ganglia. In the case of the example, the temp module will report to Ganglia at least every 50 seconds.

This directive is superseded in the event that the value of a collected metric is greater than the metric's defined 'value_threshold' (see below).

metric

This is where you define metric-specific settings:

  • name: The name of a specific metric, as defined in the descriptor dictionary in your module
  • title: An optionally human-readable title for your metric that will be displayed in the Ganglia front-end
  • value_threshold: If your metric reports a value above the value (in the units defined in your metric descriptor) defined here, it will be reported to Ganglia regardless of the 'time_threshold' defined for the collection_group

Further reading

Additional information about Python modules can be found in the README file: http://ganglia.svn.sourceforge.net/viewvc/ganglia/branches/monitor-core-3.1/gmond/modules/python/README.in?view=markup

Some helpful user-contributed resources: