Personal tools
Skip to content. | Skip to navigation
This project is a Zenoss extension (ZenPack) that allows for monitoring of OpenStack. This means that you can monitor the flavors, images and servers a user or consumer perspective. OpenStack Compute v1.1 (Cactus) is known to be supported. Specifically this means that Rackspace's CloudServers can be monitored. In the future it is likely that support for monitoring OpenStack Storage (Swift) will be added. OpenStack is a global collaboration of developers and cloud computing technologists producing the ubiquitous open source cloud computing platform for public and private clouds. The project aims to deliver solutions for all types of clouds by being simple to implement, massively scalable, and feature rich. The technology consists of a series of interrelated projects delivering various components for a cloud infrastructure solution. Once the OpenStack ZenPack is installed you can begin monitoring by going to the infrastructure screen and clicking the normal button for adding devices. You'll find a new option labeled, "Add OpenStack." Choose that option and you'll be presented with a dialog asking for the following inputs. 1. Username - Same username used to login to OpenStack web interface 2. API Key - Can be found by going to "Your Account/API Access" 3. Project ID - This can be left blank if you don't know what it is 4. Auth URL - For Rackspace this would be https://auth.api.rackspacecloud.com/v1.0 5. Region Name - This can be left blank if you don't know what it is Once you click Add, Zenoss will contact the OpenStack API and discover servers, images and flavors. Once it is complete you'll find a new device in the OpenStack device class with the same name as the hostname or IP you entered into the dialog. Click into this new device to see everything that was discovered. The following types of elements are discovered. * Servers * Images * Flavors The following metrics are collected. * Total Servers and Servers by State o States: Active, Build, Rebuild, Suspended, Queue Resize, Prep Resize, Resize, Verify Resize, Password, Rescue, Reboot, Hard Reboot, Delete IP, Unknown, Other * Total Images and Images by State o States: Active, Saving, Preparing, Queued, Failed, Unknown, Other * Total Flavors Status monitoring is performed on servers and images with the following mapping of state to Zenoss event severity. Servers State to Severity Mapping: * Reboot, Hard Reboot, Build, Rebuild, Rescue, Unknown == Critical * Resize == Error * Prep Resize, Delete IP == Warning * Suspended, Queue Resize, Verify Resize, Password == Info * Active == Clear Images State to Severity Mapping: * Failed, Unknown == Critical * Queued, Saving, Preparing == Info * Active == Clear If you are also using Zenoss to monitor the guest operating system running within the server Zenoss will present the graphs for that operating system when the graphs option is chosen for the OpenStack server.
All monitoring is performed through the optional swift-recon API endpoint that can be enabled on all of your Swift object servers. Before using this ZenPack you must install and configure swift-recon on your Swift object servers. Usage Installing the ZenPack will add the following objects to your Zenoss system. Configuration Properties zSwiftObjectServerPort: Listening port of swift-object-server. Defaults to 6000. Monitoring Templates SwiftObjectServer in /Devices Process Classes OpenStack/Swift swift-account-auditor swift-account-reaper swift-account-replicator swift-account-server swift-container-auditor swift-container-replicator swift-container-server swift-container-sync swift-container-updater swift-object-auditor swift-object-replicator swift-object-server swift-object-updater swift-proxy-server Event Classes /Status/Swift /Perf/Swift The zSwiftObjectServerPort property is used by the SwiftObjectServer monitoring template to control what port it will attempt to find the recon API on. Normally the default of 6000/tcp will work unless you've chosen a different port for your swift-object-server process. By default the SwiftObjectServer monitoring template will not be bound to any devices. To make use of it you will need to either bind it directly to your Swift object server devices, or put your object servers into their own device class and bind the template to that device class. Typically this will be under either /Server/Linux or /Server/SSH/Linux so you get normal operating system monitoring in addition to the Swift-specific monitoring. Swift Metrics Assuming you have swift-recon and Zenoss setup properly you can expect to see the following extra graphs on your Swift object servers. Swift Object Server - Async Pending Trend of asynchronous pending tasks. When a Swift proxy server updates an object it attempts to synchronously update the object's container with the new object information. There is a three second timeout on this task and if it can't be completed in that time, it will be put into an asynchronous pending bucket to be executed later. By trending and thresholding on how many tasks are pending you can get an early read on cluster performance problems. By default a maximum threshold of 10 is set on this metric and will raise a warning severity event in the /Perf/Swift event class when it is breached. Swift Object Server - Disks Trend of total and unmounted disks on the storage node. Swift's mechanism for detecting failing or failed drives and taking them offline is to unmount them. By proactively monitoring for unmounted disks and replacing them you can keep your cluster healthy. By default a maximum threshold of 0 is set on unmounted disks and will raise a warning severity event in the /Status/Swift event class. Swift Object Server - Quarantine Trend of accounts, containers and objects that have been quarantined. Swift has an auditor process that will find corrupt items and move them into a quarantine area so good objects will be replicated back into their place. Sudden increases in quarantined items can indicate hardware problems on storage nodes. Additionally quarantine is not automatically pruned and can result in some storage nodes filling up their disk at a faster rate than others and running out of space. By default a maximum threshold of 100 is set individually on quarantined accounts, containers and objects. A warning event will be raised in the /Status/Swift event class if it is breached. Swift Object Server - Replication Time Trend of replication time. Swift has a replicator process that cycles continually. If a single replication cycle takes more than 30 minutes it can reduce the resiliency of the cluster. By default a maximum threshold of 30 minutes is set on replication time and will raise a warning severity event in the /Perf/Swift event class when breached. Swift Object Server - Load Averages Trend of 1, 5 and 15 minute operating system load average. Additionally the 15 minute load average divided by total disks is calculated. A perfectly efficient storage node will run at a load average of 1.0 per disk. By default a maximum treshold of 2.0 is set on 15 minute load average divided by total disks and will raise a warning severity event in the /Perf/Swift event class when breached. Swift Object Server - Process Churn Trend of processes created per second. High process churn can indicate a broken process being unnecessarily restarted. By default a maximum treshold of 100 processes per second is set and will raise a warning severity event in the /Perf/Swift event class when breached. Swift Object Server - Disk Usages Trend of maximum, average and minimum disk usage for all disks in the storage node. These are the primary storage capacity metrics within a cluster. Depending on the size of each individual disk, weights and the skew of store object sizes, an entire cluster can exceed capacity if a single disk runs out of capacity. By default a maximum threshold is set on the maximum usage metric. It will raise a warning severity in the /Status/Swift event class when breached. Swift Object Server - Disk Sizes Trend of maximum, average and minimum disk sizes for all disks in the storage node. Ideally all disks in a storage node will be the same size unless weights are closely managed. No default thresholds are set on these metrics. Swift Object Server - Processes Trend of total and running processes. No default thresholds are set on these metrics. Process Monitoring All Swift processes will be discovered and monitored based on the process classes listed above. If one of the processes is found to not be running on a node where it should be, an error severity event will be raised in the /Status/OSProcess event class. Each of the individual Swift process will also be monitored for its CPU and memory utilization.
Once the PostgreSQL ZenPack is installed you will have the following new zProperties which should be set either for device classes or individual devices. * zPostgreSQLPort - Port where PostgreSQL is listening. Default: 5432 * zPostgreSQLUseSSL - Whether to use SSL or not. Default: False * zPostgreSQLUsername - Must be a superuser. Default: postgres * zPostgreSQLPassword - Password for user. No default. In addition to setting these properties you must add the zenoss.PostgreSQL modeler plugin to a device class or individual device. This modeler plugin will discover all databases and tables using the connectivity information provided through the above settings. Each database and table will automatically be monitored. The following elements are discovered: * Databases * Tables The following performance metrics are collected. * Server * Summaries of all databases and tables. * Databases * Latencies * Connection * SELECT 1 * Connections * Total * Active * Idle * Durations * Active Transactions (min/avg/max) * Idle Transactions (min/avg/max) * Queries (min/avg/max) * Size * Backends * Efficiencies * Transaction Rollback Percentage * Tuple Fetch Percentage * Transaction Rates * Commits/sec * Rollbacks/sec * Tuple Rates * Returned/sec * Fetched/sec * Inserted/sec * Updated/sec * Deleted/sec * Locks * Total * Granted * Waiting * Exclusive * AccessExclusive * Summaries of all tables. * Tables * Scan Rates * Sequential/sec * Indexed/sec * Tuple Rates * Sequentially Read/sec * Index Fetched/sec * Inserted/sec * Updated/sec * Hot Updated/sec * Deleted/sec * Tuples * Live * Dead
This ZenPack provides a new Python data source type. It also provides a new zenpython collector daemon that is responsible for collecting these data sources. The goal of the Python data source type is to replicate some of the standard COMMAND data source type's functionality without requiring a new shell and shell subprocess to be spawned each time the data source is collected. The COMMAND data source type is infinitely flexible, but because of the shell and subprocess spawning, it's performance and ability to pass data into the collection script are limited. The Python data source type circumvents the need to spawn subprocesses by forcing the collection code to be asynchronous using the Twisted library. It circumvents the problem with passing data into the collection logic by being able to pass any basic Python data type without the need to worry about shell escaping issues. The Python data source type is intended to be used in one of two ways. The first way is directly through the creation of Python data sources through the web interface or in a ZenPack. When used in this way, it is the responsibility of the data source creator to implement the required Python class specified in the data source's Python Class Name property field. The second way the Python data source can be used is as a base class for another data source type. Used in this way, the ZenPack author will create a subclass of PythonDataSource to provide a higher-level functionality to the user. The user is then not responsible for writing a Python class to collect and process data.
To start monitoring your RabbitMQ server you will need to setup SSH access so that your Zenoss collector server will be able to SSH into your RabbitMQ server(s) as a user who has permission to run the rabbitmqctl command. This almost always means the root user. See the Using a Non-Root User section below for instructions on allowing non-root users to run rabbitmqctl. To do this you need to set the following zProperties for the RabbitMQ devices or their device class in Zenoss. * zCommandUsername * zCommandPassword * zKeyPath The zCommandUsername property must be set. To use public key authentication you must verify that the public portion of the key referenced in zKeyPath is installed in the ~/.ssh/authorized_keys file for the appropriate user on the RabbitMQ server. If this key has a passphrase you should set it in the zCommandPassword property. If you'd rather use password authentication than setup keys, simply put the user's password in the zCommandPassword property. You should then add the zenoss.ssh.RabbitMQ modeler plugin to the device, or device class containing your RabbitMQ servers and remodel the device(s). This will automatically find the node, vhosts, exchanges and queues and begin monitoring them immediately for the following metrics. * Node Values o Status - Running or not? Generates event on failure. o Open Connections & Channels o Sent & Received Bytes Rate o Sent & Received Messages Rate o Depth of Send Queue o Consumers o Unacknowledged & Uncommitted Messages * Queue Values o Ready, Unacknowledged & Total Messages o Memory Usage o Consumers There is a default threshold of 1,000,000 messages per queue. This is almost certainly an absurdly high threshold that shouldn't trip in normal systems. However, by clicking into the details of any individual queue you can set the per-queue threshold to a more reasonable value that makes sense for a given queue.
This package provides the capability to monitor windows systems. The following daemons perform Windows monitoring tasks: zenwin - monitors Windows service processes zeneventlog - imports events from the Windows Event Log into Zenoss zenwinperf - collects performance information from Windows machines
Xen Monitor --------------------------- XenMonitor is a ZenPack that allows you to monitor virtually hosted operating systems running on a Xen hypervisor. It depends on the prior installation of the ZenossVirtualHostMonitor zenpack. This zenpack: 1) Extends ZenModeler to find Guest OS's and add them to Virtual Hosts 3) Provides templates for collecting resources allocated to Guest OSs. To Use XenMonitor: 1) Ensure that you have SSH keys to your Xen servers (as root). 2) Create your Xen servers using the /Servers/Virtual Hosts/Xen device class. If you have these servers modeled already, remove them and recreate them under the Xen device class. DO NOT MOVE THEM. 3) Select the Guest menu and ensure that the guest hosts were found when the devices were added.
ZenJMX is a full-featured JMX client that works "out of the box" with JMX agents that have their remote APIs enabled. It supports authenticated and unauthenticated connections, and it can retrieve single-value attributes, complex-value attributes, and the results of invoking an operation. Operations with parameters are also supported so long as the parameters are primitive types (Strings, booleans, numbers), as well as the object version of primitives (such as java.lang.Integer and java.lang.Float). Multi-value responses from operations (Maps and Lists) are supported, as are primitive responses from operations. The JMX data source installed by ZenJMX allows you to define the connection, authentication, and retrieval information you want to use to retrieve performance information. The IP address is extracted from the parent device, but the port number of the JMX Agent is configurable in each data source. This allows you to operate multiple JMX Agents on a single device and retrieve performance information for each agent separately. This is commonly used on production servers that run multiple applications. Authentication information is also associated with each JMX data source. This offers the most flexibility for site administrators because they can run some JMX agents in an open, unauthenticated fashion and others in a hardened and authenticated fashion. SSL-wrapped connections are supported by the underlying JMX Remote subsystem built into the JDK, but were not tested in the Zenoss labs. As a result, your success with SSL encrypted access to JMX Agents may vary. The data source allows you to define the type of performance information you want to achieve: single-value attribute, complex-value attribute, or operation invocation. To specify the type of retrieval, you must specify an attribute name (and one or more data points) or provide operation information. Any numerical value returned by a JMX agent can be retrieved by Zenoss and graphed and checked against thresholds. Non-numerical values (Strings and complex types) cannot be retrieved and stored by Zenoss. When setting up data points, make sure you understand the semantics of the attribute name and choose the correct Zenoss data point type. Many JMX Agent implementations use inconsistent nomenclature when describing attributes. In some cases the term "Count" refers to an ever-increasing number (a "Counter" data point type). In other cases the term "Count" refers to a snapshot number (a "Gauge" data point type).
Zenoss Virtual Host Monitor --------------------------- ZenossVirtualHostMonitor is a ZenPack that allows you to monitor virtually hosted operating systems. This ZenPack refers to a Virtual Machine Host as the one running on the bare metal, and Guest for those running within the virtual hardware. This zenpack: 1) Extends Devices to support a relationship from Host to Guest. 2) Provides screens displaying resources allocated to Guest OSs. 3) Collects nothing on its own. It provides base functionality for other zenpacks (XenMonitor, VMwareESXMonitor)