Operations
This page discusses operations, which are defined actions you can invoke on any managed resource that supports them.
Overview
Operations are all about executing some feature or function against resources in inventory. Some of these operations could change the operational state of a resource (such as start, stop, or restart commands), while others can be purely informational (obtain some report data, or get a list of deployments).
Concepts
Supported Operations
Operations are defined by the plugin. More specifically, an operation is defined primarily by the type of resource you're talking about, but it may be limited to specific versions of that type.
So, if you have two JBoss AS instances in inventory that were the exact same version (same 4.0.5.GA), they will support the exact same operations. However, a JBoss AS 4.0.2 instance may have a slightly different list of supported operations than JBoss AS 4.2.1.
Arguments
Operational arguments actually piggyback on top of the same underlying constructs used to support arbitrarily complex resource configuration data. Consequently, arguments are strongly typed, can be required or optional, and inherit support for validation. Thus, regardless of how complicated the arguments to an operation might be, they can still be supported structurally as well as displayed in the UI.
Results
Operational results also piggyback on the same underlying constructs used to support resource configuration data. Thus, regardless of how complicated the results from an operation are, they can still be supported structurally as well as displayed in the UI.
As with arguments, results too will be strongly typed, and inherit support for validation. Result validation is a more complex way in which a plugin might determine whether an operation was successful or not.
Scheduling
An operation can be executed immediately, or it can be deferred for future execution. A deferred execution can be invoked once, or it can be invoked on a repeating interval. An interval can be unbounded (i.e., the operation will be repeatedly invoked on the specified interval forever), or it can have a termination date.
Regardless of which options are chosen, all operations necessarily go through the scheduler - even ones marked to execute immediately. The scheduler has the responsibility for creating new operation schedules, managing the repeat intervals of each, knowing when to fire each scheduled operation, and unscheduling operations as appropriate.
However, once the scheduler doesn't have any responsibility for actually executing the operations themselves. For that job, it communicates with the operation manager.
Operation State Management
State management of an operation is separate from its scheduling. When the operation manager gets the hand-off from the scheduler, it will create a new operation history record and set its state to INPROGRESS. An asynchronous message is sent down to the agent telling it to invoke a specific operation on a particular resource with the arguments that were specified when the schedule was created. On a per resource basis, the agent queues up these operations; this prevents more than one operation from being executed on any single resource at the same time.
The user interface will show this history item as a pending operation, at which point the user has the opportunity to cancel it. If it is cancelled, another message is sent to the agent. If the message gets there before the agent begins the operation, the agent will remove it from the queue, and tell the server that its state can be set to CANCELLED.
On the other hand, once the agent has started the operation on the managed resource, it can no longer be cancelled. In other words, even if the user attempts to cancel this operation in the UI, the cancel request will effectively be ignored once the operation has begun. At first glance, this may seem like a rather restrictive semantic; however, it was deemed a necessary evil because it would be next to impossible for a plugin to know precisely what intermediate state the managed resource was in when the operation was forcibly cancelled (assuming it was even possible to forcibly cancel it). So, instead of coming up with a potentially very complex mechanism for determining what semi-completed state a forcibly cancelled resource was in, no operation can be cancelled once it has been started.
Eventually, the operation will complete on the managed resource and pass its raw value back to the plugin communicating with it. This plugin must analyze the raw output from the managed resource and modify, transform, or craft an entirely different response message depending on the raw result. The raw result will never be sent directly to the server, it must be wrapped in an object that matches the expect result type as defined by the plugin for that operation.
Regardless of the details of the response, a non-cancelled operation can only ever wind up in two states: SUCCESS or FAILURE. Operations that fail may be accompanied by a detailed error message explaining the failure.
On occasion, the server may have trouble communicating with the agent that is managing the target resource. In this case, the operation will immediately be put into the FAILURE state, and the accompanying error message will explain the details of the connectivity issues.
Timeouts
Occasionally, there are instances where an operation will "hang" on a managed resource. This would normally spell trouble for the plugin because it can not start the next operation on the resource until the current operation completes.
Luckily, each plugin supports a timeout for invocations that don't complete in a timely fashion. Once the timeout period expires, the operation is forcibly halted, the operation is marked as FAILURE, and the error message will contain details about the timeout.
However, this is exactly what we were trying to avoid by not supporting forced cancellations of operations. Halting an operation abruptly could potentially leave the resource in an inconsistent state. Though, when an invocation is hung, there isn't much more that can be done other than a forcible termination of the execution. This scenario is expected to be rare.
Unfortunately, there are yet more risks in this scenario. The issue past this point is if there were other operations waiting in the queue. If so, as soon as the currently executed operation is halted and removed from the queue, the next operation proceeds as if nothing bad has happened. If the forced cancellation was "safe", there should be no problem; but if the resource is in an inconsistent state, it might cause a cascade failure of all operations executed against this resource in the future (perhaps until the resource is restarted). Though if the administrator is using RHQ assiduously, he or she should immediately be alerted that there have been consecutive operation execution failures, and that manual intervention is probably needed to get to the bottom of things.
History
The outcomes of all operations executed on a resource are recorded. The operation history is a way to audit the results of these executions. When the scheduler deems that an operation needs to be executed against some managed resource, it creates an operation history element and sets it state to INPROGRESS. Items in this state are put in the pending operations table.
All other states of an operation - CANCELLED, SUCCESS, and FAILURE - are termination states. Operations in any one of these states are shown in the completed operations table.
Completed operation data is never automatically purged, but a RHQ user can delete specific elements that may be out-of-date or no longer relevant.
User Interface
The operations tab has three sub tabs that will be described below

New
This tab presents you with a list of possible operations for the selected resource

When you choose one by clicking on it, it will be marked with a little asterisk and an option panel will appear below:

You need to fill all required fields and decide if the operation should be scheduled for immediate (one time) execution.
If you want to schedule the execution to happen at a different time or to recur, you need to select the textbox. RHQ will present you with additional fields about the recurrence of the operation.

The "Other options" section lets you set a timeout and some notes.
Clicking "Schedule" schedules the operation.
Depending on when you scheduled the operation, you will be transferred to the "Schedule" or "History" subtab.
Scheduled
This tab lists all operations scheduled for the future along with notes you entered when creating the schedule and the owner of this scheduled operation

Clicking on the operation name provides you with the schedule information like recurrence and operation parameters.
You can also cancel a scheduled operation by selecting the operation in the list and clicking on "unschedule" at the bottom.
History
The history view consists of two parts:
- operations in progress: an operation that is running is shown here
- completed operations: an operation that is completed is shown here

Both panels show the name of the operation, when it was submitted and completed and by whom as well as its status. In case of failure you can click on the failure link to see why the operation failed (e.g. missing permissions).