The Simple Times (tm) is an openly-available publication devoted to the promotion of the Simple Network Management Protocol. In each issue, The Simple Times presents technical articles and featured columns, along with a standards summary and a list of Internet resources. In addition, some issues contain summaries of recent publications and upcoming events.
The Simple Times is openly-available. You are free to copy, distribute, or cite its contents; however, any use must credit both the contributor and The Simple Times. (Note that any trademarks appearing herein are the property of their respective owners.) Further, this publication is distributed on an ``as is'' basis, without warranty. Neither the publisher nor any contributor shall have any liability to any person or entity with respect to any liability, loss, or damage caused or alleged to be caused, directly or indirectly, by the information contained in The Simple Times.
The Simple Times is available via both electronic mail and hard copy. For information on subscriptions, see the end of this issue.
The purpose of this article is to consider new concepts and capabilities in network management. These range from incremental enhancements of the current state of affairs to wild-eyed dreaming.
The approaches are these, in order of increasing departure from current practices:
Along with the MIBs themselves, we have learned the value of concise, machine-parseable MIB definitions. These form the fundamental vehicle by which a general purpose management station can learn the about the devices under its control.
We must first recognize that there is a distinction between the ``network management'' of monitoring and capacity planning from the ``network management'' of troubleshooting. For convenience, we'll refer to the former simply as ``network management'' and the latter as ``troubleshooting''.
Nearly 100% of network management occurs when networks are not failing.
When today's networks break, it is usually due to either a hard connectivity failure or a routing failure. In either case, neither TCP nor UDP get through.
Error bursts and congestion failures do occur, but these tend to be transient, and whether performed by the TCP engine or in a network management station's SNMP retry logic, the packets do tend to get through eventually. It is interesting that with TCP's congestion avoidance algorithms, TCP based streams behave in a way more likely to alleviate the congestion than unregulated UDP streams.
Quality of service controls (such as RSVP) are coming to the Internet. We expect management traffic will get the ability to request priority. This will help ensure that as long as a pathway exists, there will always be a way to monitor and control the net no matter how congested it gets.
Troubleshooting is a distinct branch of network management and requires tools and techniques quite different from those used for continuous monitoring and control. In troubleshooting, SNMP is, at best, a tertiary level tool with value rather below that of ``ping'', ``traceroute'', ``nslookup'', and ``mtrace''.
Today's network devices often contain processors and memory exceeding that of our management platforms of a few years ago. Already these devices perform numerous autonomous operations and have considerable protocol stacks already in place.
Today's network devices are capable of managing themselves, if given the opportunity. (We must admit, however and unhappily, that there are is a very large class of price sensitive devices in which every corner that could be cut was cut, including the time to read the relevant specifications or perform any interoperability testing.)
A meta-variable is simply a MIB variable which exists only in the MIB definition document. Each meta-variable is defined as a function of real MIB variables.
Meta-variables would be used by MIB designers to express useful derivations that can be made from the raw data. This could capture a significant body of empirical knowledge which today is rarely, if ever, recorded.
The function may be simple, such as the dividend produced when an error counting Gauge variable is divided by sysUpTime. In this case, the result would be an average error rate.
Or the function may be more complex, like something that takes the second derivative with respect to time of that error counting Gauge. This function would highlight significant changes in the error rates on an interface, which is a far more useful indicator of trouble than an average error rate.
To reify these meta-variables, a management station would have to perform the function. This implies that the function must be expressed by some procedural statement that can be mapped down to basic SNMP get and set primitives and polling. One might say that the functions would be best expressed as simple scripts.
The definition of these meta-variables and the functions used to generate them would be expressed in standard MIB definition documents with appropriate formalities so that they could be machine parsed and utilized by a management station.
An extension of the meta-variable concept is to place intermediary devices in the network whose role is to compute these meta-variables and export them as real SNMP variables in a MIB specific to those intermediary devices. Another extension is for the SNMP agents themselves to compute the meta-variables, in which case they become real-variables.
Although SNMP itself is relatively ``simple'', it takes some work to build the MIB support in an agent, and considerably more work to build the management support to utilize the MIB data, and a great deal of work to deploy the manager onto the various network management ``platforms''.
An HTTP/HTML management server embedded in a managed device, with underlying TCP is not significantly more complex or memory intensive than an SNMP agent with mechanisms supporting generalized lexi-ordering and arbitrary collections of objects in a set. (It is easy to vastly underestimate the amount of work required for an agent to handle an arbitrary collection of proposed values which may arrive in an set request.)
If one looks at many of today's workstation-based management platforms, one quickly realizes that they are really not much more than a collection of device-specific add-ons.
Those add-ons could be just as easily created by having a device export highly device specific web pages with controls and user interface paradigms. For example, management platforms take pride in the fact that they can project a rendering of a managed device, so that the operator can point at a port to invoke a control panel for that specific port. This is pretty routine stuff for a typical web server.
The device vendor ships one, self contained product. That product includes its own management functions and does not depend on anything except that WWW browsers are reasonably uniform and ubiquitous. With respect to its management functions, the vendor controls the horizontal and it controls the vertical; the vendor controls everything about the device and its management, from operation to GUI. It's an extremely attractive proposition.
The great drawback of this approach is that it requires human intelligence to comprehend the WWW forms presented by a device. If one accepts the proposition, as we do, that in the long-term, networks should perform significant self-management, then this approach represents a substantial danger that we will end up further from our goal rather than closer.
The myths of ``The Collapsing Network'' and ``The Dumb Agent'' have forestalled many efforts to consider a connection-oriented alternative to SNMP.
Today's Internet is successfully carrying an enormous transaction load using the World-wide Web's HTTP, which is a TCP-based protocol. HTTP transactions follow a very simple life-cycle:
One could readily conceive of a number of ways to encode MIB information in that chunk of binary data. It could be truly binary, with its own MIME type. Or it could be embedded in HTML as readily identifiable, machine parseable, structured comments.
One might think that the real issue with this approach is how to map get, get-next, get-bulk, and set onto this scheme.
However, the real issue is whether we really need the get* trinity at all. The get operation is the only silver-bullet of the three SNMP retrieval operations; the latter two are merely means to get past SNMP's limited data unit imposed by the myth of ``The Collapsing Network''. As such, all three retrieval operations could be collapsed into a single get-subtree operator that takes a single parameter, an object identifier, and returns all objects which are prefixed by that OID. For convenience, we ought to define the subtree traversal to return the objects in lexicographic order; and, for efficiency, we should allow a list of prefixes and allow the return of multiple sub-trees.
So, how would this actually be mechanized over HTTP?
Consider SNMP queries as the equivalent of a WWW form in which the user or management station simply lists the MIB objects it wants to obtain or set values into. The SNMP response would be the web page returned as a result of processing the input form.
For processing efficiency, this result need not be encoded in a way that could be directly presented to a human user. The data could be handled either by a special application which speaks HTTP or by a management plug-in to a WWW browser.
One very attractive feature about this approach is that it may be able to piggyback on those WWW security features which are falling into place.
The main drawback of this scheme is that it can be highly intensive in its use of TCP connections, but as has been mentioned, the WWW community is already facing and, hopefully, resolving this problem.
It has been argued by some that this approach would degenerate into a prodigious number of short TCP connections, each retrieving only a small number of MIB variables. This is a valid concern. It has also been argued that the gain offered by TCP is not so great when comparing with the get-bulk operator. This is true, however, get-bulk is not widely deployed (yet). And it merely changes the point at which the curve of TCP efficiency crosses that of SNMP efficiency; it does not change the fact that, as MIB retrieval size increases, TCP becomes more efficient than UDP-based SNMP.
In SNMP, this relationship is somewhat vague and tends to be indirectly visible as polling by managers (to determine ongoing device status), trap destination configuration in agents, and table management in RMON devices. In the various SNMPv2 proposals, this relationship was made manifest through the various administrative frameworks.
Why not go the next step to acknowledging the relationship and creating an explicit manager-agent ``association''? This association would be composed of security and other state information and there would exist, whenever possible, an open transport connection between the manager and agent. (When that underlying transport connection fails, the two ends would attempt to reconstruct it and re-synchronize their association state.)
This approach vastly simplifies the issues of security\ --\ authentication and privacy exchanges would occur at association startup and would be cross-checked at important points in the association (typically in the form of a handshake when re-building transport connections and as cryptographic-checksums embedded as integrity checks in the various transactions crossing the association.)
This approach also obtains a significant performance improvement over today's SNMP when moving any significant amount of data. (With today's TCP protocol engines for small queries, however, there may be three or four packets crossing the net rather than the two for UDP based SNMP, although the comparative analysis can be rather complex and highly subject to packet loss rates and the TCP windowing and ACK behavior of a given TCP implementation.)
One can imagine, for example, a script that monitors the variables in a device watching for tell-tale signs, such as a rapid increase in an error rate. The script could then either report the problem, trigger additional diagnostic tests, or take corrective action. (The latter two would require sophisticated scripts.) Scripts are ideal mechanisms to evaluate and act upon meta-variables, as described earlier.
Scripts are often expressed in a simple interpretive language. Each line of the script is simply a row in a table of octet string variables.
This is not a new idea: some years ago David Levy of SNMP Research published a ``Script MIB'' and the University of Delft allows a management station to inject Scheme language programs as RMON filters.
The real difficulties of all script approaches are not the scripts themselves or the language used (although, as one can expect, there are there are competing camps advocating TCL, Scheme, Java, Python, APL, RPG, Cobol, or French.)
The difficulties are these:
This is not a new idea. Professor Yechiam Yemini of SMARTS has been building tools using these techniques for many years. Java's popularity is extending this idea to areas other than network management.
The basic idea is ``management by delegation'', the superior level manager creates a script which it downloads into the area manager for execution. The area manager is, of course, a multi-threaded device and can execute many scripts simultaneously, perhaps on behalf of multiple superior managers.
An area manager would usually be given authority over devices with which it has inexpensive, low latency communications. One might conceive of an area manager's span of control as a single LAN or a group of LANs connected with a single high performance router.
The area manager might interact with the end devices using scripts, but it is far more likely that it would be done using traditional SNMP. One of the benefits of the proximity of the area manager and the ultimate devices is that the high bandwidth and presumably low packet loss rate would allow SNMP exchanges to be done rapidly and with minimal data distortion due to non-atomic snapshots of device tables.
An interesting possibility of area managers is that if they are equipped with out-of-band communications paths they can play a very useful role in network troubleshooting. Anyone who has ever repaired networks knows that you always need to be in at least two places at once. An area manager running a pre-loaded script can act as a troubleshooter's remote eyes and ears. For example, an area manager might be running a script which says:
Watch the network traffic and routing protocols and periodically ping sites off the local net to confirm outside connectivity. Should outside connectivity fail, perform traceroutes and report the results using the out-of-band channel.
In terms of network management, worms are really scripts that can replicate and migrate. They are really just the next step in the continuum that begins with script MIBs. Some researchers in the network management community are already working with migratory programs. They can perform network device discovery using a worm that migrates through the network and sends a report back to a central logging address whenever the worm moves to a new machine.
Effective use of worms requires that they be ``safe'', that they have finite lifetimes and limited appetites for network and computing resources, and that they can themselves be located, managed, and terminated.
None of the ideas presented here are impossible. Any one could be developed and deployed within 12 months.
Web technology is being employed in both device management and network management applications. We will discuss each in turn.
But, before we begin, a bit of terminology is in order:
Other companies have done this level of functionality as well. In the 1980's, nearly all network devices had console ports supporting VT100 style terminals. This was the ubiquitous terminal of the day. Most devices implemented a TCP stack, many times just to provide remote access to the console. In the 1990's, a web browser has replaced the VT100 as the ubiquitous terminal of choice. Telnet servers not even installed on many devices. This observation has fueled the drive for HTTP consoles in network devices. However, just as console interfaces were not a good candidate for standardization when VT100's were the chosen interface, these web interfaces are quite different and not good candidates today.
These are examples of how web technology has improved network management. But what's the next step? How far can we go? What impact does Java have?
Cisco began experimenting with embedding Web interfaces in routers in mid-1995. We wanted to give the maximum capability to a browser-based user interface with minimum impact to software image size. Initially, we didn't think it was possible to translate the entire Cisco command language to the new paradigm. However, after some experimentation, a scheme to use the same command parsing functions as the TTY interface to generate web pages was devised.
In internal data structures we had the names of the commands and the help strings to go with them. The HTTP code extracts those strings from the parse tables and formats them in HTML. Each command is turned into a link. For example, show interface ethernet0 is turned into the URL:
http://routername/exec/show/interface/ethernet0/This approach means that anyone with a web browser can issue a router command and the result will come back into the browser window as scrollable text.
We thought this was interesting and went off to solve other problems such as organizing these links into useful subsets, creating forms for specific functions, and so on. But, it was our customers that surprised us! We found that customers are building Web-based pages for help desk guides. As help desk personnel are trying to track down problems, they follow step-by-step procedures on web pages. Some of these steps on the web pages are actually links. It may say ``click here to show router interface statistics.'' This would be much more straight-forward than directing the user to telnet to the router and execute a command. Through the use of frames, the procedure can stay on the screen while the result of the procedure can show in another frame.
Having full device commands as web operations can greatly simplify automation of simple network operations tasks. If different links to different devices are listed in the series of web pages, then multiple devices can be woven together in the investigation. Keep in mind that the output that is given to these operations is designed to be human friendly, and not necessarily machine readable.
``If you build it they will come.''So a particularly persistent individual took the latest CMU code and started porting. When the first version came out, people came from all around to see SNMP running in Java. When this poor soul admitted that the varbind encoding routines were left in the C programming language, he was ridiculed. So he went back to work and finished the job. Next, fancy Java gauges were tied to the SNMP stack. When several gauges monitoring separate MIB objects were put up in a window for an Interop demo, the criticism continued:
``You're going to clog the network with this polling if you have separate PDU's for each gauge.''So a scheme to make the SNMP class an applet in it's own right and use inter-applet communications was devised. This exercise shows that there is no end to the things that have to be redeveloped when a new paradigm comes along.
But we weren't out of the woods yet, there were two more problems. The first was that this was getting to be an awful lot of code to download to a browser every time you want to use it. Second was Java's network security paradigm that dictates that an applet can only open a socket to the server that the applet was served from. This has a couple of unfortunate implications. If we want to do SNMP to a device from a Java applet, that device must be sophisticated enough to be able to download an SNMP stack over HTTP. We changed the router to download applets from its flash file system, but that's only useful for demonstrations. Additionally, if we want to put up a browser with gauges monitoring two different devices, I'd need two copies of the SNMP applet. One copy would have to come from each device.
SUN is interested in relieving these and some related implications, and it seems a more sophisticated applet security model is on the way. Regardless, some applets will have to be trusted more than others. I guess some will ship with ``root privileges'' sarcasm! These limitations are not present for Java applications that are installed in the local environment and run from the Java interpreter and not from a browser. So the benefit of platform portability is preserved while the benefit of dynamic loading of client software is not.
In contrast, Microsoft addresses the effort required for their new paradigm with programmer tools that generate much of the code to use their object abstraction automatically. They recognize that when an engineer has a choice he won't want to write user interface code to the new model and then write proxy code to map the new model to SNMP. Most engineers will use SNMP directly so they can get their work done.
Network and system management nirvana seems to be stated something like this: if we can make all devices look alike, we can make management simple. The trouble is that even if vendors would like to convert all of their MIB's to some new object paradigm, the last thing they want is for their devices to look just like their competitors. Each vendor will insist on having a way for their competitive advantage to be manipulated and displayed. If you add a bell to the box, marketing will want to see the bell that they defined in the management system. Commonly useful object models will quickly decay from the baggage that is placed upon them.
The success of new paradigms is predicated on them providing significantly more value than the old paradigm. It remains to be seen if the power of the web can push these new paradigms over the top.
Better ideas than SNMP come and go. Most fail because they don't realize that SNMP represents a pretty nice balance between the possible and the practical.
The typical transaction is one in which the client establishes a connection to the server, issues a request, and waits on a response from the server. The server upon receiving the request from the client, processes the client's request, sends a response, and then closes the connection.
An example of a GET method might be:
GET /ModGenConf.html HTTP/1.0 Accept: image/gif Accept: image/jpg Accept: image/x-xbitmap User-Agent: Lean Mean BrowserBoy/ Strictly beta If-Modified-Since: Monday, 1-Jan-96 19:04:09 GMTwhile an example POST method might be:
POST /ModGenConf.html HTTP/1.0 Accept: image/gif Accept: image/jpg Accept: image/x-xbitmap User-Agent: Lean Mean BrowserBoy /Strictly beta DATE: 06%2F06%2F96&IP=188.8.131.52&SAVE=YESA blank line is used to delimit each method.
Responses begin with a status line consisting of the protocol version of the server followed by a response status code, and an optional response status message, e.g., ``HTTP/1.0 OK''. Typical response codes and messages include:
Following the status line are one or more header/value pairs and then the actual data.
Similar to the request, the response then includes an optional list of header/value pairs. Two of these headers fields, Content-Type and Content-Length, indicate the media type and the length, respectively, of resource being returned. The media type is given as a MIME type. MIME is defined in RFC 1521, and can be used to identify any media encoding type including application-specific types.
The server issues a authorization challenge by sending the client a 401 (unauthorized request) response. In the response, it identifies one or more supported authentication schemes along with whatever parameters are necessary for achieving authentication via that scheme. One of these parameters indicates a realm, or protection space, on the server. It allows the server's resources to be partitioned into multiple protected regions, each with it's own authentication scheme and/or authorization database.
The client, upon receiving a 401 response, can re-issue the request with a Authentication header that identifies the authentication scheme in use along with the necessary credentials to prove its identity for the realm in question.
RFC 1945 defines an authentication mechanism called ``basic'', which all clients are encouraged to support it. This model employs user-ID/password credentials for a realm.
Proper page design philosophy has been to minimize bandwidth used by minimizing the amount of information a document contains (image content in particular). This philosophy also benefits management agents, which generally don't have a large amount of persistent storage, by only requiring them to store a minimum of information at the agent.
HTML uses ``markup'' tags to denote regions of text as having specific characteristics. The tags serve as instructions to the browser on how to render a region of text. These tags are portions of text surrounded by the less-than (<) and greater-than (>) characters. For example, the <B> tag indicates that the browser should bold the text following tag and a </B> indicates to the browser that the bolding should end. HTML provides tags for formatting of text, inclusion of graphic images, navigation to other documents (hyperlinks), standard form controls (text boxes, radio buttons, and so on).
The web-based agent implementation described within this article exports HTML formatted management documents to a standard web-browser. No specialized management station software is needed to use the agent. There are three issues to consider: document design, agent implementation, and authentication.
The static document is a document whose contents never change. It can be built into the agent in the form of a static data structure, put in the agent's persistent storage, or referenced by the agent by using a URL naming a supporting device (a supporting server or another agent). Examples of static documents are graphic images, help/informational documents, and HTML documents used entirely for the user's navigation to other documents (via hyperlinks).
The next type of document is a dynamic document. This document's contents have the potential to change over time. The contents of a dynamic document are assembled at run-time upon a request for the document from the client. This is the most common type of document supported by our agent. It can be used for many purposes such as dynamically displaying the current values of statistics kept at the agent, and current values of user configurable operational parameters at the agent.
The final type of document is a form document. This type of document is used to modify the current operational mode of the device or agent. Form documents can be either static or dynamic themselves. The controls (e.g., textbox submission fields, two-state buttons, or multi-choice selection boxes) on forms often have current state values associated with them. These state values must be inserted into the control upon a request for the form document.
The layout of documents for our agent is done with the aid of a commercially-available HTML layout editor. This tool is augmented by an in-house developed tool which gives the document designer the ability to associate certain fields within a document with dynamic (or ``live'') data within the device and, in the case of form documents, supply an action method for each control element to be executed upon form submission.
Upon a request from a client, the agent searches the directory using the Request-URI as a key to find a particular document.
For a get request, the document's serialization method is called when the agent receives a request for the document. The results of the serialization method are the current contents of the document. These results are returned as the data section of the HTTP response.
In the case of a POST request, the document's serialization routine is called after the document has processed the data section of the request. The document's control element parsing method separates the data section of the request into control/value pairs. Each pair represents the submitted value of a specific control on the form. Then, for each control, the parsing method calls a specific user-supplied method (defined in the document design step), passing the value associated with the control. The serialization results for a form after processing a POST method is completely up to the form designer. The results could simply be the form document with the new values used in the form submission, or be a document indicating the success or failure of the submission.
The agent itself can be configured to be single- or multi-threaded. The advantage of using multi-threaded server is primarily increased throughput. For many management implementations, a singly threaded server is fine. (Browsing a Web-based management agent isn't particularly exciting!)
Another advantage of a multi-threaded agent is that when used with persistent connections, the agent will be able to service more than one client at a time. Persistent connections are aimed at solving the latency problem with TCP slow-start. Persistent connections were added ad-hoc in HTTP 1.0 and will become standard in HTTP 1.1. The disadvantages are possible manager conflict and resource consumption issues.
It uses two authentication realms (protection spaces) within the device, read-only and read-write. The agent maintains two passwords, one for read-only access and one for read-write access, which are stored in persistent storage on the device. (This is quite similar to SNMP, and on some our devices, the SNMP community strings will be used as the passwords). A read-only password only allows the client (browser) to access pages in the read-only realm. The read-write password allows access to both realms.
Each document in our agent is associated with one realm. If a client fails to authenticate itself with the agent for a particular document, the agent challenges the client with that document's realm (read-only or read-write). The agent can determine a document's realm using the document's authentication realm method, which simply returns a string indicating the realm. In our implementation, informational documents typically only require a read-only level of access, but may occasionally require read-write if they contain privileged information. Form documents are generally not accessible at the read-only level of access.
Now let's look at how this approach to Web-based management compares to the traditional SNMP approach.
First, there is no specialized management software necessary to configure or monitor a device.
Second, the versioning problems that typically occur when an older agent or manager doesn't support the new and possibly required features of the other are eliminated. These problems occur when either a new MIB is supported at the agent and not supported (displayed) at the manager, or when a new version or feature of the manager expects a minimum of MIB support from an older agent. By using a Web-base agent to export screens directly, both agent and manager do not need to be updated simultaneously. This same problem is only exacerbated in a multiple vendor environment. Many vendors supply devices requiring some level of vendor specific MIB support. This complicates the job of the management station vendor by requiring them to support additional screens for these vendor specific MIBs. (MIB browsers are simply inadequate for this task!) Often, this leaves the network manager with no choice but to buy specific management applications for each vendor's device and then learn to use all of them. By delivering the management information via a Web-based agent, the application is delivered with the network device and the common familiar interface of a Web browser exists for all applications.
The next major advantage is that of platform-independence and location-independence for the application. A network manager can access his Web-controlled network from anywhere, on any platform. All that is needed is a general-purpose Web browser, which is standard equipment for the vast majority of platforms.
A final major advantage is seamless integration with on-line documentation. Context-Sensitive help and documentation may be accessed via hyperlinks embedded directly into the agent's management pages. Additionally, configuration and management can be driven entirely from an on-line instruction manual, in tutorial fashion.
Also, security options for HTTP are becoming available. Security for commercial applications using HTTP over the Internet will force a solution in this area. A management application that uses HTTP can then leverage these security features. In addition, using one security mechanism between multiple applications would greatly ease the burden for network administrators who often have to configure security for all applications.
The first issue is the latency of HTTP when used for small transactions. This is due to the protocol being layered over TCP. This has been raised as an issue against using HTTP as the transport for generic MIB information (instrumentation at the agent). Latency for HTTP is an issue for ALL HTTP transactions, not just in this specific case. The HTTP community is aware of the problem and are working toward solutions. One solution is to use persistent connections, and thus amortize connection overhead over multiple connections. The theory is that if you want one resource from a server, you are probably going to want another.
The next issue of concern applies mainly to the approach of exporting formatted management information from agents. The concern to this approach would be that the burden of display is placed on the agent. This would be a valid concern for only the ``lowest'' of low end devices. This overhead is fairly minimal and varies based on the desired complexity of the documents exported from the agent. It primarily manifests itself as additional memory overhead (for document formatting) on most agents.
The last issue of concern is that HTTP requires a TCP implementation. Many devices already support other applications that require TCP, e.g., telnet, so this will not be an issue for them. In fact, for those implementations with existing telnet support, an HTTP/HTML implementation could replace the telnet functionality, and thus require little or no extra memory overhead.
Both provide the client (manager) the ability to modify or retrieve specific resources within a server (agent). SNMP provides a standard mechanism for identifying resources and a standard representation of those resources (a media type). HTTP provides the ability to identify and transport resources of any media type between communicating entities.
Devices that support both SNMP and HTTP will duplicate functionality. This is why it may be desirable to have an HTTP/SNMP interoperability standard. Interoperability with SNMP could be a very simple matter in my opinion. It could be as simple as defining a new MIME type for SNMP operations. This type could be the current encoding standard for SNMP, BER, a newly defined encoding standard, or both. Current SNMP operations over HTTP could then take advantage of the protocol's ability to transfer large amounts of information, and a new get-subtree operation could also be defined to explicitly take advantage of this feature, as is currently under discussion on the SNMP mailing list.
CyberAgent is FTP Software's intelligent agent technology based upon Sun's Java programming language. An intelligent agent is a mobile, independent piece of software that travels the network and accomplishes tasks remotely. CyberAgent classes supplement the Java Development Kit and provide templates for agent development.
CyberAgent technology provides a remote execution capability that, when integrated into management applications, may go a step beyond SNMP-based management applications. Using intelligent agent technology, you can write agents for specialized network management purposes.
This article discusses the CyberAgent framework, agent development in network management and how you can write enhanced SNMP management applications using the CyberAgent technology.
Another factor governing CyberAgent movement is the travel plan. A travel plan describes the general mode that CyberAgents visit computers on the destination list and, like the destination list, is defined when a CyberAgent is launched. You can choose between two travel plans at launch time: radial or sequential.
In a radial travel plan, a CyberAgent appears to travel from the home node to each of its destination nodes simultaneously and then return home. Actually, multiple copies of the CyberAgent are dispatched in rapid succession, one to each destination node. If so programmed, each of the CyberAgent copies may then return to the home computer.
+------------+ +------------+ | Node A | | Node B | | | | | | Windows 95 | | UNIX | +------------+ +------------+ \ / \ / \ / \ / \ / \ / +------------+ CyberAgents are | User's | simultaneously sent | Management | to each node, they | Station | collect data and +------------+ return home / \ / \ / \ / \ / \ / \ +------------+ +------------+ | Node C | | Node D | | | | | | Windows NT | | Windows 95 | +------------+ +------------+
In a sequential travel plan, a CyberAgent is launched from the home node and travels to node A where it performs a task and maybe collects some data. After completing its task on node A, it continues its journey to node B and performs its task on node B. After completing its task on node B, it repeats the process of traveling to the next node and performing its task until it reaches node D, the CyberAgent's last stop. Then, if so programmed, the CyberAgent returns its data (and even itself) to home. The sequential travel plan is sometimes described as a store-and-forward plan.
+------------+ +------------+ | User's | CyberAgent | Node A | | Management | --------------> | | | Station | launched | Windows 95 | +------------+ +------------+ ^ | | | CyberAgent | CyberAgent returns | collects data | home with results | and goes to | | next node | V +------------+ +------------+ | Node D | | Node B | | | <--- .... <--- | | | Windows 95 | | UNIX | +------------+ +------------+
The exact sequence of computers visited can be altered by the CyberAgent under program control. While restricted to the destination list, the CyberAgent can move at will among those computers, collecting information, depositing data in local files, executing Java and non-Java programs, sending data home, and performing any function that a program confined to a single computer can perform.
While this feature is powerful, it raises the issue of coordination between CyberAgents and the information they may collect, especially when multiple CyberAgents (or multiple copies of the same CyberAgent) arrive at the same computer. CyberAgents communicate (and coordinate) through CyberAgent messages which may contain any native Java object. Information collected asynchronously may be aggregated into a single file at the home computer through synchronized file access methods that are transparent to the CyberAgent programmer.
Mobility control flow parallels control flow for fixed programs; both have initialization, iteration, and termination phases. Mobility initialization occurs prior to CyberAgent launch, iteration is the code executed at each computer on the destination list and termination occurs after the CyberAgent returns home. Each of these phases is implemented as a CyberAgent Application Program Interface (API) method.
A CyberAgent controls its own movements by determining where it is, setting its target destination (where it is to go next) and then sending itself to the target destination. The normal case is to go to the next destination on the list. Alternately, the CyberAgent may decide to go to any other destination on the list or even to break out of the destination list and send itself home. Sending itself home is analogous to a break statement in a fixed control flow structure.
In clear mode, CyberAgents are not password protected or encrypted and all CyberAgents that arrive are accepted without any attempt at authentication or integrity checking (appropriate if everyone on your network is trusted).
In password security mode, CyberAgents are not encrypted but only CyberAgents signed with a password shared by you and the sender are accepted. Password security mode provides a degree of authentication and integrity verification. Prior to launch, the password is hashed with the CyberAgent's contents to generate the password signature or Message Authentication Code (MAC). When the CyberAgent arrives at its destination, the recipient uses the common password to recover the MAC from the CyberAgent signature. The recovered MAC is then compared to a new MAC computed on the CyberAgent's contents. Successful signature decoding and MAC comparison indicates that the CyberAgent did indeed arrive from a trusted source and was not altered in transit.
In DES security mode, CyberAgents are both encrypted and signed with a key (known to the recipient and the sender). The DES digital signature provides strong authentication and integrity assurance. Prior to launch, the sender encrypts the CyberAgent with the private key. Then the key is hashed with the CyberAgent's contents to generate the digital signature or MAC. When the CyberAgent arrives at its destination, the recipient uses the private key to decrypt the CyberAgent and to recover the MAC from the CyberAgent signature. The recovered MAC is then compared to a new MAC computed on the CyberAgent's contents. Encryption ensures that privacy is protected and successful signature decoding and MAC comparison indicates that the CyberAgent did indeed come from a trusted source and was not altered in transit. RC2 encryption is similar to DES and is available for customers outside the United States and Canada.
Each of the security modes can be applied to CyberAgents from individuals or from a member of a community. A community shares the same password or key among all its members and is more convenient than having a separate password or key for every individual with whom you want to exchange CyberAgents.
Even SNMP-instrumented data may be impossible to access in certain cases. For example, because no industry-standard extensible agent protocol exists, software from multiple vendors sometimes cannot coexist on a device. In these cases, each vendor provides an SNMP agent for managing their product. Due to SNMP port contention, the user is forced to choose one agent and must forgo remote access to data normally available via the others. Using a CyberAgent, this data may become remotely accessible if an alternate local method for retrieval of the information is provided by the vendor.
CyberAgents can be used to enhance SNMP-based fault detection and management applications. These applications can detect error conditions either actively (such as by device polling), or passively (such as by receiving traps and informs from the devices). The information is passed to the network administrator either visually, by sounding a bell or via e-mail. In general, however, notification that a condition exists is the most that the application can do. Sometimes, these conditions are time-critical and cannot be diagnosed or rectified after the fact. Historic logging of data indicates that problems occurred but there is no way to reconstruct the conditions that caused them. Enterprise-specific trap handling is an application for CyberAgents. In the case where the device is sending notification of an internal problem, an agent can be dispatched to gather more detailed information about the state of the system and return it to the manager.
CyberAgents can be used as components in a system to monitor (and possibly resolve) IP addresses conflicts. CyberAgents can periodically update a centralized network database of IP addresses in use and detect when a conflict occurs and determine which user is authorized to use the address. Another CyberAgent can then notify the unauthorized party to change their address. In some cases, depending on network stack vendor and operating system, a CyberAgent may even replace the duplicate with an unassigned address.
CyberAgents are also good candidates to perform software distribution and updates. A CyberAgent can carry a software installation package to target computers, copy the package to the target computer, notify the user that new software files are available and prompt the user to initiate installation.
A MIB Browser is an application that allows one to view the available variables in an SNMP agent's MIB by issuing a get request, (or possibly a series of get-next requests. Most are terse and low level, meaning that they convey the raw information available from the device and make no decisions on the data, which a network management application would do.
A MIB Compiler is a program or script that converts a MIB file in ASN.1 notation (format of a MIB definition) to a file that can be read and interpreted by a specific network management application. A MIB compiler may also be used in the generation of SNMP agent code fragments and/or structures. Some network management applications do not use a MIB compiler but parse the ASN.1 file directly.
``In particular, you claim that the Opaque solution can be implemented by method routines without changing the agent's protocol engine, and by applications without changing the platform's protocol engine. But even if that is true, it is only because you are having the method routine and/or application take over part of the functionality of the protocol engine, i.e., you are advocating that both method routines and management applications implement their own ASN.1 encoding/decoding functions. In fact, it is precisely because Opaque requires this kludge that the SMI prohibits any further use of Opaque. If it wasn't for backward compatibility, Opaque would not be defined in RFC 1902.''
This issue of The Simple Times may be controversial in that it provides a forum for SNMP friend and foe to suggest and analyze emerging management technologies and their relationship to SNMP. Even so, it is important to explore these topics because SNMP and these emerging technologies will have to become good, life-time neighbors. As such, we should understand the synergy between them, both to allow SNMP to prosper, and to ensure that the new technologies aren't used beyond their limits.
In this editorial, we'll focus primarily on the marketplace to understand one of SNMP (few!) failures, and the opportunities which have arisen.
The reason is due to how value is perceived and how it is actually added. At present, device vendors receive little benefit for their efforts at making their products manageable -- they can add only instrumentation. In order for the customer to perceive the value, the device vendor must rely on NMS developers and application vendors. This makes it difficult for a device vendor to differentiate themselves in their segment.
Further, standardized instrumentation is only defined for those aspects which are common to most devices. With every device having some proprietary functionality and most protocol standards being interface specifications, rather than functional specifications, full instrumentation, especially for configuration, always requires some aspects which are implementation-specific. This presents an insolvable problem for application vendors who must focus on standardized instrumentation in order to appeal to all device vendors in a segment. Thus, the operators are provided at best with incomplete solutions.
In brief, there is no longer any incentive towards innovation in device management, there is only an incentive to standardize ``more MIB variables''. We've done a good job at this. (Although one might observe that the number of new MIB modules has declined in recent years, suggesting that this too is running out of steam.)
Regardless of the breadth, width, and quality of standards defining instrumentation, a failure to achieve solutions in the marketplace is a failure of standardization. Does this mean that SNMP is a failure? No, of course not. But clearly, the current arrangement has achieved steady-state for device management, and, by itself, SNMP has proven inadequate.
So, we need to change the incentives! Note that there are only two first-tier players: device vendors and operators. Our solution must enable their roles to have a greater influence on the market. To do this, we'll take advantage of an unrelated development in Internet technology.
At present the client/server model for device management divides its tasks as:
http://device/applicationwhere device is the domain name (or IP-address) of the device and application, if present, referred to an script available on the device. (If application wasn't specified, a page would be returned that specified the scripts available.) In most cases, a form would be returned asking for the parameters with which to run the application.
This clearly provides a non-SNMP framework, and we'll use this fact to our advantage. For example, here's how we'll dispose of the usual administrative and security problems:
This control now gives them an incentive to differentiate their products from others in the same segment. Use of Web technology provides an open interface to vendor-proprietary functionality. As such, device vendors are rewarded when they spend more time on adding management to their products.
For example, a device vendor might write one or two dozen different management applications which are carefully tailored to exploit the special features of the device. Ease-of-use, functionality, performance, and other issues can be high-lighted. When an application completes, the page returned can contain links for further information, such as invoking related applications with preset parameters, or fetching help information, and so on.
No longer is support for network management simply a checklist item for device vendors -- it becomes an important consideration for decision-makers. Indeed, operators will now wield their purchasing power based on smoking guns, not finger-pointing. If a vendor's device lacks management solutions, the operator needs blame only the device vendor, not the NMS developers or application vendors. However, operators will now take on a new responsibility -- they will have to clearly express the kinds of functionality they expect to see in the devices now that the middlemen aren't clouding the issue.
How does DM/W3 impact the other players besides those in the first-tier? Agent developers will still provide SNMP engines to device vendors, along with a low-impact HTTP server and a device-side scripting language. But, NMS developers will need to integrate their products with a Web browers. Although many vendors working on applications for device management will have to become more involved with device vendors, things will probably be easier for client-side application vendors as they switch from windowing protocols to HTML programming.
I predict that two areas in the IETF will see considerable change. The Operational Requirements area will find itself the center of activity in specifying functional requirements (the ``what'', not the ``how''). Further, the Network Management area will likely be disbanded: each of its few remaining ongoing instrumentation efforts can be defined in the appropriate area of the IETF, and the remaining standardization efforts, those dealing with the ``internal organization of the SNMP layer'' (IOSL) such as v2 security, agent extensibility, rmon/entity mib, etc., are probably best put in the Applications area.
While it might seem strange to suggest that the NM area will eventually go away, this is perhaps inevitable. There simply isn't much SNMP innovation going on these days and we should consider it a stable, mature protocol in the Internet suite, which really doesn't merit an area of its own. Management issues are best left to each of the individual areas. As a final argument, consider that all that's left in the NM area is the IOSL-related working groups. If there wasn't an NM area today, it would be hard to argue that those working groups merit their own area. As such, if makes sense to simply move the IOSL-related work to the Applications area.
MIB module checking:
The Simple Times also solicits terse announcements of products and services, publications, and events. These contributions are reviewed only to the extent required to ensure commonly-accepted publication norms.
Submissions are accepted only via electronic mail, and must be formatted in HTML version 1.0. Each submission must include the author's full name, title, affiliation, postal and electronic mail addresses, telephone, and fax numbers. Note that by initiating this process, the submitting party agrees to place the contribution into the public domain.
Back issues are available, either via the Web or anonymous FTP.
In addition, The Simple Times has several hard copy distribution outlets. Contact your favorite SNMP vendor and see if they carry it.