TechNote

Number: 08103-04
Date: October 1994

The management capabilities of Token Ring

Token-ring networks include a complex set of low-level management features, which are used to control and monitor enterprise-wide network systems.

Basic network management functions, such as maintaining the token-passing system or handling nodes as they enter or leave the network, as well as the more sophisticated functions, such as identifying and correcting fault and load problems, are distributed among the nodes on the network, rather than being handled by a central host.

MAC frames

The management capabilities are designed into the Medium Access Control (MAC) frames that are built into every 802.5 token-ring adapter. As a result, the node does not require special hardware or software to benefit from them, regardless of the mix of network protocols or manufacturers' equipment used on the network.

The MAC frames address four basic functions:

Maintaining the ring

The maintenance of token-ring systems consists of initializing the ring (starting the token), and dealing with lost, corrupt or permanently circulating tokens or frames. These are all events that would otherwise degrade the operation of the ring. To overcome this problem, one node is elected as the active monitor.

The active monitor can be any node on the ring. It is selected dynamically by all the nodes on the ring through a process called Monitor Contention. This process is transparent to users and occurs whenever the current active monitor leaves the ring or develops an error. When this happens, each node that wants to become the new active monitor transmits a Claim Token MAC frame. The node with the highest address wins. The new active monitor transmits a Ring Purge MAC frame, which is used to reset the ring. The other nodes on the ring assume the role of standby monitors, each ready to become the new active monitor should the current active monitor have a problem or leave the ring.

To determine if a token or frame is lost, the active monitor uses a valid-frame timer. The timer is set for a time that is longer than the time it should take for a token or frame to go around the ring. If the time elapses without a frame or token being detected, the active monitor resets the ring and sends out a new token.

To detect a permanently circulating frame, the active monitor sets a bit in the Access Control field of each frame the first time it passes by. If it sees a frame with the bit already set, it concludes that the node that originally sent the frame failed to remove it from the ring and release a new token. The active monitor sends out a Ring Purge MAC frame to reset the ring and issues a new token.

The active monitor transmits an Active Monitor Present (AMP) MAC frame every seven seconds. This is used by the other nodes to check that the active monitor is working and to determine who their upstream neighbours are by a scheme called neighbour notification (also known as Ring Polling). This scheme works as follows. The first node to receive the AMP frame stores the active monitor's address, since this is its upstream neighbour. The AMP frame continues through all the other nodes on the ring, but they only note that the active monitor is still working. The first node then transmits a Standby Monitor Present (SMP) MAC frame when it next receives the token. In the same way as before, the next node downstream receives the SMP frame, stores the address of its upstream neighbour, and then transmits an SMP frame of its own. This process continues round the ring until the active monitor eventually receives an SMP frame from its upstream neighbour, at which point every node has registered who its Nearest Active Upstream Neighbour (NAUN) is.

If the active monitor does not receive an SMP from its NAUN, it transmits a Report Neighbour Notification Incomplete MAC frame to the network manager. This contains the address of the node which last transmitted an AMP or SMP, therefore indicating that the node immediately downstream of this node is faulty.

If the active monitor receives an AMP from another node, it transmits a Report Active Monitor Error MAC frame, relinquishes its role, and becomes a standby monitor. The new active monitor then transmits a Ring Purge MAC frame to reset the ring. Problems with LANs sometimes occur when nodes join the network. While most network adapters perform some form of low-level hardware test before they become active, 802.5 token-ring adapters conduct an extensive series of tests before they join a ring. The test process consists of three steps:

Each node on a token-ring network is identified by a unique number - a 48-bit, IEEE-assigned node address. When a node joins the ring, it must check that the node address it intends to use is not already being used by any other node. To do that, it sends out a Duplicate Address Test MAC frame to the node address it wants to use. If another node receives the frame, that node address is already in use. When this happens, the node trying to join the ring removes itself and reports the error to the user. It can then make another attempt to join the ring using a different locally-administered address.

Whenever nodes join or leave the ring, additional MAC frames appear on the ring. This is because inserting or removing a node involves switching relays, which temporarily disrupts the token-ring signal. This does not affect the data integrity, because all data on the ring is protected by a complex four-byte Cyclic Redundancy Check (CRC), and any data frames that are affected are retransmitted.

The next time the Ring Poll process is carried out, any node that detects a different NAUN immediately upstream of it reports the new NAUN in a Report New NAUN MAC frame. This enables the network manager to detect nodes joining or leaving the ring.

Locating and isolating cable and hardware faults

Because the nodes on a token-ring network transmit directly from one node to the next, a node that does not receive a valid electrical signal from its NAUN can deduce that it or its NAUN is therefore faulty. This will happen if there is a cabling fault. The node detecting the error transmits a beacon MAC frame to all the nodes on the ring, including the faulty one, warning them that the ring service is suspended and giving the address of the faulty node. When a node sees a beacon frame concerning itself, it assumes that it may be faulty and isolates itself from the ring. The node then tests itself and its lobe cable. The test consists of transmitting Lobe Test MAC frames down the lobe cable, which tests both the adapter and the cable. If the node fails the test, it does not rejoin the ring and the ring recovers. If it passes the test, it rejoins the ring. If it rejoins the ring, after a certain period of time, the node that initially detected the error isolates itself from the ring and tests itself. If it fails the test, it does not rejoin the ring and the ring recovers. If it passes, it means that the fault lies either in the trunk cabling between the two nodes or in a Multistation Access Unit (MAU).

In many cases, cabling or hardware problems are either corrected by the MAC protocols without intervention by the network administrator, or are narrowed down to the likely location of the cabling or MAU that is failing.

Identifying hard and soft errors

Once inserted into the ring, each node is responsible for monitoring its own performance and logging any errors it detects. It periodically reports any errors to the Ring Error Monitor (REM). This is a function on the network that collects errors reported by each active adapter and uses this information to detect, diagnose, and correct conditions that degrade network performance. The REM is usually incorporated in a token-ring management program. The errors reported are either localized or generalized. Local ("isolating") errors define a limited fault domain, while general ("non-isolating") errors indicate an error whose source cannot be located with certainty or accuracy. In most cases, all the errors are reported in a single MAC frame called a Report Soft Error MAC frame.

All isolating errors indicate a problem in the reporting node, its NAUN, or the cabling and access units between them. They include:

Non-isolating errors indicate a problem at an undefined place on the ring and include:

Some non-isolating errors are relevant to the reporting node in particular. For example, congestion errors suggest that a node's performance is insufficient to cope with the traffic.

Because each node reports all detected errors, potential problems with servers, bridges, or high-performance nodes can therefore be identified and dealt with before users notice that something is wrong. In addition, cabling faults sometimes appear intermittently at first and show up as soft error reports before there are problems on the ring, allowing the fault to be fixed before it becomes critical.

Network management programs

The wealth of management functions built into 802.5 token rings creates opportunities to continuously monitor the ring, pinpoint faults before they can cause a problem, and manage the ring. Consequently, network management programs have been developed that take advantage of these MAC-level capabilities and provide the network administrator with a much greater ability to keep the network running smoothly by maximizing performance and minimizing downtime.

A network management program uses MAC frames to query or control the nodes on a ring. It can interpret the queries and display the results to the network administrator in a useful and readily understandable way. An example of an easy-to-use network management program is Madge's Ring Manager II.

There are three pairs of MAC frames used by a network management program to obtain information from individual nodes:

A network management program can compile an up-to-date list of inserted nodes, and once the list exists, the network administrator can watch nodes joining or leaving the ring and check the order of the nodes on the ring.

A network management program can track the current active monitor by detecting the Report New Active Monitor frame that is issued by a node if it becomes the new active monitor. The program can track the order of inserted nodes on the ring using the Report New NAUN MAC frame issued by any node that detects a different NAUN immediately upstream of it. A network management program can handle fault isolation as follows:

If a node is causing problems on the network or violating network security, the network management program can remove it from the ring by issuing a Remove Ring Station MAC frame.

The comprehensive management capabilities inherent in IEEE 802.5 token-ring networks and the sophisticated network management programs that are being developed to use these capabilities show that the 802.5 token ring is the robust LAN for enterprise-critical networking.