TechNote
Number: 08103-04
Date: October 1994
The management capabilities of Token Ring
Token-ring networks include a complex set of low-level management features, which are
used to control and monitor enterprise-wide network systems.
Basic network management functions, such as maintaining the token-passing system or
handling nodes as they enter or leave the network, as well as the more sophisticated
functions, such as identifying and correcting fault and load problems, are distributed among
the nodes on the network, rather than being handled by a central host.
MAC frames
The management capabilities are designed into the Medium Access Control (MAC) frames
that are built into every 802.5 token-ring adapter. As a result, the node does not require
special hardware or software to benefit from them, regardless of the mix of network protocols
or manufacturers' equipment used on the network.
The MAC frames address four basic functions:
- Maintaining the ring
- Locating and isolating cable and hardware faults
- Identifying hard and soft errors
- Providing node status information to management applications
Maintaining the ring
The maintenance of token-ring systems consists of initializing the ring (starting the token),
and dealing with lost, corrupt or permanently circulating tokens or frames. These are all
events that would otherwise degrade the operation of the ring. To overcome this problem,
one node is elected as the active monitor.
The active monitor can be any node on the ring. It is selected dynamically by all the nodes
on the ring through a process called Monitor Contention. This process is transparent to users
and occurs whenever the current active monitor leaves the ring or develops an error. When
this happens, each node that wants to become the new active monitor transmits a Claim
Token MAC frame. The node with the highest address wins. The new active monitor
transmits a Ring Purge MAC frame, which is used to reset the ring. The other nodes on the
ring assume the role of standby monitors, each ready to become the new active monitor
should the current active monitor have a problem or leave the ring.
To determine if a token or frame is lost, the active monitor uses a valid-frame timer. The
timer is set for a time that is longer than the time it should take for a token or frame to go
around the ring. If the time elapses without a frame or token being detected, the active
monitor resets the ring and sends out a new token.
To detect a permanently circulating frame, the active monitor sets a bit in the Access Control
field of each frame the first time it passes by. If it sees a frame with the bit already set, it
concludes that the node that originally sent the frame failed to remove it from the ring and
release a new token. The active monitor sends out a Ring Purge MAC frame to reset the ring
and issues a new token.
The active monitor transmits an Active Monitor Present (AMP) MAC frame every seven
seconds. This is used by the other nodes to check that the active monitor is working and to
determine who their upstream neighbours are by a scheme called neighbour notification (also
known as Ring Polling). This scheme works as follows. The first node to receive the AMP
frame stores the active monitor's address, since this is its upstream neighbour. The AMP
frame continues through all the other nodes on the ring, but they only note that the active
monitor is still working. The first node then transmits a Standby Monitor Present (SMP) MAC
frame when it next receives the token. In the same way as before, the next node downstream
receives the SMP frame, stores the address of its upstream neighbour, and then transmits an
SMP frame of its own. This process continues round the ring until the active monitor
eventually receives an SMP frame from its upstream neighbour, at which point every node
has registered who its Nearest Active Upstream Neighbour (NAUN) is.
If the active monitor does not receive an SMP from its NAUN, it transmits a Report
Neighbour Notification Incomplete MAC frame to the network manager. This contains the
address of the node which last transmitted an AMP or SMP, therefore indicating that the
node immediately downstream of this node is faulty.
If the active monitor receives an AMP from another node, it transmits a Report Active
Monitor Error MAC frame, relinquishes its role, and becomes a standby monitor. The new
active monitor then transmits a Ring Purge MAC frame to reset the ring.
Problems with LANs sometimes occur when nodes join the network. While most network
adapters perform some form of low-level hardware test before they become active, 802.5
token-ring adapters conduct an extensive series of tests before they join a ring. The test
process consists of three steps:
- Testing the adapter's circuitry by verifying that the token-ring chipset is functioning
properly and that all its protocol-handling hardware and firmware can perform not only basic
token-ring operations, but also the array of sophisticated management functions that are built
into the token-ring chips on the adapter.
- Testing the adapter's on-board transmit and receive circuitry through an internal
loopback.
- Testing the lobe cable, which at this stage is isolated from the network and looped
back within the hub at the wiring closet.
Each node on a token-ring network is identified by a unique number - a 48-bit, IEEE-assigned
node address. When a node joins the ring, it must check that the node address it intends to
use is not already being used by any other node. To do that, it sends out a Duplicate Address
Test MAC frame to the node address it wants to use. If another node receives the frame, that
node address is already in use. When this happens, the node trying to join the ring removes
itself and reports the error to the user. It can then make another attempt to join the ring using
a different locally-administered address.
Whenever nodes join or leave the ring, additional MAC frames appear on the ring. This is
because inserting or removing a node involves switching relays, which temporarily disrupts
the token-ring signal. This does not affect the data integrity, because all data on the ring is
protected by a complex four-byte Cyclic Redundancy Check (CRC), and any data frames
that are affected are retransmitted.
The next time the Ring Poll process is carried out, any node that detects a different NAUN
immediately upstream of it reports the new NAUN in a Report New NAUN MAC frame. This
enables the network manager to detect nodes joining or leaving the ring.
Locating and isolating cable and hardware faults
Because the nodes on a token-ring network transmit directly from one node to the next, a
node that does not receive a valid electrical signal from its NAUN can deduce that it or its
NAUN is therefore faulty. This will happen if there is a cabling fault. The node detecting the
error transmits a beacon MAC frame to all the nodes on the ring, including the faulty one,
warning them that the ring service is suspended and giving the address of the faulty node.
When a node sees a beacon frame concerning itself, it assumes that it may be faulty and
isolates itself from the ring. The node then tests itself and its lobe cable. The test consists of
transmitting Lobe Test MAC frames down the lobe cable, which tests both the adapter and
the cable. If the node fails the test, it does not rejoin the ring and the ring recovers. If it
passes the test, it rejoins the ring. If it rejoins the ring, after a certain period of time, the node
that initially detected the error isolates itself from the ring and tests itself. If it fails the test, it
does not rejoin the ring and the ring recovers. If it passes, it means that the fault lies either in
the trunk cabling between the two nodes or in a Multistation Access Unit (MAU).
In many cases, cabling or hardware problems are either corrected by the MAC protocols
without intervention by the network administrator, or are narrowed down to the likely location
of the cabling or MAU that is failing.
Identifying hard and soft errors
Once inserted into the ring, each node is responsible for monitoring its own performance and
logging any errors it detects. It periodically reports any errors to the Ring Error Monitor
(REM). This is a function on the network that collects errors reported by each active adapter
and uses this information to detect, diagnose, and correct conditions that degrade network
performance. The REM is usually incorporated in a token-ring management program. The
errors reported are either localized or generalized. Local ("isolating") errors define a limited
fault domain, while general ("non-isolating") errors indicate an error whose source cannot be
located with certainty or accuracy. In most cases, all the errors are reported in a single MAC
frame called a Report Soft Error MAC frame.
All isolating errors indicate a problem in the reporting node, its NAUN, or the cabling and
access units between them. They include:
- Line and burst errors.
- Internal errors. These are recoverable errors that a node detects within itself.
- Access control errors. These suggest that the node's NAUN is failing since it is
unable to correctly set certain bits in the access control field of a frame.
- Abort delimiter errors. These indicate the corruption of frames between the node's
NAUN and the node itself.
Non-isolating errors indicate a problem at an undefined place on the ring and include:
- Lost frame errors. These are caused either when nodes join or leave the ring, or if
there is a cabling fault.
- Congestion errors. These indicate that a node's data buffers are full. This can be
caused either by using an under-powered node in a server or gateway, or by a failure of the
node's on-board or host-resident software.
- Duplicate address errors. These indicate that two or more nodes on the ring are
using the same address. Each node must have a unique address.
- Frequency errors. These point to a problem with the frequency of the signal on the
ring. This indicates a problem with this node's or the active monitor's crystal oscillator or
phase-locked loop circuitry.
- Token errors. These are usually caused by nodes joining or leaving the ring, but they
can also indicate cable faults.
Some non-isolating errors are relevant to the reporting node in particular. For example,
congestion errors suggest that a node's performance is insufficient to cope with the traffic.
Because each node reports all detected errors, potential problems with servers, bridges, or
high-performance nodes can therefore be identified and dealt with before users notice that
something is wrong. In addition, cabling faults sometimes appear intermittently at first and
show up as soft error reports before there are problems on the ring, allowing the fault to be
fixed before it becomes critical.
Network management programs
The wealth of management functions built into 802.5 token rings creates opportunities to
continuously monitor the ring, pinpoint faults before they can cause a problem, and manage
the ring. Consequently, network management programs have been developed that take
advantage of these MAC-level capabilities and provide the network administrator with a
much greater ability to keep the network running smoothly by maximizing performance and
minimizing downtime.
A network management program uses MAC frames to query or control the nodes on a ring. It
can interpret the queries and display the results to the network administrator in a useful and
readily understandable way. An example of an easy-to-use network management program is
Madge's Ring Manager II.
There are three pairs of MAC frames used by a network management program to obtain
information from individual nodes:
- A Request/Report Ring Station Address MAC frame obtains a node's group and
functional addresses, its physical location, and the node address of its NAUN.
- A Request/Report Ring Station State MAC frame obtains a node's internal status and
microcode revision level.
- A Request/Report Ring Station Attachments MAC frame obtains a node's access
priority, its functional address and its enabled function classes (such as Ring Error Monitor or
Bridge), and its product ID.
A network management program can compile an up-to-date list of inserted nodes, and once
the list exists, the network administrator can watch nodes joining or leaving the ring and
check the order of the nodes on the ring.
A network management program can track the current active monitor by detecting the Report
New Active Monitor frame that is issued by a node if it becomes the new active monitor. The
program can track the order of inserted nodes on the ring using the Report New NAUN MAC
frame issued by any node that detects a different NAUN immediately upstream of it.
A network management program can handle fault isolation as follows:
- Transmit Forward MAC frames are sent by the program to test whether one node on
a ring can communicate with another. This enables the network administrator to test not only
the two nodes involved in the process, but also the path between them, including any
bridges. The ability to perform this kind of testing without running special software in the
transmitting or receiving nodes greatly simplifies fault isolation.
- The network management program uses the Report Soft Error MAC frames sent to it
to catch errors as they happen and log them against each reporting node and its NAUN.
- The network management program uses any beacon MAC frames transmitted to
determine the fault domain and report this to the system administrator together with
suggested actions to correct the fault if the ring cannot recover automatically.
If a node is causing problems on the network or violating network security, the network
management program can remove it from the ring by issuing a Remove Ring Station MAC
frame.
The comprehensive management capabilities inherent in IEEE 802.5 token-ring networks
and the sophisticated network management programs that are being developed to use these
capabilities show that the 802.5 token ring is the robust LAN for enterprise-critical networking.