What else was affected by the outage?
In previous examples we have shown how Route Explorer canpinpoint the exact nature of an outage so that the appropriate correctiveaction can be taken with minimum loss of time. In this example we show how post-mortem analysis of theoutage can reveal information that can help you engineer your network better.
Many routing outages can lead to cascade failures as trafficload is shifted from the failing router or link to other routers and links. Figure1illustrates such an outage. Thefour screen shots from Route Explorer are spaced 10 minutes apart and show theprogression of an outage among 3 core routers. Upper-left: 2:34 pm: the 3 corerouters in a POP, prior to cascade outage. Upper right: 2:40 pm. First corerouter down. Lower left: 2:50 pm.Second core router down, First router coming up. Lower right: 3:00 pm. Both core routers are back up.
![]()
Figure 1
During event diagnosis and post-mortem analysis RouteExplorer can reveal much finer grain detail on this outage. Figure2illustrates the events list for the entire time period of the outage.
![]()
Figure 2
The routing engineer may “replay” the event list, a singleevent at a time to show the structure and dependencies of the failures. The entry in red marks the next eventto be played. The “Execute” buttonat the bottom replays a single event with each click. The topology map is updated with each event playback. A down link event is marked by a red“X” in the topology map, an up event is marked by a green dot.
Some of the fine grain details revealed by Route Explorer:
- Two of the core routers flapped their adjacencies with 2 other core routers several times just prior to going down.
- All three core routers experienced total or partial outages and severe flapping of their interfaces throughout the outage.
- The outage lasted for 23 minutes.
Fine grain routing analysis from Route Explorer’s eventhistory along with ancillary data (such as MRTG traffic graphs) can reveal thefailure dependencies in your network. This information can help you diagnose a repeating cascade failure or toprevent it in the future.
HOW TO:
- Open an X Windows or VNC session to the Route Explorer.
- Open topology and open History Navigator (see above)
- Show the event list for the time period of interest (see above)
- To replay history, a single event at a time:
- Position red cursor in the History Navigator window timeline to where you wish to start replaying individual events. The “current” event (next event to be played and closest to the selected time) will be highlighted in red text.
- If you don’t see the red event but the Show Current button is active, clicking that button will reposition the window to the time in the event list where the current event will be displayed.
- To position the current event to any event in the list simply click the right mouse button on that event and click “Take Time Here”.
- If you don’t see the current event and if the Show Current and Execute buttons are inactive (shown in grey color), the current event is not in the first 1000 events of the period. To add the next 1000 events to the list, click the More button. Clicking this button will add another 1000 events each time.
- To execute the current event and see its effect on the topology, click Execute. Ensure that the topology window is visible.
- Continue clicking Execute to single step through the event history.
Copyright © 2003-2004. Packet Design, Inc.
http://www.packetdesign.com