One of the most critical responsibilities for network engineers is to ensure that the network is routing packets as intended. IP routing protocols, including BGP, the routing protocol of the Internet, and Interior Gateway Protocols, such as OSPF, IS-IS or EIGRP, are complex and network engineers must understand them to manage them. Every type of organization – from small to large enterprises to network operators and service providers – depend on routing protocols for data delivery between branches, head offices, data centers or co-location sites. A routing issue that lasts for a few seconds can interrupt stock trading, banking transactions, e-commerce, video streaming, or a VoIP call. This means that operators must give utmost importance to designing, operating and managing their IP networks.
Why is managing routing behavior so difficult? There are several factors that complicate routing and add to the management challenge. We have discussed several causes of routing issues in our blog, such as change management, hardware failures, and configuration errors. Human error often plays a part, but there are other factors too. Let us look at some of the ones that make managing IP routing so complex and challenging.
The advent of any-to-any networks, where application traffic can flow in any direction, led to an increase in the number of meshed networks and the use of dynamic IP routing. When traffic can take any path from source to destination with routing decisions being made on a hop-by-hop basis, network engineers lose visibility into the actual path that traffic takes. This makes it difficult to achieve objectives such as load balancing and end-to-end Quality of Service. Add to this dynamism transitory routing error conditions that cause frequent path changes, and service assurance becomes really difficult.
In IP networks, routers make the decisions on how to forward packets. This means that when there is a service delivery problem, network engineers must first find all the hops along the path for the service and then query each router to find the end-to-end path and troubleshoot the issue. The network does not provide a single repository from which engineers can see the entire network’s routing plan.
Even small changes in the network topology or performance can result in major impacts on service delivery. For example, if a link along a path fails and a backup link takes over, then all routers in the network will update their forwarding tables with the new path. When this happens in a network with hundreds of routers, the time it takes for protocol convergence can cause a loss or delay in the delivery of data packets and contribute to network performance degradation.
SNMP-based tools show the health and performance of network hardware, and NetFlow and other traffic flow analyzers show the volume and composition of traffic. But these traditional network management and monitoring solutions, along with ping, traceroute and other CLI commands, do not provide real-time, global visibility into dynamically changing routing behavior and the paths that critical services take across the network. While there are tools that capture network topology snapshots, in dynamic networks this data can become stale quickly. And for some network applications, real-time information is critical. After all, milliseconds can equate to millions of dollars.
Route analytics technology provides visibility into the control plane. This can be done by querying the routers periodically to capture their configuration data and construct the network topology. However, as mentioned above, periodic discovery may not be adequate when real-time visibility is needed, as is the case with some SDN automation applications. Real-time Route analytics technology records the live IGP and BGP protocol messages shared between routers to build and maintain and always-accurate network topology model of all active routing paths. This real-time telemetry can be stored and used to create a live network topology map showing all routing paths, and for troubleshooting, historical analysis and planning purposes.
Real-time route analytics technology demystifies complex routing behavior and can help network teams successfully tackle a number of management challenges, including those listed above. For example, they can determine if traffic is taking the least desirable path in search of the lowest cost path, if all expected customer VPN prefixes are being advertised, the reason for unplanned deviation in routes for specific services, the convergence time after network events, the cause of intermittent routing issues that would otherwise go undetected and unresolved, and many others.
Real-time routing telemetry and analytics are widely used today by network engineering, planning and operations teams to assure service delivery, optimize networks for performance and resiliency, and mitigate the risk from changes. As the industry embraces SDN and NFV automation to create adaptive, self-healing and self-optimizing networks, the same real-time telemetry and analytics will provide the intelligence to power resource and service orchestrators.