“Evolution is Inevitable”, it is a basis of the present and obviously will be for the coming future, so let us talk a bit on Network Telemetry evolution against the evolving world of SDN!
With the increasing demands, enterprises and data centers are looking up towards new ways and solutions such as SDN solutions and NFV to make the existing networks more reliable, redundant, effective, and profitable. Such adoptions would demand more sophisticated and flexible telemetry mechanisms for monitoring, maintaining and troubleshooting the networks, particularly for data centers and large enterprise networks. Obviously, legacy monitoring approaches would serve good for gathering many insightful data, but a few mandatory network monitoring needs of growing networks won’t suffice with those traditional approaches.
Requirements of growing networks:
- Gathering real-time data
In traditional methods, data usually gets polled at a few seconds from the networking device, whereas, the traffic runs at the rate of GBs or MBs from that device. Just imagine, how much real-timeliness can be achieved here!
- Having 360⁰ visibility of the Network
Here every network device communicates explicitly to monitoring manager, and hence to identify the flow taken by a packet in the network would be difficult and will be restrictive to gather many insights of the actual network behavior.
- Looking up towards Machine Learning
ML is also stepping up in the networking world, however, the real significance can only be achieved if the instantaneous exceptions can be identified and quickly acted upon to have reliability and stability. Legacy methods won’t have much scope in this direction – it would be more understandable at reaching the end of this blog!
- Delivering high flexibility
Networking resources itself were little rigid in the past and were giving a little scope of enhancement to traditional methods. Whereas, in current trends, networking is more focused on programmability and redundancy for both hardware and software, which would naturally bring better strategies to consider the monitoring aspects.
Packet Level Network Telemetry:
As name indicates, it is all about carrying and gathering the telemetry information with the data packets traversing through the network. This tactic is being utilized within in-situ OAM (IOAM) and In-band Network Telemetry (INT), as well as in an alternate marking performance measurement (AM-PM) context. Well, In-band Network Telemetry (INT) has become much popular in the outer world as the telemetry data are being piggybacked at the line-rate along with the usual traffic of the network.
As illustrated in the diagram above, a source end-point embeds instruction (INT Header) in packets listing the types of network state to be collected from the network elements. After that, each network element inserts the requested network state (INT Payload) in the packet as it traverses through the data path of the network, and when the packets reach the last node, all the cascaded network states would be decoded and analyzed to get the necessary insights of the entire flow of the packet traversal. The end user can configure source and sink endpoints, flows of the network, and more by which better and insightful data can be gathered from the network. Let us look at the obvious possibilities of packet monitoring brought up with this approach.
Key benefits of Packet-Level Telemetry:
- Inflated Latencies and Congestion Analysis:
While traversing through the source to sink, nodes in the network would append the timestamp at which packet has ingressed and egressed. By decoding this data, the latency within a node and between two nodes can be easily identified.
Once latencies over the flow taken in the network are calibrated with low traffic, congestion can be easily identifiable as the latencies would be relatively high when the traffic is high.
- Network Topology and Packet Traversals:
If every node is instructed to append its identity i.e. a port on which the packet is ingressed and a port on which the packet is egressed, then topology can be easily derived illustrating the path taken by the packet. By configuring multiple sources and sink nodes with multiple packet traversals can lead to capture 360⁰visibility of the network topology.
- Timeliness and Flexibility for Exceptions:
The data are being captured from the network element with traffic on line-rate, which is obviously the fastest way to identify the crucial exceptions.
In the upcoming trends, network processor ASICs are also coming up with the support of generating and mirroring a packet on exception having insightful data. This is an altogether different taste of packet-based telemetry approach but offers best-in-class flexibility and timeliness.
- Doorway to Machine Learning:
The first need for ML is to learn about the possible behavior at the earliest, and this approach fulfills the same by providing very real-time notification of the events.
As mentioned in the first benefit, latencies can lead to identifying the network congestion points. Similarly, ML algorithms can be established which can help to predict the congestive conditions and aid in resolving them. This is a single use case where ML can help in protecting important or elephant flows.
The way the networking industry is evolving, the time is not far where ML would open new doors for security, redundancy, and reliability within the network!
Thus, ‘Packet Level Network Telemetry’ enables to gather data along with the actual traffic, giving the ability to observe real-time, end-to-end network behavior across the network infrastructure. This would aid network administrator to describe transient issues that arise due to performance bottlenecks, network failures, or configuration errors. Read more
About Author: Aalok Shah
Aalok Shah is associated with VOLANSYS Technologies as a Sr. Engineer. He has served in multiple industry verticals and worked upon many tools and technologies till now in his journey. Being passionate, he always looks forward to the opportunities to bring better solutions to the table.