Networking Heart

Wednesday 5 July 2023

LISP vs EVPN: Mobility in Campus Networks

TL&DR: The discussion on whether “LISP scales better than EVPN” became irrelevant when the bus between the switch CPU and the adjacent ASIC became the bottleneck. Modern switches can process more prefixes than they can install in the ASIC forwarding tables (or we wouldn’t be using prefix-independent convergence).

Now, let’s focus on the dynamics of campus mobility. There’s almost no endpoint mobility if a campus network uses wired infrastructure. If a campus is primarily wireless, we have two options:

The wireless access points use tunnels to a wireless controller (or an aggregation switch), and all the end-user traffic enters the network through that point. The rest of the campus network does not observe any endpoint mobility.
The wireless access points send user traffic straight into the campus network, and the endpoints (end user IP/MAC addresses) move as the users roam across access points.

Therefore, the argument seems to be that LISP is better than EVPN at handling a high churn rate. Let’s see how much churn BGP (the protocol used by EVPN) can handle using data from a large-scale experiment called The Internet. According to Geoff Huston’s statistics (relevant graph), we’ve experienced up to 400.000 daily updates in 2021, with the smoothed long-term average being above 250.000. That’s around four updates per second on average. I have no corresponding graph from an extensive campus network (but I would love to see one), but as we usually don’t see users running around the campus, the roaming rate might not be much higher.

However, there seems to be another problem: latency spikes following a roaming event.

I have no idea how someone could attribute latency spikes equivalent to ping times between Boston and Chicago to a MAC move event. Unless there’s some magic going on behind the scenes:

The end-user NIC disappears from point A, and the switch is unaware of that (not likely with WiFi).
The rest of the network remains clueless; traffic to the NIC MAC address is still sent to the original switch and dropped.
The EVPN MAC move procedure starts when the end-user NIC reappears at point B.
Once the network figures out the MAC address has moved, the traffic gets forwarded to the new attachment point.

Where’s latency in that? The only way to introduce latency in that process is to have traffic buffered at some point, but that’s not a problem you can solve with EVPN or LISP. All you can get with EVPN or LISP is the notification that the MAC address is now reachable via another egress switch.

OK, maybe the engineer writing about latency misspoke and meant the traffic is disrupted for 20 msec. In other words, the MAC move event takes 20 msec. Could LISP be better than EVPN in handling that? Of course, but it all comes down to the quality of implementation. In both cases:

A switch control plane has to notice its hardware discovered a new MAC address (forty years after the STP was invented, we’re still doing dynamic MAC learning at the fabric edge).
The new MAC address is announced to some central entity (route reflector), which propagates the update to all other edge devices.
The edge devices install the new MAC-to-next-hop mapping into the forwarding tables.

Barring implementation differences, there’s no fundamental reason why one control-plane protocol would do the above process better than another one.

But wait, there’s another gotcha: at least in some implementations, the control plane takes “forever” to notice a new MAC address. However, that’s a hardware-related quirk, and no control-plane protocol will fix that one. No wonder some people talk about dynamic MAC learning with EVPN.

Aside: If you care about fast MAC mobility, you might be better off doing dynamic MAC learning across the fabric. You don’t need EVPN or LISP to do that; VXLAN fabric with ingress replication or SPB will work just fine.

Before doing a summary, let me throw in a few more numbers:

We don’t know how fast modern switches can update their ASIC tables (thank you, ASIC vendors), but the rumors talk about 1000+ entries per second.
The behavior of open-source routing daemons and even commercial BGP stacks is well-documented . Unfortunately, he didn’t publish the raw data, but looking at his graphs, it seems that good open-source daemons have no problems processing 10K prefixes in a second or two.

It seems like we’re at a point where (assuming optimal implementations) the BGP update processing rate on a decent CPU exceeds the FIB installation rate.

Back to LISP versus EVPN. It should be evident by now that:

A campus network is probably not more dynamic than the global Internet;
BGP handles the churn in the global Internet just fine, and there’s no technological reason why it couldn’t do the same in an EVPN-based campus.
BGP implementations can handle at least as many updates as can be installed in the hardware FIB.
Regardless of the actual numbers, decent control-plane implementations and modern ASICs are fast enough to deal with highly dynamic environments.
Implementing control-plane-based MAC mobility with a minimum traffic loss interval is a complex undertaking that depends on more than just a control-plane protocol.

There might be a reason only a single business unit of a single vendor uses LISP in their fabric solution (hint: regardless of what the whitepapers say, it has little to do with technology).

Thursday 1 June 2023

Features of SDM-Based Submarine Cable Systems

The evolution of SDM in submarine networks started with Suboptic’16 (powered by Alcatel-Lucent Submarine Networks (ASN) Ltd., founded in 1983) which was the first 16-FPs submarine cable system [8]. Until recently, in all networks, the initial goal was twofold: to increase the total cable capacity (up to 70% with regard to traditional cable) and to decrease the required cost and power per transmitted bit.

The innovative features that characterize the first generation of SDM submarine networks are [8] as follows.

A relatively high count of FPs (in the same cable) in order to increase the transported capacity.

The deployment of lower effective area fibers in order to optimize cost through the use of a smaller number of regenerators.

The implementation of the novel “pump farming” repeaters’ technology. Pump farming means that a set of pump lasers isused to amplify a set of FPs. Reliability, redundancy, and better power management are the main advantages. In particular, reliability can be a cost-reduction factor as submarine cables’ failures and repairs (bringing downtime in provided services) are very costly.

SDM aims to achieve higher capacities by using the same amount of used power through a more efficient power management. The key concept is to reduce the optical power provided to each FP as a way to decrease the nonlinearities as implementing high count of FPs in the same cable.

However, in the event we want to cover small-distance undersea links (e.g., unrepeated- festoon networks) or if we want to increase the capacity of an existing traditional submarine cable system (consisting of a limited number of FPs), multiband transmission is a more effective solution compared with SDM, which is mostly preferable for long-haul distance links. More specifically, in multiband transmission there is no need to change the existing wet plant infrastructure during the upgrade process and this can result in both an increase (by double) of the capacity/FPs and in cost savings.

Table 2 presents the pros and cons of using SDM over a single band as opposed to multiple bands. The options presented are doubling the number of fibers at C band only and using the same number of FPs over the C + L bands. Note however that the C + L transmission is less efficient because C + L has to be separated and recombined (mux/de- mux) in the repeater for each span. This extra multiplexing/de-multiplexing leads to an extra loss per span of about a few dBs, which is contrary to the “optimizing efficiency” basic SDM concept.

Figure 6 shows the different types of submarine cables and the various types of fibers, respectively. The selection of the optimal cable type depends on the depth at which each cable is sinked. For example, double-armored (DA) submarine cable is used at the shore end, terminated at the beach manhole of the cable landing site, and interconnected with a much lighter land cable (LWA) moving toward the cable landing station.

Saturday 21 January 2023

Basics to GDB Debugging

● 1. We see that there is a core dump happening:

●

● 2. We see what the core dump tells:

●

● 3. Disassemble

● The arrow tells us the instruction where the core dump was generated

●

● 4. Print address

●

● 5. Print address

●

● 6. Print the character at address (if it was a character)

●

● 7. To see the contents of registers- info registers

●

● 8. To see which function called which function and so on till the current function- use ‘backtrace (bt)’ or ‘info stack’ or ‘where’

●

● :

●

● 10. To see the frame info of the latest stack frame from above- info frame

●

Now,

ESP is the current stack pointer. EBP is the base pointer for the current stack frame.

When you call a function, typically space is reserved on the stack for local variables. This space is usually referenced via EBP (all local variables and function parameters are a known constant offset from this register for the duration of the function call.) ESP, on the other hand, will change during the function call as other functions are called, or as temporary stack space is used for partial operation results.

● We need to know who had called us and what were the parameters given to me (the current function)

●

○ So, ebp+ 0 will give us the previous ebp

○ We can know the ebp address from the info registers

○ Usually the value of ebp in info registers (which is the current value of the ebp) will be same as the info frame since it will be pointing to the local variables of current function.

● Now, ebp + 4 contains the return instruction where we have come from

○ Example:

○

○ To see the above output to print 10 addresses, in hexadecimal format and in word length(.ie. 4 bytes, 8 bytes, …):

○ Example:

○

○ In that example, we see who has called us is this:

○

● In example, we can tell that the current function has been called from connect() function using:

○

● Now, we can do disassemble connect+176:

○

○ So, we see that the instruction before 176 has a function call to mystrcpy()

Thursday 15 December 2022

[BGP] Significant Changes to BGP in Recent Years

Border Gateway Protocol (BGP) is a fundamental routing protocol that is responsible for routing Internet traffic between Autonomous Systems (ASes). BGP is used to exchange routing information between routers on the Internet, and it determines the best path for network traffic to follow. In recent years, BGP has undergone several changes, with new features added to improve its performance and security. In this article, we will discuss some of the newer features added to BGP in recent years.

One of the most significant changes to BGP is the addition of BGPsec. BGPsec is a security extension to BGP that provides secure routing by adding digital signatures to BGP updates. This ensures that BGP routing information is authentic and has not been tampered with, preventing attackers from hijacking traffic or redirecting it to a malicious destination. BGPsec is now widely deployed, and it is essential in ensuring the security and integrity of BGP routing information.

Another important addition to BGP is the support for Multiprotocol BGP (MP-BGP). MP-BGP allows BGP to support routing information for multiple protocols, such as IPv4 and IPv6, as well as other network layer protocols like MPLS. This provides greater flexibility and scalability in routing, allowing BGP to handle the increasing demands of modern networks.

BGP Flowspec is another feature that has been added to BGP in recent years. BGP Flowspec is a traffic filtering mechanism that allows network operators to specify how traffic should be treated based on specific characteristics, such as the source or destination IP address, the type of traffic, or the application used. This allows network operators to block or rate-limit traffic that is considered undesirable, such as traffic from known sources of DDoS attacks.

BGP Large Communities is another recent addition to BGP. BGP Large Communities is an extension to BGP that allows network operators to attach additional metadata to BGP routing updates. This metadata can be used for a wide range of purposes, such as filtering, traffic engineering, or monitoring. BGP Large Communities is particularly useful in large networks where the routing table can be very large, and it provides a more efficient way to manage routing updates.

BGP Link State is another recent addition to BGP that provides a more scalable way to handle routing information. BGP Link State is based on the same principles as the OSPF and IS-IS routing protocols, where routers maintain a database of link-state information, and routing decisions are made based on this information. BGP Link State can handle larger networks with more complex routing requirements, providing better scalability and efficiency in routing.

BGP Add-Path is a feature that allows BGP to advertise multiple paths for the same destination prefix. This provides greater redundancy and load balancing, allowing traffic to be distributed more evenly across multiple paths. BGP Add-Path is particularly useful in networks with high traffic volumes or where link failures are common.

Finally, BGP Route Refresh is a feature that allows BGP routers to refresh their routing tables without tearing down BGP sessions. This provides a more efficient way to handle routing updates, as BGP sessions do not need to be reset each time the routing table is updated. BGP Route Refresh is particularly useful in large networks with many BGP sessions, where resetting BGP sessions can be a time-consuming and disruptive process.

In conclusion, BGP has evolved over the years to become a more robust, secure, and flexible protocol, thanks to the addition of new features and improvements. Network operators can now benefit from advanced features like BGP Flowspec, BGP-LS, and BGPsec, to enhance their network's security, scalability, and resiliency. The combination of BGP with SDN technologies can further enhance network automation and programmability, making it easier to manage large-scale networks.

Thursday 3 November 2022

OSPF vs ISIS- Which Routing Protocol for your needs?

OSPF and ISIS are two popular link-state routing protocols used in large-scale networks. Both protocols have similarities and differences in terms of their design, features, and operation.

OSPF (Open Shortest Path First) is a well-established protocol that has been in use for over 30 years. It operates at the Internet layer (Layer 3) and uses a hierarchical design that partitions the network into areas. Each area has its own topology database, which reduces the size of the network's routing table and enhances scalability. OSPF also supports multiple paths to a destination, allowing for load balancing and redundancy.

ISIS (Intermediate System to Intermediate System) is a protocol that operates at the Network layer (Layer 2) and is used in large-scale Service Provider networks. It also uses a hierarchical design similar to OSPF, but instead of areas, it uses levels. Each level has its own link-state database, which reduces the size of the network's routing table and improves scalability. ISIS is also known for its fast convergence and support for large networks with high-speed links.

One significant difference between OSPF and ISIS is their underlying protocol. OSPF uses IP packets to exchange routing information, while ISIS uses a protocol called CLNS (Connectionless Network Service). This difference can impact the protocol's behavior and performance, depending on the network's architecture and requirements.

Another difference is the way they handle metric calculations. OSPF uses a metric called cost, which is based on the link's bandwidth. In contrast, ISIS uses a metric called metric value, which is based on the link's speed and delay. This difference can affect how the protocol selects the best path to a destination and can impact network performance and behavior.

Additionally, OSPF and ISIS have different default behavior for load balancing. OSPF supports equal-cost multipath (ECMP) by default, which allows for multiple paths to a destination with the same cost. In contrast, ISIS does not support ECMP by default, and network operators must configure it manually. This difference can impact how network operators design their network for load balancing and redundancy.

Furthermore, OSPF and ISIS have different mechanisms for network convergence. OSPF uses a process called Dijkstra's algorithm to calculate the shortest path to a destination, while ISIS uses a technique called SPF (Shortest Path First) calculation. Both mechanisms ensure network convergence and path selection, but their implementation and performance can vary based on network topology and traffic patterns.

In terms of security, both OSPF and ISIS support authentication to prevent unauthorized access and attacks. OSPF supports several types of authentication, including plaintext, MD5, and IPsec, while ISIS supports only plaintext authentication. This difference can impact network security and how network operators secure their network.

Finally, OSPF and ISIS have different deployment and support options. OSPF is widely supported by many vendors and is commonly used in enterprise networks, while ISIS is primarily used in Service Provider networks and is supported by fewer vendors. This difference can impact how network operators choose their routing protocol based on their network's architecture and requirements.

In conclusion, OSPF and ISIS are both link-state routing protocols that offer similar features and benefits but have significant differences in their design, behavior, and operation. Network operators must carefully evaluate their network's requirements and architecture to choose the best protocol for their needs.