Networking Heart

Tuesday, 8 August 2023

Elastic Stack for Network and Security Engineers

In recent years network engineers turned from CLI jockeys into a hybrid between an application developer and a networking expert… what acronym-loving people call NetDevOps. In practice, a NetDevOps engineer should be able to:

manage and troubleshoot networks;
helps to troubleshoot any information system issue (including “slow” applications);
automate networking tasks;
monitor network and application performance;
continue auditing the infrastructure, eventually using (partial) automation to make it less time-consuming;

and do everything else not covered by other IT teams.

To get there, NetDevOps started to learn Linux, Python, and automation frameworks. In this article, we’ll add log management to the mix.

Log management is the ability to collect any event from information systems and get them automatically analyzed to help NetDevOps react faster to information system issues.

There are many commercial or open-source log management platforms; I would mention:

Each one of these has a different focus: Graylog was born as a log management solution, Elastic Stack is a Big Data solution, and InfluxDB is a time-series database.

We won’t go into discussing the pros and cons of these products. There are already plenty of blog posts doing that.

I chose to work with Elastic Stack because of its:flexibility: I’m able to inject logs from almost any device I encountered;
scalability: I can manage hundreds or thousands of log messages per seconds without any issue;
integration: I’m able to build a very robust solution that includes other open-source components.
vision: I honestly like where Elastic company is going, and I agree with its vision.

Wednesday, 5 July 2023

LISP vs EVPN: Mobility in Campus Networks

TL&DR: The discussion on whether “LISP scales better than EVPN” became irrelevant when the bus between the switch CPU and the adjacent ASIC became the bottleneck. Modern switches can process more prefixes than they can install in the ASIC forwarding tables (or we wouldn’t be using prefix-independent convergence).

Now, let’s focus on the dynamics of campus mobility. There’s almost no endpoint mobility if a campus network uses wired infrastructure. If a campus is primarily wireless, we have two options:

The wireless access points use tunnels to a wireless controller (or an aggregation switch), and all the end-user traffic enters the network through that point. The rest of the campus network does not observe any endpoint mobility.
The wireless access points send user traffic straight into the campus network, and the endpoints (end user IP/MAC addresses) move as the users roam across access points.

Therefore, the argument seems to be that LISP is better than EVPN at handling a high churn rate. Let’s see how much churn BGP (the protocol used by EVPN) can handle using data from a large-scale experiment called The Internet. According to Geoff Huston’s statistics (relevant graph), we’ve experienced up to 400.000 daily updates in 2021, with the smoothed long-term average being above 250.000. That’s around four updates per second on average. I have no corresponding graph from an extensive campus network (but I would love to see one), but as we usually don’t see users running around the campus, the roaming rate might not be much higher.

However, there seems to be another problem: latency spikes following a roaming event.

I have no idea how someone could attribute latency spikes equivalent to ping times between Boston and Chicago to a MAC move event. Unless there’s some magic going on behind the scenes:

The end-user NIC disappears from point A, and the switch is unaware of that (not likely with WiFi).
The rest of the network remains clueless; traffic to the NIC MAC address is still sent to the original switch and dropped.
The EVPN MAC move procedure starts when the end-user NIC reappears at point B.
Once the network figures out the MAC address has moved, the traffic gets forwarded to the new attachment point.

Where’s latency in that? The only way to introduce latency in that process is to have traffic buffered at some point, but that’s not a problem you can solve with EVPN or LISP. All you can get with EVPN or LISP is the notification that the MAC address is now reachable via another egress switch.

OK, maybe the engineer writing about latency misspoke and meant the traffic is disrupted for 20 msec. In other words, the MAC move event takes 20 msec. Could LISP be better than EVPN in handling that? Of course, but it all comes down to the quality of implementation. In both cases:

A switch control plane has to notice its hardware discovered a new MAC address (forty years after the STP was invented, we’re still doing dynamic MAC learning at the fabric edge).
The new MAC address is announced to some central entity (route reflector), which propagates the update to all other edge devices.
The edge devices install the new MAC-to-next-hop mapping into the forwarding tables.

Barring implementation differences, there’s no fundamental reason why one control-plane protocol would do the above process better than another one.

But wait, there’s another gotcha: at least in some implementations, the control plane takes “forever” to notice a new MAC address. However, that’s a hardware-related quirk, and no control-plane protocol will fix that one. No wonder some people talk about dynamic MAC learning with EVPN.

Aside: If you care about fast MAC mobility, you might be better off doing dynamic MAC learning across the fabric. You don’t need EVPN or LISP to do that; VXLAN fabric with ingress replication or SPB will work just fine.

Before doing a summary, let me throw in a few more numbers:

We don’t know how fast modern switches can update their ASIC tables (thank you, ASIC vendors), but the rumors talk about 1000+ entries per second.
The behavior of open-source routing daemons and even commercial BGP stacks is well-documented . Unfortunately, he didn’t publish the raw data, but looking at his graphs, it seems that good open-source daemons have no problems processing 10K prefixes in a second or two.

It seems like we’re at a point where (assuming optimal implementations) the BGP update processing rate on a decent CPU exceeds the FIB installation rate.

Back to LISP versus EVPN. It should be evident by now that:

A campus network is probably not more dynamic than the global Internet;
BGP handles the churn in the global Internet just fine, and there’s no technological reason why it couldn’t do the same in an EVPN-based campus.
BGP implementations can handle at least as many updates as can be installed in the hardware FIB.
Regardless of the actual numbers, decent control-plane implementations and modern ASICs are fast enough to deal with highly dynamic environments.
Implementing control-plane-based MAC mobility with a minimum traffic loss interval is a complex undertaking that depends on more than just a control-plane protocol.

There might be a reason only a single business unit of a single vendor uses LISP in their fabric solution (hint: regardless of what the whitepapers say, it has little to do with technology).

Thursday, 1 June 2023

Features of SDM-Based Submarine Cable Systems

The evolution of SDM in submarine networks started with Suboptic’16 (powered by Alcatel-Lucent Submarine Networks (ASN) Ltd., founded in 1983) which was the first 16-FPs submarine cable system [8]. Until recently, in all networks, the initial goal was twofold: to increase the total cable capacity (up to 70% with regard to traditional cable) and to decrease the required cost and power per transmitted bit.

The innovative features that characterize the first generation of SDM submarine networks are [8] as follows.

A relatively high count of FPs (in the same cable) in order to increase the transported capacity.

The deployment of lower effective area fibers in order to optimize cost through the use of a smaller number of regenerators.

The implementation of the novel “pump farming” repeaters’ technology. Pump farming means that a set of pump lasers isused to amplify a set of FPs. Reliability, redundancy, and better power management are the main advantages. In particular, reliability can be a cost-reduction factor as submarine cables’ failures and repairs (bringing downtime in provided services) are very costly.

SDM aims to achieve higher capacities by using the same amount of used power through a more efficient power management. The key concept is to reduce the optical power provided to each FP as a way to decrease the nonlinearities as implementing high count of FPs in the same cable.

However, in the event we want to cover small-distance undersea links (e.g., unrepeated- festoon networks) or if we want to increase the capacity of an existing traditional submarine cable system (consisting of a limited number of FPs), multiband transmission is a more effective solution compared with SDM, which is mostly preferable for long-haul distance links. More specifically, in multiband transmission there is no need to change the existing wet plant infrastructure during the upgrade process and this can result in both an increase (by double) of the capacity/FPs and in cost savings.

Table 2 presents the pros and cons of using SDM over a single band as opposed to multiple bands. The options presented are doubling the number of fibers at C band only and using the same number of FPs over the C + L bands. Note however that the C + L transmission is less efficient because C + L has to be separated and recombined (mux/de- mux) in the repeater for each span. This extra multiplexing/de-multiplexing leads to an extra loss per span of about a few dBs, which is contrary to the “optimizing efficiency” basic SDM concept.

Figure 6 shows the different types of submarine cables and the various types of fibers, respectively. The selection of the optimal cable type depends on the depth at which each cable is sinked. For example, double-armored (DA) submarine cable is used at the shore end, terminated at the beach manhole of the cable landing site, and interconnected with a much lighter land cable (LWA) moving toward the cable landing station.

Saturday, 21 January 2023

Basics to GDB Debugging

● 1. We see that there is a core dump happening:

●

● 2. We see what the core dump tells:

●

● 3. Disassemble

● The arrow tells us the instruction where the core dump was generated

●

● 4. Print address

●

● 5. Print address

●

● 6. Print the character at address (if it was a character)

●

● 7. To see the contents of registers- info registers

●

● 8. To see which function called which function and so on till the current function- use ‘backtrace (bt)’ or ‘info stack’ or ‘where’

●

● :

●

● 10. To see the frame info of the latest stack frame from above- info frame

●

Now,

ESP is the current stack pointer. EBP is the base pointer for the current stack frame.

When you call a function, typically space is reserved on the stack for local variables. This space is usually referenced via EBP (all local variables and function parameters are a known constant offset from this register for the duration of the function call.) ESP, on the other hand, will change during the function call as other functions are called, or as temporary stack space is used for partial operation results.

● We need to know who had called us and what were the parameters given to me (the current function)

●

○ So, ebp+ 0 will give us the previous ebp

○ We can know the ebp address from the info registers

○ Usually the value of ebp in info registers (which is the current value of the ebp) will be same as the info frame since it will be pointing to the local variables of current function.

● Now, ebp + 4 contains the return instruction where we have come from

○ Example:

○

○ To see the above output to print 10 addresses, in hexadecimal format and in word length(.ie. 4 bytes, 8 bytes, …):

○ Example:

○

○ In that example, we see who has called us is this:

○

● In example, we can tell that the current function has been called from connect() function using:

○

● Now, we can do disassemble connect+176:

○

○ So, we see that the instruction before 176 has a function call to mystrcpy()

Thursday, 15 December 2022

[BGP] Significant Changes to BGP in Recent Years

Border Gateway Protocol (BGP) is a fundamental routing protocol that is responsible for routing Internet traffic between Autonomous Systems (ASes). BGP is used to exchange routing information between routers on the Internet, and it determines the best path for network traffic to follow. In recent years, BGP has undergone several changes, with new features added to improve its performance and security. In this article, we will discuss some of the newer features added to BGP in recent years.

One of the most significant changes to BGP is the addition of BGPsec. BGPsec is a security extension to BGP that provides secure routing by adding digital signatures to BGP updates. This ensures that BGP routing information is authentic and has not been tampered with, preventing attackers from hijacking traffic or redirecting it to a malicious destination. BGPsec is now widely deployed, and it is essential in ensuring the security and integrity of BGP routing information.

Another important addition to BGP is the support for Multiprotocol BGP (MP-BGP). MP-BGP allows BGP to support routing information for multiple protocols, such as IPv4 and IPv6, as well as other network layer protocols like MPLS. This provides greater flexibility and scalability in routing, allowing BGP to handle the increasing demands of modern networks.

BGP Flowspec is another feature that has been added to BGP in recent years. BGP Flowspec is a traffic filtering mechanism that allows network operators to specify how traffic should be treated based on specific characteristics, such as the source or destination IP address, the type of traffic, or the application used. This allows network operators to block or rate-limit traffic that is considered undesirable, such as traffic from known sources of DDoS attacks.

BGP Large Communities is another recent addition to BGP. BGP Large Communities is an extension to BGP that allows network operators to attach additional metadata to BGP routing updates. This metadata can be used for a wide range of purposes, such as filtering, traffic engineering, or monitoring. BGP Large Communities is particularly useful in large networks where the routing table can be very large, and it provides a more efficient way to manage routing updates.

BGP Link State is another recent addition to BGP that provides a more scalable way to handle routing information. BGP Link State is based on the same principles as the OSPF and IS-IS routing protocols, where routers maintain a database of link-state information, and routing decisions are made based on this information. BGP Link State can handle larger networks with more complex routing requirements, providing better scalability and efficiency in routing.

BGP Add-Path is a feature that allows BGP to advertise multiple paths for the same destination prefix. This provides greater redundancy and load balancing, allowing traffic to be distributed more evenly across multiple paths. BGP Add-Path is particularly useful in networks with high traffic volumes or where link failures are common.

Finally, BGP Route Refresh is a feature that allows BGP routers to refresh their routing tables without tearing down BGP sessions. This provides a more efficient way to handle routing updates, as BGP sessions do not need to be reset each time the routing table is updated. BGP Route Refresh is particularly useful in large networks with many BGP sessions, where resetting BGP sessions can be a time-consuming and disruptive process.

In conclusion, BGP has evolved over the years to become a more robust, secure, and flexible protocol, thanks to the addition of new features and improvements. Network operators can now benefit from advanced features like BGP Flowspec, BGP-LS, and BGPsec, to enhance their network's security, scalability, and resiliency. The combination of BGP with SDN technologies can further enhance network automation and programmability, making it easier to manage large-scale networks.