Hung Process in IOS-XR

So, quick post to help anyone else who runs into this problem in IOS-XR land.

While attempting to issue commands related to the BGP process on the box and I was meet with no response from the box. I was able to break the process with the typical Ctrl + C process, and issue commands to other processes on the box, but the BGP process just refused to response.

Reviewing the logs, I was able to find some errors related to no response being received from the BGP process :

RP/0/RSP0/CPU0:Apr  9 22:55:42.109 : sysdb_shared_nc[382]: %SYSDB-SYSDB-6-TIMEOUT_EDM : EDM request for 'oper/ip-bgp/gl/act/shared/vrf/default/afi/' from 'bgp_show' (jid 65855, node 0/RSP0/CPU0). No response from 'bgp' (jid 1047, node 0/RSP0/CPU0) within the timeout period (100 seconds)

You can see that there is a ‘no response from ‘bgp” string in this log message. The quick and easy way to take care of a hung process like this is to restart it by issuing the following command :

RP/0/RSP0/CPU0: router#process restart 1047 location 0/RSP0/CPU0

WARNING : Issuing this command will rock the BGP process, so plan accordingly. You may experience a brief outage so schedule it during a typical maintenance window.

BGP 4-byte ASNs

BGP ASN Overview

ASes are the unit of routing policy in the modern world of exterior routing, according to RFC1930. However, the classical definition of an AS is a set of routers that all exist within a single technical administrative domain. When it comes to BGP, this ASN is the numerical identifier for unique presence for an organization, on the Internet.

The diagram below gives a good logical idea of how ASNs are utilized to provide unique presence within the Internet.

BGP ASNs

Traditional 2-byte ASN

This ASN is represented in a 16 bit number unsigned integer. It being 16 bits, the maximum number of ASes that can be assigned is limited to 65535. As with RFC1918, and private IPv4 address space – there is a reserved range for ASNs. ASNs 64512 through 65535 are reserved for private use and are not globally routable. Meaning they cannot be advertised across the internet to other potential peers that you’re organization connects to.

New 4-byte ASN

As I’d addressed previously, the fact that RFC1918 has left us at a choke point in terms of IPv4 public addressing, it has been decided that 64512 public AS numbers will not be enough and will eventually run out. So, the powers at be have decided that it is time to address this problem now and nip it in the bud instead of waiting for the same problems we’re now having with IPv4/v6 conversion.

Just as the 2-byte AS number is notated in decimal notation, so is the new 4-byte ASN. However, the notation is known as ASDOT notation. Specifically, the 32 bits will be split into two ‘words’ of 16 bits a piece and notated with a ‘.’ or dot in the middle. An example being 65000.65000. However, should the higher order 16-bits be set to represent the value of decimal 0, traditional 2-byte notation can be used.

With the advent of this idea, RFC4893 was created. This RFC explains the two major attributes that have been added to BGP to support the incremental migration from support from a 2-byte ASN to a 4-byte ASN structure. The first being AS4_PATH which now supports the new 32 bit length, and AS4_AGGREGATE. With this post we’ll be focusing on the AS4_PATH attribute.

It also introduced a new AS_TRANS attribute that is used to substitute a 4-byte ASN when peering with a BGP speaker that has only 2-byte ASN support. More on that below.

4-byte BGP Peering

When BGP is starts the process of forming an adjacency between two BGP speakers, the OPEN message is sent between the two devices. Within this OPEN message the “My Autonomous System” field houses the administratively defined ASN. The OPEN message format can be seen below :

        0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+
       |    Version    |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |     My Autonomous System      |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |           Hold Time           |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                         BGP Identifier                        |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | Opt Parm Len  |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                       Optional Parameters                     |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

As shown above, the traditional BGP “My AS” field is 16 bits, or 2-bytes.

Before we get into the specifics behind how a BGP speaker advertises its 4-byte ASN to another BGP speaker, we have to address a limitation that was built into BGP from its inception. BGP was designed to terminate the peering with a neighbor should it receive an OPEN message with an Optional Parameter that isn’t supported. This seriously inhibits the introduction of future capabilities being incorporated into the BGP protocol as a whole. Limiting the future extensibility of the protocol.

With this, RFC2842 was created that defines a new “Capabilities” Optional Parameter that will be advertised in the OPEN messages. Allowing for some extensibility to be built into BGP. BGP peers will then list supported Optional Capabilities and peer accordingly. The 4-byte ASN Capability code has been assigned as number 65 by the IANA. The rest of the Capability codes can be found here. With the advent of this new Capability code, we’re able to now use 4-byte ASNs within a BGP topology.

We are then able to peer as traditional 2-byte BGP peers would. Simply configuring the neighbor addressing and what AS they’re a part of.

2 and 4-byte Support and Interconnection

Things get interesting when having to deploy both 2-byte and 4-byte ASNs. RFC4893 provides the explanation for how this process takes place.

As is defined within RFC4893, for the remainder of this post we’re going to refer to BGP speakers that only support 2-byte ASN’s as OLD BGP speakers, and devices that are configured for 4-byte support as NEW BGP speakers.

RFC4893 states that when a NEW BGP speaker is peered with an OLD BGP speaker it is to advertise the AS_PATH attribute in the old 2-byte ASN form. This is where the AS_TRANS substitution will occur. The NEW BGP speaker knows the limitation of the OLD BGP speaker and will swap the 4-byte ASNs in the advertisement with the AS_TRANS attribute, or as defined within RFC4893, the reserved ASN 23456 . The NEW BGP speaker is also required to send the AS4_PATH attribute at the same time. This AS4_PATH attribute will only consist of the 4-byte ASNs that are advertised from upstream NEW BGP speakers.

NEW BGP speakers are to be prepared to receive both an AS_PATH and AS4_PATH attribute from an OLD BGP speaker. Upon receiving both the AS_PATH and AS4_PATH attribute, the NEW BGP speaker will then merge the AS_PATH and AS4_PATH attribute.

This process allows OLD BGP speakers to co-habitate with NEW BGP speakers and still preserve the new 4-byte ASN between NEW BGP speakers.

An example of the process can be seen below :

BGP 4-Byte ASN

LDP Extended Discovery

Label Distribution Protocol (LDP) is a protocol used within MPLS for exchanging label bindings with other MPLS routers within the AS. These labels are then used to negotiate LSP’s throughout the network.

LDP uses an adjacency discovery technique by sending an LDP discovery packet, UDP 646, to the all-routers multicast address of 224.0.0.2. Once an adjacency is discovered, one of the LSRs will take the active roll and the other takes the passive roll. The passive LSR waiting for the active to initiate the connection. This is done via the comparison of unsigned integers but I’m not going to get into the granulars behind that, you’re welcome to read RFC3036 if you’re interested.

If the routers are not directly connected, as with TE tunnels sometimes spanning discontiguous LSRs within the MPLS domain, there is the ability to utilize the LDP Extended Discovery mechanism that is defined within RFC3036. This will allow the LSR to produce a Targeted Hello message toward a defined target LSR. Instead of using the all-routers (224.0.0.2) address, the router will send a unicast discovery packet to the target LSR.

As the RFC states, the basic LDP discovery process is symmetrical, with both LSRs discovering each other and negotiating their role of in the establishment of the LDP session. With LDP Extended Discovery, the process is asymmetrical with one LSR initiating the connection and the targeted LSR deciding whether or not it wants to respond or ignore the request to establish an LDP session. Should the targeted LSR decide it wants to respond, it will do so by sending periodic Targeted Hellos in return.

We’ll start with the following topology :

topology

Routers PE1 and PE2 are not directly connected, therefore they will not establish an LDP adjacency, as seen here in the show mpls ldp neighbors output from PE1.

fig2

PE1 has successfully established LDP adjacencies with P1 and P2, but not with PE2. If we were to want to establish LDP adjacency with these two LSRs, the following process would have to be followed :

First, we would need to define a targeted neighbor within the LDP process on one of the PE routers. For this example, we’ll go ahead and define PE1 as the active LSR by issuing the mpls ldp neighbor neighbor-ID targeted ldp command :

fig3

By enabling the debug mpls ldp targeted-neighbors, we can see the LDP process kick up and start to attempt to discovery the defined target LSR, on PE1 :

fig4

Once PE1 has been setup to target PE2, we can then configure PE2 to accept targeted LDP messages by issuing the mpls ldp discovery targeted-hello accept command :

fig5

Once PE2 has been configured to respond to the targeted hellos it is receiving, we can see the adjacency establish through the same debugs :

fig6

We can also verify the adjacency by issuing the show mpls ldp neighbor command again – and here we see that 3.3.3.3 is now listed as a valid adjacency :

fig7

Up, Up and Away

So,

It’s been quite some time since I’ve posted to my blog and I feel it necessary to check in with everyone. I’ve recently passed the ROUTE portion of my CCNP and I plan on sitting for the SWITCH in a month or two. Only because I work with switching hardware more than anything else, do I feel more confident in that area of the ciriculuum. After that will obviously be the TSHOOT and that will conclude the track for my NP. If you’d asked me a year ago if I’d thought I’d be this far along in my studies I would have probably laughed at you. I thought of having passed even the NA to be a feat that I might not be able to obtain. But God and my family continue to be my driving force behind working as hard as I do. So, I will keep you all updated on when I sit for the SWITCH.

As for my WGU studies, I’ve not been concentrating on them as much as I should be, and plan on kicking that into high gear starting right after Christmas. I’m taking no technical courses this semester and I think that has derailed me a bit on wanting to study any of the course materials. But, I will do it to get it out of the way so I can finally finish the degree.

I’ve recently been thinking about what it would take to obtain a CCIE and I think I am going to continue on my Cisco studies after the NP and complete the IE. I want to have those numbers and have that feeling of accomplishment that so many other before me have. I am slowly building my library for the studies and I plan on having all of the litterature at arms length within the next few months. I am currently going by the reading list on Cisco’s learning network, but if anyone has any suggestions, I would gladly take them. I know this is going to be a long and arduous process. But totally worth it.

I’ve also been in my new position with CSC now for almost 90 days. I am working with a group of really smart guys and I am learning a lot from them. It’s definitely contributing to the wealth of knowledge that I know will be required for my attempt at an IE R&S. But, through my studies, I’ve also been able to contribute to the team, which is definitely a fulfilling experience. Being able to apply concepts and theory that I’ve studied so hard and watching them play out just as I’ve done in many, many lab scenarios.

I’ve got a few ideas still jotted down for some technical posts and I promise I will post them soon. Just trying to wrap up some loose ends before the holidays are here and gone. Off the top of my head, I can think of a RADUIS implementation post using Server 2008 R2 authenticating against Active Directory. While using Wireshark to verify PSK exchange between the RADIUS server and the end point device. Stay tuned!

-Ed

IPv6 Address Formatting

So, I’ve come to the in depth IPv6 studies for my CCNP – I figure I’d take some notes on my blog to help others out who take this path.

I know, I should’ve gotten gotten some exposure to this on my CCNA studies, and I did, but not enough to totally absorb it.

OK, Take a deep breath. I know some people, including myself, were a little intimidated by this – but it CAN be done!

So, let get started :

With IPv6, the standard IPv4 32-bit addressing scheme goes out the window. Now a 128-bit addressing scheme is adopted.

IPv4 provided a total of 4,294,967,296 addresses. Ready for this one? IPv6 now provides this many addresses :

340,282,366,920,938,463,463,374,607,431,768,211,456  –  340 Undecillion addresses.

I’ve read somewhere that this is actually enough addresses to give every atom on the earth’s surface an address, and then 100 Earth’s there-after. You can check me on that, but I am writing this from memory. Seems like we will NEVER run out eh? Well, this brings up one of my favorite XKCD comics – http://xkcd.com/865/

To prevent confusion and over/under/random-use and to help with the public allocations, the RFC states that we will leave 85% of the IPv6 spectrum unused until the standard is revised. When that will be? I don’t think anyone knows, but there was a time when we though IPv4 would provide enough addresses for everything….

As you can see, this would become a bit less than ideal to try and manage. That being said, the powers at be decided to divide the addresses into 8 groups of 4 hexadecimal characters each. The IP address in v6 land no longer consists of 4 octects of 8 bits. Now it consists of 8 groups of 16 bits per character and since it is hexadecimal the bits can be set to values ranging from 0 through F – and if you do the math, this comes out to be 2^128’s combinations of characters. Which adds up to that crazy number listed above.

An example being :

2001:0050:0000:0000:0000:0AB4:1E2B:98AA  –  A far cry from 10.1.1.1, I would say.

Again, to make this all a bit more comprehensive and manageable, the powers at be allowed us some lee-way on how we can write and handle the addresses. You are able to drop the groups of consecutive zero’s. But, here’s the kicker, this can only be done ONCE per address.

If we were to apply this to the address listed above, it would become :

2001:0050::0AB4:1E2B:98AA

Now, still a bit unrly to have to try and type into a cmd prompt to ping something, so there is another rule we can apply. We are allowed to drop leading zeros within the address. Again, if we were to apply this to the address above, it would become :

2001:50::AB4:1E2B:98AA

This brings it down to something, though not exactly EASY to remember or type – but a hell of a lot more manageable than the first number we started with.

That sums up the formatting of the addresses, and is a very high level overview – but I just wanted to point out the 2 rules that can be applied to “short-handing” the address to make it a little less stressful on your brain. As I know I needed.

As always, thoughts and insights are greatly appreciated.

Until next time.

Work Stuff – More Studying – Collapsing the Data Center

So,

I created a task for myself at work, not that my work load hadn’t already been enough.

I wanted to implement Active Directory authentication on our Alcatel-Lucent equipment using a Windows Server 2008 R2 server acting as an NPS server. I wasn’t able to find much documentation on this at all on the web or at Alcatel Unleashed or even Alcatel’s configuration manuals or user guides. They were, however, nice enough to provide me with the Vendor Specific Attributes(VSA’s) that needed to be added to the RADIUS server to provide the appropriate information to the device. I will post the complete write-up on that at a later time, as I am still writing the documentation at work and I want to finalize it before I post it.

On a better note! I’ve been doing a lot of CCNP preparation as of late, and I am starting to feel more and more comfortable with the concepts and application of the material. I’ve worked my way through EIGRP and OSPF and I have moved on to route distribution. The more I read and the more I work in my home lab and GNS3, the more thirsty I become for learning and digesting anything network oriented. Using route-maps and ACL’s to efficiently distribute routes between different domains or to assign specific metrics to routes to make it even more efficient! I find myself analyzing everything at work to see how I can make it more efficient. Which can’t be a bad thing, and they certainly benefit from it.

OK, I’m sure you’ve heard enough in my posts about my studies. Let’s talk about some industry buzz for a while. Collapsing the data center, in a good way. In the past, traditional networks required an abundant use of distribution layer switches to communicate with the core layer of the network. The article states that there isn’t as much of need for the distribution layer anymore. That the access layer switches could communicate directly with the core for core services. As we move further into the future and the more I work with these types of technology, the more I realize that there may actually be some logic to this theory. In the past, equipment couldn’t be considered as reliable as it today. Decreases in power demand, footprint and cost, and increases in reliability and performance are starting to allow devices to last longer and produce better results. This leaves us with the option to start eliminating some distribution layer devices and start uplinking access devices directly to the core.

Most arguments I’ve heard against this idea addresses redundancy and availability. But, if you have an access layer switch uplinked to a distribution layer device which then uplinked to the core, and that distribution layer switch were to fail, you’re still left with the same results as if the access layer switch failed. No connectivity.  So, the idea of less complexity and more performance is always something we’re all keen on as network junkies.

Another topic I’d like to discuss in a later post would be “cloud” technologies. While good in theory, I think the world is in for a bit of an eye-opener when we really start moving heavily toward companies who provide a cloud service. I see it as one giant security threat. Even with the ideas of public, private, or hybrid clouds. But, I digress. I will write that up in another post.

Though I’m still new to the in depth studies and theories of networks, I still strive to make the best judgements with what I know. Please feel free to add input or correct me in any of my statements.

Thanks for reading!,

NetworkN3rd

 

EIGRP variance

Cisco’s proprietary routing protocol, EIGRP, offers and interesting tidbit of functionality to the network that decides to run on entirely IOS based routers. This little tidbit is known as the EIGRP variance command. What this function does, is allow for unequal cost load balancing on a router.

Please see Diagram below:

You’ll notice that the HQ router is connected to both remote office via some sort of serial based medium. One interface bandwidth being 128Kbps and one being 256Kbps (I know, not a whole hell of a lot of bandwidth, but this is just for ease of example). Along with that, the remote offices are connected using a FastEthernet standard at 100Mbps.

You’ll notice that Remote Office – 1 has network 10.10.20.0/24 connected to it. Now, if all routers in the diagram are running EIGRP and are all fully converged, the HQ router will have a route installed in it’s route table for the 10.10.20.0/24 network. Due to the low bandwidth on the 128Kbps link directly to Remote Office – 1, router HQ is going to install the HQ <-> RO2 <-> Switch <-> RO1 route into its routing table strictly because the cost of traversing that 128Kbps link, as opposed to the 256Kbps link and then the 100Mbps link between the two remote sites, would be far more costly on the time it would take for the traffic to reach it’s destination.

That being said. We all know, in the IT industry, we’re looking for newer and faster ways to get data from point A to point B. And we all know a little bit about what load-balancing is – Utilizing more than one medium to transport traffic from A to B at the same time – balancing the traffic 1-to-1 across multiple links.

Well, Cisco decided that the whole “only equal cost load balancing” model was a little too restrictive. So, they took it upon themselves to not only create their own routing protocol, but add a few little tidbits of functionality to it that truly make it their own. And this is where the variance command was born.

The variance command allows you to load balance the traffic across unequal cost paths, as opposed to the traditional load balancing across only equal cost paths.

If we refer back to the diagram above, we can see that we can now issue the variance command on the device for that particular instance of EIGRP on the HQ router. We will call the multiplier (n) for the sake of the following example.  To keep it simple math wise, if we entered the variance 2 on the HQ router, the router would then include routes with a metric of less than 2 times the minimum metric route for that destination. What that means is, once this command is issued, the router will look for routes to the 10.10.20.0/24 network that are proportionally unequal to the metric of 2 defined in the variance command. (ie. 128Kbps is exactly 2 times less than 256Kbps)

A little tricky at first, but once you actually sit and think about it, just make sure that you have your math right before you enable the command, and watch the previously useless routes come to life and allow even more optimization to your network. 🙂