I ran into an interesting problem a few months back, but haven’t had a chance to blog about it.
Working with a Cisco GET VPN environment we have deployed, we were notified by our ISP that they would be performing some “circuit grooming”. That phrase always makes me cringe. The usual “this shouldn’t effect your connectivity at all” emails came and went.
Needless to say, the maintenance was performed and all of a sudden all of the GM routers were throwing errors pertaining to anti-replay. And as we all know with anti-replay, that’s a bad thing. That means the tunnels are toast. GET VPN uses a psudo-timer solution to protect against replay attacks, called Time Based Anti-Replay(TBAR). This psuedo-time is handled by the Key Server within the GDOI.
Long story short, no traffic was passing. Entering the “clear crypto gdoi” command took care of this. We did some shallow research and found that there was a particular bug for the version of code that we were running and patched accordingly. A few weeks went by and the problem presented itself again. The same “clear crypto gdoi” command, again, took care of the problem. Checking the GDOI psuedo-time that was negotiated, I noticed a sizable delta between the server time and the time that the branch router was reporting in at.
This gets me to the point of my story. Cisco’s Embedded Event Manager(EEM).
EEM is a fantastic subsystem built into IOS. It allows you to build an “applet” that can respond to certain criteria that you can define as triggers.
Being new to EEM, I wrote a less than complex script to parse the syslog buffer of the remote router, looking for a particular replay error, and issue the ‘clear crypto GDOI’ command.
Something along the lines of such :
event manager applet GDOI_TBAR_RESET
event syslog msg "%CRYPTO-4-PKT_REPLAY_ERR"
action 1.0 cli command "clear crypto gdoi" pattern "confirm"
action 2.0 cli command "yes"
action 3.0 syslog msg "Clear Crypto GDOI command issued due to Anti-Replay Error!"
This doesn’t being to scratch the surface of EEM, but it still taught me a little bit about the structure of it and how it can function. I plan to make an update to it in the near future once I have the time to track down the OID that issues that particular error within IOS.
If anyone has any interesting stories / uses for EEM, feel free to leave a comment below. I’m always up for tips and tricks!