I recently purchased a new microserver to reduce my power footprint at home. And I had to move the FMC (Firepower Management Center) from the OpenStack deployment that I previously ran at home. However, I ran into several bugs in OpenStack that were fixed in later versions, but I couldn’t upgrade because of another bug. Essentially I hit a catch-22 and had to deploy a new FMC on that new microserver and use a restore to move the data and policies. In that process I did hit a bug for which I’d like to share some info on.
Backup & Restore FMC
First of all, the Backup and Restore of FMC is quite straightforward, as long as you keep ALL versions the same. As soon as one thing is different, whether this is the SRU, the VDB or geolocation updates, you cannot restore. I ran into this at the first attempt. Before you do the restore, just make sure all versions are the same, including VDB, SRU and geo updates.
The process itself is easy, go to System -> Tools -> Backup/Restore, create a backup, download the file. Then login to the new FMC and go to System -> Tools -> Backup/Restore, upload the backup file, hit restore, have coffee and wait until FMC is restored and rebooted.
Hitting the bug
At least, that was what supposed to be happening, but after the restore I hit a bug, with id CSCvd33448, where fireamp.pl is getting 100% (or actually more) CPU load which is of course not good. What happened? Well, it seems that this perl script is called regularly and somehow, after the restore, also after a reboot, fireamp.pl crashes (it becomes a zombie process) and it gets spawn again, becoming a zombie process, and well basically, it’s overloading your FMC with a lot of semi-crashed (processes aren’t killed or killable) fireamp.pl scripts.
Although the cisco bug doesn’t mention a workaround, there is a fix to this problem. You do have to be patient for this workaround as with anything FMC related.
Working around the bug
Just login to the FMC via SSH. Use the admin credentials that you also use for your web management.
Then enter the following commands and when asked for credentials, use your admin password to execute these sudo credentials.
sudo mv /usr/local/sf/bin/fireamp.pl /usr/local/sf/bin/fireamp.pl.not
sudo shutdown -r now
You have to be patient with these commands as the CPU is running at 100%, so typing commands and waiting for them take quite some time, the command needs to be scheduled in an already overloaded environment. Compare it with trying to fit a single ICMP packet into a 1Gps datastream on a gigabit ethernet port.
Now wait until FMC is restarted and all processes are back up and running.
Login to the web interface, go to AMP -> Integration and delete your AMP cloud
Recreate your AMP connection by going through the registration process by adding a new AMP cloud. Allow the access and wait until the cloud connection is enabled and working.
Log back in to FMC via SSH with the admin credentials and execute the following command
sudo mv /usr/local/sf/bin/fireamp.pl.not /usr/local/sf/bin/fireamp.pl
Wait for a few minutes and check with ps -ax | grep fireamp.pl that there are no more defunct processes
admin@na-fpmc-001:~$ ps -ax | grep fireamp.pl
Warning: bad ps syntax, perhaps a bogus '-'? See http://procps.sf.net/faq.html
9902 pts/0 R+ 0:00 grep fireamp.pl
13910 ? S 0:02 /usr/bin/perl /usr/local/sf/bin/fireamp.pl
13924 ? Sl 0:09 /usr/bin/perl /usr/local/sf/bin/fireamp.pl
Connection to <IPAddress> closed.
If you see the same behaviour occurring, just wait for things to settle down. If that still doesn’t help, you might want to repeate the steps.
Although the bug doesn’t have a workaround and this method is diving into the FMC ecosystem, it did save me from reconfiguring and redeploying all policies.