Fixing A Juniper Switch That Was Shut Down Improperly
Juniper switches need to be shut down properly, not just powered off. They're Unix-based, and Unix does not like being shut down improperly.
OS Primary Partition Corruption
You will know when you have a switch that has been shut down improperly. There will be an amber light on the chassis, and this alarm on the console:
user@switch> show chassis alarms 1 alarms currently active Alarm time Class Description 2014-01-26 10:48:49 EST Minor Host 0 Boot from backup root
As well as this banner:
*********************************************************************** ** ** ** WARNING: THIS DEVICE HAS BOOTED FROM THE BACKUP JUNOS IMAGE ** ** ** ** It is possible that the primary copy of JUNOS failed to boot up ** ** properly, and so this device has booted from the backup copy. ** ** ** ** Please re-install JUNOS to recover the primary copy in case ** ** it has been corrupted. ** ** ** ***********************************************************************
When installing the OS, a Juniper device makes two copies of the OS. One is a backup, in case the primary was not unmounted cleanly at shutdown (or just powered off).
To copy the backup image over top of the primary image (you must type this; it will not tab-complete):
request system snapshot media internal slice alternate
Note that using this command will only repair the OS; it won't clear the alarm.
Verify with the command:
show system storage partitions
You will get output like this:
Boot Media: internal (da0) Active Partition: da0s1a Backup Partition: da0s2a Currently booted from: backup (da0s2a)
Note the "Currently booted from: backup" line.
Once the snapshot is done, the switch must be rebooted to clear the alarm. Normally, a Juniper will boot the last-known-good copy of the OS. It must be forced to use the primary.
request system reboot slice alternate media internal in 0
SSH Issue
Sometimes, SSH will also fail after an improper shutdown. When trying to SSH to the switch, you will see this:
user@COREBOX-re0> ssh 192.168.1.2 ssh_exchange_identification: Connection closed by remote host
To fix this, console into the switch and do the following:
start shell user root cd /var mkdir empty exit
Then you have two options: reboot the switch or restart SSH.
To restart SSH:
configure private deactivate system services ssh commit rollback 1 commit
Full OS Reinstall
If it gets powered off improperly enough, the primary and backup images will both be marked bad, and you will see this:
U-Boot 1.1.6 (Apr 4 2013 - 10:30:53) Board: EX2200-C-12T-2G 4.15 EPLD: Version 14 (0x00) DRAM: Initializing (512MB) Flash: 8 MB Firmware Version:01.00.00 USB: scanning bus for devices... 3 USB Device(s) found scanning bus for storage devices... 1 Storage Device(s) found ELF file is 32 bit Consoles: U-Boot console FreeBSD/arm U-Boot loader, Revision 1.1 (builder@svl-junos-pool91.juniper.net, Tue Apr 5 00:15:22 UTC 2011) Memory: 512MB bootsequencing is disabled new boot device = \ can't load '/kernel' can't load '/kernel.old' Press Enter to stop auto bootsequencing and to enter loader prompt.
To reinstall the OS:
- Copy the .tgz file for the OS to a FAT32 formatted USB memory key
- Power off the switch
- Insert the USB key into the switch
- Power on the switch
- Press enter when you see the "Press Enter" prompt
- Run this command:
install --format file:///<the .tgz file>