Fixing A Juniper Switch That Was Shut Down Improperly

From Baranoski.ca
Revision as of 16:46, 26 February 2020 by Casey (talk | contribs)
Jump to navigation Jump to search

Juniper switches need to be shut down properly, not just powered off. They're Unix-based, and Unix does not like being shut down improperly.


OS Primary Partition Corruption

You will know when you have a switch that has been shut down improperly. There will be an amber light on the chassis, and this alarm on the console:

user@switch> show chassis alarms
1 alarms currently active
Alarm time              Class  Description
2014-01-26 10:48:49 EST Minor  Host 0 Boot from backup root

As well as this banner:

***********************************************************************
**                                                                   **
**  WARNING: THIS DEVICE HAS BOOTED FROM THE BACKUP JUNOS IMAGE      **
**                                                                   **
**  It is possible that the primary copy of JUNOS failed to boot up  **
**  properly, and so this device has booted from the backup copy.    **
**                                                                   **
**  Please re-install JUNOS to recover the primary copy in case      **
**  it has been corrupted.                                           **
**                                                                   **
***********************************************************************

When installing the OS, a Juniper device makes two copies of the OS. One is a backup, in case the primary was not unmounted cleanly at shutdown (or just powered off).

To copy the backup image over top of the primary image (you must type this; it will not tab-complete):

request system snapshot media internal slice alternate

Note that using this command will only repair the OS; it won't clear the alarm.

Verify with the command:

show system storage partitions

You will get output like this:

Boot Media: internal (da0)
Active Partition: da0s1a
Backup Partition: da0s2a
Currently booted from: backup (da0s2a)

Note the "Currently booted from: backup" line.

Once the snapshot is done, the switch must be rebooted to clear the alarm. Normally, a Juniper will boot the last-known-good copy of the OS. It must be forced to use the primary.

request system reboot slice alternate media internal in 0

If that does not resolve it, try the reboot command again. For some reason, a second reboot sometimes solves it.

SSH Issue

Sometimes, SSH will also fail after an improper shutdown. When trying to SSH to the switch, you will see this:

user@COREBOX-re0> ssh 192.168.1.2
ssh_exchange_identification: Connection closed by remote host

To fix this, console into the switch and do the following:

start shell user root
cd /var
mkdir empty
exit

Then you have two options: reboot the switch or restart SSH.

To restart SSH:

configure private
deactivate system services ssh
commit
rollback 1
commit


Full OS Reinstall

If it gets powered off improperly enough time, the primary and backup images will both be marked bad, and you will see this:

U-Boot 1.1.6 (Apr  4 2013 - 10:30:53)

Board: EX2200-C-12T-2G 4.15
EPLD:  Version 14 (0x00)
DRAM:  Initializing (512MB)
Flash: 8 MB

Firmware Version:01.00.00
USB:   scanning bus for devices... 3 USB Device(s) found
       scanning bus for storage devices... 1 Storage Device(s) found

ELF file is 32 bit
Consoles: U-Boot console

FreeBSD/arm U-Boot loader, Revision 1.1
(builder@svl-junos-pool91.juniper.net, Tue Apr  5 00:15:22 UTC 2011)
Memory: 512MB
bootsequencing is disabled
new boot device =
\
can't load '/kernel'
can't load '/kernel.old'
Press Enter to stop auto bootsequencing and to enter loader prompt.


To reinstall the OS:

  • Copy the .tgz file for the OS to a FAT32 formatted USB memory key
  • Power off the switch
  • Insert the USB key into the switch
  • Power on the switch
  • Press enter when you see the "Press Enter" prompt
  • Run this command:
install --format file:///<the .tgz file>