基站闪断故障案例分析
摘要:某IPRAN站点在运行过程中出现频繁的设备闪断故障,造成此站点下挂4个3G站、3个4G站业务频繁闪断。
背景
2016年8月24日上午在网优人员给局方维护人员打来电话,说有数个站点出现频繁断站,然后过几分钟就好;我方工程师在收到局方告知后,立即与维护人员取得联系,为了不影响局方的业务以及考核指标,决定先将此站点下挂的站点端口进行关闭处理。
问题描述及排查
1. 登录到闪断的设备上,一般成环的节点很少会出现瞬断,但是不排除。首先检查瞬断节点的收发光功率;如果没问题就逐条检查环上的其他节点的光功率。一般情况下是光口实际收光超出临界值;
zxr10#show perf-value interface gei_5/1 laser-basis-cur(mA) : laser-basis-min(mA) : laser-basis-max(mA) :
laser-temp-cur(deg) : laser-temp-min(deg) : laser-temp-max(deg) :
optical-send-cur(dBm) : optical-send-min(dBm) : optical-send-max(dBm) :
optical-rcv-cur(dBm) : optical-rcv-min(dBm) : optical-rcv-max(dBm) :
send-bandwidth-usage : rcv-bandwidth-usage :
zxr10#show perf-value interface gei_6/1 laser-basis-cur(mA) : laser-basis-min(mA) : laser-basis-max(mA) :
laser-temp-cur(deg) : laser-temp-min(deg) : laser-temp-max(deg) :
optical-send-cur(dBm) : optical-send-min(dBm) : optical-send-max(dBm) :
optical-rcv-cur(dBm) : optical-rcv-min(dBm) : optical-rcv-max(dBm) :
send-bandwidth-usage : rcv-bandwidth-usage :
2. 检查设备温度,看风扇状态:
设备温度过高会导致板卡假死,当温度降到板卡可以承受的温度的时候,板卡又会起来,所以就会出现闪断。
zxr10#show version
ZXR10 Router Operating System Software, ZTE Corporation
ZXR10 ROS Version 6220 Software, Version ZXCTN 6220 RELEASE SOFTWARE
Copyright (c) 2001-2014 by ZTE Corporation Compiled Apr 24 2015, 18:46:39
System image files are flash:
System uptime is 244 day(s), 16 hour(s), 53 minute(s)
[SCCU, panel 1, master]
Main Processor: PowerPC MPC8544 Processor
System Memory : 1024M bytes, System FLASH : 128M bytes System NVRAM: 8K bytes, System Temperature: 41℃ System Serial : 16383 , System BaudRate : 9600
System Clock: PTP stratum 3 Board Type: RSCCP2
Serial Number: 1152, Part Number:
Main CPLD1 Version: , Main CPLD2 Version: CPU CPLD Version : N/A, PCI CPLD Version : N/A FPGA Version: , Main PCB Number: 101001 [SCCU, panel 2, slave]
Main Processor: PowerPC MPC8544 Processor
System Memory : 1024M bytes , System FLASH: 128M bytes System NVRAM: 8K bytes, System Temperature: 43℃ System Serial: 16383 , System BaudRate: 9600 System Clock : PTP stratum 3 Board Type : RSCCP2
Serial Number: 0138, Part Number:
Main CPLD1 Version: , Main CPLD2 Version:
CPU CPLD Version : N/A , PCI CPLD Version: N/A FPGA Version: , Main PCB Number: 101001 [NPCI, panel 1]
Main Processor : W90N740 Processor
System Memory: 16M bytes , System Temperature: N/A Board Type: R4EGC , Logic Board Type : R4EGC_ETH Port Number : 4
Serial Number : 2008, Part Number : Main CPLD Version :
FPGA Version : , Main PCB Number : 081001 [NPCI, panel 2]
Main Processor : MPC8321 Processor
System Memory: 128M bytes , System Temperature: 43℃ Board Type : R16E1F, Logic Board Type : (TDM+IMA) Port Number : 16
Serial Number : 1105 , Part Number: Main CPLD Version :
FPGA Version: , Main PCB Number : 080902 [NPCI, panel 5]
Main Processor : W90N740 Processor
System Memory: 16M bytes, System Temperature: 43℃ Board Type: R4EGC , Logic Board Type : R4EGC_ETH Port Number: 4
Serial Number: 1680, Part Number : Main CPLD Version :
FPGA Version : , Main PCB Number : 081001 [NPCI, panel 6]
Main Processor : W90N740 Processor
System Memory : 16M bytes , System Temperature: 43℃
Board Type : R4EGC, Logic Board Type : R4EGC_ETH Port Number : 4
Serial Number: 0696 , Part Number : Main CPLD Version :
FPGA Version : , Main PCB Number : 081001
zxr10#show fan-information PCB Version : 100801 Logical Version :
Fan1 : Normal Work Fan2 : Normal Work Fan3 : Normal Work Fan4 : Normal Work
3. 查看端口是否有错包,当端口错包累计到一定程度后,很大可能性会出现设备闪断。
zxr10#show interface gei_5/1
gei_5/1 is up, line protocol is up, detect status is OK Byname is none Description is none The port is optical Duplex full Smb ipg 12
Smb mcc bandwidth 10 Mbits Mcc vlan id is 3024 Smb mcc priority 7 No loopback BW 1000000 Kbits
Last clearing of \ 120 seconds input rate: 2428214 Bps, 8820 pps
120 seconds output rate: 1571473 Bps, 7147 pps Interface peak rate: inputBpsoutput 8806412 Bps Interface utilization: input 1%, output 1%
Input:
Packets : Bytes: 220163 Unicasts : Multicasts : Broadcasts : 2 LittUndersize: 0
Oversize : 0 CRC-ERROR : 259107 Dropped : 2869615 Fragments Jabber : 0 MacRxErr Errframe : 0 Disframe UnicErrors : 0 MultiErrors BroErrors : 0 GoodJumb : 07 64B : 0 65-127B : 826 128-255B : 83 256-511B : 887 512-1023B: 8 1024-Max : 54 AlignErrors : 0 Output:
Packets : Bytes : 3 Unicasts : Multicasts : Broadcasts : 4 Collisions : 0 LateCollision: 0 ColliTimes : 0
: 0 : 0 : 2869615 : 0