Escolar Documentos
Profissional Documentos
Cultura Documentos
Version: V3.12.10
ZTE CORPORATION
No. 55, Hi-tech Road South, ShenZhen, P.R.China
Postcode: 518057
Tel: +86-755-26771900
Fax: +86-755-26770801
URL: http://ensupport.zte.com.cn
E-mail: support@zte.com.cn
LEGAL INFORMATION
Copyright 2013 ZTE CORPORATION.
The contents of this document are protected by copyright laws and international treaties. Any reproduction or
distribution of this document or any portion of this document, in any form by any means, without the prior written
consent of ZTE CORPORATION is prohibited. Additionally, the contents of this document are protected by
contractual confidentiality obligations.
All company, brand and product names are trade or service marks, or registered trade or service marks, of ZTE
CORPORATION or of their respective owners.
This document is provided as is, and all express, implied, or statutory warranties, representations or conditions
are disclaimed, including without limitation any implied warranty of merchantability, fitness for a particular purpose,
title or non-infringement. ZTE CORPORATION and its licensors shall not be liable for damages resulting from the
use of or reliance on the information contained herein.
ZTE CORPORATION or its licensors may have current or pending intellectual property rights or applications
covering the subject matter of this document. Except as expressly provided in any written license between ZTE
CORPORATION and its licensee, the user of this document shall not acquire any license to the subject matter
herein.
ZTE CORPORATION reserves the right to upgrade or make technical change to this product without further notice.
Users may visit ZTE technical support website http://ensupport.zte.com.cn to inquire related information.
The ultimate right to interpret this product resides in ZTE CORPORATION.
Revision History
Figures............................................................................................................. I
Tables ............................................................................................................ III
Glossary .........................................................................................................V
II
Intended Audience
This manual is intended for:
l System engineers
l Maintenance engineers
Chapter Summary
2, Characteristics of System Normal Running Describes the characteristics when the system
runs normally.
Appendix A, Board Resetting and Switchover Describes the board resetting and switchover.
Related Documentation
The following documentation is related to this manual:
Conventions
This manual uses the following typographical conventions:
II
1-1
1-2
Run the command below to ping the RNC OMP address from the SBCXFor
example, the RNC OMP address is 129.1.1.1:
#ping 129.1.1.1
l There is no alarm about any board in the alarm management interface of the EMS
system, especially no alarm saying that the control-plane communication link is
broken.
l In the alarm management interface of the EMS system, all boards are in normal
active/standby status.
2-1
l In the dynamic data management interface of the EMS system, all units, subunits,
neighboring office signaling points, No. 7 links, IMA groups, IMA links, PVCs,
signalling links, AAL2 channels, SCTP connections, AS & ASP, Node B ports,
cells, and channels are in unblocked / activated / available status.
In the dynamic data management interface of the EMS system, all base stations
are unblocked, or without blocking.
2. When the user check the working status of the board indicators at the RNC rack side:
l No red indicator is in the ON status on all boards.
l For the two ROMB/RCB of all modules, the Active indicator is in the ON status
on one MP and the Standby indicator is in the ON status on the other. The
Run indicator flashes slowly (being ON/OFF once every one second, similarly
hereinafter). The active/standby slots conform to what the EMS indicates.
l The Run indicators of the ROMB/RCB flashes slowly.
l For the two ICMGs, the Active indicator is in the ON status on one ICMG and
the Standby indicator is in the ON status on the other. The Run indicator flashes
slowly. The reference selection indicator and the trace indicator are always ON.
l The Run indicator on the DTB flashes slowly. The E1 indicator flashes slowly at
1Hz.
If the indicator of the port configured with E1 flashes slowly, it means that the E1 is
correctly connected; if the indicator is always ON, it means that data is configured
but the E1 is not connected; If the indicator is OFF, it means no data is configured.
2-2
3-1
Steps
1. Go to the cabinet immediately to check the power supply. If the power failure occurs in
large area, inform the power supply maintenance persons to recover the power supply.
Shut down the power supply of the cabinets one by one. Power on after the power
supply is stable.
2. If the external power supply is normal, after reading the users complaints, observe
the calling status of all offices from the performance statistics console. Determine the
fault occurrence range, in all offices or in some offices. If the fault occurs only in some
offices, contact the personnel in the offices, checking the interface state and link state,
positioning the fault range, and determining whether the fault is on the local office. If
not, deal with the peer office. If so, go to Step 3.
3. Check whether the indicator status on the hardware boards is normal. Check whether
the physical connection and link with other element is normal. If so, contact the
maintenance personnel of other element for the troubleshooting, or find the possible
source by referring to the emergency maintenance manual of other element.
4. If there is no obvious hardware fault on the boards, check whether the software and the
data has problem. After observing OMC client alarm information, check whether there
is alarm of the board abnormality or link abnormality. If all is normal, check whether
the radio resource cell status is normal, whether the physical connection and link with
other element is normal. Try to recover quickly: Checking the operation logs, checking
whether the system is down due to data mis-modification or deletion (through checking
MML operation logs and alarm time, judging the relativity of the operation and fault).
IF so, recover the data.
3-2
5. If all is normal, contact the personnel of other element (such as, Node B, CN) for the
troubleshooting, or find the possible source by referring to the emergency maintenance
manual of other element.
End of Steps
Follow-Up Action
Caution!
When the fault occurs, it is important to locate the fault, especially locating whether the
fault occurs on the local office or other office. It is important for the fast troubleshooting.
Caution!
The abnormality record is very useful in emergency aid and the subsequent problem
analysis and summary. Therefore, be sure to fill a complete abnormality record.
Locate and analyze the fault based on the following three aspects:
1. Service faults often begin with user complaints, so you shall register the user number.
Analyze the base station where the complaint user is located in accordance with
different tools at radio and CN sides, to locate and analyze the fault.
l Use signalling trace and probe to find out CN, RNC, or Node B where the
complaint user is located, to locate and determine fault related equipment.
3-3
l If you can't determine the location of complaint user at the RNC side, you shall
search for help from the CN side.
2. Determine fault scope through the analysis of KPI index.
l Query relevant indices in KPI to determine the affected base station scope about
the fault.
l Determine whether it is a global fault based on the faulty base station.
l Determine whether it is associated with the module and specific board based on
the faulty base station.
3. Test arrangement.
If possible, arrange test at specific area, and provide more accurate information on
emergency maintenance.
In addition, provide the on-site fault record table wherever possible to allow ZTE
maintenance personnel to learn and locate the fault more easily.
After the maintenance experts arrive at the site, they take emergency maintenance
actions to recover the communication as soon as possible.
3-4
Caution!
Board handover, reset and replacement may have a great influence on the system running.
Therefore, refer to Appendix E and Appendix A of ZXWR RNC (V3.07.310) Radio Network
Controller Trouble Shooting before the operation.
Make records of the current status before any board handover and physical location
change.
Make records of each step and symptom occurring in the service recovery on the site.
1. Brief notice
The operator makes the brief notice, including the fault occurrence time, fault
properties, fault symptom, and detailed troubleshooting steps. If the fault can not be
removed, provide the detailed dealing steps, for the fast troubleshooting in the future.
Save the log files on ZXWR RNC SBCX with the file manager.
3. Alarm information
3-5
Collect the history alarms from thirty minutes before to thirty minutes after the fault.
Maintenance personnel can query and save on the alarm browsing window.
4. Command log information
Collect the command log information from thirty minutes before to thirty minutes after
the fault. Maintenance personnel can query the operation log, security log, and system
log on the log management subsystem of EMS.
5. RNC abnormality log
You can export all log files under the directory /IDE0/ExcInfoand Exc_omp.txt and
Exc_pp.txt under the directory /DOC0 of the active/standby OMP boards when a
fault occurs. After that, you can query the fault according to these log files.
Caution!
The log files from the active and standby OMP boards must be saved separately to
avoid overwriting.
3-6
If the above steps cannot help in troubleshooting and solution, please refer to Emergency
Aid.
4-1
Steps
1. Before the power supply in the equipment room recovers, to prevent from the accident,
power off all switches on the cabinet power distribution subrack which connects with
the external power supply system. Keep the dual-path power supply on the rack in
OFF file.
4-2
a. Power on the power distribution subrack. Check whether the power supply voltage
is within the normal range: -57 V ~ -40 V.
b. Power on the dual-path on the racks. Check whether the power supply voltage is
within the normal range: -57 V ~ -40 V.
c. Recover the power supply of the network cabinet and server cabinet. Start the
EMS server, charging dual-machine server, and disk machine.
End of Steps
End of Steps
4-3
Steps
1. Power on OMC server and OMC client.
Power on OMC server and OMC client. Start the related service programs. If the
connection is normal, the login main interface pops up at the client. After logging on
successfully, the client can operate normally.
2. Power on ZXWR RNC rack, shelf, and board.
ZXWR RNC power-on order: Power on the master control shelf and then other
shelves. Power on and start ZXWR RNC. Observe RUN status to check whether the
system startup is normal (no alarm and RUN is flashing slowly). After making sure
that OMC server is started normally, check the following items:
l Check whether there is communication fault through the fault management, such
as, whether the communication between modules is normal. If so, deal with it.
l Check whether the signaling link gets error in the office through the fault
management. If so, deal with it.
l Perform the basic service test. Make sure that the service recovers normal
through the signaling tracing and failure observing.
l The upper-level EMS alarms can be reported normally and the performance
statistics can be reported normally, too.
End of Steps
4-4
3. Processing unit: RCB and RUB, which process the upper layer protocols of ZXWR
RNC control plane and user plane.
Generally, the alarm function of OMC client and the flashing status of ZXWR RNC rack
board can help to judge the failed board and its causes.
1. Log on to EMS client and click Tool > Alarm Management. Check OMC alarm function
of ZXWR RNC EMS, and then check whether there is any board alarm the type of the
alarm board.
2. Observe other indicators of the board.
The following is examples for the flashing of common indicators.
a. Check ENUM on the board. In normal cases, it is solid OFF. If the indicator is
solid ON or flashes, the board is out of position. Unplug and plug it to observe the
status again.
b. If RUN indicator slowly flashes (frequency: 1 time/s) and ALM is solid OFF, the
board is running normally. If other indicators flash, the board is not running
normally at this time. If RUN is solid OFF, the board fails in self-test. If both RUN
and ALM flash slowly (1 time/s), this board is under active/standby changeover.
Wait for a while to see whether the board recovers to its normal status.
c. Check ACT on the board. If it is solid ON, this board is an active board while if it
is solid OFF, the board is a standby one. This indicator is to locate active/standby
changeover failure.
Proposals for handling such fault:
1. The alarm management information of ZXWR RNC EMS generally indicates the
alarm causes and recommended operation to eliminate this alarm. Perform related
operations according to such information.
2. Wait for ZXWR RNC board to recover to its normal status, and observe whether the
user service restores to normal.
If indicators flash abnormally for long during the board running and the alarm still exists,
try the following operations:
1. Reserve the alarm information.
2. Reset the alarm board or replace the board.
Caution!
Resetting ZXWR RNC boards may have a huge influence on services. Such as, if you
reset RUB, it is necessary to re-create all cells and user services on this board. if you
reset the interface board, it is necessary to re-create all bearers allocated on this board.
Therefore, please proceed with caution.
4-5
4-6
4-7
In the SDH transport network shown in Figure 4-2, REG is the regenerative repeater, ADM
is the add/drop multiplexer, DXC is the digital cross-connection equipment, and TM is the
terminal multiplexer.
Figure 4-2 Network Location and Alarm Structure of ESDTI, ESDTG, and ESDTT
The optical interface board terminates the regeneration section overhead between the
REG and the optical interface board. Alarms for maintaining the regenerator section
include: LOS (Loss Of Signal), LOF (Loss Of Frame), and RS-TIM (Regenerator Section
- Trace Identifier Mismatch).
The optical interface board terminates the regeneration section overhead between the
nearest ADM and the optical interface board. Alarms for maintaining the regeneration
section include: MS-AIS (Multiplex Section - Alarm Indication Signal), MS-FERF (Multiplex
Section - Far End Receive Failure), SF (Signal Failure), and SD (Signal Degrade).
The optical interface board terminates the higher order path overhead between the
DXC and the optical interface board. Alarms for maintaining the higher order path
include: AU-AIS (Administration Unit-Alarm Indication Signal, AU-LOP (Administration
Unit - Loss Of Pointer), HP-TIM (Higherorder Path Trace Identifier Mismatch), HP-UNEQ
(Higher-order Path UN-Equipped), HP-PLM (Higherorder Path - Payload Label Mismatch),
HP-FERF (High-order Path - Far End Receive Failure), and LOM (Loss Of Multiframe).
The optical interface board terminates the lower order path overhead between the TM
and optical interface board. Alarms for maintaining the lower order path include: TU-AIS
(Tributary Unit - Alarm Indication Signal), TU-LOP (Tributary Unit - Loss Of Pointer),
LP-RDI (Lower-order Path - Remote Defect Indication), LP-RFI (Lower-order Path -
Remote Failure Indication), LP-TIM (Lower-order Path - Trace Identifier Mismatch),
LP-UNEQ (Lower-order Path UN-Equipped), and LP-PLM (Lowerorder Path - Payload
Label Mismatch).
The optical interface board terminates the E1 circuit overhead between the opposite switch
and the optical interface board. Alarms for maintaining the E1 circuit include: E1AIS (E1-
Alarm Indication Signal), E1LOF (E1 Loss Of Frame), E1-LOM (E1 Loss Of Multiframe),
E1-RAI (E1 Remote Alarm Indicator), E1-FEBE (E1 Far End Block Error), and E1-SLIP.
4-8
Detect the media plane of RNC, inter-boards, and inter-shelves, to check whether
there is any packet loss resulting form hardware fault in RNC.
l Handle transmission alarms
Remove the alarms for packet loss at the bottom bearer layer caused by abnormal or
unstable transmission.
l Capture packets
4-9
The cause is the connection fault between the opposite exchange and the SDH
transport device, such as, E1 cable connection fault.
l RS-TIM,HP-TIM,LP-TIM
The cause is that the values of local J0, J1, and J2 are inconsistent with the
configurations of SDH transport device. Alarms of these three types do not affect
the services.
To eliminate the alarms, obtain the values of J0, J1, and J2 related to the transport
device through the query opposite configuration and then modify the values in the
database.
l E1-SLIP
If E1SLIP occurs when the board is running normally, the cause is the clock fault.
4-10
probable symptom is: Services borne on E1 1, 4, 7, 10, 13, 16, 19, 23, 26, 29, 32, 35, 38,
41, 45, 48, 51, 54, 57, 60, 63 are normal and on others are disconnected.
To confirm whether the array modes are same or not, insert E1AIS alarm on the opposite
on E1 2. If the E1 22 detects the alarm, the array modes at the interconnection ends are
different.
All CS and PS services in the Power failure Check the power supply.
whole network are blocked. CN-side failure Check the CN side.
All CS and PS services in a APBE fault Check the board and replace it
single RNC are blocked. Incorrect configurations if necessary.
corresponding to the office Modify office direction
at the CN side configurations.
All CS services in a single RNC APBE fault Check the board and replace it
are blocked. SS7 link fault if necessary.
Check SS7 configurations.
All PS services in a single RNC APBE fault Check the board and replace it
are blocked. SS7 link fault if necessary.
Check SS7 configurations.
All services of a resource shelf UIM fault Check the UIM and replace it if
are blocked. GLI fiber fault necessary.
CHUB connection fault Check the GLI fiber and the GLI
port.
Check the CHUB connection
and the CHUB port.
All services of a CMP module RCB fault Switch over the RCB.
are blocked. Replace the failed RCB.
All services of an IMA are IMA fault Check the IMA and replace it if
blocked. Media plane fault necessary.
Take further measures as
required according to the media
plane test.
4-11
All services of a DTA/DTI are DTA/DTI fault Check the DTA/DTI and replace
blocked. RDTA fault it if necessary.
Check the RDTA and replace it
if necessary.
All services of a Node B are IMA group fault Check the IMA group and
blocked. Node B fault analyze the symptoms.
Check the Node B.
All services of a cell are blocked. Incorrect cell configurations Check cell configurations.
Manual blocking Unblock the cell.
1. Many calls cannot be got through, or the Internet cannot be accessed and the terminal
cannot be activated.
2. Check alarms on the EMS alarm management interface to see if there is any office
direction unreachable alarm, and if the alarm occurs in all RNCs. If so, the fault lies
in the CN. If the fault only occurs in one or several RNCs, it is possibly caused by
RNC-side problems.
4-12
Recommended Solutions
1. Check to see if all tables are synchronized for the data modifications of the whole
network or a single RNC. If so, recover the data.
2. Check to see if there is any alarm about inaccessible calls or unreachable signals in
all RNCs. If so, check the CN side.
3. Check to see if there are frequent SSCOP link establishments and disconnections
(The message is BGN, END.) Make sure that the PVC bandwidth and the PVC type
of both sides of the Iu interface are identical.
4. Check the optical interface indicator of the RNC interface board. If the SD indicator is
off, check to see if the fiber connection is correct. If yes, reset or replace the APBE
and the interface board. If the SD indicator still off, check the CN side.
5. If the SD indicator is on, replace the interface board. If the problem still exists, check
the CN side.
Recommended Solutions
1. If the clock reference lost alarm occurs on the clock board, check to see if the clock
output connection on the RGIM is correct and if the connection is loose.
2. Conduct an active/standby changeover to the interface board or the optical interface.
3. If the alarm still exists after step 2, conduct an active/standby changeover to the CLK
clock board.
4-13
4. If the alarm remains after the above three steps, replace the rear board of the CLK
clock board and replace RGIM.
5. If the resource shelf reports the 16M clock driving alarm, take the following measures:
a. Check the clock cables on the rear board of the UIM to see if they are connected
correctly and if there is any loose connection.
b. Conduct an active/standby changeover to the UIM, with the driving clock being
provided by the standby UIM.
c. Replace the UIM, or replace the board whose driving clock fails.
1. If no call can be got through in many RNCs or throughout the network, the problem lies
in the CN side. If the failure only occurs in some areas, the problem lies in the RNC.
2. Check the SS7 link and the AAl2 channel (Iu office direction) through the background
dynamic management interface to see if they are in normal condition.
3. Check to see if the APBE operates normally. Check the background alarm
management interface to see if there is any APBE fault alarm.
4. Check the background alarm management interface to see if there are many alarms
about failed common channels or out-of-service cells.
5. Check to see if the cells in which no call can be got through belong to the same interface
board or RCP.
4-14
6. Check to see if call failures occur regularly. If the call fails once per several times of
calls, it is possible that one of the AAl2 channels at the Iu interface fails.
Recommended Solutions
1. Check to see if the RNC data configuration is modified before the failure occurs. If so,
recover the configuration by importing the backup data.
2. Check the SS7 link. If it is abnormal, handle it by following the criteria to analyze RNC
fault coverage.
3. Reset or replace the interface board.
4. If step 3 doesnt work, conduct an active/standby changeover between No.3 and No.4
module, setting the active module to the standby board.
5. Reset the interface board to which the failed cell belongs.
Fault Analysis
1. When either party or both party cannot be heard in a speech call, replace the UE first,
and then make a test call in the same environment. If the fault does not occur any
more, the problem probably lies in the UE.
4-15
2. If unilateral conversations still occur after testing different brands of UEs for many
times, the problem possibly lies in the system.
3. Use two UEs to make a test call, and do an uplink loopback test and a downlink
loopback test on the calling party or the called party in the signalling tracing system. If
you can hear your voice from the calling UE during the uplink loopback test, it means
that there is no problem from the UE to the RNC, and the problem possibly lies in
interface board or the CN side. If not, the problem possibly lies in the user plane or
the Iub interface.
Recommended Solutions
1. Check to see if a global data modification is made before the failure occurs. If so,
recover to the pre-modification data.
2. Replace the UE. If the failure does not occur any more, the problem lies in the UE.
Report it to the UE maker for solution.
3. Reset APBE (Iu interface board).
4. If the fault still exists after step 3, reset the RUB where services are bourne (To check
the RUB, enter the command UcpmcGetInstNo IMSI in the RDS to get the inst No,
and then enter the command UcpmcShow InstNo, 3 (instNo is the instance number)
to find the slot of the RUB corresponding to the instance number).
5. Reset the IMA/APBI/DTA to which the failed cell belongs.
6. If the fault still exists, reset
7. If the problem remains after all these steps, contact personnel at the CN side for
troubleshooting.
2. Make a packet transmission test to the UE by using the tool in the signalling tracking
system. If the UE downloads data at a normal rate during the test, it means that there
is no problem from the UE to the RNC user plane.
3. Make a ping packet test. If no problem is found during the test, the problem possibly
lies in the Iu interface, or the IP packet limitation made at the CE/CN side.
4-16
4. Replace the UE. If the download and webpage access failures does not exist any more,
the problem lies in the UE. Contact the UE maker for solution.
How to Analyze download and webpage access failures after activating PS services is
described in Figure 4-7.
Figure 4-7 Analyzing Download and Webpage Access Failures after Activating PS
Services
Recommended Solutions
1. Check to see if the data configuration is modified before the failure occurs. If so,
recover the configuration by importing the backup data.
2. Reset the GIPI, which segments and regroups packets. If the failure still exists, replace
the interface board.
3. If the failure remains, conduct an active/standby changeover to the UIM.
4. If the changeover doesnt work, reset the RUB where the PS service is established.
5. If the failure remains after all these resets, ask personnel at the CE and the CN sides
for troubleshooting to see if the problem is caused by the MTU packet limitation.
4-17
Fault Analysis
1. Check the EMS to see if large-scale cell outages occur to all RNCs, and if all
transmission-related boards generate alarms. If so, the problem probably lies in
transmission.
2. Check the alarms on the EMS alarm management interface. If the interface board
generates many E1/IMA/SCTP link alarms, the cell outage is possibly caused by
transmission-related problems. For IP transmission, check to see if there is any
conflict in terms of MAC address or IP address.
3. If there are cell outage alarms but no interface board transmission failure alarms in the
EMS system, the problem may be caused by RCP failure.
4. If cell outages only occur to several interface boards, the problem possibly lies in the
Iub interface board.
How to analyze large-scale cell outages is described in Figure 4-8.
Recommended Solutions
1. Check to see if a global parameter modification is made before the failure occurs. If
so, recover the configuration by importing the backup data.
2. If all out-of-service cells belong to the same module and the transmission interface
board generates no alarms, conduct an active/standby changeover to the home RCB
module.
3. If all out-of-service cells belong to the same resource shelf and the transmission
interface board generates no alarms, conduct an active/standby changeover to the
UIMU/GUIM/GUIM2.
4. If all cells that belong to an interface board are out of service, reset or replace the
APBE/SDTA.
4-18
Figure 4-9 Analyzing Absence of Cell Signals and Low Success Rate of RRC
Establishments
Fault Analysis
1. Check the EMS interface to see if there are QoS alarms about the success rate of
RRC establishments. If so, it means that the current common transmission channels
are established successfully and the UE has initiated RRC establishments.
2. Check the EMS alarm management interface to see if there are notifications about
system message update failure. If so, it means that broadcast messages cannot be
delivered and the UE cannot access the network correctly due to the update failure.
3. Connect an LMT to the site to see if the BCH packet transmission increases normally.
If not, it means that the Node B fails to deliver broadcast messages.
4. Conduct ALCAP and FP signalling tracing through RNC or LMT signalling tracing
to see if the transmission allocation and the FP synchronization fail during RRC
establishments.
Recommended Solutions
1. Check to see if a global parameter modification is made before the failure occurs. If
so, recover the configuration by importing the backup data.
2. If there are notifications about system message update failure, modify the SIB1 value
of the cell and trigger the system message once to refresh the updating process.
3. If the Node B fails to deliver broadcasts, or if the transmission allocation and FP
synchronization fails, block and unblock the cell.
4. If all these steps dont work, reset the Node B.
1. On OMC unified UMS client, check whether the cell establishment is normal.
2. Through Node B LMT, check whether the cell establishment is normal.
3. The abnormal activities take place in one or more cells, and all the activities in this
cell are abnormal or have a quite low success rate, while radio processes originated
in other cells run normally.
4-19
1. The possibility of ZXWR RNC/CN failure is quite little if The radio processes originated
in other cells run normally pops up.
2. Check whether the cell is in block status.
3. Reset the cell.
4. Wait for the system to recover, and then check whether the fault still exists.
5. Check whether Node B transceiving antenna is connected well and whether the power
amplifier is normal.
Caution!
The radio resource data is based on such factors as onsite call model and onsite landforms
combining with network planning and optimization, so do not modify it. To adjust the
parameters, make a proper data backup beforehand.
4-20
Figure 4-10 Analyzing OMM and NetNumen U31 Abnormality and Interruption
Handling Steps
1. Check to see if the communication between the Client and the Server is normal.
a. Ping the IP address of the Client and the Server to see if the communication is
normal.
If the IP address can be pinged through, but the packet loss rate is high and the
network is intermittent, check to see if there is another computer with the same
IP, if the dhcp function is enabled illegally in any computer in the internal network,
and if the physical connection of all NEs is correct.
b. If the IP address cannot be pinged through, check the physical connection
between the Client and the Server for abnormality.
If the Server and the Client are not in the same subnetwork, use the command
netstat r to check if the Server and the Client can communicate through the
router. If not, add a route by running this command: route add xx.xx.xx.xx (network
IP address) -netmask xx.xx.xx.xx (subnet mask) xx.xx.xx.xx (gateway IP address); for
example:
#route add 192.168.0.0 -netmask 255.255.255.0 10.11.201.254
This command will add a route to the 192.168.0.0 network section, with the
gateway IP address being 10.11.201.254. The routes added by this means will
not exist any more after the operating system is restarted. Therefore, it is required
to write the route configuration command in the startup script; for example, at the
end of the /etc/rc3 file.
c. Check to see if the router is configured correctly.
2. Use another Client to log in the Server. If the login succeeds, the problem lies in the
Client; if the login fails, the problem lies in the Server.
3. Make sure that there is enough space in the system disk and the disk where the Client
software is installed. If the space is not enough, delete unnecessary files to make
more space.
4. Check to see if the Client is affected by virus and if the operating system runs normally.
a. Use the latest virus definitions to kill the virus.
4-21
b. If the operating system does not run normally, reinstall it or find another computer
to install the Client software.
5. Check to see if the Server hard disk is fully used.
a. Run the command df -k to check if any hard disk partition in the Server is fully
occupied.
b. If so, use the command rm rto delete unnecessary files, such as redundant log
files, or use the command rm rto delete useless folders to make more room.
For example, to delete the log files under the directory $OMCHOME/log, run the
following commands:
$rm 123.txt
$rm r 1234
6. Check to see if the Server process is normal.
Run the command ps u womcr to check the real-time OMM process. For
example, the following shows a normal OMM process:
bash-3.00$ ps -u gomcr
PID TTY TIME CMD
5844 ? 00:00:00 run-linux.sh
5851 ? 00:00:00 ftpserver-linux
5855 ? 00:18:44 java
5872 ? 00:00:00 java
c. If there is no OMM log output, use the command ps ef|grep java to check the
OMM to see if there is any java process running. Run the command kill 9 to
quit the java process. For example, if the java process number is 1209, run the
following command to quit it:
kill -9 1209
4-22
Note:
There are possibly more than one java process in the system. Kill them all.
4-23
Connect the database, and then shut down ORACLE by running the command
shutdown immediate. Connect the database again and run the command startup
to start the database.
Note:
It is required to log in sqlplus as an ORACLE user.
4-24
Figure 4-11 Analyzing OMM and NetNumen U31 Performance Data Delay and Reporting
Failure
Fault Analysis
1. Process to collect the OMM performance data
Generally, performance data is collected in this order: OMP SBCX; EMS-(ftp)
OMM-(ftp) SBCX. The following steps describe how performance data is collected.
a. The RNC collects performance data according to the measurement tasks created
in OMM.
b. The RNC uploads the collected data to the log server.
c. The OMM server takes the performance data (.dat) from the log server through
FTP.
d. The OMM resolves the data files, and then save them into the database.
e. The OMM creates EMS data files under its FTP directory according to the
measurement tasks created by the EMS and notifies the EMS to take the files.
f. The EMS server takes the performance data (.xml) from the OMS server through
FTP.
g. The EMS resolves the data files, and then save them into the database.
2. Process to check the OMM performance data
Anything wrong occurring in any of the steps above can lead to data query failure.
Check the data by selectively following the steps below as required.
a. The RNC collects performance data according to the measurement tasks created
in OMM.
4-25
Note:
Measurement tasks are classified into basic measurement and general
measurement. The basic measurement is to collect all data by default and the
user does not have to create measurement tasks. For the basic measurement,
the name of the measurement task has a suffix "(Base)".
4-26
ORA-06512: at line 1
Caused by: java.sql.SQLException: ORACLE:ORA-06502:
PL/SQL: numeric or value error: number precision too
large
ORA-06512: at "RNS_PM.PROC_RNC_RNC_RAB", line 1046
ORA-06512: at line 1
Note:
When this happens, send the logs to the troubleshooting team in time.
\ums-svr\tmp\RNCPMDATABAK\[NE No.]\DATA\[Measurement
Object Type]\err (This path is used for 3.17.300k, 3.17.310d, or later
versions.)
Note:
When this happens, send the file that cannot be resolved to the
troubleshooting team.
4-27
vi. If you cannot query the data about such radio objects as some cells, check to
see if they are set to the debugging status.
If you cannot query the data about a cell, a Node B, or a cell pair, check the
corresponding Node B to see if it is set to the debugging status. If so, the data
of the cells, Node Bs, and cell pairs that corresponds to the Node B will be
saved in the debugging table.
To check the debugging data, enter the engineering mode (by pressing
Ctrl+Shift+P) on the Performance Management interface in the OMM NM
system. If the site is set to the debugging status, check to see if it is necessary
to modify the status.
e. The OMM creates EMS data files under its FTP directory according to the
measurement tasks created by the EMS and notifies the EMS to take the files.
Check to see if the OMM creates EMS data files on time.
Open the logs created during the corresponding period under the directory \ums
-svr\log\, and then search the key word .xml in the printing result. If there is
such printing result as follows, it means the OMM has created files and notified
the EMS.
INFO [SocketTransport] send a notify:10100035:Code:10100035
,sequenceId:E5048513-C97F-9928-CAFC-597478C908CD
destUrl:socket://172.22.96.3:21125/ Object0:\tmp\ftp\nmi\pm
\WRNC\30117\PM200903101447+080024A20090310.1430+0800-200903
10.1445+0800_30117_103_-_1.xml
INFO [com.zte.ums.csp.nmi.pm.adapter.EmsPmAdapterNotifMea
s]fileName= \tmp\ftp\nmi\pm\WRNC\30117\PM200903101447+08002
4A20090310.1430+0800-20090310.1445+0800_30117_103_-_1.xml
f. The EMS server takes the performance data (.xml) from the OMS server through
FTP.
g. The EMS resolves the data files, and then save them into the database.
Typical Solutions
1. Typical case 1: The parameters for the FTP service are modified, which leads to a
failure to save the performance data into the database.
Symptoms: No result is displayed during the query of the latest performance statistics
at the NM Client.
a. Check the network connection of the OMC server and the log server. The
connection from the OMC server to the log server, as well as the connection from
the log server to the NE ROMP, is correct.
b. Log in the database on the OMC server by using the username rns_pm. Run the
corresponding command to check the last time when performance data is saved.
4-28
c. Check the available space of the hard disk where the Oracle is installed. There is
enough space.
d. Check to see if the OMC has taken the data files.
e. Check to see if the log Server has created files.
f. Check to see if the FTP service of the log server is configured correctly.
After these steps, it can be concluded that the OMC server fails to take performance
statistics from the log server. The OMC server takes performance data through the
FTP service of the log server, so the troubleshooting should focus on the settings of
the FTP service.
The home directory for the anonymous user of the FTP service on the log server is
modified, and this is the reason for the problem.
2. Typical case 2: The FTP connection of the OMC fails frequently.
Symptoms: There are many data delay alarms on the EMS and these alarms mostly
last for less than 30 minutes. The EMS logs show that the OMM fails to notify the EMS
to take files on time.
First, check the network connection of the OMC server and the log server. The FTP
connection from the OMC server to the log server and from the log serer to the NE
ROMP is incorrect. Restart the FTP programme and the problem is solved.
Recommended Solutions
Recover the database manually by running proper commands.
Use sqlplus to log in the system as sysdba. Run the command alter database open and
check the execution of the command. If it is executed successfully, the database will be
back to normal. If not, run the command recover database. If the database is accessible
after running this command, it means the database recovers.
4-29
This is because the data deleted through the command DELETE will go to the rollback
segment first, and then will be removed from the rollback segment after the command is
submitted, but the space occupied by the UNDO table space will not be released.
Recommended Solutions
Log into the database as the SYS user SYSDBA.
Run the following commands:
SQLPLUS /NOLOG
CONN SYS/ORACLE@WOMC AS SYSDBA
Where, ORACLE is the password and WOMC is the SID of the database.
1. Create new UNDO table space by running the following commands:
CREATE UNDO
TABLESPACE "UNDOTBS3"
This is because the hard disk is unreasonably partitioned and planned, so that the
database is filled with performance data or alarms, especially after the database extends
itself automatically.
4-30
Recommended Solutions
Follow the steps below to move a certain data file from the disk that has no enough free
space.
1. Make offline the table space to which the data file belongs by running the command
alter tablespace (table space name) offline.
2. Cut the data file and paste it into the new directory (the file can be renamed).
3. Run the following command to reconnect the table space and the data file: alter tables
pace (table space name) rename datafile the previous absolute path of the data file +
the previous file name to the new absolute path of the data file + the new file name.
4. Make the table space online by running this command: alter tablespace (table space
name) online
The data file is moved.
The following command can be used to check the location of a table space data file.
table space name (capital le
select file_name from dba_data_files where tablespace_name=
.
tter)
The performance statistics shows that the average MP load is above 60%.
The fault is mainly caused by insufficient traffic planning, traffic burst, and UE registration
burst.
4-31
Recommended Solutions
1. How to handle MP overload caused by increased traffic.
During the MP overload period, keep a close eye on the MP load. If the load is above
80%, block some cells manually to lower the load.
Modify the corresponding parameters when the MP load is relatively low.
Modify the access parameters to reduce the retransmissions of RRC connection
requests.
Modify the location update parameters to reduce the periodic location updates. Make
the modifications according to the MSC. The modified parameters must be lower than
the values set in the MSC.
If all RCP modules are not evenly loaded, modify the number of sites that belong to
these RCP modules.
In the rack diagram of status management, check the active/standby status of the MP
to see if the MP board is in an abnormal status. If abnormal, click the MP board to make
an active/standby changover. Check to see if signalling tracing and RTV measurement
are enabled. If so, disable them.
Go to the MP-related logs and send them to the UMTS troubleshooting team.
4-32
2. From the Server drop-down list, select the backup configuration data file, and click
Restore.
Before handling ZXWR RNC emergency faults, back up the configuration data first. On
one hand, the fault recovery may involve configuration data modification, and the data can
restore onsite status to avoid the worse case during the emergency fault recovery. On the
other hand, reserve the first-hand information for ZTEs maintenance and technical support
personnel and the technicians at the home front, helping to analyze and locate problems
and improving the system performance.
4-33
4-34
APBI, GIPI3, GIPI4, SDTA2, DTA, IMAB, DTI, EIPI, SDTI, RCB, ROMB,
1+1 backup
UIMC, GUIM, THUB, CLKG, ICM, ICMG, SBCX, RCB
1:1 backup APBE, APBE2, APBI, SDTA, SDTA2, SDTI, SDTB, POSI
A-1
A-2
Program version of
active ROMP
B-1
Version of connected
Node B/CN
OMC version
Operation log
information of OMC
Project information
B-2
Including the resource name (such as service cell ID, ATM No.) whose
status is abnormal through query and abnormal contents. Record it if
there is any.
Abnormality query
information
Information of signal
tracing
Fault type:
Fault source:
Fault phenomena:
B-3
Solution:
Summary:
Complaint
company or
In the warranty period or not ()Y()N
organiza-
tion
B-4
Handling result:
Handled by: Stamp of the department:
Unresolved problems:
B-5
The version
Flash at 5 Hz Solid OFF Booting is being
downloaded.
The version
download failed.
The board does
Flash at 1 Hz Flash at 5 Hz Booting not match the
configuration and
cannot download
its version.
DEBUG version
Downloading
has successfully
version
downloaded
VxWorks and
is waiting for
downloading
and running the
Solid ON Solid OFF Booting
version.
RELEASE version
indicates the
version download
is successful and
the version is
starting.
The board
Solid OFF Flash at 5 Hz 7
self-test failed.
Self-test failure The startup of the
Solid OFF Flash at 2 Hz 8 operation support
system fails.
B-6
The power-on of
Flash at 5 Hz Flash at 2 Hz 10 basic processes
fail or time out.
Alarm of
mismatch
Flash at 2 Hz Flash at 5 Hz 6 among version,
hardware and
Running fault configurations.
alarm
The media plane
Flash at 2 Hz Flash at 2 Hz 2 communication
disconnects.
Active/standby
Flash at 1 Hz Flash at 1 Hz 5 changeover is in
process.
Besides common indicators mentioned above, different boards have their own indicators.
For the detailed description of the indicators, refer to the related hardware description
manual.
B-7
B-8
II
III
IV
GLI
- Gigabit Line Interface
GUIM
- Gigabit Universal Interface Module
ICM
- Integrated Clock Module
LMT
- Local Maintenance Terminal
MP
- Main Processor
MTP3B
- B-ISDN Message Transfer Part level 3
OMC
- Operation & Maintenance Center
POSI
- POS Interface Board
PSN
- Packet Switched Network
RCB
- RFS Circuit Breaker
ROMB
- RNC Operation & Maintenance Board
RUB
- RNC User Plane Board
SBCX
- X86 Single Board Computer
THUB
- Trunk HUB
UIMC
- Universal Interface Module for Control plane (BCTC or BPSN)
UIMU
- Universal Interface Module for User Plane
VI