Você está na página 1de 76

ATAE Cluster

V200R002C50 Scheme

www.huawei.com

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved.


ATAE Acronyms and Abbreviations

OSTA Open Standard Telecom Architecture

ATCA Advanced Telecom computing Architecture

PICMG PCI Industrial Computer Manufacturers Group

IPMI Intelligent Platform Management Interface

IPMB Intelligent Platform Management BUS

FRU Field Replaceable Unit

SDR Sensor Data Record

SEL System Event Log

BMC Baseboard Management Controller

SNMP Simple Network Management Protocol

OOS Out of Service

GE Gigabit Ethernet

FC Fiber Channel

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 1
References

 ATAE Cluster System Hardware Description

 ATAE Cluster System Hardware Installation Guide

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 2
Objectives

 After completing this course, you will:

 Understand the background of the ATAE hardware platform.

 Learn the basic concepts, networking, and technical


implementation of the ATAE cluster V200R002 scheme.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 3
Contents

1. ATAE Cluster Overview

2. ATAE Cluster Hardware Platform

3. ATAE Cluster Scheme

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 4
Low OPEX — Greatly Reducing Power
Consumption, Equipment Room, and Labor Cost
For a network with 50,000 GSM TRXs
Equipment Room
Four cabinets
M2000 Decreased
by 75%
Sun M5000(8P) Sun M4000(2P)
One cabinet
PRS Nastar TS
ATAE Server

ATAE HP DL980G7 HP DL580G7 HP DL580G7


Saves 406,534 kWh of electricity and
reduces 319 tons of carbon emissions
Power consumption each year

Original Reduce 72% energy


ATAE scheme consumption
Devices 4418 W 16020 W *Based on empirical data, heat
generated by 1 w power consumption
Air requires 3 W power to cool.
conditioners 13254 W 48060 W

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 5
Highly Efficient Redundancy System for Fast Service
Recovery and Smooth Capacity Expansion
Redundancy protection for key hardware
•DC cabinet: The cabinet uses dual 3 inputs. The power supply units of
SAN helps fast the ATAE subrack use 2+2 redundancy and support -48 V power supply.
recovery •AC cabinet: The cabinet uses dual 2 inputs. The power supply units of
 Boards are started the ATAE subrack use 1+1 redundancy and support two -48 V power
by SAN boot. supply inputs.
 PnP enables
•The ATAE SMM supports 1+1 active/standby redundancy.
smooth capacity
expansion •The ATAE switching unit supports 1+1 redundancy.
 Fast recovery •The ATAE subrack's fans support 1+1 redundancy.
SAN boot technique is used.
when faults occur.
Plug & Play for boards •The service bus of the ATAE subrack adopts dual-star redundancy.
density disk arrays
24*600G/900GB high-
•The service plane and management plane are isolated from each
Data reliability other.

Service disk •The IPMB, Base plane and Fabric plane use 1+1 redundancy.

array data is •The storage system uses RAID1+0 and RAID5 for data protection.

mirrored using •The storage system uses 1+1 redundant controllers.


RAID1+0. •The storage system uses 1+1 power protection.

Backup disk System-level backup and restore system is embedded.


Supporting data backup and restore of each product's real-time
arrays use 12*2T high-density data, applications (OSS software and DB software), and
Raid 5. disk arrays operating system.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 6
High Reliability, N:1 Node Redundancy
This ATAE scheme is similar to the
solution that an HA system is
embedded into each OSS product.
PRS

Nastar-Standby
PRS-Standby
PRS

Nastar
ATAE features carrier-class
reliability. It provides
redundancy protection for key
function modules. When a
PRS-N:1 Nastar - N:1 function module malfunctions,
the system automatically switch
services to the standby module.
PRSDB

DB-Standby
M2000-Standby
M2000
M2000-DB

The reliability of the ATAE


M2000-DB

functioning as a carrier-class
server is 99.999%. The reliability
of SUN and HP servers is IT class
M2000 - N:1 DB-N:1 level and is 99.99%.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 7
Contents

1. ATAE Cluster Overview

2. ATAE Cluster Hardware Platform

3. ATAE Cluster Scheme

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 8
ATCA
 Advanced Telecom Computing Architecture (ATCA) is a standard derived from the CPCI
standard. It meets the new requirements in the telecom area.
 Vendors such as Intel, Force, Elma, Radisys, Schroff, Rittal, Bustronic, and Pigeonpoint are
engaged in the development and research of basic ATCA parts and have their definite
roadmaps.
 Huawei uses ATCA to develop products and package the products into the ATAE platform.
 Basic features of ATCA:
 Dual –48 V DC redundant power supplies
 High-speed differential signal connector
 8 U x 280 mm board
 1.2-inch slot spacing, which holds high heat sinks and helps design the air vents for heat
dissipation
 High-speed subboard for hot swapping
 Standard IPMI management bus, which manages all parts in the system
 Open software architecture and CGL OS
 Compliance with the NEBS and ETSI standards

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 9
ATAE Cluster Cabinet Deployment Scheme
DC-based scheme (recommended) AC-based scheme

 The ATAE cluster cabinet can be


supplied with DC or AC power. If
AC power is used, you need to
configure the primary power supply
CP2000AC54 AC/DC to convert
AC power to DC power in order to
supply power to ATAE cluster
subracks.

 A maximum of three cascaded


S3900 disk arrays are deployed to
provide storage space (MSS&ESS)
for all products in the cabinet.

 Another S3900 is deployed as a


backup disk array. (The BSS
replaces the tape drive, tape library,
and virtual tape library used in the
traditional scheme.)

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 10
ATAE Cluster Cabinet Deployment
Scheme (Continued)
Compared with the traditional deployment
scheme, the application systems like M2000 and
PRS are deployed separately from the database
system; that is, data processing and data storage
are separated.

PRS SAU Nastar ESS


MSS BSS

 The misplug prevention design based on the


customer experience ensures that each module is
installed in the correct place.
M2000 TS
Each board and key module are hot swapping
without service interruption.
Highly integrated Each board seamlessly connects to other basic
solution devices in the subrack. Therefore, you do not need
to connect them with cables during installation.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 11
ATAE Cluster Cabinet Deployment Scheme
01 02 03 04 05 06 07 08 09 10 11 12 13 14

Nastar-Standby

OSMU Standby
SAU- Standby
PRS-Standby

Nastar DB
Reserved

Reserved

Reserved
PRS-DB

Nastar
LSW

LSW
SAU
PRS

01 02 03 04 05 06 07 08 09 10 11 12 13 14
M2000-Standby

DB-Standby
Reserved

Reserved

M2000DB

Reserved

Reserved

Reserved
(Sybase)
M2000

M2000
OMSU

LSW

LSW

TS
M2000 4:1 PRS 1:1 SAU 1:1 Nastar 1:1 DB N:1
Highly reliable Highly reliable Highly reliable Highly reliable Highly reliable

 ATAE Cluster supports five clusters: M2000 cluster, PRS cluster, SAU cluster, Nastar cluster, and
Oracle cluster. Each cluster uses the N:1 scheme.
An ATAE subrack has 14 slots. Slots 07 and 08 house the GE&FC switching unt and the basic
processing subrack houses the OSMU in slot 01.
The OS is installed at the bottom of each board and application or database software is installed
above.
Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 12
Product Introduction – Architecture

Air exhaust vent

Rear interface board


Service board
backplane

Hot-swap
fan tray

Redundant
Air intake vent power supply
switching board

Service management module (SMM)


Active/standby redundancy

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 13
Product Introduction – Subrack
A carrier-class modular technology is used to achieve desirable functionality,
performance, density, and high reliability.
 The chassis is 14 U high and 19 inches wide. It can be installed in any
standard 19-inch subrack.
 The chassis provides 14 slots for boards in the front and another 14
Front slots for interface boards at the back.
The subrack is configured with a dual-star high-speed interconnection
view 

backplane. The backplane provides redundant buses, such as dual-star


Intelligent Platform Management BUS (IPMB), service data bus, power
supply bus, and clock bus. Boards and modules are interconnected
through these buses. This reduces the number of cables between
boards and modules.
 The subrack is configured with four 1+1 redundant power distribution
modules. These modules provide power supply for the chassis and
various parts (including boards and interface boards) through the
backplane.
Back  Two fan boxes are configured to achieve intelligent heat
dissipation.
view
 A back cable trough is installed according to customers'
maintenance habits. This improves maintenance efficiency.
 The subrack has passed the electromagnetic compatibility (EMC)
authentication and Underwriters Laboratory (UL) authentication.
 The subrack supports the Network Equipment Building Specification
(NEBS).

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 14
Product Introduction - Heat Dissipation System
 The subrack adopts the forced air cooling method
and the bottom to top ventilation design, that is, the
air inlets are from the lower front and the air outlets
are from the top rear.
 The heat dissipation system can adjust the wind
speed automatically according to the internal
temperature, and provides 200 W cooling capability
to each front slot and 30 W cooling capability to
each rear slot.
 As the core component of the heating dissipation
system, the fan tray supports N:1 redundancy
protection. That is, even if a fan is faulty, the running
of the entire system is not effected. The maximum
air volume of the subrack is 560 CFM. The subrack
can meet the heat dissipation requirement of the low
consumption blades.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 15
Power Supply Module of the ATAE Cluster
Subrack (DC Cabinet)
In a DC cabinet, the power module provides four –48 V power inputs for each subrack, with the left
and right sides of the subrack each supplied with 2+2 redundant backup power. The DC power
provides each slot with redundant –48 V DC power through the backplane to ensure uninterruptible
power supply for the chassis. The following table lists the power performance specifications:

Item Specification

Rated input voltage –48/–60 V DC

Input voltage range –40.5 V DC to –72 V DC

Maximum current 40 A

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 16
Power Supply Module of the ATAE Cluster
Subrack (AC Cabinet)
In an AC cabinet, the power module provides two –48 V power input for each subrack, with the left
and right sides of the subrack each supplied with 1+1 redundant backup power. The DC power
provides each slot with redundant –48 V DC power through the backplane to ensure
uninterruptible power supply for the chassis. The following table lists the power performance
specifications:

Item Value

Rated input voltage –48/–60 V DC

Input voltage range –40.5 V to –72 V DC

Maximum current 60 A
ATAE 2 Circuits, 1+1 Redundancy

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 17
Product Introduction – Server Board (1)
 dual-core or quad-core Intel® Xeon® processor with low power consumption.
 Each processor supports 12 MB L3 buffer.
 The memory capacity is 48 GB (6 x 8 GB).
 The memory model is DDR3 1333000 KHz.
 The server board supports two Ethernet (10/100/1000 M Base-T) Base interfaces.
 The server board supports two Ethernet (1000 M Base-B) Fabric interfaces.
 The server board supports one Ethernet (1000 M Base-B) Update channel interface.
 The server board provides two USB 2.0 interfaces (compatible with USB 1.1) and
one Intelligent Platform Management Controller (IPMC) serial port. (The IPMC port
also works as the system serial port. Its communication standard is RS232, and its
interface type is RJ-45.)
 The server board supports two SAS hard disks. Each SAS hard disk provides a 300
GB capacity.
 The server board provides an IPMC module that is supplied with power
independently. The IPMB module connects to the chassis management board
through a redundant backup IPMB bus.
 The IPMC module provides the following functions:
 FRU, SDR, and SEL information management
Server board  Temperature detection, voltage detection, and alarming
Model: AUPSA  Hot-swap control, power-on/off control, and reset control
 Console redirection
 Remote KVM over IP

Note: In an ATAE cluster system, only the OSMU board is configured with two 300 GB hard disks, where none of the
other boards is configured with hard disks.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 18
Product Introduction – Server Board (2)

 The AGFRBM board can provides four daughter board


connectors to support egress Gigabit Ethernet (GE) daughter
boards, egress fiber channel (FC) daughter boards, Fabric FC
daughter boards, egress Serial Attached SCSI (SAS) daughter
boards, and small computer system interface (SCSI) daughter
boards. These connectors provide external interfaces and Fabric
interfaces for the server board through the daughter boards.

For ATAE Cluster ,the server board is configured with :

 3*Egress GE daughter boards(J1/J2/J3): each provides two


egress 1000 M network adapters.

 1*Fabric FC daughter board(J4): provides inner-joint fiber data

Server interface board channels for fiber data communications.


Model: AGFRBM

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 19
Product Introduction – Switching Board (1)
 Basic configuration (including only the Base plane)

Twelve GE ports connect the switching board to twelve board slots.

Two GE ports connect the switching board to the SMMs of the active and standby
chassis.

One GE port connects the switching board to the Base switching plane of another
switching board slot so that the active and standby Base switching planes work in
redundancy mode.

Eight ports connect the switching board to the interface board of the switched
network through the backplane so that the switching board provides external
network interfaces.

Two ports connect the switching board to the GE and FC daughter boards
respectively.
 Basic configuration + GE module: provides an additional Fabric switching
plane which is independent from the Base switching plane and provides
21 ports

Twelve 20GE ports connect the switching board to 12 slots.

One 10GE port connects the switching board to the Fabric switching plane of
another switching board slot so that the two Fabric switching planes work in
redundancy mode.

Eight 10GE ports connect the switching board to the interface board of the
switched network through the backplane so that the switching board provides
external network interfaces.
 Basic configuration + GE module + FC module: provides an additional FC
optical switch function and four external 8 Gbit/s FC ports through the
Switching board interface board, which can be used to set up an SAN storage network)
Model: AXCBF1
The configuration of the switching board in an ATAE cluster system
is as follows: basic configuration + GE module + FC module.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 20
Product Introduction – Switching Board (2)
Interface Description

3 Fabric FC Four 8 Gbit/s auto-negotiation


interface ports.

Four gigabit Ethernet


interfaces
2 Base
 1000M BASE-T auto-
interface
sensing
 Two indicators

 Uses the SFP+ optical


module to provide eight
10GE optical ports.
4 Fabric LAN
 Uses the photoelectric
interface
Switching interface board conversion module to
Model: AXCRM provide eight GE electrical
ports.
Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 21
Product Introduction – S2600 Disk Array
Processor 64-bit

Data cache 4 GB

Host interface Eight 4 Gbit/s FC host


interfaces
Number of hard disks 12
supported by each
subrack
OceanStor S2600
Supported RAID level 0,1,3,5,6,10,50

Disk space 450 GB/1 TB

Note: Data is stored from each ATAE board to the


disk array through an FC connection. However,
routine maintenance is carried out for the disk array
through a network cable connection.
S2600 is no longer delivered in ATAE Cluster
V100R001C01.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 22
Product Introduction –S3900 Disk Array

Processor 64-bit

Data cache 16 GB/Controller

Sixteen 8 Gbit/s FC
Host interface
host interfaces

OceanStor S3900 Number of hard disks


24
supported by each subrack

Supported RAID level 0, 1, 3, 5, 6, 10, 50

Disk capacity 600 GB/900 GB

Note: Data is stored from each ATAE board to the disk


array through an FC connection. However, routine
maintenance is carried out for the disk array through a
network cable connection.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 23
Product Introduction –S3900 Disk Array (for
Backup)
Processor 64-bit

Data cache 8 GB/Controller

Host interface Eight 8 Gbit/s FC


host interfaces
Number of hard disks 12 (3.5" hard disks)
supported by each subrack
OceanStor S3900

Supported RAID level 0, 1, 3, 5, 6, 10, 50

Disk capacity 2 TB

Note: Data is stored from each ATAE board to the disk


array through an FC connection. However, routine
maintenance is carried out for the disk array through a
network cable connection.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 24
Contents

1. ATAE Cluster Overview

2. ATAE Cluster Hardware Platform

3. ATAE Cluster Scheme

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 25
Contents
3. ATAE Cluster Scheme
3.1 Constraints of ATAE Cluster Design Specifications
3.2 ATAE Cluster Networking
3.3 ATAE Cluster SAN Boot Scheme
3.4 ATAE Cluster Storage Scheme
3.5 Clusters and Services in the ATAE Cluster
3.6 ATAE Cluster Backup and Restore Scheme
3.7 ATAE Cluster Remote HA Solution
3.8 ATAE Cluster Emergency System
3.9 ATAE Cluster OSMU Maintenance Solution
3.10 ATAE Cluster Antivirus Solution

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 26
Constraints of ATAE Cluster Design
Specifications
Constraint 1: All product boards are started using SAN Boot. The
emergency system can also be started using disks configured for the
ATAE boards.
A fully-configured two-subrack cabinet
provides 24 ATAE boards.
Constraint 2: When the resources required for co-deployment of A maximum of 23 boards, including
service boards, DB boards, and
multiple products exceed the capability of one cabinet using two standby boards, can be installed for
subracks, two or more cabinets must be used. However, note that one product applications.

product cannot be deployed in two cabinets or more.

Constraint 3:In capacity expansion scenarios, the resources required OSMU


(mandatory)
for products in a cabinet after capacity expansion must not exceed a
cabinet's capability (including board number and storage space). If
exceeded, need customized to support.
Fully configured service disk
Avoid this sales scenario at the pre-sale stage. If the capacity expansion arrays (one MSS plus two ESSs).

sale has broken constraint 3, the sales department must confirm with
R&D to determine the supported or not. One backup storage subrack
(BSS)
Twelve 2 TB hard disks are
configured (compatible with the
configuration of twenty-four 1 TB
hard disks or twenty-four 600 GB
hard disks on the live network).

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 27
Capacity Expansion Principle: Flexible Board
Layout and Mixed Configuration
The following uses an example of expanding M2000/PRS (400 equivalent NEs) to M2000/PRS (800
equivalent NEs) .
400 equivalent NEs 800 equivalent NEs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Switching board 1

Switching board 2
M2000 boards are
mixed; they are
separated by PRS
boards in middle.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Switching board 1

Switching board 2
Switching board 1

Switching board 2
M2000 Standby

PRS Standby
PRS Standby

M2000 Stdby
M2000 MED

M2000 MED
M2000 MED

DB Standby
DB Standby

M2000 DB

M2000 DB
M2000 DB

M2000 TS
M2000 TS

PRS DB
PRS DB

M2000
M2000

OSMU
OSMU

PRS
PRS

Deploy M2000 and PRS in sequence in the subracks of the Deploy the new boards for capacity expansion in the

cabinet. The standby DB board is always deployed in slot 14 remaining slots of the subrack in sequence.
Capacity expansion procedure:
because it is shared by multiple products. 1. Export live network data from OSMU.
The planning tool is used during 2. Use the planning tool to import live network data and select the
target network for capacity expansion.
actual capacity expansion planning. It 3. The planning tool generates capacity expansion planned data.
(including board layout after capacity expansion without the need
generates the board locations for of manual adjustment)
capacity expansion by following the 4: Install hardware based on the capacity expansion plan.
5. Use the OSMU to import the capacity expansion planned data
principle of "flexible board layout and and activate boards.
mixed configuration". 6. Use the capacity expansion script of the service products to
complete the capacity expansion.
Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 28
Contents
3. ATAE Cluster Scheme
3.1 Constraints of ATAE Cluster Design Specifications
3.2 ATAE Cluster Networking
3.3 ATAE Cluster SAN Boot Scheme
3.4 ATAE Cluster Storage Scheme
3.5 Clusters and Services in the ATAE Cluster
3.6 ATAE Cluster Backup and Restore Scheme
3.7 ATAE Cluster Remote HA Solution
3.8 ATAE Cluster Emergency System
3.9 ATAE Cluster OSMU Maintenance Solution
3.10 ATAE Cluster Antivirus Solution
Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 29
ATAE Cluster Networking

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 30
ATAE Cluster Networking (Compatible with
the Networking on the Live Network)
SMM SMM

IPMB
Switch Switch
Backplane
Base interface Board Board

PMC interface
Maintenance
Lan Lan of the rear
network
Switch Switch board
Service
network

OSMU M2000 M2000 M2000 TS DB DB


Router Customer's
server server server server server server server
network

Backplane FC
Switch Switch
interface
Fabric interface Switch Switch FC storage network
Board Board
VCS heartbeat Board Board
network

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 31
ATAE Cluster Networking (For V2R1 and
Later)
SMM SMM

Backplane Base
IPMB
interface
Switch Switch
Maintenance Board Board
network & VCS
heartbeat PMC interface of
network Lan Lan the rear board
Switch Switch
Independent
network plane 2

OSMU M2000 M2000 M2000 TS DB DB


Router Customer's
server server server server server server server
network

Switch Switch
Fabric interface Backplane FC
Board Board Switch Switch
interface
Independent Board Board
FC storage
Lan Lan network plane 1 network
Customer's
Switch Switch
network
Router

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 32
ATAE Cluster Networking (Compatible with
the Networking on the Live Network)
Network Number of
Network
Purpose Adapter Network Description
Type
Type Interfaces
The ATAE cluster supports The VCS heartbeat network uses the link
Used for the
the following types of VCS
communication between Fabric
layer for communication, and therefore you
network: heartbeat 2 do not need to configure an IP address. The
the nodes in the VCS plane
network two links of the heartbeat network work in
cluster
 VCS heartbeat active/standby mode.
network Used for production The maintenance network uses a fixed IP
installation and for the address. Changing the address is not
 Maintenance network Maintenance
routine maintenance of Base plane 2 allowed. The maintenance network
network
 Service/DB network ATAE boards and disk implements redundancy through the dual
arrays planes.
 FC storage network
Used for the
communication on the
service/DB network, for The service network connects to the public
Service
Service example, the M2000 network through each ATAE rear interface
interface 2
network connecting to NEs and the board, and implements redundancy through
board
DB board providing the dual planes.
database services for the
M2000

The storage network connects to the service


Used for mounting service board or OSMU board by FC cables through
boards into the service the switching board.
FC storage
storage system and for FC module The FC interface is provided by the fiber
network
storing the backup data loopback module of the rear board and
from the OSMU board directly communicates with the switching
Note: board.
 The network cable of the rear board of the OSMU connects to the service network so that users can remotely
log in to the system.
 The OSMU communicates with and maintains boards and disk arrays through the maintenance network.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 33
ATAE Cluster Networking Design (For V2R1
and Later)
Network Number of
The ATAE cluster Network
Purpose Adapter Network Description
supports the following Type
Type Interfaces
types of network:
The basic service network connects to the
VCS Used for communication
 VCS heartbeat heartbeat between nodes in the Base plane 2
public network from the back of the switching
network board, and implements redundancy through
network VCS cluster
the dual planes.
 Maintenance network
The maintenance network uses a fixed IP
 Service/DB network address. Changing the address is not
allowed. The maintenance network
 FC storage network
Used for production implements redundancy through the dual
Maintenance installation and routine planes.
Base plane 2
network maintenance of the ATAE The VCS heartbeat network uses the link layer
board and disk arrays. for communication, and therefore you do not
need to configure an IP address. The two links
on the heartbeat network work in
active/standby mode.
Used for service/DB
communication. For
The service network connects to the public
example, the M2000
Service Fabric network through each ATAE rear interface
connects to the NEs, or 2
network Plane board, and implements redundancy through
the DB board provides the
the dual planes.
database service for the
M2000.
The storage network connects to the service
Business board mount board or OSMU board by FC cables through
FC storage service storage space. switching board.
FC module
network OSM board mount The FC interface is provided by the fiber
backup space. loopback module of the rear board and directly
communicates with the switching board.
Note:
 The network cable of the rear board of the OSMU connects to the service network so that users can remotely log in to
the system.
 The OSMU communicates with and maintains boards and disk arrays through the maintenance network.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 34
Requirements for the IP Address of the
ATAE Cluster Network (Two Subracks)
Maintenanc Service
VCS
e Network Network
Device Heartbeat Description
(Private IP (Public IP
Network
Address) Address)
Each SMM is configured with one IP address. (By default, the SMM IP address is
192.168.128.23 or 192.168.128.24 in the MPS and is 192.168.128.26 or
SMM in the ATAE 192.168.128.27 in the EPS). The two SMMs work in active/standby mode. A
6 0 0
subrack logical IP address is configured for the two boards to provide external services.
By default, the logical IP address is 192.168.128.25 in the MPS and is
192.168.128.28 in the EPS.
Each disk array with disk array controllers is configured with two maintenance IP
addresses:
The default maintenance IP addresses of the service disk array
Disk array 4 0 0
are192.168.128.203 and 192.168.128.204.
The default maintenance IP addresses of the backup disk array are
192.168.128.201 and 192.168.128.202.

M–N service/DB Each board is configured with one public IP address. The public IP address also
boards serves as the logical IP address for providing external services. (Note: You
need to apply for an IP address only for an external service network.)
(M represents the
total number of The default private IP address is 192.168.128. [100 + {subrack number –1 } * 14
M-N 0 M-N
boards and N + slot number] .
represents the Note: A public IP address is not required for the active board of the locally
number of deployed ES. The IP address of the active board of the emergency M2000
standby boards) system can be used.

You do not need to plan a public IP address for the standby board. When the
services on a service board are switched to the standby board, the IP address of
N standby boards N 0 0 the service board is switched to the standby board.
The default private IP address is 192.168.128. [100+subrack No.*2+slot No.]

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 35
Introduction to the ATAE Cluster Storage
ATAE cluster storage groups are classified into the following disk arrays by functionality:

Service disk array (SAS 600 GB):
1. Determine disk array configurations based on sale scenarios. The following describes an example where the total size of
required space is S:
a. If S is smaller than 5,800 GB, one MSS needs to be configured (if the disk array is configured with disk array controllers).
b. If S is larger than or equal to 5,800 GB but smaller than 11,900 GB, one MSS and one ESS need to be configured (if a disk array
extended frame is installed).
c. If S is larger than or equal to 11,900 GB but is smaller than 18,000 GB, one MSS and two ESSs need to be configured.
2. The MSS connects to the ATAE switching board through an optical fiber and connects to each ESS through an SAS
cable.
3. The service disk array stores the data of the OSs, services, and databases on all boards.
4. The RAID 1+0 technology is used to achieve redundancy protection for the service disk array. Each disk array consisting
of 24 hard disks forms an RAID group. Each RAID group contains two hot spare disks and only 11 hard disks provide
available space.

Backup disk array (SATA 2 TB or 3.5")
1. No extended storage subrack (ESS) is used with the backup storage subrack (BSS) when a single ATAE subrack is
used.
2. The BSS communicates only with the OSMU by connecting to the switching board through an optical fiber.
3. The backup disk array stores the backup data, including the OS backup data, dynamic backup data, and static backup
data, on each ATAE board.
4. The RAID 5+1 hot standby technology is used to achieve redundancy protection for the backup disk array.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 36
Connecting the Subracks (MPS/EPS) and the
Switches

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 37
Connecting the Controller Enclosure and the
Subrack

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 38
Connecting the Controller Enclosure and the
Subrack

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 39
Connecting the MPS and the EPS

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 40
Networking Between the MSS and ESS of
the Service Disk Array

One MSS + one ESS One MSS + two ESSs

Note:

The cascading structure of disk array can be either of the following based on the deployed products
and network scale: one MSS + one ESS and one MSS + two ESSs.

The ESS connects to an MSS or ESS through an SAS cable.

The MSS connects to the ESS in dual lines for redundancy.

No ESS connects to the BSS in the current version.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 41
Contents

3. ATAE Cluster Scheme


3.1 Constraints of ATAE Cluster Design Specifications
3.2 ATAE Cluster Networking
3.3 ATAE Cluster SAN Boot Scheme
3.4 ATAE Cluster Storage Scheme
3.5 Clusters and Services in the ATAE Cluster
3.6 ATAE Cluster Backup and Restore Scheme
3.7 ATAE Cluster Remote HA Solution
3.8 ATAE Cluster Emergency System
3.9 ATAE Cluster OSMU Maintenance Solution
3.10 ATAE Cluster Antivirus Solution
Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 42
Why Is the SAN Boot Technology Introduced
into the ATAE Cluster?
1. What is SAN?

SAN is short for storage area network. It is a type of high-speed storage network which is similar to the
common local area network (LAN). An SAN network directly connects the server and the disk array through
dedicated hubs and FC switches. It allows you to establish a data connection between the disk array and
each ATAE board using an optical fiber and built-in ATAE fiber switching board.

2. What is SAN Boot?


SAN boot starts the OS (installed on the disk array) from the disk array through the SAN. SAN boot is also
called remote boot because the OS is not started from the local disk.

3. Why is the SAN boot technology introduced into the ATAE cluster?

 Higher reliability because OS data is integrated into the disk array: None of the ATAE boards is configured with
local disks or mechanical parts. This improves reliability.
 Quick fault recovery: If a board becomes faulty, you can replace the faulty board without reinstalling the system.
Instead, you only need to change the mapping between the board and the OS on the disk array. This simplifies
user operations.
 Centralized management: The boot disk of the server is stored on the storage device for centralized
management. This helps fully use the various advanced management functions of the storage device.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 43
Board Hot Swap due to SAN Boot
The following describes the SAN boot technology using an example where the M2000, PRS, and  None of the ATAE boards is configured
TranSight (400 equivalent NEs) are deployed: with local disks. The disk array provides a
110 GB storage space for the OS of each
1 2 3 4 5 6 7 8 9 10 11 12 13 14
board.
M2000 Standby  If local disks have been configured on the

PRS Standby

DB Standby
ATAE boards such as the emergency
M2000(TS)

M2000DB

PRSDB
M2000

M2000
OSMU

system board and OSMU board, the local

LSW

LSW

PRS

ES
disks are used for installing the operating
system; that is, SAN boot is not used to
boot the operating system.
 OS data, service data, and database data
are completely stored in the disk array.
 Board replacement does not require
system reinstallation. This simplifies onsite
Storage switching The ES uses local disks
instead of SAN boot to operations.
The OSMU is handle emergencies
responsible for occurred when disk
initialization and arrays are damaged. Quick system recovery
maintenance of the
disk array. Therefore,
it cannot act as SAN
boot.
Secure and reliable data

Disk array

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 44
Contents

3. ATAE Cluster Scheme


3.1 Constraints of ATAE Cluster Design Specifications
3.2 ATAE Cluster Networking
3.3 ATAE Cluster SAN Boot Scheme
3.4 ATAE Cluster Storage Scheme
3.5 Clusters and Services in the ATAE Cluster
3.6 ATAE Cluster Backup and Restore Scheme
3.7 ATAE Cluster Remote HA Solution
3.8 ATAE Cluster Emergency System
3.9 ATAE Cluster OSMU Maintenance Solution
3.10 ATAE Cluster Antivirus Solution
Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 45
Mapping Between the LUN of the Service Disk
Array and the ATAE Board
The following describes the LUN mapping using an example where the
M2000 and PRS (400 equivalent NEs) are deployed:
Note:
M2000 Cluster PRS Cluster DB Cluster

M2K-DB  The two disk arrays work in RAID 1+0


OSMU Master Slave Standby PRS Standby PRS-DB DB-Standby
(Sybase) mode. Each disk array is configured with
two hot spare disks.
 Each board can detect only its own OS
LUN.
/export/home
/export/home /export/home /export/home /export/home  Each cluster can detect all the service
LUNs in its own cluster.
MAP  The VxVM technology is used to manage
all volumes in the OS.

LUNS
OS M2K DB:RAW PRS DB:RAW and The mapping between the disk array and
APP:/export/ and APP:/export/h /export/home the OS is implemented through the WWN
home /export/home ome number of the fiber card.

S2600-MSS S2600-ESS-1
S2600-ESS-1
RAID1+0
S2600-ESS-0 S2600-ESS-2
S2600-ESS-2

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 46
Mapping Between the LUN of the Backup
Disk Array and the OSMU
OSMU Service board/DB board

Note:

/export/home
 All LUNs on the backup disk array
are allocated to the /export/home
directory in the OSMU.
MAP
 The /export/home directory is used
to store all backup data.

LUNS

Service data backup


Backup LUN process

BSS array connection


relationship The mapping between the disk array and
RAID5 the OS is implemented through the WWN
S2600- MSS number of the fiber card.
One HotSpare
S2600- ESS- 2

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 47
Contents
3. ATAE Cluster Scheme
3.1 Constraints of ATAE Cluster Design Specifications
3.2 ATAE Cluster Networking
3.3 ATAE Cluster SAN Boot Scheme
3.4 ATAE Cluster Storage Scheme
3.5 Clusters and Services in the ATAE Cluster
3.6 ATAE Cluster Backup and Restore Scheme
3.7 ATAE Cluster Remote HA Solution
3.8 ATAE Cluster Emergency System
3.9 ATAE Cluster OSMU Maintenance Solution
3.10 ATAE Cluster Antivirus Solution
Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 48
VCS Cluster System and Principles
The VCS software takes resource as the minimum
management unit and resource group as the set of
resources. In the SLS system, each board maps a
service group and each group has sources defined by
users. These resources are dependent on each other.
Before using the VCS software to manage the
resources, you must register these resources first.
The VCS SLS scheme supports N:1 backup for boards
in the network. The N boards in the SLS cluster share a
standby board. The cluster consisting of the service
boards the standby board is centrally monitored and
managed by the VCS software. If a software or
hardware resource fault within the monitoring range
occurs on any board in the SLS cluster, the VCS
software will try to start the resource on the board first.
If the VCS software fails to start the service, it will
automatically switch all resources of the board including
the applications to the standby board.
To enable N boards to share a standby board, form
N service groups using each of these N boards and the
standby board separately. These service groups are
centrally managed and schedule by the VCS software.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 49
Cluster Systems in the ATAE Cluster
The following describes the application clusters where the M2000
and PRS (400 equivalent NEs) are deployed:
 The ATAE SLS system includes the
Mast er Sl ave M2K- DB St andby PRS St andby
( Sybase)
service cluster such as M2000 cluster and
PRS cluster and the database cluster

PRS Cl ust er
which is comprised by all database boards.
M2000 Cl ust er
Each cluster is configured with a standby
Logi cal I P board.

 A cluster is independent from each other.


For example, the M2000 cluster and PRS
PRS- DB
DB- St andby
cluster are independent from each other.

DB Cl ust er
 When the Sybase database is used, the
Sybase database board and the M2000
service board are deployed in the same
cluster.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 50
OSS Application Cluster
Take the M2000 as an example. The cluster system achieves high availability of
the application system through the standby board.

M2000 DB (Sybase) The VCS of the M2000 cluster system


tries to recover malfunctioned resources
M2000 standby (three times) based on the configured
board policy when a service board's resources
malfunction. If the resources cannot be
recovered, the VCS switch the service
group to the standby board.
M2000 service-1
M2000 service-3 All boards in the M2000 cluster form the
N:1 cluster through VCS; where, N
represents the active boards and 1
represents the standby board.
M2000 service-2 M2000 services are deployed on each
service board in distributed manner.
A service board and the M2000 standby
board form a service group (the switch is
carried out on a service group basis).

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 51
VCS Resource of the OSS Application
Cluster Cluster Involved
Purpose Naming Rule
Resource Node
srXsY_oss_sg_nic_rs
Network Service node To monitor the where
APP
adapter and backup running status X represents the ATAE
(NIC) node of the NIC. subrack and Y represents the
active service board.
srXsY_oss_sg_ip_rs
To monitor the
MountPoint Logical IP Service node where
Logical IP running status
and backup X represents the ATAE
address of the logical
node subrack and Y represents the
IP address.
active service board.
srXsY_oss_sg_dg_rs
DiskGroup NIC Service node To monitor the where
Disk group
and backup resources in X represents the ATAE
resource
node the disk group. subrack and Y represents the
active service board.
Note:
 Upper-layer resources depend on lower- To monitor the srXsY_oss_sg_mount_rs
Mount Service node running status where
layer resources.
point and backup of the X represents the ATAE
 Services are started from bottom to top.
resource node /export/home subrack and Y represents the
Resources at the same layer are started at mount point. active service board.
the same time.
 Services are stopped from top to bottom. srXsY_oss_sg_ossapp_rs
 Service groups are named in the format of OSS Service node To monitor the where
active board subrack number_oss_sg, application and backup running status X represents the ATAE
for example, sr2s2_oss_sg. resource node of the M2000. subrack and Y represents the
active service board.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 52
Sybase VCS Resource of the OSS Cluster
Sybase backup
Cluster Involved
Purpose Naming Rule
Resource Node

srXsY_db_sg_nic_rs
DB node To monitor the
Network where
and backup running status of
adapter (NIC) X represents the ATAE subrack and
Sybase node the NIC.
Y represents the active DB board.

To monitor the srXsY_db_sg_ip_rs


DB node
Logical IP running status of where
and backup
address the logical IP X represents the ATAE subrack and
node
address. Y represents the active DB board.
Mount Logical IP
srXsY_db_sg_syb_rs
DB node To monitor
Sybase where
and backup Sybase
resource X represents the ATAE subrack and
node instances
Y represents the active DB board.

DiskGroup NIC srXsY_db_sg_dg_rs


DB node To monitor the
Disk group where
and backup resources in the
resource X represents the ATAE subrack and
node disk group.
Note: Y represents the active DB board.
 Upper-layer resources depend on lower- To monitor the
srXsY_db_sg_mount_rs
layer resources. Mount point
DB node running status of
where
 Services are started from bottom to top. resource
and backup the
X represents the ATAE subrack and
node /export/home
Resources at the same layer are started mount point.
Y represents the active DB board.
at the same time.
 Services are stopped from top to bottom. Sybase DB node To monitor the
srXsY_db_sg_sybbak_r
where
 Service groups are named in the format backup and backup running status of
X represents the ATAE subrack and
resource node the Sybase.
of active board subrack number Y represents the DB board.
_syb_sg, for example, sr2s2_syb_sg.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 53
DB Cluster

DB-4
All DB boards form the N:1 cluster
through VCS; where, N represents the
active DB boards and 1 represents the
DB-Standby standby DB board.
Each DB board runs an independent
instance. The instance name differs
DB-1
from one DB board to another.
DB-3 All the DB boards and the standby
board form a cluster. Each DB board
and the standby board form a service
group.
DB-2

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 54
VCS Resource of the DB Cluster
Oracle Listener Cluster Involved Purpose Naming Rule
Resource Node
Network DB node To monitor the srXsY_db_sg_nic_rs
adapter and running status where
ORACLE Logical IP (NIC) backup of the NIC. X represents the ATAE subrack and
node Y represents the active DB board.
Logical IP DB node To monitor the srXsY_db_sg_ip_rs
address and running status where
Mount NIC backup of the logical IP X represents the ATAE subrack and
node address. Y represents the active DB board.
Oracle DB node To monitor srXsY_db_sg_oracle_rs
resources and Oracle where
DiskGroup backup instances. X represents the ATAE subrack and
node Y represents the active DB board.
Disk group DB node To monitor the srXsY_db_sg_dg_rs
resource and resources in the where
Note: backup disk group. X represents the ATAE subrack and
 Upper-layer resources depend on lower- node Y represents the active DB board.
layer resources.
 Services are started from bottom to top. Mount DB node To monitor the srXsY_db_sg_mount_rs
point and running status where
Resources at the same layer are started
resource backup of the X represents the ATAE subrack and
at the same time. node /export/home Y represents the active DB board.
 Services are stopped from top to bottom. mount point.
 Service groups are named in the format
of active board subrack number Oracle DB node To monitor the srXsY_db_sg_listener_rs
_db_sg, for example, sr2s2_db_sg. listener and running status where
resource backup of the listener. X represents the ATAE subrack and
node Y represents the active DB board.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 55
Example of OSS Application Cluster
Switchover Only the services on one service
board can be switched to the
Service Group-0 Service Group-1 standby board at a time.
The switchover process stops all
resources on the malfunctioning
Not FailOver! board in sequence first and then
starts all resources on the standby
FailOver board in sequence.
The switchover of the service
APP-1 APP-2
group on one board will not trigger
the switchover of service groups on
other boards of the same cluster.
You can view the status on the
OSMU device panel after the
switchover succeeds.
APP-Standby Resource switchover is
automatically triggered by the VCS
when resources malfunction. In
routine maintenance, you can
manually perform the switchover.
The processes are the same.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 56
Example of DB Cluster Switchover
Only the services on one DB board can be
switched to the standby DB board at a time.
Service Group-0 Service Group-1 The switchover process stops all resources
on the malfunctioning board in sequence first
and then starts all resources on the standby
board in sequence.
Not FailOver!
The switchover of the service group on one
board will not trigger the switchover of
FailOver services boards on other boards of the same
cluster.
DB-1 DB-2
The switchover of M2000/PRS/Nastar
service groups will not be triggered when
database switchover occurs but applications
will be stopped. After the switchover
succeeds, the applications will be started
automatically.
You can view the status on the OSMU
DB-Standby device panel after the switchover succeeds.
Resource switchover is automatically
triggered by the VCS when resources
malfunction. In routine maintenance, you can
manually perform the switchover. The
processes are the same.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 57
Contents
3. ATAE Cluster Scheme
3.1 Constraints of ATAE Cluster Design Specifications
3.2 ATAE Cluster Networking
3.3 ATAE Cluster SAN Boot Scheme
3.4 ATAE Cluster Storage Scheme
3.5 Clusters and Services in the ATAE Cluster
3.6 ATAE Cluster Backup and Restore Scheme
3.7 ATAE Cluster Remote HA Solution
3.8 ATAE Cluster Emergency System
3.9 ATAE Cluster OSMU Maintenance Solution
3.10 ATAE Cluster Antivirus Solution
Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 58
Data Levels and Specifications for ATAE
Backup and Restoration Dynamic data
Static data

The ATAE cluster solution provides three backup levels based on the data to be backed up. OS data

 Dynamic Data

Dynamic data includes the data in the dynamic configuration files and database. Such data is generated during
the product running and is backed up once a day. A maximum of N weeks of data can be saved. This ensures
that the system can be restored based on the data of any day within the N weeks.

 Static Data

Static data includes the binary codes of the product and all third-party software. N backups are performed for all
the static data after the initial installation is complete or the product and third-party software are upgraded. This
ensures that the system can be restored based on the static data of the last N backup when the system is
properly started.

 OS Data

OS data includes the data of all board OSs. N backups are performed for the data after the initial installation is
complete or the OS is upgraded. This ensures that the system can be restored based on the data of the last N
backup when the OS is properly started.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 59
Data to Be Backed Up and Restored in
ATAE
Dynamic Data
It includes the service dynamic files of the OSS systems like M2000, PRS, Nastar, eSAU,
TranSight and SONMaster and database data like the configuration files saved in
/export/home/omc and data in each table space of the database.

Static Data

It includes the applications of the OSS system or Oracle. For example, the binary files in
directories like /export/home/oracle, /opt/OMC, and /opt/PRS.

OS Data

It includes the OS data of all boards. That is, all data in the / partition.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 60
Technical Implementation for ATAE Backup
and Restoration
 Dynamic Data
The OSS (such as the M2000, PRS, Nastar, eSAU, and eCoordinator) backs up dynamic data on the client.
After the backup is complete, a backup script is executed to automatically upload the backup data to the
backup disk array using the SCP protocol. You do not need to stop the services when backing up the
dynamic data. You can manually restore the dynamic data using the OSMU GUI. The backup data is saved
as a tar.gz file.

 Static Data
You can back up and restore static data using the OSMU. You do not need to stop the services when
backing up the static data. After the backup is complete, the data is transferred to the backup disk array
using the SCP protocol. The backup data is saved as a tar.gz file.

 OS Data
You can back up and restore OS data using the OSMU. You need to use dd+bzip2 to back up the OS data to
the backup disk array. Based on test results, 50 GB partition-based OS data can be compressed to a 700 MB
file using the maximum compression ratio. It takes 30 minutes to back up the OS data and 40 to 50 minutes to
restore the OS data. From ATAE V200R001C01 onwards, you do not need to stop services and restart the OS
during OS backup. However, you must restart the OS during restoration. In versions earlier than ATAE
V200R001C01, you need to restart the OS during OS backup.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 61
Typical Scenarios of ATAE Data Backup
Scenario Data Backup Scheme

After the initial installation of the OSS systems like Back up the OSS system data in the following sequence:
M2000/PRS/Nastar/eSAU is completed 1. Use the OSS client to manually back up dynamic data and save the
data to the backup disk array.

2. Back up the OSS system and DB static data to the backup disk array.

3. Back up the OSS system and DB OS data to the backup disk array.

After the OSS system or OSMU server software is Back up the OSS system data in the following sequence:
upgraded 1. Use the OSS client to manually back up dynamic data and save the
After a patch is installed for the OSS system or OSMU data to the backup disk array.
server software 2. Back up the OSS system and DB static data to the backup disk array.
After the database software is upgrade Note:
After capacity expansion is carried out for the The operating system changes a lot after the cross-R-version
database upgrade is carried out for the OSS server software. In this case,
you are advised to backup the boards' operating system data.

After the operating system is upgraded Back up the operating system data to the backup disk array.

After a patch is installed for the operating system

Routine applications Use the OSS client to periodically back up dynamic data to the backup
disk array.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 62
Typical Scenarios of ATAE Data Restoration

Scenario Data Restoration Scheme

The OSS system like M2000, PRS, Nastar, or Restore only the dynamic data.
eSAU are functioning properly but users want to
roll back the OSS system to a previous state (for
example, last week).

The OSS system board's operating system runs Restore OSS system data in the following
properly but the database device is damaged or sequence:
the OSS system configuration file is damaged. 1. Static data

2. Dynamic data

Users are not sure which files are lost or Restore OSS system data in the following
damaged. sequence:

The OSS system or the DB board's system 1. OS data


breaks down and the operating system cannot 2. Static data
be started.
3. Dynamic data

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 63
Contents
3. ATAE Cluster Scheme
3.1 Constraints of ATAE Cluster Design Specifications
3.2 ATAE Cluster Networking
3.3 ATAE Cluster SAN Boot Scheme
3.4 ATAE Cluster Storage Scheme
3.5 Clusters and Services in the ATAE Cluster
3.6 ATAE Cluster Backup and Restore Scheme
3.7 ATAE Cluster Remote HA Solution
3.8 ATAE Cluster Emergency System
3.9 ATAE Cluster OSMU Maintenance Solution
3.10 ATAE Cluster Antivirus Solution
Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 64
Remote HA Solutions
In the ATAE cluster online remote HA solution, two
ATAE cluster systems are deployed in different
geographic locations, and data is synchronized
between the two systems in real time through a
dedicated data channel. If one system becomes
faulty, the M2000 services can be switched from the
faulty system to the other system at any time. This
ensures the continuity of the M2000 services. The
ATAE cluster online remote HA solution effectively
prevents losses caused by disastrous events such
as earthquake, fire, and power failure, provides
remote protection for the M2000 devices, and
thereby improves the M2000' capability in resisting
possible security risks. During active/standby
switchover of the two sites, the site that does not
synchronize NE performance data will not
automatically re-collect historical performance data
from NEs even if the services at the standby site
start. For details about how to query the historical
performance data of NEs before the switchover, see
section "Synchronizing NE Measurement Results
Manually" in U2000 Product Documentation.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 65
Contents
3. ATAE Cluster Scheme
3.1 Constraints of ATAE Cluster Design Specifications
3.2 ATAE Cluster Networking
3.3 ATAE Cluster SAN Boot Scheme
3.4 ATAE Cluster Storage Scheme
3.5 Clusters and Services in the ATAE Cluster
3.6 ATAE Cluster Backup and Restore Scheme
3.7 ATAE Cluster Remote HA Solution
3.8 ATAE Cluster Emergency System
3.9 ATAE Cluster OSMU Maintenance Solution
3.10 ATAE Cluster Antivirus Solution
Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 66
Solution Description – Basic Principle (Local
Deployment Networking Scheme)
The emergency system and the primary system are
deployed on the same LAN and managed by the
same OSMU (locally deployed). The emergency
system must be properly connected to the primary
system server, managed NEs, NMS, and U2000
client so that it can take over OSS services in a
timely manner if the primary system becomes faulty.
1. If the emergency system is locally deployed,
when services are manually switched from the
primacy system to the emergency system, the
same IP address is used.
2. The internal network is used for data
synchronization between the local emergency
system and the primary system.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 67
Solution Description – Basic Principle (Remote
Deployment Networking Scheme)
The emergency system and the primary system are
deployed on different LANs and managed by different
OSMUs (remotely deployed). The emergency system
must be properly connected to the primary system server,
managed NEs, NMS, and U2000 client so that it can take
over OSS services in a timely manner if the primary
system becomes faulty.
1. When the emergency system is remotely deployed
(multiple emergency systems can be deployed in one
cabinet), when services are manually switched to from
the primary system to the emergency system, the IP
address changes. One public IP address needs to be
applied for each emergency system board.
2. The public network is used for data synchronization
between the remote emergency system and the
primary system.
3. When NEs in the primary system are taken over by the
emergency system, NEs are distributed on the boards
the emergency system based on the average
allocation principles.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 68
Contents
3. ATAE Cluster Scheme
3.1 Constraints of ATAE Cluster Design Specifications
3.2 ATAE Cluster Networking
3.3 ATAE Cluster SAN Boot Scheme
3.4 ATAE Cluster Storage Scheme
3.5 Clusters and Services in the ATAE Cluster
3.6 ATAE Cluster Backup and Restore Scheme
3.7 ATAE Cluster Remote HA Solution
3.8 ATAE Cluster Emergency System
3.9 ATAE Cluster OSMU Maintenance Solution
3.10 ATAE Cluster Antivirus Solution

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 69
ATAE Cluster OSMU Maintenance Networking

The following describes the OSMU maintenance


networking in the ATAE cluster by using an example
where the M2000 (Sybase) and PRS (400 equivalent
NEs) are deployed:  The OSMU manages all ATAE boards
through the maintenance network.

 The IP address of each cluster board and


that of the OSMU are on the
192.168.128.xxx network segment.

 A trust relationship is configured between the


OSMU and each ATAE board. Therefore, the
OSMU can directly access and maintain
each ATAE board.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 70
OSMU Maintenance Solution

Note:
 The OSMU server performs all
control logic of the software. The
OSMU agent is automatically
installed on each board when the
OS is installed on the board, and
functions as the OSMU client.
 The OSMU manages and maintains
each board through the OSMU
agent.
 A script for route maintenance is
stored locally on the OSMU and
ATAE boards.
 The OSMU server manages and
maintains each board by invoking
the script stored in the OSMU
server or using the OSMU agent to
invoke the script saved in the board.
After invoking the script finishes,
the OSMU agent returns a result to
the OSMU server stating whether
the task succeeds or fails.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 71
OSMU Active/Standby Board Protection
 A board having two disks of 600 GB each is introduced as
the standby OSMU board.
 The BSS is shared between the active and standby OSMU
boards. The BSS will be mounted to the standby OSMU
board when the active/standby OSMU switchover is
OSMU OSMU Standby triggered.
 Data is synchronized between the active and standby
OSMU boards once per hour.
 Data to be synchronized covers the entire OSMU software
Data is synchronized (/opt/osmu), alarms recorded by OSMU and backup data of
once per hour the OSMU itself.
 A full synchronization method is adopted for the first time.
From the second time, the OSMU uses incremental
synchronization. The synchronization is based on Linux
rsync and SSH encryption.
 Switchover between the active and standby OSMU boards
1 2 3 4 5 6 7 8 9 10 11 12 13 14
is manually performed. Data synchronization is triggered
Emergency system

Standby OSMU once when users switch over the active and standby
OSMU boards on the OSMU GUI.
Database
M2000
M2000
OSMU


LSW
LSW

Data synchronization is bi-directional. After services are


PRS
PSR

switched from the active OSMU board to the standby


OSMU board, the data on the standby OSMU board will be
synchronized to the active OSMU board.
 The standby OSMU board has all the functions of the
active OSMU board.
 When only one subrack is used, the standby OSMU board
is installed in slot 13 (Oracle scenario where the Oracle
standby board is installed in slot 14) or slot 14 (Sybase).
When two subracks are used, the standby OSMU board is
installed in slot 14 of subrack 6.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 72
Contents
3. ATAE Cluster Scheme
3.1 Constraints of ATAE Cluster Design Specifications
3.2 ATAE Cluster Networking
3.3 ATAE Cluster SAN Boot Scheme
3.4 ATAE Cluster Storage Scheme
3.5 Clusters and Services in the ATAE Cluster
3.6 ATAE Cluster Backup and Restore Scheme
3.7 ATAE Cluster Remote HA Solution
3.8 ATAE Cluster Emergency System
3.9 ATAE Cluster OSMU Maintenance Solution
3.10 ATAE Cluster Antivirus Solution

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 73
OSMU Board Antivirus Solution

Note:

 Antivirus software is not configured for the OSMU by


default, and therefore it is not delivered. The
antivirus software is used only for the antivirus
solution.
 For details about the antivirus solution, see the
SUSE Linux OS Antivirus User Guide.
 The antivirus solution is used with ServerProtect for
Linux (SPLX) 3.0. In the case of a version change,
check whether the SPLX version supports the used
SUSE Linux kernel.
 The OSMU server installs the antivirus software
client into each board.
 The OSMU server generates and maintains the
configuration data of antivirus software. It also
synchronizes the configuration data to each board.
 The TMCM server connects to the Internet through a
firewall and regularly updates the antivirus database.
 The TMCM server is a separately delivered PC
Server.

Copyright © 2014 Huawei Technologies Co., Ltd. All rights reserved. Page 74
Thanks
www.huawei.com

Você também pode gostar