Você está na página 1de 113

HELSINKI UNIVERSITY OF TECHNOLOGY Department of Computer Science and Engineering Telecommunications Software and Multimedia Laboratory

Edgar J. Ramos

Analyzing the Media Control interfaces and Mobile Media Gateway for the Ip Multimedia Subsystem (IMS)

Masters Thesis submitted in partial fulllment of the requirements for the degree of Master of Science in Technology. Espoo, February 4, 2008

Supervisor: Instructor:

Professor Antti Yl a-J a aski Gonzalo Camarillo, M.Sc.

HELSINKI UNIVERSITY OF TECHNOLOGY


Author: Edgar J. Ramos

ABSTRACT OF THE MASTERS THESIS

Name of the thesis: Analyzing the Media Control interfaces and Mobile Media Gateway for the Ip Multimedia Subsystem (IMS) Date: Feb 4, 2008 Number of pages: 100 Department: Professorship: Supervisor: Instructor: Department of Computer Science and Engineering T-110 Professor Antti Yl a-J a aski Gonzalo Camarillo, M.Sc.

The IMS as part of the 3G evolution networks involves costs and investments for the development of new systems and technologies. The Multimedia Resource Function Processor (MRFP) is an important component on the IMS to provide the operations over the media in the packet-switched domain. Also it has been identied strong synergies with the circuit-switched domain Media Gateway (MGW). This Thesis mainly analyzes the Mp interface deployment, which is the one used to send the control messages to the MRFP. Additionally several protocol alternatives are reviewed from the proposed Media Control protocols dened in IETF together with the scenarios where a MRFP is needed. At the moment of writing this work, the Mp interface has not been standardized fully. Therefore, the models and protocols reviewed and their impact are analyzed for the Mobile Media Gateways User Plane Control Function (UPCF) application. The nal analysis lead to the selection of a media server control model including the protocols recommendation for each interface. At last, a prototype was developed to illustrate the possible adaptations from the Mobile Media Gateway. It was meant to handle the Mp interface behaving as a MRFP and setup IMS IP calls. This nal output provides a contribution to one possible transition path for the evolution from the circuit switched domain to the packet switched domain using the Mobile Media Gateway and the Media Server Control model.

Keywords: Media, Control, IMS, Mp, Mr, MGW, MRFP, MRFC, MRF

Acknowledgements
I want to thanks the people who supported me to nish this work: my family in Panama and here in Finland, my friends and my colleagues in Ericsson. Their best contribution was to provide me with their precious time. I would also like to thank Johan Torsner for his patience and advices. Finally, my gratitude to Gonzalo Camarillo, who made and excellent work as a tutor besides the diculties found during this work development.

Otaniemi, February 4th, 2008

Edgar Ramos

ii

To my beloved ones, with whom I share my challenges and rewards.

Contents
Abbreviations List of Figures List of Tables 1 Introduction 1.1 1.2 1.3 1.4 1.5 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . ix xi xii 1 1 2 3 3 3 5 5 6 9 11 13 13 14 14 14

2 IP Multimedia Subsystem 2.1 2.2 2.3 2.4 2.5 2.6 Overview of the IMS Architecture . . . . . . . . . . . . . . . . . . . The IP Multimedia Core Network Subsystem . . . . . . . . . . . . . Signaling plane interfaces . . . . . . . . . . . . . . . . . . . . . . . . IMS/CS relationship (3GPP view) . . . . . . . . . . . . . . . . . . . Media Plane Considerations . . . . . . . . . . . . . . . . . . . . . . . Application Plane Considerations . . . . . . . . . . . . . . . . . . . .

3 Multimedia Resource Function Processor 3.1 Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Media Encoding . . . . . . . . . . . . . . . . . . . . . . . . . iv

3.1.2

Media Transport . . . . . . . . . . . . . . . . . . . . . . . . .

15 17 17 19 22 23 24 25 30

4 Media Gateway 4.1 4.2 4.3 Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transport and Signaling . . . . . . . . . . . . . . . . . . . . . . . . . The Mobile Media Gateway (M-MGW) . . . . . . . . . . . . . . . . 4.3.1 4.3.2 4.3.3 Connectivity Packet Platform . . . . . . . . . . . . . . . . . . Hardware Structure Overview . . . . . . . . . . . . . . . . . . Application Overview . . . . . . . . . . . . . . . . . . . . . .

5 Control Protocol for Media Servers 5.1 Multimedia Resource Function, Media Servers and Physical nodes Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30 34 34 35 36 36 38 41 41 42 43 43 53 62 68 75 76

6 Media Server Control Interface Criteria 6.1 6.2 6.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 6.3.2 3GPP Requirements . . . . . . . . . . . . . . . . . . . . . . . IETF requirements . . . . . . . . . . . . . . . . . . . . . . . .

7 Media Server Control Interface Analysis 7.1 7.2 7.3 3GPP Proposed Protocols for Control of Media Servers . . . . . . . IETF Proposed Protocols for Control of Media Servers . . . . . . . . Media Server Control Protocols . . . . . . . . . . . . . . . . . . . . . 7.3.1 7.3.2 7.3.3 7.3.4 7.4 7.5 ITU Recommendation H.248 (Megaco Protocol) . . . . . . . Session Initiation Protocol (SIP) . . . . . . . . . . . . . . . . Media Sessions Mark-up Language (MSML) . . . . . . . . . . Media Server Control Mark-up Language (MSCML) . . . . .

Trac Scenario Implications . . . . . . . . . . . . . . . . . . . . . . . Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.6

Preferred model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79 82 82 82 83 83 83 84 84 85 86 86 86 89 90

8 Prototype 8.1 8.2 8.3 Mobile-MGW VoIP prototype . . . . . . . . . . . . . . . . . . . . . . M-MGW prototype considerations . . . . . . . . . . . . . . . . . . . Aected User-Plane Control application subsystems . . . . . . . . . 8.3.1 8.3.2 8.3.3 8.4 Signalling Transport Converter (STC) . . . . . . . . . . . . . GCP Termination (GCPT) . . . . . . . . . . . . . . . . . . . Connection Coordinator . . . . . . . . . . . . . . . . . . . . .

M-MGW Prototype testing . . . . . . . . . . . . . . . . . . . . . . . 8.4.1 8.4.2 8.4.3 8.4.4 H.248 ows between M-MGW and SBC . . . . . . . . . . . . Call setup procedure. . . . . . . . . . . . . . . . . . . . . . . Call Release Procedure . . . . . . . . . . . . . . . . . . . . .

MGW Cold startup ow . . . . . . . . . . . . . . . . . . . . .

9 Conclusions and Future Work 9.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vi

Abbreviations
(A-F)
3G 3GPP AAL1 AAL2 AAL5 AMR AS ATM ATIS BGCF BICC B-ISUP BSC CAMEL CAP CN COPS CPP CS CSCF DiServ DTMF EDGE FTP 3rd Generation 3rd Generation Partnership Project ATM Adaptation Layer - 1 ATM Adaptation Layer - 2 ATM Adaptation Layer - 5 Adaptive Multi-Rate Application Server Asynchronous Transfer Mode Alliance for Telecommunications Industry Solutions Breakout Gateway Controller Function Bearer Independent Call Control Broadband ISDN User Part Base Station Controller Customized Applications for Mobile network Enhanced Logic CAMEL Application Part Core Network Common Open Policy Service protocol Connectivity Packet Platform Circuit Switched Call Session Control Function Dierentiated Services Dual Tone Multi-Frequency Enhanced Data Rates for GSM Evolution File Transfer Protocol

vii

(G-M)
GCP GGSN GPRS GSM GSM-FR HSS HTTP IANA I-CSCF ICP IETF IIOP IMS IM-SSF IP IPsec ISDN ISUP ITU IVR MEGACO MGC MGCF MGW MIME M-MGW MRFC MRFP MTP3 MTP3b MSC MTP M3UA Gateway Control Protocol Gateway GPRS Support Node General Packet Radio Service Global System for Mobile Communication GSM Full Rate Home Subscriber Server Hypertext Transfer Protocol Internet Assigned Numbers Authority Interrogating - CSCF Internal Communication Path Internet Engineering Task Force Internet Inter-Object Request Broker Protocol IP Multimedia Subsystem IP Multimedia Service Switching Function Internet Protocol IP security Integrated Services Digital Network ISDN User Part International Telecommunication Union Interactive Voice Response Media Gateway Control protocol Media Gateway Controller Media Gateway Control Function Media Gateway Multipurpose Internet Mail Extensions (Ericsson) Mobile Media Gateway Multimedia Resource Function Controller Multimedia Resource Function Processor Message Transfer Part level-3 Message Transfer Part level-3 broadband Mobile-services Switching Centre Message Transfer Part MTP3 User Adaptation Layer

viii

(O-Z)
OSA-CSC O&M PCM P-CSCF PSTN QoS RFC RNC RSVP RTP SAAL SCF S-CSCF SCTP SGW SIP SLF SS7 SSCF SSCOP SRTP TCP TDM THIG TLS UDP UE UMTS VoIP VPN XML XCAP Open Service Access - Service Capability Server Operation and Maintenance Pulse Code Modulation Proxy-CSCF Public Switched Telephone Network Quality of Service Request For Comments Radio Network Controller Resource Reservation Protocol Real-Time Transport Protocol Signaling ATM Adaptation Layer Service Switching Function Serving-CSCF Stream Control Transmission Protocol Signalling Gateway Session Initiation Protocol Subscription Locator Function Signalling System no. 7 Service-Specic Coordination Function Service-Specic Connection-Oriented Protocol Secure RTP Transmission Control Protocol Time Division Multiplexing Topology Hiding Inter-network Gateway Transport Layer Security User Datagram Protocol User Equipment Universal Mobile Telecommunications System Voice Over IP Virtual Private Network Extensible Mark-up Language XML Conguration Protocol

ix

List of Figures
1.1 2.1 2.2 4.1 4.2 4.3 4.4 4.5 Example of MGWs controled by the MSC . . . . . . . . . . . . . . . Simplied IMS architecture . . . . . . . . . . . . . . . . . . . . . . . IMS reference points . . . . . . . . . . . . . . . . . . . . . . . . . . . MGW reference points and interfaces . . . . . . . . . . . . . . . . . . Mc Interface/Reference point protocol stacks . . . . . . . . . . . . . Nb Interface/Reference point Protocol stacks [3] . . . . . . . . . . . IPBCP Tunneling [3] . . . . . . . . . . . . . . . . . . . . . . . . . . . Control system structure of a CPP conguration with eight Board Processors (BP) and some Special Processors (SP) and a Main Processor Cluster (MPC) with three Main Processors (MP). [82] . . . . M-MGW hardware structure. [33] . . . . . . . . . . . . . . . . . . . M-MGW hardware structure. [33][41] . . . . . . . . . . . . . . . . . Connection model in the Connection Coordinator. [41] . . . . . . . . IMS Logical Nodes grouping used for the industry relative to the Media Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example of a H.248 connection model. The asterisk box in each of the Contexts represents the logical association of Terminations implied by the Context[62]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H.248 Message structure . . . . . . . . . . . . . . . . . . . . . . . . 2 6 12 20 21 22 23

24 26 27 28

4.6 4.7 4.8 5.1

32

7.1

44 46

7.2 7.3

H.248 Message size for a Representative Call Flow, based on a benchmark test performed by the Erlang project in [81]. . . . . . . . . . .

49

7.4

Encoding and Decoding measurements for H.248 binary encoded messages based on a benchmark test performed by OSS Nokalva in [76]. (The message sizes are the same as for the corresponding messages shown in gure 7.3 for binary encoding) . . . . . . . . . . . . . . . . SIP Message processing delay of 4 Proxy implementations. Source: [28] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MSML Object classes relationship . . . . . . . . . . . . . . . . . . . MSML Core Package Hierarchy . . . . . . . . . . . . . . . . . . . . . MSCML Advanced Conference Model . . . . . . . . . . . . . . . . . MSCML Media Server Connection Model . . . . . . . . . . . . . . .

50

7.5

59 64 65 70 71

7.6 7.7 7.8 7.9

7.10 Possible deployment of the Analyzed Media Server Control Protocols on the 3GPP interfaces of IMS (The ISC interface is not present for simplicity). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.11 Selected preferred Model for Media Server Control (The ISC interface is not present for simplicity) . . . . . . . . . . . . . . . . . . . . . . . 8.1 8.2 8.3 H.248 Context view targeted to the prototype connection model . . H.248 Call setup ow used for the Prototype . . . . . . . . . . . . . H.248 call release and cold start-up ows used for the prototype . .

77

81 83 87 88

xi

List of Tables
4.1 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 Example of MGWs features [43] . . . . . . . . . . . . . . . . . . . . H.248 Commands [62] . . . . . . . . . . . . . . . . . . . . . . . . . . H.248 Descriptors [62] . . . . . . . . . . . . . . . . . . . . . . . . . . SIP Request Methods from the core specication[23] . . . . . . . . . SIP Request Methods from extensions to the Core specication [23] SIP headers dened in the Core Protocol . . . . . . . . . . . . . . . 18 45 47 55 56 57 63 66 72 73

MSML object classes . . . . . . . . . . . . . . . . . . . . . . . . . . . MSML mandatory and conditional packages . . . . . . . . . . . . . . MSCML Request Methods for advanced conference . . . . . . . . . . MSCML Request Methods for IVR . . . . . . . . . . . . . . . . . . .

7.10 Comparison of the characteristics of the Analyzed Media Control Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.11 Functionality comparison of the analyzed Media Server Control Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

80

xii

Chapter 1

Introduction
1.1 Background

The IP Multimedia Subsystem (IMS)[14] has been seen as the clear evolution of the packet-switched domain in the cellular world, specically for the 3G networks. The operators and manufacturers have mainly driven this evolution through the standardization bodies. They have agreed on standards and technologies that should allow the shift from the old systems to the new ones. This involves costs and investments from both parties (operators and manufacturers) for the development and the deployment of such technologies and sometimes the process specied by the recommendations and agreements is not providing a smooth enough transition for the real environments. From now on, we will be referring to the 3G networks as the ones dened for UMTS (Universal Mobile Telecommunications Service) and driven by 3GPP (Third Generation Partnership Project) recommendations and the IMS as associated with it. The Mobile Softswitch has been dened as the architectural model for the mobile core network where the call control and the switching functions are found in dierent nodes. It is based in the 3G Partnership Projects (3GPP) release 4. The Media Gateway (MGW) and the MSC Server (Mobile Switching Center) are the main players for this concept, where the MSC works as a controller for the MGW (see gure 1.1). The MGW handles: the media stream processing, switching, transcoding and the resources for transport (the links of the connectivity layer) while the MSC is responsible for the mobility management and the call control. The IMS has been introduced in the 3GPPs release 5, providing to the 3G networks with an infrastructure to oer QoS (Quality of Service), charging, coordina-

CHAPTER 1. INTRODUCTION

MGW

IP ATM
MGW

MGW

Backbone
MGW

Connectivity Layer Control Layer

MSC

Figure 1.1: Example of MGWs controled by the MSC tion and integration of services and security for the packet switch domain. Part of the media plane is handled through the Media Resource Function (MRF), which is divided in two functions or logical nodes: the MRFC (Media Resource Function Controller) that is in charge of interpreting the SIP (Session Initiated Protocol) messages from the AS (Application Server) and the S-CSCF (Serving Call/Session Control Function), and the control of the other MRF function, the MRFP (Media Resource Function Processor). The last is in charge of the media stream processing and mixing, playing announcements and controlling bearers on the Mb reference point [13].

1.2

Problem Description

There are strong synergies between the functions performed by the MGW and the functions expected from the MRFP. Both are capable of doing media stream processing, transcoding and control of the media transport. Already it has been proposed

CHAPTER 1. INTRODUCTION

the possibility of merging these logical nodes in one physical node: the Mobile Media Gateway (M-MGW) [43]. The Mp interface that is used for the MRFC to control the MRFP has not been standardized at the moment of writing this work. The MGW already uses the protocol H.248 (known by IETF as MEGACO) for the Mc interface with the MSC. Therefore, this work will analyze the dierent existing proposals for the Mp interface and Media server control. Also it will consider how they User Plane Control Function (UPCF) Application and could impact the M-MGWSs propose an implementation based in the H.248 protocol for the Mp interface through a prototype.

1.3

Objectives

To provide a comparison between the proposed protocols and models for Media Server control. Additionally, to explore the implementation synergies between the M-MGW and the MRFP by constructing a prototype. The intention is to provide a better understanding of the integration of the IMS network as an evolution of the 3G networks by the gradual introduction of the IMS functions in the core network.

1.4

Scope

The thesis is limited to the Media Gateway (MGW) and the Mp interface of the MRF (Media Resource Function) contained in the 3GGPs Release 6 specications. The H.248 version discussed is version 2 released by ITU and IETF. In the MGW, the scope is limited to the application that handles the User Plane Control Functions (UPCF) although some hardware specic details are mentioned. The outcomes and test environment are presented despite the prototype implementation is not provided. The test environment is using an adaptation of a 3G network suitable for our analysis purposes.

1.5

Structure of the Thesis

Following to the introduction chapter, the IMS is presented in chapter 2. There, important elements to the 3Gs Circuit Switched Network and IMS interactions are reviewed; the media plane and the services using such interactions are also summarized. The access network and the terminals are irrelevant for our purposes therefore we will concentrate on the Core Network Subsytem of the IMS. Chapter 3 introduces to the Multimedia Resource Function Processor (MRFP) tasks and the

CHAPTER 1. INTRODUCTION

signaling that goes through the node. On chapter 4, the Media Gateway (MGW) implementation is described, its functions over the media plane, media transport and signaling combined with hardware and software overview. Chapter 5 provides an overview of the control protocols for Media Servers and the dierent models associated to media server control. Later, the criteria of evaluation for the analysis provided by this work can be found in chapter 6. The analysis itself of the dierent protocols alternatives , nal comparison and selection of the preferred model is in chapter 7. The practical implementation of the prototype is presented in the nal chapter 8. Then the conclusions and future work are provided.

Chapter 2

IP Multimedia Subsystem
The mobile networks have been traditionally built in the Circuit Switched (CS) domain copying the model from the Public Switched Telephone Networks (PSTN). After the GPRS (General Packet Radio Service) launch for GSM, the interest for the transport of data using the mobile core network was evident. The packet switching became then an important part of the Core Network specications that can be fully realized in the 3GPP vision for the All IP Architecture [11]. In this frame, the IP Multimedia Subsystem (IMS) was dened based in the provision of multimedia services that are bound to the packet data transmission. It is conformed for all the necessary elements on the Core Network (CN) using the Packet Switch (PS) domain to deliver the dierent types of multimedia [13] [71].

2.1

Overview of the IMS Architecture

The IMS follows a layered architecture consisting of three dierent layers or planes [78]: application, control and connectivity & users plane. An example of the IMS layered architecture is provided by gure 2.1. The application layer basically consists on the SIP Application Servers, which provide the content and value-added services to the users. The independence of this layer with respect to the others allows the services being provided regardless of the access used. The control layer handles the registration, setup and release of calls and sessions. It also supplies functional control of the users plane servers (like MGWs and MRFPs). A big part of the network signaling is produced and received by this layer, including the O&M (Operation and Management), charging, and networks interworking support.

CHAPTER 2. IP MULTIMEDIA SUBSYSTEM

Finally, the connectivity & users plane manages and processes the media sent by the end-points. This layer is in charge of the media trac transport, routing, switching and network translation (protocol conversions). 3GPP has standardized functions or what we could call logical nodes. This is dierent from the physical nodes oered by the vendors, which can merge several functions in a physical node or split them in several entities[24]. In essence, the architecture of the IMS is dened by the logical nodes and the standardized interfaces linking them.

AS

CS Network

CSCF BGCF HSS MRFC MGCF SGW

Access Network
MRFP MGW

UE

Media Traffic Signalling Traffic

Figure 2.1: Simplied IMS architecture

2.2

The IP Multimedia Core Network Subsystem

The IP Multimedia Core Network Subsystem according to 3GPP comprises of all CN elements for the provision of IP multimedia applications over IP multimedia sessions [12].For this purpose the CN includes: 1. User data base(s): One or several HSS (Home Subscriber Server), which can be considered as the evolution of the GSM node HRL (Home Location Register).

CHAPTER 2. IP MULTIMEDIA SUBSYSTEM

It stores and handles the subscriber and services related information. Some of the information includes: security items, location data, proles and S-CSCF (Serving Call Session Control Function) allocated to the subscribers. The SLF (Subscriber Location Functions) is used in case of the need to split the information stored by the HSS into several of them. The function of the SLF is to map the addresses of the subscribers into the corresponding HSS containing the information needed for the particular subscriber [24]. 2. The CSCFs (Call/Session Control Functions). They are SIP servers in charge of routing and session management. The entities grouped in this category are:
P-CSCF (Proxy Call Session Control Function). It is the rst signaling contact point within the Core Network (CN) with the terminal. It takes care of accepting requests (SIP messages) and processing them itself or forwarding them into the CN depending on what is required. It may include a PDF (Policy Decision Function) that could be implemented in a separated node [14]. I-CSCF (Interrogating CSCF). It is a SIP proxy, serving as a contact point within an operators network for all the connections from a subscriber of the network or a roaming subscriber located in the network operators service area at the moment. It has an interface using the Diameter protocol to the SLF and the HSS. It can hide some domain information by encrypting parts of the SIP message. This last feature is known as THIG (Topology Hiding Inter-Network Gateway) and it is optional [14]. S-CSCF (Serving CSCF). It is the main control entity in the signaling plane. Basically, it executes the session control services for the User Equipment and maintains the sessions states needed to support the operators services. The S-CSCF analyzes the terminals SIP messages and decides to route them to the correct application servers that would be able to provide the service to the subscribers. Another function of the S-CSCF is the routing services, for example translations of telephone numbers provided instead of SIP URIs (Universal Resource Identier)as addresses. The enforce of networks policies to the users is also done by the S-CSCF depending of the user service prole and operators policy conguration [14] [24].

3. The ASs (Application Servers), which provide and hosts the multimedia services and comprise the application plane. Three types of ASs have been de-

CHAPTER 2. IP MULTIMEDIA SUBSYSTEM ned:

SIP AS: It is a native Application Server that executes multimedia services based on SIP. It will become the most used type for the development of new IMS specic services [24]. OSA-CSC (Open Service Access - Service Capability Server): This type of application server supplies an interface for the OSA (Open Service Access) framework Application Server and interfaces the S-CSCF with SIP at the same time. It is an important mechanism to provide secured and standardized access for a third party AS to the IMS. IM-SSF (IP Multimedia Service Switching Function): It is a specialized application server designed to support CAMEL (Customized Applications for Mobile network Enhanced Logic) services for legacy reasons. It acts as a SCF (Service Switching Function) on one side, interfacing the gsmSCF by using CAP (CAMEL Application Part) and allowing it to handle an IMS session. On the other side, it behaves as a SIP server interfacing the S-CSCF.

4. BGCF (Breakout Gateway Control function). The network can have one or more of them. It is a point for breakout to the Circuit Switched network from the IMS. It is used when an IMS subscriber is trying to access the CS network. Despite being a SIP server, it uses the phone number and routes the request to the MGCF (Media Gateway Control Function) if the breakout takes place in the same network, or to one BGCF on the destination network [24][78]. 5. PSTN (Public Switched Telephone Network) - CS (Circuit Switched) gateways. They are decomposed into:
MGCF (Media Gateway Control Function). It is a control and signaling node. It maps the SIP protocol to ISUP (ISDN User Part) or BICC (Bearer Independent Call Control), both over IP. Also, the MGCF controls the MGW resources using the H.248 protocol. SGW (Signaling Gateway). Sometimes it is needed to transfer requests to the CS networks from the IMS and vice versa. The SGW is in charge to do the low-level protocol conversion to make compatible the transport and mutual understanding in the signaling network. Basically, it converts the MTP (Message Transfer Part) into SCTP (Stream Control Transmission Protocol) over IP[24].

CHAPTER 2. IP MULTIMEDIA SUBSYSTEM

MGW (Media Gateway). It does the media plane conversion between the IMS and the CS networks. The conversion can include change of transport protocol (for example, from RTP to AAL2), transcoding (i.e., from AMR to G.711) or stream processing (echo cancellation, playing of announcements, DTMF, etc). It receives control instructions from the MGCF.

6. MRP (Multimedia Resource Function). It is divided into two logical nodes:


The MRFP (Multimedia Resource Function Processor) provides the necessary resources to support the user-plane services requiring media processing in the IMS domain. It is capable of performing conference mixing, transcoding, media analysis, stream treatment, and providing the media to play announcements, for example. The MRFC (Multimedia Resource Function Controller) controls the MRFP resources using H.248 and acts as a SIP User Agent [24]. It was designed to grant the separation between the control plane and the user plane for the IMS media resources.

2.3

Signaling plane interfaces

The signaling in IMS takes place over the reference points (see gure 2.2). They are links between the dierent IMS logical nodes using protocols that 3GPP has chosen to standardize. The protocols deployed at the reference points have been standardized by IETF (Internet Engineering Task Force) and ITU (International Telecommunication Union). The reference points (we will also call them interfaces) in IMS grouped by protocols are [13]: 1. References points using Diameter (specied in RFC 3588 [22]) :
Cx Interface. It is implemented between the I-CSCF or S-CSCF and the HSS. Mainly, it is used to allocate the S-CSCF, retrieve user information and for authentication purposes. Dx Interface. It is implemented between the I-CSCF or S-CSCF and the SLF. It is used to nd the HSS where the users data is allocated. Dh Interface (Release 6). It is implemented between the AS and the SFL. It is used by the AS to nd the right HSS to contact in a multi-HSS environment.

CHAPTER 2. IP MULTIMEDIA SUBSYSTEM

10

Sh Interface. It is implemented between the AS and the HSS. It is used by the AS to access the HSS records. Si Interface. It is implemented between the HSS and the IM-SSF. It is used by the AS to query CAMEL subscription information from the HSS. Gq Interface (Release 6). It is implemented between the P-CSCF and the PDF. It is used to exchange policy decision and information between both entities.

2. Reference points using SIP (specied in RFC 3261 [86]) :


Mw Interface. It is implemented between two CSCFs. It is used as a communication channel (SIP proxy) between the CSCFs. ISC Interface. It is implemented between the S-CSCF and an AS. ISC stands for IMS Service control and it is used to carry charging information. Mi Interface. It is implemented between the S-CSCF and the BGCF. It is used by the CSCF to forward a session to the BGCF when the session needs to be routed to the CS domain. Mj Interface. It is implemented between the MGCF and the BGCF. When the BGCF selects the CS network where the breakout will happen, if it is happening in the home network, the Mj interface is used to forward the session to the MGCF. Mk interface. It is implemented between two BGCFs. It is used when the CS breakout happens in a network dierent from the home network. Then the session is forwarded to the BGCF in that network. Mr Interface. It is implemented between the S-CSCF and a MRFC. It is used by the S-CSCF to activate bearer related services. The functionality is not fully standardized. Mg Interface. It is implemented between the MGCF and a I-CSCF. The MGCF uses this interface to forward CS sessions to the IMS domain after converting the ISUP signaling to SIP. Mm Interface. It is implemented between the I-CSCF and another SIP terminal or server. In general, it is used by the IMS (specically by the I-CSCF) to communicate with another multimedia IP networks by using SIP.

CHAPTER 2. IP MULTIMEDIA SUBSYSTEM

11

Gm Interface. It is implemented between the UE (in reality the access network) and the P-CSCF. It is used as the connection between the UE and the IMS. The interface allows three type of procedures: registration, session control and transactions [78]. The UE registers to the P-CSCF sending information about the supported security mechanisms and exchanging authentication data with the network. This procedure implies the registration of user identities and the set up of secure ways of communication between the UE and the IMS. Re-authentication requests or network initiated de-registration are also handled by this interface. After this, a session can be established and handled using the Gm interface (UE terminated or originated) to forward the requests. And nally, stand-alone requests and responses are also sent into the interface by the transaction procedures.

3. Reference points using MEGACO (ITU-T Recommendation H.248)[62]:


Mp Interface. It is implemented between the MRFC and the MRFP. It is used by the MRFC to control the resources and media streams in the MRFP. Mn Interface. It is implemented between the MGCF and an IMS-MGW. It is used by the MGCF to control the resources and MGW functions.

4. Reference point using COPS (Common Open Policy Service protocol specied in RFC 2748 [31]):
Go Interface. It is implemented between the PDF (P-CSCF) and the GPRS network. It is used for QoS authorization and charging correlation between the IMS and the GPRS.

5. Reference point using HTTP (HyperText Transfer Protocol) and XCAP (XML Conguration Protocol):
Ut Interface (Release 6). It is implemented between the UE (in reality the access network) and one (or several) AS. It is used to allow data manipulation for conguration purposes from the UE.

2.4

IMS/CS relationship (3GPP view)

Even when the IMS operates in the packet switch domain, it needs to keep a relationship with the circuit switch domain for several reasons: backwards compatibility

CHAPTER 2. IP MULTIMEDIA SUBSYSTEM

12
Visited IMS Network

AS P-CSCF

Mk
BGCF

Si ISC Cx
HSS

Home IMS Network


Mj
SGW
MGCF

Mi
S-CSCF

Cx Mw
I-CSCF

Mw

Mg

Mr
MRFC

Dx Mw Ut
SLF

Mn

Mp
MGW MRFP P-CSCF

PDF
Gi Go

Mb Nb Mb Mb Gm

Access Network
GGSN

CS Network

IP Multimedia Network
UE

Figure 2.2: IMS reference points and legacy, interworking with other networks and UEs, and inability to provide support for certain services (i.e. emergency calls in the earlier phases). Although such compatibility is not mandatory from the specications point of view, it is clearly of importance for the operators to be able to interact between both domains [74]. As has been stated, the PSTN/CS Gateway is the one that provides the interface for IMS toward a CS network. It handles the calls terminated and originated in the CS domain where the IMS takes part. This Gateway has its counterpart in the CS domain, which in practical terms can be the same entity. Concerning this, the 3GPPs Technical Specication TS 23.002 reminds that the packet switch domain and the circuit switch domain ...are overlapping, i.e. can contain some common entities[13]. Also in TS 23.228 The IP multimedia subsystem is independent of the CS domain although some network elements may be common with the CS domain[13]. The specications make a dierence between the IMS-MGW and the CS-MGW (Circuit Switch MGW) as the MGW operating on the packet switch domain or the one from the circuit switch domain. The reason for such separation is mainly

CHAPTER 2. IP MULTIMEDIA SUBSYSTEM

13

based on the media processing done in the CS domain and the possibility of it being performed in the PS domain (to adapt media for the CS domain, i.e. echo canceling). The pure media processing for the IMS has been specied as a task of the MRFP and a media connection can be established between both entities (MGW and MRFP), allowing media processing in one and protocol transcoding in the other one. The CS MSC (Mobile-services Switching Center) and the IMS MGCF have also common functions when is about the MGW control. Excluding some particular things proper of the network characteristics, the MGW handling and control could be considered identical.

2.5

Media Plane Considerations

The IMS follows the signaling and media separation model, being able to oer a guaranteed QoS (Quality of Service), media services, and exible charging. Regardless, IMS is still an IP network and the transport and routing of the media is done following the IP principles. Because the IMS is an interconnected network interworking with another networks (i.e. Internet), the mechanism established to provide the expected features requires the special handling of the packages (using RSVP or DiServ for QoS) and even specialized protocols (RTP, SRTP, SCTP, IPsec, etc). The media transverses an IP network by following a channel set by a session control. The media may or may not transverse network nodes (for example the MRFP or the MGW) depending if it is necessary any treatment to adequate it for the communications purposes. For our study, we will concentrate on the cases when it is necessary to use the MRFP and the MGW in the IMS.

2.6

Application Plane Considerations

The Application Servers (ASs) perform the storage and service provisioning functions. Sometimes this implies the processing and analysis of media. For that reason it makes sense in practice to fusion the MRFC functionality with the specic AS that needs to do the media processing. In this way the media processing is done by the MRFP controlled by an AS. From this point of view, the MRFC could seem to belong to the application plane, because the media processing is done as part of a service provided by the IMS or as the service itself. But in the strict meaning of the MRFCs function it acts as a controller for the MRFP and for that reason we will consider it in the control plane.

Chapter 3

Multimedia Resource Function Processor


The MRFP is a part of the general Multimedia Resource Function. It is controlled by the MRFC using the Mp reference point. By this interface it receives orders from the Application Servers (AS) and indirectly through the S-CSCF. It is also connected using the Mb interface with other MRFPs, MGWs, other IMS networks and the GGSN (Gateway GPRS Support Node) for the media processing.

3.1

Function

The 3GPP specications [13, 14] dene the task of the MRFP as follows:
Bearer control on the Mb reference point. Mixing of media streams (video and audio, multiparties, etc). Source of media (i.e. announcements, tones, etc). Processing of media streams (e.g. transcoding, analysis of media) Resources provisioning for the MRFC control.

In practice the MRFP will be used as a support node for the provisioning of multimedia services. The implementation of the functions are very open for the manufacturers, as well as the capacity and service parameters.

3.1.1

Media Encoding

The media encoding is relative to the digital handling of media. Video and audio are digitalized with several technologies and their interworking has to be secured 14

CHAPTER 3. MULTIMEDIA RESOURCE FUNCTION PROCESSOR

15

by transcoding mechanisms and translating techniques. Also the media could need manipulation to provide a service to the subscribers or the mixing of several media types or media streams to achieve a simultaneous transmission-reception of media from several sources at the same time. One example is provided when an AS will control the MRFP to adapt the media sent to one subscriber from a media server by performing transcoding and mixing of the streams in a conference call. In this case, the MRFP can produce the media or transcode the media from a media server somewhere in the network (or other interconnected IP network). Then the mixing could be done for several subscribers sharing the session. 3GPP has specied which codecs must be supported in their IMS terminals [5]. This are AMR (Adaptative MultiRate) speech codec and H.263 video codec. Those terminals that support real time text conversational services must use T.140 (ITU-T Recommendation T.140 [53]) and the ones providing wideband services support AMR-WB (AMR WideBand). On the other hand, some non-3G terminals could support dierent codecs. The more common ones for speech are G.711 [54], also known as PCM (Pulse Code Modulation) and GSM-FR (GSM Full Rate) [35] implemented in all the GSM terminals, and for video: MPEG (Motion Picture Experts Group) video standards (MPEG-1 [45], MPEG-2 [46] and MPEG-4 [44]) and H.261[47] used by the H.320[55] videoteleconferencing framework as well [24].

3.1.2

Media Transport

The MRFP takes care of the congestion and QoS for the bearer control. It could also handle the security associations for the media transport and the establishment and release of connections over IP. This is useful considering the Mb reference point connections to other IP networks, i.e. GPRS, to set VPN (Virtual Private Networks) relationships and other IP connection services. Because of the IP properties, other protocols are used on top of it to provide the services not available with pure IP.The most important protocols used for the MRFP are : RTP, TCP, UDP and SCTP. The RTP (Real-Time Transport Protocol) [88] is used to transmit digitalized media that are sensitive to delay and time continuity [26]. For this purpose, it provides sequence numbers in the packages and timestamps. The rst lets the receiver nd out when a loss of packages has happened and the correct order of the packages in the receiving buer. The second allows playback control and delay handling. RTP is used over UDP (User Datagram Protocol) datagrams mainly for concurrence reasons, allowing one equipment to handle several RTP streams at the same time. RTCP (RTP Control Protocol) is used together with RTP. RTCP complements the RTPs real-time functions. It provides an out-band communication

CHAPTER 3. MULTIMEDIA RESOURCE FUNCTION PROCESSOR

16

between the sender and the receiver, with information about the performance of the network and additional information about the media sent using the RTP packages. It is transported over UDP and uses the next port number following the port number used by the RTP stream. For specic messages, the SCTP (Stream Control Transmission Protocol) protocol specied in RFC 2960 [90] is used. This protocol is connection-oriented, which means the establishment of startup associations between two end points that provide a set of transport addresses to send and receive the messages. The basic function of SCTP is to oer a reliable transport for messages between the peers avoiding the disadvantages of TCP and UDP. Mainly these are: the capacity of multihoming, better DoS protection than TCP, and handling of multiple streams. In our study the use of SCTP will be mainly seen in the signaling plane, where it is used to transport the H.248 messages over the Mp reference point. TCP (Transmission Control Protocol)[80] has been used traditionally for reliable transport. It was created to provide IP with the possibility of transmitting datagrams to dierent processes in the same host with reliability. It is a connection oriented protocol, by creating a virtual circuit connection reserving a port and authorizing the transmission from both parties with full duplex capabilities. It is a stream-oriented protocol and is based in buered transfer [26]. At the same time, UDP (User Datagram Protocol) [79] is a connectionless and unreliable protocol, but adds to IP the capacity of distinguishing among multiple processes in a destination host [26]. The application handling the UDP transmission have to take care of the delays, loss and duplication of packages, sorting and reachability of the destination. In any case, UDP is used because of the low overhead and easy packets transmission. UDP can cause network congestion when is used to transmit large quantities of data, because it lacks of congestion control. Other protocols like DCCP (Datagram Congestion Control Protocol) [69] have been developed to handle the possible network congestion because of unreliable transport services.

Chapter 4

Media Gateway
Traditionally, Gateways have acted as translators between networks, allowing communication to be carried on even when the protocols, topologies and transmission media are dierent. The Media Gateways also adapt the media transmitted in the payload to be compatible with the dierent technologies used for digital transmission and terminals communicating. 3GPP has included the Media Gateways as the main edges for connectivity in the Circuit Switched Core Network. They provide the access to the network backbone for the media (acting sometimes as a transit switch), payload processing, media conversion and bearer control, to support the Circuit Switched services options [13]. This is especially important in the transition of the GSM networks to the all-IP networks vision, when there exists several networks, which needs to communicate and interact between them. The Media Gateways realize this in the media plane and the evolution of the 3G networks will dene the needs of this node depending on the development into the all IP-vision already mentioned.

4.1

Function

The Media Gateway function specied by 3GPP is split in two: the Circuit Switched Media Gateway function and the Packet Switch (or IMS) Media Gateway function. The functions might or might not be implemented in the same node or implemented just partially for each function (not the complete set for each type). In practice, the IMS-MGW and the CS-MGW can dier just in the transport used for the payload, although only that fact is motive of important implementation dierences. 3GPP just outlines the MGWs functions without giving an exact denition of the features needed to ...support the Iu options for CS services [13] for the CS-MGW

17

CHAPTER 4. MEDIA GATEWAY

18

and ...will be provisioned with the necessary resources for supporting UMTS/GSM transport media [13] as is specied in the 3GPPs Network Architecture TS 23.002. Then the specic features can be inferred from those requirements. Teemu Hares in his master thesis Adding multimedia resource function processor functionality to Mobile Media Gateway already identies this and presents an example of Media Gateway features needed, which can be seeing in table 4.1. Table 4.1: Example of MGWs features [43]
Functions Switching/routing Signalling Protocol conversion Transcoding Media Processing Quality Security Services Features ATM switch Real-time IP router (support of IPv4 and IPv6) Signaling with H.248, SS7, BICC Convert transport protocols AMR speech Speech coded with other codecs Echo cancellation QoS handling Dierentiated Services (DiServ) IP Security (IPSec) Multiparty connections Tone sending/detection DTMF sending/detection Announcement handling Modem services and digital data access Cs data Charging information collection

As specied, the control layer through the Media Gateway Control Function (MGCF), the Mobile-services Switching Centre (MSC) server and the Gateway MSC (GMSC) controls the Media Gateway. The Media Gateway interacts with those entities to implementing a termination and context view provided by the H.248 abstraction. This will be further explained in section 7.3.1. Because the Media Gateway is at the border of the backbone and interacts with other networks, it needs to implement a dual routing/switching function. If the backbone implements an ATM network, the MGW switches the cells originated by the Radio Network to or from the backbone, and it even serves as a transit node where the cells are routed to other MGW or ATM switches. In the case of the IP backbone, the routing mechanism works dierent from ATM. The MGW processes the packets only directed to itself, it does not acts as a transit node although is not

CHAPTER 4. MEDIA GATEWAY

19

required by the topology of the network and a routing table includes such a MGW as a link to reach another MGW. Also, the MGWs must know where to send the packets to reach the end destination, and maintain a QoS level by keeping an own routing table if necessary. In the case of being connected to dierent networks (PSTN, UTRAN, UMTS, ISDN, etc), the MGW could need to implement translation mechanisms to be able to send the media with the right protocol format, data encoding and media frame to each of the networks interacting with it. On the data transmission level, the most common conversions are from ATM to IP and vice-versa, and from ATM internal layers AAL1 to AAL2 and vice-versa (mainly in the cases of TDM data, used mainly for PSTN and in some cases for GSM).

4.2

Transport and Signaling

The Media Gateway has several interfaces and references points. From the circuit switch domain: the Mc interface connecting to the GMSC and MSC server, and the Nb interface connecting to other MGWs in the network. The Mc interface (reference point) also connects the MGCF from the IMS to the MGW, and the Mb interface connects the MGW with the MRFP and the GGSN from the GPRS network. The radio network is connected by the IuCS interface, and the BSC can use the A interface directly to the MGW as well.This can be seen in gure 4.1. The Mc interface transports the H.248 signaling messages to control the MGW. The rest of the protocol stack is mentioned in the 3GPP specication TS 29.232 Media Gateway Controller (MGC) - Media Gateway (MGW) interface [4]. The ATM or IP transport is handled as a mix of both or in pure mode. In the pure ATM transport, the H.248 is transported on top of MTP3b (Message Transfer Part level-3 broadband)[51] that provides the point-to-point link communication and the capacity of sharing load and changeover between a set of links. The rest of the stack is SSCF (Service-Specic Coordination Function)[50] and SSCOP (Service-Specic Connection-Oriented Protocol)[48] conforming to the Signaling ATM Adaptation Layer (SAAL) [49] on top of AAL5 (ATM Adaptation Layer - 5). In the pure IP transport mode, the stack is simpler by having the H.248 on top of the SCTP (Stream Control Transmission Protocol)[90] over IP providing the connection oriented relation needed to compliant with the redundancy and load handling of the signaling. The specication also warns about not to use IPsec for the Mc interface because SCTP provides the necessary security. In the case when both signalings are used, the M3UA (MTP3 User Adaptation Layer) [6] shall be added to the SCTP

CHAPTER 4. MEDIA GATEWAY

20

Radio Network Nodes

PSTN
MGCF

IMS Nodes CS Domain Nodes GPRS Nodes

PSTN

MRFP

MSC + VRL

Mb

Mc

MGW

Nb

GGSN

MSC A IuCS
MGW

BSC RNC

Figure 4.1: MGW reference points and interfaces stack providing the necessary interworking with all the systems. It might also be added in a pure IP network, looking for a more exible implementation of the nodes. Figure 4.2 illustrates the possible stacks for the Mc interface. The Iu-CS interface is connecting the Media Gateway with the Radio Network Controller (RNC) and the Base Station Controller (BSC). This interface is also connecting the Radio Access Network (UTRAN) to the Core Network. ATM and IP can be used as a transport for this interface [7]. When the Iu-CS interface is not implemented for the BSC (Base Station Controller), the A-interface can be used instead [1]. It was designed for the EDGE (Enhanced Data Rates for GSM Evolution) Radio Access Network connection to the MSC, and only the users data is routed to the MGW. The backwards compatibility with GSM and the MSC split (MSC-Server/MGW) makes this interface appear into the 3G networks layout. The Nb interface provides the transport for user plane data between MGWs and

CHAPTER 4. MEDIA GATEWAY

21

H.248

H.248 M3UA

H.248 MTP3b SSCF SSCOP AAL5 ATM

SCTP SCTP

IP

IP

IP-Transport

IP - ATM Transport

ATM-Transport

Figure 4.2: Mc Interface/Reference point protocol stacks bearer control [3]. It can use ATM and IP transport, and the protocol stack varies for the transport network user plane or the transport network control plane. In the rst, the ATM case, the stack is AAL-2 SAR SSCS (AAL type 2 Segmentation And Reassembly Service Specic Convergence Sublayer)[52]/AAL2 [57]/ATM. In the IP case, the stack is RTP/UDP/IPv4 or IPv6. The transport network control plane for ATM uses AAL2 connection signaling (ITU-T Q.2630.2 [56])/AAL2 Signaling Transport Converter for MTP3b (ITU-T Q.2150.1 [61])/MTP3b/SSCFNNI/SSCOP/AAL5/ATM. The IPBCP (IP-Bearer Control Protocol)[60] shall be tunneled as specied in 3GPP TS 23.205 Bearer-independent circuit-switched core network [2]. The gures 4.3 and 4.4 show the Nb stacks and the IPBCP tunneling route. The Mb interface/reference point provides access for the IPv6 network services to transport user data[13]. According to the specication, the Mb interface can be the same as the Gi interface from the GPRS network. Therefore, the MGW keeps a transport link with the GPRS network (specically the GGSN) and with the MRFP from the IMS. The PSTN interface connects the MGW to the PSTN network. The interface is used for the media gateway just to transport the user plane media. The control signaling goes to the MSC server or the GMSC. From the transport point of view, the MGW is considered to be one ISDN external exchange point for the PSTN network.

CHAPTER 4. MEDIA GATEWAY

22

RTP RTCP

AAL-2 SAR SSCS

AAL2 connection signalling (Q.2630.2) AAL2 STC for MTP3b MTP3b

UDP

AAL2

SSCF-NNI SSCOP AAL5

IPv4 IPv6

ATM

ATM

IP-Transport User Plane

ATM Transport User Plane

ATM Transport Control Plane

Figure 4.3: Nb Interface/Reference point Protocol stacks [3]

4.3

The Mobile Media Gateway (M-MGW)

The specications do not standardize the implementation of the dierent functions in the networks (that we know as logic nodes). Instead they give a set of requirements and tasks that must be accomplished by the logical entities. The manufacturers and operators can choose the way how these functions are implemented; i.e. several logical nodes can be grouped into one physical node. The circuit-switched MGW functions are implemented in the Mobile Media Gateway (M-MGW) together with some useful extra functionality. The M-MGW can be used as a Signaling Gateway (SGW), an ATM packages handler (in addition to the switching point), real-time IP router and media stream handler. The evolution of the 3G networks especially from operators within the incumbent GSM adding WCDMA and greeneld [34] scenarios, has given to the M-MGW the opportunity to develop as an important part to be considered by the operators in their network planning. The exibility and versatility of the node, which allows to change or upgrade the capacity or functions implemented physically, even during operation (or short operation breakages), is an attractive feature in the competitive market. These facts make feasible the study of the node possibilities to include IMS

CHAPTER 4. MEDIA GATEWAY

23

Nc MSC-Server MSC-Server

IPBCP: Q.1970

Tunnel: Q.1990

Mc

Mc MGW Nb MGW

BICC: Q.765.5

TS 29.232

Figure 4.4: IPBCP Tunneling [3] functionalities into its features. Many of the M-MGW characteristics are possible thanks to the Connectivity Packet Platform (CPP), former Cello Packet Platform [68].

4.3.1

Connectivity Packet Platform

The CPP is a proprietary platform for execution and transport, providing interfaces to be used by applications using its services and executing on top of it. CPP has been designed to provide support for redundancy, scalability, distributed execution environment and transport services. It includes a real-time operative system, a distributed real-time database and O&M (Operation and Maintenance) support. CPP is built up for software and hardware modules [82] and consist of two big parts: The CPP platform and the CPP development environment. The rst contains support for ATM switching, IP routing, SS7 (Signaling System no. 7), B-ISUP (Broadband ISDN User Part) and Q.2630 [56] for AAL2 connections. The second provides application interfaces, libraries and software execution control for the multiprocessor environment.

CHAPTER 4. MEDIA GATEWAY

24

4.3.2

Hardware Structure Overview

In general, the CPP nodes are composed by a backplane acting as a bus where dierent processor boards are attached and communicating between them. The processors follow a hierarchic model (gure 4.5 and are placed in specialized boards with specic functions and purposes.

BP
ICP

BP
ICP ICP

BP
MP
ICP ICP ICP

BP
ICP

MP
ICP

BP

MP
ICP

MPC
ICP

ICP

BP
ICP

BP
ICP

BP

BP

SPSP SP

Figure 4.5: Control system structure of a CPP conguration with eight Board Processors (BP) and some Special Processors (SP) and a Main Processor Cluster (MPC) with three Main Processors (MP). [82] The Main Processors (MPs) form a cluster (referred to the Main Processor Cluster) where several Board Processors (BPs) are attached using a star topology. They use an Internal Communication Path (ICP) to communicate with the cluster and can control several Special Purpose Processors (SPs). The MPs inside a cluster communicate but the SPs only with the BP they are connected to. Although this is hierarchic system, there is not a master-slave relationship, since in case of MP restart or crash, another MP will take over the critical processes immediately. The processors can nd where the processes are executed, querying a name server provided by CPP that publishes the task and services running in the MPC [82]. The backplane of the M-MGW is able to switch up to 622Mps duplex non-blocking

CHAPTER 4. MEDIA GATEWAY

25

between the boards attached to the backplane(gure 4.6. The boards have a specic function and purpose, and they contain the dierent processor types depending on their function. The dierent types of boards that can be found in the M-MGW structure are [33]:
General Purpose Boards (GPB). They have a hard disk and can be congured as a Main Processor (MP) or to provide Interactive Messages (IM). Special Purpose Boards (SPB). The SPB has been provided with several powerful microprocessors and can be used to execute any application. One of its typical task is the packet handling. Media Stream Boards (MSB). The MSBs are controlled by the applications and are composed of Digital Signal Processors (DSPs), memory, and control processors. Exchange Terminals (ET-x). There exists dierent types of Exchange Terminals and then named depending on the transmission technology used. They are named ET-MC1, ET-M4, ET-C4, ET-FE1, ET-FE4 and ET-FET, and each of them adapts to a dierent type of physical medium (i.e. Ethernet, ATM, TDM, SONET, SDH, etc). Switch Core Board (SCB). The SCB has three functions: it works as an ATM switch core, it distributes the system clock and connects the subracks together using the Inter-Subrack Link (ISL). Switch Extension Boards (SXB). They are used to expand the M-MGW capacities. In large nodes, the SXB interconnects subracks where the use of a SCB is not enough. Timing Unit Board (TUB). It provides the clock reference for the M-MGW. Also it is synchronized with the public network and has long term stability. Pooled Forwarding Engine. Used to handle the packet IP trac over ATM.

4.3.3

Application Overview

The M-MGW application is a software layer built on top of the CPP platform, using its services to interact and communicate. The application is the one that instructs the M-MGW hardware the task that must be done, and implements an abstract enviroment to handle call data, protocols, instructions and services. Part of this

CHAPTER 4. MEDIA GATEWAY

26

ETMC 1

ET MC 41

ETM4

ETC4

ETFE1

ETFE4

SCB SXB TU B SPB

Bac kpla ne

ETFET

GPB
ET-X SCB SXB TUB SPB GPB MSB PFE Exchange Terminals Switch Core Board Switch Extension Board Timing Unit Board Special Purpose Board General Purpose Board Media Stream Board Pooled Forwarding Engine

MS B PFE

Figure 4.6: M-MGW hardware structure. [33] abstraction is the Virtual Media Gateway (VMGW). With the VMGW, the MMGW node can serve several MGC (Media Gateway Controllers) making possible the sharing of the physical resources and services[41]. The application is structured in three big parts:
User Plane Control Functions (UPCF). The User Plane Control Functions consist of Signaling Transport Converter (STC), GCP Termination (GCPT) and Connection Coordinators. User Plane Functions. It consist of the media stream and media framing resources software. Operation and Maintenance application (O&M). Provides the software interface for Operation and Maintenance of the node.

User Plane Control Functions The User Plane Control Functions (UPCF) can be seen as the brain of the M-MGW node. Its functions go from terminating the H.248 protocol, by interpreting and

CHAPTER 4. MEDIA GATEWAY

27

H.248

MGW Application User Plane Control Functions

Signaling Transport Converter GCP Terminations Connection Coordinators

Virtual MGWs
API

API

Operation and Maintenance

User Plane Functions

Media Stream Function

Media Framing Function

API

API

CPP

Bearer Termination

External Bearer Control

Real-time Routing

Switching Function

Signalling Gateway

Physical Interfaces

Figure 4.7: M-MGW hardware structure. [33][41] replying the commands, administering the resources of the node, ordering connections, selecting transport systems and controling stream functions. It is divided into three subsystems: 1. Signaling Transport Converter. This subsystem handles the transport of H.248 between the M-MGW and the MGC (any Media Gateway Controller; i.e. MSC server, MGCF, etc) fullling the recommendation Q.2150 from ITU-T [61]. It provides: Independence from the transmission media, services availability reports, and transparency of the information transferred [91]. 2. GCP Termination. GCP stands for Gateway Control Protocol and it is yet another name for H.248 (or MEGACO). The GCPT subsystem is decoding and encoding the H.248 message in binary mode, using the ASN.1 (Abstract Syntax Notation One) [64] and encoding applying the Basic Encoding Rules (BER) [63]. It also handles the H.248 messages by breaking them into M-MGW internal data and vice versa, checks the syntax of the messages and controls the timers implemented for the H.248. Load balancing between several Connection Coordinators (CCs) following the H.248 rules and load balancing algorithms is part of its functionality as well [91].

CHAPTER 4. MEDIA GATEWAY

28

3. Connection Coordinator. The Connection Coordinator is responsible for mapping the H.248 view to the user plane view. In other words, the CC interprets the H.248 instructions and maps them into device reservations, connections, stream processing commands for the platform and event monitoring. The resources database is also updated by the CC to keep control of the hardware and software resources used in the dierent calls. Figure 4.8 illustrates an example call where the CC has implemented a chain of reserved functions for stream processing and transport adaptation.

Logical View
Resource Component Database

Connection Coordinator

ATM

AAL2 Media Framing

IuFH

Voice Coder

Echo Canceller

RTP

UDP Media Framing

IP

Media Stream Resource Component

Connection Chain

Figure 4.8: Connection model in the Connection Coordinator. [41]

User Plane Functions The User Plane Functions system can be divided into two parts: the media stream functions and the media frame functions. The media stream functions are relative to the necessary media stream processing in the user plane. It can be a stream addition (tones, interactive messages, DTMF sender), stream modication (echo canceling, speech coders), media analysis (DTMF detector, tones detector) or data modulation (asynchronous non-transparent circuit switched data, etc). The media frame functions does the work of adapting the media frames into the dierent transport

CHAPTER 4. MEDIA GATEWAY

29

systems, by framing the media and xing it into the adequate protocol stack for the transmission. Basically, it packs and unpacks the media arriving as payload to the node, facilitates the media stream processing if needed, and converts from one transport system to another. Operation and Maintenance application The O&M application allows the administration of the M-MGW node and the conguration of many of their parameters. It provides support for upgrades and resource administration, keeps logs and informs about the status of the node. It can be operated remotely or locally using a Graphical User Interface (GUI). A web browser using a Java Virtual Machine (JVM) can access the GUI and the client can communicate with the node by HTTP (Hypertext Transfer Protocol), IIOP (Internet Inter-Object Request Broker Protocol) and FTP (File Transfer Protocol).

Chapter 5

Control Protocol for Media Servers


5.1 Multimedia Resource Function, Media Servers and Physical nodes Implementation

Media Server is a term used by the industry for referring to the Multimedia Resource Function as a physical node. The Media Servers could include several of the functionalities specied by 3GPP for the IMS logical nodes in one box and not only the MRF. Sometimes, the control part is totally split from the processing part, and sometimes is considered an integral part of it. The most common desired groupings of functions are (see gure 5.1):
MRFC-MRFP or MRF. Comprises the Media Resource Function in a physical node. The standardized communication interface in use by this node would be the Mr interface, which according to the specication 3GPP TS 23.228[14] uses the SIP protocol specied by IETF RFC 3261[86]. In this case, the Mp interface could be omitted, and one would use a proprietary interface to control the Media Resources available in the node. MRFC-MGC. Associating the MRFC with the MGC could be considered a good alternative in terms of stack reuse since both use H.248 to control their slaves nodes as masters. Also, some of the state machinery and the control logic have strong synergies, and there exists similar points in their functionalities (see 2.1). One of the strongest arguments that could aect this association is the dierence in cardinality. Some experts believe that the cardinality between MRFC-MRFP is many-to-one, meanwhile, the MGC-MGW is one-to-many.

30

CHAPTER 5. CONTROL PROTOCOL FOR MEDIA SERVERS

31

This is a valid scenario if we compare with a CS-MGW to which every UE has to get a connection. There is then a need for many MGWs for reasons of capacity, load balancing and even geographic location for the user plane processing. Meanwhile only a MGC could be deployed to take care of the signaling. In the IMS case where the MRFP is expected to be used, there is not a need for each UE to access a MRFP. In this case, an architecture with a more centralized approach for a Media Processing Server is expected to be deployed, at least in the initial IMS phases. Also several ASs will expect to have access to control the MRFP resources to serve their clients.
AS-MRFC. This kind of associations add an additional value to the previous point, where the AS is integrated and has direct access to the MRFP. Signaling reduction in the network, easier integration and clearer routing management are some of the advantages brought by this association. On the other hand, this kind of close architecture could represent a risk of inter-vendor compatibility and interworking, i.e, for other AS accessing the MRFP or interworking with the AS embedded in this entity. MRFP-MGW. The MGW and the MRFP have strong synergies in the media processing area. Integration studies have already been published for this association [43] providing many arguments for the merging of the functionalities in one node. The control of both nodes is driven by H.248, and their associations with the masters is very similar.

The merging of MRFC-CSCF means basically the addition of the MRFC functionality to the CSCF. The main concerns with respect to this type of association are about the H.248 signaling that has to be generated to control the MRFP. The CSCF and the MRFC deploy a SIP stack in their input interfaces, and the handling of a H.248 stack for output might seem superuous and therefore not practical. The idea of grouping the functions has caused discussion in the industry forums (i.e., 3GPP and IETF ) to look into alternative interfaces to control the Media Servers. The 3GPP specications [14] locate the media resources per se in the MRFP, and dene the Mp interface as the control interface from the MRFC. Despite of this, the utility and feasibility of the Mp interface has been questioned by some manufacturers and developers in several forums (IETF, ATIS, 3GPP, etc) and white papers (i.e. Convedia [27]and Brooktrout [20]). The main arguments for those claims are:
Increasing of the network complexity by using the master-slave relationship between MRFP and MRFC. The H.248 master-slave model forces the MFR

CHAPTER 5. CONTROL PROTOCOL FOR MEDIA SERVERS

32

AS

AS

CS Network

CS Network

CSCF BGCF HSS MRFC MGCF SGW HSS

CSCF BGCF

MRFC

MGCF

SGW

Access Network
MRFP MGW

Access Network
MRFP MGW

UE

Media Traffic Signalling Traffic

UE

Media Traffic Signalling Traffic

AS

CS Network

CSCF BGCF HSS MRFC MGCF SGW

Access Network
MRFP MGW

UE

Media Traffic Signalling Traffic

AS

AS

CS Network

CS Network

CSCF BGCF HSS MRFC MGCF SGW HSS

CSCF BGCF

MRFC

MGCF

SGW

Access Network
MRFP MGW

Access Network
MRFP MGW

UE

Media Traffic Signalling Traffic

UE

Media Traffic Signalling Traffic

Figure 5.1: IMS Logical Nodes grouping used for the industry relative to the Media Server

CHAPTER 5. CONTROL PROTOCOL FOR MEDIA SERVERS

33

split and limits the possibility of direct control from the Application Servers to the MRFPs.
Lack of exibility for the implementation of the media services. Low level control and direct processing commands are specied in H.248s packages, making the realization of services and application level resources more restricted. Development redundancies for the node commercial releases. The limited use of H.248 for 3GPP networks for the specic cases of MRFPs and MGWs represents an extra eort for support and maintenance and a waste of opportunities to converge protocols used commonly in dierent standardized media networks, specically in the case of SIP. The production of commercial nodes merging several capabilities and providing support for several network architectures is limited because of this aspect.

Chapter 6

Media Server Control Interface Criteria


6.1 Methodology

The analysis of the protocols is based on the dierent scenarios implying the MRFC and the MRFP for IMS networks. As the aim of this thesis is to study the feasibility of the implementation of the Mp interface for the M-MGW, the impacts that the protocols have on the node are then considered as well. The scenarios are used to provide some of the basic requirements. At the same time, these provide the evaluation parameters for the comparison of the protocols. The comparisons are done on a theoretical level, since some of the protocols are still in draft stage or not fully standardized. The criteria used to evaluate the protocols are based on:
Network specications and standards (3GPP, IETF, ITU) Articles and technical reports Test results when available

These items provide the information to analyze the dierent aspects of the protocols: 1. Functionality. Analyzes the dierent functions and services provided by the protocols. 2. Architecture specics. Looks at the dierent congurations details and the environment setup necessary for the protocols to workt, and also into the associated protocols and transport. 34

CHAPTER 6. MEDIA SERVER CONTROL INTERFACE CRITERIA

35

3. Performance. Evaluates dierent aspects inherent to this category: Bandwidth consumptions, delays, processing power needed, load variation and error handling. 4. Extensibility. Reviews the possible extensions to be used with the protocols to improve their capabilities and add functionality. 5. Scalability. Analyzes the handling of overload situations, connection associations and deployment of several instances. 6. Security. Lists the security features of the protocol restricted by the security needs for the Mp interface. 7. Interoperability. Analyzes the possibility of deployment in several areas (other than the Mp interface), legacy and conguration compatibility. 8. Development. Presents the future trends and state of the protocols at this moment.

6.2

Evaluation

The nal result of the analysis is provided by the evaluation criteria. Even when it is dicult to quantize the extent of the protocols commitment to every requirement for the Mp interface and other analyzed aspects, several aspects can be dened depending on the inherent characteristics. The aspects used for the study are: 1. Flexibility. It denotes the degree of adaptability for changing environments, dierent congurations or deployment extent. 2. Capacity. It ranges the capabilities measurement to equivalent categories for the protocols. 3. Reliability. It measures the reliability of the protocols for the dierent analyzed aspects. 4. Complexity. It evaluates the level of complexity for the implementation of each protocol and necessary setup arrangements to make it work. 5. Abstraction level. This measurement shows the rank of abstract concepts used by the commands in the protocols implementation. This gives an idea of the degrees of freedom for the implementation of the protocols commands by the handling applications.

CHAPTER 6. MEDIA SERVER CONTROL INTERFACE CRITERIA

36

6.3
6.3.1

Requirements
3GPP Requirements

According to the last published stage of the TS.23.333 Multimedia Resource Function Controller (MRFC) - Multimedia Resource Function Processor (MRFP) Mp interface: Procedures Descriptions [9] at the moment of writing this text, the MFR function has been split into MRFC and MRFP and their tasks dened as follows: The tasks of the MRFC may consist of the following:
Control the media stream resources in the MRFP. Interpret information coming from an AS and S-CSCF (e.g. session identier) and control the MRFP accordingly. Generation of CDRs (Call Data Records).

Tasks of the MRFP may consist of the following:


Control of the bearer on the Mb reference point. Provide resources to be controlled by the MRFC. Mixing of incoming media streams (e.g. for multiple parties). Media stream source (e.g. for multimedia announcements). Media stream processing (e.g. audio transcoding, media analysis). Floor Control (i.e. manage access rights to shared resources in a conferencing environment).

Also, the next functional requirements have been dened:


Play Tones. The MRFP should be able to send specic tones upon request of the MRFC. It shall be able to play the tone, continuously until a stop request is sent or for a requested length of time. Also, it shall be able to send notications and detect a DTMF with the possibility of triggering tones actions. Play Announcement. The announcements shall be played with the same requirements as the tones, with the addition that requests of predened xed announcements can also be received. Furthermore, several predened variables (such as date, time, currency, etc) could be provided by the MRFP in the announcement under request.

CHAPTER 6. MEDIA SERVER CONTROL INTERFACE CRITERIA

37

Text to Speech. To produce automatically generated speech from text input is another of the functional requirements. Although the language type may be indicated, a translation function is not required. Audio Record. It basically consists of recording the audio media stream in the user plane and storing to a le. The recording could include the mixing of streams from a multi-party call. DTMF Collection. The MRFP shall be able to detect and report DTMF digits to the MRFC. Automatic Speech Recognition. This function is responsible for the processes implying the matching of a user input voice against a target data producing a recognition result that represents the detected input. Play Multimedia. The function of the playing multimedia is to play the synchronized audio and video media stream to the user, including requests for playing media to several parties connected to a call or session and transcoding if the codec of the media is dierent from that of the session codec. Multimedia Record. Storing the recorded synchronized audio and video media stream(s) into a multimedia le is part of the MRFP functions. The recording can include one or several parties from a single or conference call. Audio Conference. The audio conference allows several subscribers participating in the conference to communicate with each other. The way of communication may be inuenced by a oor control policy summarized in 3GPP TS 24.147 [8]. Transcoding, DTMF detection, playing tones or announcements, recording of audio during the conference may be possible. Multimedia Conference. The multimedia conference adds the video stream to the previous function and the respective video manipulation actions such as recording and transcoding. Audio Transcoding. The MRFP shall support audio transcoding between streams of two Terminations within the same context where the streams are encoded dierently. As minimum requirement the MRFP shall support the default 3GPP audio codec AMR (narrowband), and optionally any other codecs. Video Transcoding. The MRFP shall support video transcoding between streams of two Terminations within the same context where the streams are

CHAPTER 6. MEDIA SERVER CONTROL INTERFACE CRITERIA

38

encoded dierently. As a minimum requirement the MRFP shall support the default 3GPP video codec H.263, and optionally any other codecs.

6.3.2

IETF requirements

IETF has been one of the forums used by those opposing the deployment of H.248 to control media servers. They have raised there proposals for media control protocols, and specially discussed the need of such a protocols to use SIP in some way (i.e. as transport, extending it, etc). The internet draft draft-even-media-server-req-01[36] tried to compile the general feeling from the discussion list of the requirements for such a protocol. The document also takes as a reference RFC 4553 A Framework for Conferencing with the Session Initiation Protocol (SIP)[83] (at the time of the document writing it was an internet draft) to provide the requirements based in the required functionality specied. At the moment of writing this work it is an expired document and has not been updated yet, but it gives a good view and understanding of the IETF discussions general opinion for the media server control protocol requirements. The document by R. Even mentions textually in the requirement chapter (from page 6) the n points:
General protocol

1. The Media server control messages shall be sent over a reliable connection. 2. The protocol shall enable one AS to work with multiple MS. 3. The protocol should enable many AS to work with the same MS 4. The AS should be able to nd the MS and connect to it. 5. The MS shall be able to inform the AS about it status. 6. The protocol should be extendable. 7. The MS shall be able to tell the AS its capacities. 8. The MS shall be able to tell the AS its functionality (Mixing,IVR, Announcements). 9. The AS shall be able to request the MS to create, delete, and manipulate a mixing, IVR or announcement session. 10. The MS shall supply the media addresses (RTP transport address) to be used to the AS. 11. The MS should send a summary report when the session is terminated by the AS.

CHAPTER 6. MEDIA SERVER CONTROL INTERFACE CRITERIA

39

12. The AS should be able to request call/session and conference state from the MS. 13. The MS should support DTMF detection (in band tones and RFC 2833[89]) 14. The protocol shall include redundancy procedures. 15. The protocol shall include security mechanisms. 16. The AS should be able to reserve resources on the MS. The resources models should be simple. (This requirement needs more discussion) 17. The MS may support resource reservation and shall report the support in the initial connection to the AS. 18. The MS shall inform the AS about any changes in it capacities. The changes may be due to reservation, internal usage or due to some malfunction. 19. The AS shall be able to tell the MS which stream parameters to use on incoming and out going streams. Stream parameters may be for example codec parameters (video codec features) or bit rates. This requirement will help the MS to allocate the right resources. 20. The AS shall be able to dene operations that the MS will perform on streams like mute and gain control. 21. The MS shall supply the AS with sucient information for the event package.
Announcements. Announcements may include voice, audio, slides or video clips.

1. The AS shall be able to instruct the MS to play a specic announcement. 2. The MS shall be able to retrieve announcements from an external connection. 3. The AS shall be able to tell the MS if the message can be delayed if the MS cannot play it immediately. 4. The AS shall be able to instruct the MS to play announcements to a single user or to a conference mix.
Media mixing

1. The AS shall be able to dene a conference mix. 2. The AS may be able to dene a separate mix for each participant.

CHAPTER 6. MEDIA SERVER CONTROL INTERFACE CRITERIA

40

3. The AS shall be able to dene the relationship between two mixes, for example a pair of audio and video for lip-sync or for voice activated video switch 4. The AS may be able to dene a custom video layout built of rectangular sub windows. 5. For video the AS shall be able to map a stream to a specic subwindow or to dene to the MS how to decide which stream will go to each sub window. The number of sub-windows will start from one. 6. The MS shall be able to inform the AS who is the active speaker. 7. The AS may be able to cascade mixers ( side bar with whisper mode) 8. The MS shall be able to inform the AS which layouts it supports.
IVR

1. The AS shall be able to load an IVR script to the MS and receive the result 2. The AS shall be able to mange the IVR session by sending announcements and receiving the response (DTMF) 3. The AS should be able to instruct the MS to record a short participant stream and play it back to the conference. This is not a recording requirement.

Chapter 7

Media Server Control Interface Analysis


7.1 3GPP Proposed Protocols for Control of Media Servers

The Media Server as such can be mapped to the MRF function from the 3GPP design of the IMS. The interfaces specied for the MRF have been already mentioned in 2.3 and illustrated in 2.2. These interfaces are:
Mb Reference Point. It is considered the user media transport interface. It is also known as Reference Point to IPv6 network services. It is connecting the MRFP with other MRFPs, MGWs and IPv6 networks, such as GPRS. In the last case, it can be the same reference point as the GPRS Gi reference point [13]. Mr Reference Point. It is a SIP based interface. It is not totally specied at this moment by 3GPP. It connects the MRFC to the CSCF, and it is meant to transport the necessary information to provide the media services requested from the Media Servers. This information is expected to be provided by the Application Servers. Mp Reference Point. It shall be based on the H.248 protocol and the respective packet extensions for the media control. It transports the control commands from the MRFC to the MRFP. The work for the specication of this interface [9] is still in progress at the moment of writing this work.

Consequently, the control of the MRF is dependent of the Mr and Mp interfaces and their protocols SIP (specied in RFC 3261 [86]) and MEGACO (ITU-T Recommendation H.248)[62]. 41

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

42

7.2

IETF Proposed Protocols for Control of Media Servers

As it was already mentioned in 6.3.2, there exists dierent attempts in IETF to standardize a control protocol for Media Servers. Some of them have been submitted as internet drafts at the moment of writing this work, and therefore their current state of completeness and matureness are at dierent levels. We will focus later on the set of protocols that provides Control features at least comparable with H.248. The proposed protocols to be use for media control in IETF are:
Media Sessions Mark-up Language (MSML) [87]. This on going work is meant to provide a complete protocol for media control. The main aim is to replace H.248 and reuse the SIP stack present in many IP network nodes. Further details are expanded in section 7.3.3. Media Server Control Mark-up Language (MSCML) and Protocol, RFC 4722 [32]. Another alternative to H.248 for media server control. It provides a high degree of implementation freedom by providing directives embedded in SIP body messages. Further details are expanded in section 7.3.4. Basic Network Media Services with SIP (Netann), RFC 4240 [19]. Netann is considered to be an extension of SIP that provides announcements, scripted IVR and conference mixing setup. We will not analyze deeper the possibility of Netann deployment for media control, since the possible media control with Netann is quite limited especially after the establishment of the session, and it can be used to complement the services provided by MSCML and MSML. The Call Control XML (CCXML). Included in RFC 4267 [40]. CCXML provides a XML syntax to provide telephony call support for dialog systems such as VoiceXML. This protocol can be consider complementary to the VoiceXML compatible protocols (i.e. SIP, MSCML, MSML, etc). SIP Interface to VoiceXML Media Services [21]. This draft intends to provide a SIP interface to VoiceXML media services provided by Media Servers and employed commonly by Application Servers. This protocol could be considered as complementary for SIP for providing VoiceXML interaction. Therefore we will not analyze it further in this work. A Control Framework for the Session Initiation Protocol (SIP) [17]. This internet draft intends to provide a framework and protocol based on SIP for application deployment where the application logic and processing are distributed (i.e. Media Servers and Media Server Controllers) . It is meant to

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

43

be extended by dened Control Packages that provide a specic control features. The denition of this framework was not mature at the time of writing this work, therefore it was left out of the scope of the analysis. However, in the Future Work section 9.1 a small update of the current work done on this protocol is provided.
VoiceXML Interactive Voice Response (IVR) Control Package for the Session Initiation Protocol (SIP)[18].This internet draft denes a package for IVR functions using VoiceXML dialogs for the Control Framework for SIP [17]. A Basic Interactive Voice Response (IVR) Control Package for the Session Initiation Protocol (SIP) [16]. The scope of the package is control of media server functions for basic interactive media, as well as notications related to these functions dened as XML messages. This internet draft denes a control package for basic IVR functions for the Control Framework for SIP [17].

7.3
7.3.1

Media Server Control Protocols


ITU Recommendation H.248 (Megaco Protocol)

H.248[62] is a protocol developed as a joint eort between the International Telecommunication Union (ITU) and the Internet Engineering Task Force (IETF)[42]. The name Megaco is derived from the IETF workgroup where it was produced. It was envisioned as the substitute for all the control protocols deployed by dierent vendors and networks to control Media Gateways , such as the Media Gateway Control Protocol (MGCP). 1. Architecture specics. H.248 follows a transaction master-slave model, where the Media Gateway is the slave and the Media Gateway Controller is the master. The protocol works in the application layer of the OSI model, which means that it needs a suitable transport to establish the connections and to provide redundancy, security and routing. For those task, some other transport protocols are used to provide those services. They have even been standardized for ATM [58] (mainly deploying AAL5 and MTP3), SCTP [59], and IP (annex D of [62]) by means of UDP and TCP. 2. Functionality. H.248 was meant to be used for MGW control, although it has been extended to be deployed also for advanced conferencing. A model for creating advanced conferences is presented in [73]. It provides a transaction base

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

44

way of communication where every request should receive a reply. H.248 denes a connection model which denes two abstract entities: Terminations and Context. A termination can be the source or/and the sink of a media stream. It also encapsulates the media stream and bearer parameters. A Context denes the association between collections of terminations; there exists a special type of context named null context which represents the repository of all the terminations that do not have any association. Also a special type of termination called root is dened, in order to address the whole gateway instead of one particular termination. An example of the connection model is shown in gure 7.1.
Media Gateway Context Termination SCN Bearer Channel Termination RTP Stream Termination SCN Bearer Channel

Context Termination RTP Stream

Null Context Termination SCN Bearer Channel

Context Termination RTP Stream Termination SCN Bearer Channel


H.248.1V2 F01

Figure 7.1: Example of a H.248 connection model. The asterisk box in each of the Contexts represents the logical association of Terminations implied by the Context[62]. The message structure of H.248 is shown in gure 7.2. A H.248 message is essentially a transport mechanism for transactions. It contains a header and a set of transactions which are processed independently of each other without any specic order. The H.248 messages are not acknowledged. The H.248 commands are the basic orders from the MGC to the MGW. Their task is to manipulate the logical entities of the protocol connection model. Table 7.1 briey describes the dierent commands available. The commands contains several parameters called Descriptors. The Descrip-

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

45

Commands Add

Description The Add commands adds a Termination to a Context. The Add command on the rst Termination in a Context is used to create a Context The Modify command modies the properties, events and signals of a Termination. The Subtract command disconnects a Termination from its Context and returns statistics on the Terminations participation in the Context. The Subtract command on the last Termination in a Context deletes the Context. The Move command atomically moves a Termination to another Context. The AuditValue command returns the current state of properties, events, signals and statistics of Terminations. The Notify command allows the Media Gateway to inform the Media Gateway Controller of the occurrence of events in the Media Gateway. The ServiceChange command allows the Media Gateway to notify the Media Gateway Controller that a Termination or group of Terminations is about to be taken out of service or has just been returned to service. ServiceChange is also used by the MGW to announce its availability to a MGC (registration), and to notify the MGC of impending or completed restart of the MG. The MGC may announce a handover to the MGW by sending it a ServiceChange command. The MGC may also use ServiceChange to instruct the MGW to take a Termination or group of Terminations in or out of service. Table 7.1: H.248 Commands [62]

Direction MGC -> MGW

Modify Subtract

MGC -> MGW MGC -> MGW

Move Audit Value Notify

MGC -> MGW MGC -> MGW MGW -> MGC

ServiceChange

MGW -> MGC, MGC -> MGW

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

46

H.248 message
Transaction1 Action1
Command1 Command2

Action2
TopologyDescriptor

Action3

Command1 Command3

Command2 Command4

Transaction2 Action1
Command1
MediaDescriptor

Figure 7.2: H.248 Message structure tors consist of a name and a list of items which may have values. Table 7.2 describes all the possible descriptors and their purposes. Some of them can not be present together in a command. The Actions are in general a group of commands which are related to the same Context. Exceptions to this are commands which refer to terminations that are not in any context or when a new context is created. Finally the Transactions are grouping of Actions and therefore Commands which can be grouped into two types: Transaction Requests and Transaction Replies. The Commands within a transaction are executed sequentially in contrast to the already mentioned non-sequence nature of the Transactions in a H.248 message. The Transaction requests are basically acknowledged by the Transaction replies. The applications handling the protocol implements timers for either resending a transaction request or issuing Transaction Pending, which is a type of message used by the receiver of a Transaction Request to inform the sender of delays in the processing of a specic Transaction Re-

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

47

Descriptor Name Modem

Description Identies modem type and properties when applicable.(ModemDescriptor has been deprecated in ITU-T Rec. H.248.1 version 1 (03/2002)). Describes multiplex type for multimedia Terminations (e.g. H.223, H.225.0) and Terminations forming the input mux. A list of media stream specications Properties of a Termination (which can be dened in Packages) that are not stream specic. A list of remote/local/localControl descriptors for a single stream. Contains properties that specify the media ows that the MG receives from the remote entity. Contains properties that specify the media ows that the MG sends to the remote entity. Contains properties (which can be dened in packages) that are of interest between the MG and the MGC. Describes events to be detected by the MG and what to do when an event is detected. Describes events to be detected by the MG when Event Buering is active. Describes signals applied to Terminations. In Audit commands, identies which information is desired. In AuditValue, returns a list of Packages realized by Termination. Denes patterns against which sequences of a specied set of events are to be matched so they can be reported as a group rather than singly. In ServiceChange, what, why service change occurred, etc. In Notify or AuditValue, report of events observed. In Subtract and Audit, report of Statistics kept on a Termination. Species ow directions between Terminations in a Context. Contains an error code and optionally error text; it may occur in command replies and in Notify requests. Table 7.2: H.248 Descriptors [62] H.221,

Mux Media TerminationState Stream Local Remote LocalControl Events EventBuer Signals Audit Packages DigitMap ServiceChange ObservedEvents Statistics Topology Error

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

48

quest. A Transaction Reply should include the results for all the Commands found in the corresponding Transaction Request. This means that it should include the return values for the Commands executed successfully, and the error descriptor for any Command that has failed. The protocol allows to be extended by means of the Package Denitions. They dene optional Properties, Events, Signals and Statistics that may occur on Terminations. Packets dened by IETF appear in separate RFCs, the ones dened by ITU-T may appear in relevant Recommendations, and those dened by other organizations should be registered with IANA (Internet Assigned Numbers Authority). The denition of which Packages are supported is ensured by the use of Proles. A prole is identied by a name (IANA registered) and a Version. The prole is a document containing a description of the options available for a particular application. The only mandatory elements are the Name, Version and a summary of the prole. The proles are negotiated by the ServiceChange command. 3. Performance. (a) Bandwidth consumptions. The H.248 specication supports binary and text encoding. For the binary encoding BER (Basic Encoding Rules) and ASN.1 (Abstract Syntax Notation number One) [63] format is used. For the text encoding the Augmented Baur-Nackus Form (ABNF)[29] in its form verbose and compact. The message sizes can vary depending of the quantity of transactions, the number of contained commands per transaction, the type of command and mentioned descriptors. For making an objective measurement, we can base the measurements on a representative call ow, such as the one provided by RFC 3525 [42]. The Erlang project Megaco Benchmarking [81] is based on those messages and added some additional data. The results referring to the message size are summarized in Figure 7.3. There, we can see the dierence between binary encoding and text encoding size, and the dependency on verbose mode or compact. The size of the text encoded compact messages are similar to the binary messages in size. The H.248 constructions will grow proportionally to the number of Transactions and Commands in a single message, as well as with the number of descriptors deployed in the Commands. Logically the text verbose encoding will be larger, although the compact mode could be even smaller than the binary counterpart in many cases.

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS


H.248 Messages Size
600

49

500

400

Bytes

300

200

100

0
8b msg5 8a msg5 7 msg5 6 msg5 5 msg5 4v msg5 4b msg5 4a msg5 3 msg5 2 msg5 1h msg5 1g msg5 1f msg5 1e msg5 1d msg5 1c msg5 1b msg5 1a msg5 0d msg3 0c msg3 0b msg3 0a msg3 5 msg2 4 msg2 3d msg2 3c msg2 3b msg2 3a msg2 2a msg2 1 msg2 0 msg2 9 msg1 8 msg1 7 msg1 6 msg1 5 msg1 4 msg1 3 msg1 2 msg1 1 msg1 0 msg1 9 msg0 8b msg0 8a msg0 7 msg0 6b msg0 6a msg0 5 msg0 4 msg0 3 msg0 2 msg0 1b msg0 1a msg0

H.248 message identifier Full Text Compact Text ASN.1 BER

Figure 7.3: H.248 Message size for a Representative Call Flow, based on a benchmark test performed by the Erlang project in [81]. (b) Delays. As it was mentioned, H.248 implements several timers to be able to handle the processing and transmission delays. The transmission delays are inuenced by the type of transport and protocols chosen to carry the H.248 messages. These kind of delays are out of the scope of this work. The processing delays are the sum of all the delays from the moment when a node receive a message with a transaction request until it sends back another message replying to the transactions requested. Since the resource handling and switching in the nodes are also out of the scope of this work, we will concentrate on the encoding and decoding delays. The encoding of the messages are strongly depending on the algorithm, libraries and hardware used to run the encoder/decoders. This makes it dicult to establish a comparison point between several protocols. Nevertheless, based on measurements done by a commercial H.248 encoder manufacturer[76] some estimate can be assumed, based on the message ow sequence already presented for message size. Figure 7.4 illustrates the encoding and decoding delays for dierent type of binary H.248 messages provided by the tests performed using OSS Nokalva li-

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

50

braries. The average encoding time is 11.8 microseconds, meanwhile, the decoding time was 13.4 microseconds [76].

H.248 Processing Delay for Binary Encoding


Encode 80,0 Decode Encoding-Decoding

70,0

60,0

50,0 MicroSeconds

40,0

30,0

20,0

10,0

0,0
1a msg0 1b msg0 6a msg0 6b msg0 8a msg0 8b msg0 2a msg2 3a msg2 3d msg2 2 msg0 3 msg0 4 msg0 5 msg0 7 msg0 9 msg0 0 msg1 1 msg1 2 msg1 3 msg1 4 msg1 5 msg1 6 msg1 7 msg1 8 msg1 9 msg1 0 msg2 1 msg2 4 msg2

H.248 message identifier

Figure 7.4: Encoding and Decoding measurements for H.248 binary encoded messages based on a benchmark test performed by OSS Nokalva in [76]. (The message sizes are the same as for the corresponding messages shown in gure 7.3 for binary encoding) (c) Load Variation. The time deployed by a gateway to process the H.248 messages depends on the message requirement from the gateway physical resources. Normally, the processing of the messages is counted in hundreds of milliseconds, and the variations are according to the number of connection/disconnections needed, the type of stream functions requested and the initialization of their hardware, the quantity of terminations involved in the messages and quantity of functions already reserved. Those gures have a direct relation to the quantity of PendingTransactions messages sent (since the replies can be delayed) and the way how the replies may be grouped together creating bigger messages. Finally it is up to the application to decide if the replies are grouped into one message or sent

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

51

in separate ones, what kind of operations will be allowed and which ones will be rejected as not supported, and to go on successfully with the call ow or report a failure. The entities handling the protocol must be ready to handle dierent types of loads. Whereas a message could be processed immediately after it was received or empty fast overowed queues threatening the system to collapse. Some already mention avalanche prevention mechanisms are part of the protocol itself but they are mainly targeting the restart or the applications and not in service overload situations. One item worthy of mention is that the current trac models used for H.248 may not be anymore valid in the case of using it for Media Server Control. The reason is the dierent characteristics of the trac deploying the services of the node. For the MGW trac, the burst of H.248 messages happens at the beginning and at the end of the call session. The call setup ensures to provide to the UE with all the functions needed from the beginning and just ready to be activated/deactivated. This creates a rst burst of messages. Then later, in the call release phase, another burst of messages is created to release the resource and initialize the application entities. For media control, this might be still true, but one can expect much more interaction with the UE during the established phase than before. The reason is that a UE can request services during this phase, for example: playback of dierent videos, recording of audio, switching between video streams, switching between audio streams, etc. (d) Error Handling. H.248 messages are not acknowledged. The nodes are aware of the message arrival when they receive messages referring to the transactions contained in the sent message (replies or pending). The application handling the protocol has to be aware of the time elapsed since an outgoing transaction took place and react accordingly if the transaction is not acknowledged in any way. Basically, transactions which are not replied to a certain period of time, are resent in another H.248 Message. For this reason it is important for the protocol to keep synchronization between the two nodes, since they are highly coupled and the interworking between the master and slave requires a precise and detailed knowledge of the capacity and capabilities of the controlled node. When the synchronization between the nodes is lost, for example because wrong handling of unexpected events, a bug in the applications or overloads, the protocol provides Audits for a re-sync on the termination level or for the whole node. In case of major failures, restarts of the nodes can have sev-

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

52

eral levels (warm restart, cold restart, etc) or disabling of services which are in failure. Rather, the protocol itself provides the tools to handle this type of error, but not really an automatic way to detect them and to act accordingly since those details are left to the handling application to take care of. The protocol provides also an error descriptor which indicates with a code (and even a text) the error risen during the processing of a command. More errors could be dened for new packages and they are part of the package denition. The H.248 specication also provides considerations to avoid restart avalanches when a master node or several slaves restart. The re-registration of the nodes could cause an avalanche of initialization ServiceChanges that could cause message losses and network congestions during a critical time as it is the network restoring. A random value is used to activate a timer before sending the power-on indication when following a restart procedure already established. 4. Extensibility. The protocol was meant to be extended by the means of package denitions. 3GPP has registered several packages to assure the intercompatibility between vendors in the 3GPP networks. The packages have to be registered with IANA, and there dene extra properties, signals and error codes handled by the new package. Some issues related to encoding could limit the extensibility, such as the case of IPBCP tunneling, which can not be encoded in text mode because of character incompatibility with the H.248 text encoding. 5. Scalability. The ITU-T H.248 recommendation mentions the possibility of deployment of Virtual Media Gateways, each logically independent but maybe sharing the same node resources. Each Virtual Media Gateway could receive its own transactions and process them independently of the transactions sent to the other Virtual Media Gateways. The handling of resource overloads are left to applications and pointed out by the deployment of the correct error codes. The network congestions are not handled by the protocol itself, since it is expected to be part of the duties left to the transport layers and application congestion detection. The connection associations are assumed to be directed only between the slave and the master node, which implies that the routing between the two nodes is also handled by the transport layers and there is not done any type of load balancing, resource discovery or services reallocation other than those provided by audit means or ServiceChange. The media connections should not be related to the signaling connections since

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

53

the signaling establishes the media connection between dierent entities other than the master-slave. The avalanche restart eects are already mentioned in the Error Handling bullet of this section. 6. Security. The protocol assumes a inclosed connection between the master and the slave entity . Security is assumed to be provided by the lower layers and encryption. The recommendation just addresses the security for IP networks and species the deployment of IPsec to provide the connection protection. The recommendation also species the deployment of encryption and source IP header analysis of the media packages to avoid uncontrolled barge-in attacks and eavesdropping. Those mechanisms could delay the call set-up since more data needs to be available, such as encryption keys or media destination address, before the security mechanism can be used. By itself, H.248 does not implement any type of protection for eavesdropping, Denial of Service Attacks or spoong. 7. Interoperability. H.248 has been openly criticized because its interoperability scope is limited to 3GPP networks and even in them it has shown some incompatibility between vendors [70]. H.248 it is not used in other standardized networks other than 3GPP since its scope has been mainly in the Media Gateway control. 3GPP is expecting to utilize the experience obtained from the deployment of the protocol and apply the same principles used for the Media Gateways in the early stages of the 3GPP IMS networks. Besides, H.248 provides with mechanisms during the initialization phase to agree on the protocol version in use, the encoding type, and the prole supported (which basically denes the packages supported as well). 8. Development. H.248 has been extended to provide more services needed by the media resource control. A third version of the protocol was already released by ITU in September 2005 with several improvements to the core protocol. H.248.19 Decomposed multipoint control unit, audio, video and data conferencing packages [65] was also released. 3GPP has already started the specication of the Mp interface based on H.248 [9] which was withdrawn on March 2007.

7.3.2

Session Initiation Protocol (SIP)

The Session Initiation Protocol was developed by the IETF as the result of merging two IETF protocol proposals targeting session inviting. It has been dened

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

54

in RFC 3261 [86] and become a widely used standard in the multimedia products that handle conferences, multimedia resource sharing, presence, etc. 3GPP has also taken into use the protocol for several of the IMS interfaces (see section 2.3). An introduction to SIP is required, since some SIP extensions have been proposed for Media Server control, and the other analyzed protocols deploy SIP for transport and session initiation as well. 1. Architecture specics. SIP operates in the application layer of the OSI model. The transport of SIP is meant to be UDP, TCP and SCTP, and it is based on the Hypertext Transfer Protocol (HTTP) deploying the same client-server transaction model. The client produces requests that are received, processed and replied to a server. In contrast with H.248 that uses a master-slave model, the client-server model allows a more free connection model, where the client could contact dierent servers depending of the request that wants to be executed instead of addressing only one entity as is done in the master-slave model. SIP provide interoperability by incorporating methods for negotiating the extensions that will be used in a session. SIP denes User Agents (UA) as entities that interacts with the users. The User Agents can be User Agent Client (UAC), which sends the requests to a User Agent Server (UAS), which receives the requests and replies to it. Other SIP entities are the proxy servers. They act as intermediaries behaving as server and client to execute a request on behalf of other clients. The types of SIP proxies are:
Call Stateful Proxies. This type of proxies keeps information of the session from the beginning until its end. They are always in the path used for the SIP message exchange between two users. They need to know all the SIP transactions happening during the session. Normally this type of proxies are found close to the edge of the core network. Stateful Proxies. Also called transaction stateful proxies since the transaction is their only concern. Basically they store state related to a given transaction until its ends. Stateless Proxies. They do not keep any state, by only forwarding the request and replies to the next hop. They are found mostly close to the core of the network.

2. Functionality. SIP was made to handle multimedia sessions and normally the parties participating in a SIP dialog keep a kind of peer-to-peer relationship.

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

55

In its basic operation SIP provides session establishment, session modication and session termination. The sessions can be totally new or already created sessions. The protocols allows the users to join or leave these sessions as well as negotiate the operative parameters of the session. Based on the negotiation result and the willingness of the invited party of participate in the session, a session is established. To negotiate the multimedia parameters used in the session, some session description protocol might be needed to be distributed as well. The Session Description Protocol (SDP) is usually taking this function, although a dierent protocol can deploy SIP to reach the interested parties. In that sense, SIP is used to distribute session descriptions among potential participants [23]. During the session establishment SIP will keep the parties informed of the progress of the setup and allow them to request modications to an already established session. Finally, the protocol will provide a graceful release sequence where the parties can be explicitly aware of session termination. SIP provides the connectivity between the parties and in principle does not intervene once the session has been established. Request method INVITE ACK Description Invites the parties to participate in a session. It also contains the description of the session. It acknowledges the reception of the nal response to an INVITE. It facilitates the three-way handshake implemented only for the INVITE method in part to avoid unsynchronized parties on session establishment and enables the implementation of fork proxies. It might contain session descriptors as well It queries a server party about its capabilities, such as methods supported, session description protocol, message encoding, etc. It requests a cancellation of a pending transaction It is used to inform a server (named Registrar when supported by this method) about the location of the client It indicates a user abandoning a session. In case of a two-party session it also terminates the session in contrast with a multi-party session where the session continues after the user has left Table 7.3: SIP Request Methods from the core specication[23] The request and reply together is known as a SIP transaction. In fact, one request can generate several replies and still be considered one transaction.

OPTIONS CANCEL REGISTER BYE

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

56

The reason is the existence of provisional responses and nal responses. The provisional responses uses a status code in ranges from 100 to 199, while the status codes from 200 to 699 are nal. Table 7.3 shows the SIP request methods and their functions from the core specication, and table 7.4 shows other methods dened in the extensions. Request method INFO SUBSCRIBE Description It is used to transport mid-session information that does not aect the state of the session (for example, billing information) It is used to declare the user interest in a particular event and the status of a service session. The use of SUBSCRIBE will trigger the use of NOTIFY It is used to terminate the monitored session started with SUBSCRIBE It is used to provide information from a subscribed session Allows a user to update the parameters of a session without impacting the state. The request can be sent even when an answer to an INVITE has not being provided. It is used to provide Instant Messaging (IM) between the parties of an initiated SIP dialog. The body of the MESSAGE method has the form of MIME body parts Dened to provide session transfer functionality, by allowing one SIP entity to instruct another to perform an action It plays the same role as ACK but for provisional responses, however it has its own response in contrast to ACK It indicates publication of any event state for which there exists an appropriate event package

UNSUBSCRIBE NOTIFY UPDATE

MESSAGE

REFER PRACK PUBLISH

Table 7.4: SIP Request Methods from extensions to the Core specication [23] The general format for the SIP messages consists of: (a) A start line. In the case of the request, the start line is a request line which contains three elements:
Request method, that can be any of the ones already mentioned in tables 7.3 and 7.4. Request-URI (Uniform Resource Identier), indicating the next hop to which the request has to be routed. Protocol version, indicating the used version of the protocol (for the current version: SIP/2.0)

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

57

In the case of replies, the start line is known as status line. The status line contains also three elements:
Protocol version, same as the requests. Status code, referring to the integer from 100 to 699 describing the status of the transaction. Reason phrase, which describes the status code in verbose string mode.

(b) One or more header elds. It provides information about the request or response and about the body if it contains it. The dierent headers dened in the core SIP specication are contained in table 7.5. Accept Accept-encoding Accept-language Alert-info Allow Also Authorization Call-ID Call-info Contact Content-disposition Content-encoding Content-language Content-length Content-type Cseq Date Encryption Error-info Expires From In-reply-to Max-forwards MIME-version Organization Priority Proxy-authenticate Proxy-authorization Proxy-require Record-route Require Response-key Retry-after Route Server Subject Supported Timestamp To Unsupported User-agent Via Warning WWW-authenticate

Table 7.5: SIP headers dened in the Core Protocol (c) An empty line. It separates the headers from the message body. (d) A message body (optional). Usually the SIP bodies are session descriptions but it could be any other type of object as well. For SIP, the body content is transparent and it is not examined. A SIP message can contain several bodies as well. Any SIP application will assume that another SIP application can at least communicate by using the SIP core protocol. However, SIP is a exible protocol able to be extended following the mechanism described in the protocol specication. Such extensions are then negotiated during the session establishment by the means of the core protocol. 3. Performance

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

58

(a) Bandwidth consumptions. When SIP was designed, the bandwidth consumed by the SIP signaling was considered negligible, since the media was supposed to use the same link as the SIP signaling. But in the media servers case, the bandwidth is a resource that has to be taken care of. This might be a dimensioning issue that can aect the number of simultaneous users that a node can serve. Relative to this, message compression is a possible option but brings a trade-o with processing capacity, since the compressed messages need to be decompressed to be able to process them. A study giving results about SIP compression can be found in [66]. Since SIP deploys full verbose text messages, the messages can be from hundred bytes to several kilobytes. (b) Delays SIP performance has being evaluated in several studies. Eyers and Schulzrinne did it from the point of view of Internet telephony while comparing SIP with H.323 [37]. In the scope of this work, some results are available for delays and processing power needed from Cortes et al. [28]. Figure 7.5 is based on their measurements and shows the processing delays for dierent proxy implementations. The proxies dier in the libraries used for string processing and memory allocation, threading model and data storage. They had also provided some measurements about the quantity of calls that every proxy can handle according to their model. Those results are not shown in this work because they are strongly hardware and trac model dependent. (c) Load Variation. The load characteristics expected for SIP in a media server are very much similar to the ones explained for the H.248 protocol. Basically, the peaks in the signaling are present during the setup of the call, since that is when the session parameters negotiation is taken place. The SIP dialogs are very simple to be terminated and they do not produce as much signaling as the H.248. Since SIP has being designed to work with non-reliable protocols such as UDP, some retransmissions can be expected if the responses are not delivered on time or if they are lost. Also, the SIP extension PRACK can be used for acknowledging non-nal responses, which can increase the signaling load per user as well. By implementing a broker function node that works as intermediate between several media servers and several application servers, it is possible to create some kind of centralized control that balances the server loads and redirects the request to the appropriate servers according to their known capabilities [72]. The broker would then receive all the requests directed

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

59

to any media server from any application server, and since it is a nonstandardized node some proprietary implementations and optimization may be deployed for load handling. (d) Error Handling. SIP provides dierent responses which points to several errors depending of the nature of them. The application must process them and act according to the error reported. The errors due to data transmission are handled by the transport layer protocol in the case of reliable transport or by timer based retransmissions from the User Agent in the case of unreliable. In the case of lost synchronization because of node restarts or link failures, SIP do not provide any recovery mechanism. The applications have to do the work of recover the call state or release the ongoing sessions. Also, there is not any protection against a possible avalanche registration eect after a node restart.

SIP message processing delay for 4 Proxy implementations


Proxy A 2000 1800 1600 1400 1200 1000 800 600 400 200 0
327 389 423 467 499

Proxy B

Proxy C

Proxy D

MicroSeconds

SIP messages size

Figure 7.5: SIP Message processing delay of 4 Proxy implementations. Source: [28] 4. Extensibility. SIP is a modular and extensible protocol. Special needs are solved by dening extensions. The extensions are negotiated between User

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

60

Agents using the Require and Supported headers. The extensions that are commonly supported are possible to use, and the core protocol functionality has always to be supported by all the User Agents. This gives the possibility of fall-back and ensures interoperability. The RFC 4485 Guidelines for Authors of Extensions to the Session Initiation Protocol (SIP) [85] provides some design principles for the denition of SIP extensions. The extensions have to be registered with IANA and they might require approval from the IETF, especially when broadly used. Several extensions have been dened already for SIP and are widely adopted in almost all SIP implementations. 5. Scalability. The connection model of SIP is client-server as was already mentioned. The characteristics of the connection are very similar to a peer-to-peer connection, but still with the client-server approach of transactions with request and response. The connection between two or more User Agents is known as the SIP dialog, since in general the User Agents exchange the roles of server and client quite often. The nature of this connection depends on the transport layer used for SIP, for example if it is possible to use encryption, message retransmissions or congestion detection. By following the IETF Internet paradigm, SIP is meant to use other protocols to address needs different that its main function of describing a session. And although it is not the main responsibility of SIP to handle issues like security, or congestion, etc, it is aected by the performance of the lower layer protocols that take care of these. One way used by the SIP server implementations to handle a high load of signaling in a node is to store sessions on disk to avoid that hardware problems (restarts, lost of connection, etc) cause loss of synchronization in the on-going session. This is a clear disadvantage for the call setup time, since the diskbased storage is slower than memory based storage. The redirection of requests to SIP servers by broker functions and load balancers is also becoming a solution that has gotten a lot of attention on networks operating with SIP servers. These entities recognize the load in a pool of servers and knows also the dierent capabilities of each of them. In this way, they forward the session to the entity which they consider capable of handling it in both terms: capabilities and/or load level. 6. Security. SIP is a protocol designed to be used over the Internet. It has also inherited the authentication mechanisms used by HTTP to authenticate

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

61

a user and S/MIME (Multipurpose Internet Mail Extensions) [39] to provide condentiality and message integrity. The end-to-end security is trusted to be provided by lower layers with protocols like: IPsec [67] and TLS (Transport Layer Security) [30]. For IMS, an extra feature has been dened named Topology Hiding. Since some SIP messages contain sensitive information from one operator such as the addresses of entities on the network or the number of entities, then an operator may choose to hide this information to minimize the risk of attacks. This is done by encrypting the headers of the SIP messages as they leave the network and to decrypt the messages when they enter the internal network. This practice is not supported by IETF but judged necessary by 3GPP. Without any other protocol support, SIP is vulnerable to man-in-the-middle, Denial-of-service and Eavesdropping attacks. 7. Interoperability. All the SIP User Agents have to be compliant with the core SIP protocol. In this way, a basic level of interoperability is ensured. For more advanced features a mechanism to negotiate the dierent extensions that are supported by the UA is provided. In that way the client and the server can decide the best way to negotiate their session. SIP is also text based and reuses many characteristics of well known Internet protocols: HTTP, SMTP and MIME. This characteristic makes it easy to share implementation code with the code deployed to implement those protocols for other services (web and e-mail). 8. Development. Presents the future trends and state of the protocols at this moment. A lot has been said about the SIP future and development direction. The protocol exibility and the property of easily incorporating extensions has helped SIP to gain momentum a reputation inside the packet based telecommunications area. The direction of development in the beginning was to ensure the harmonized coexistence of SIP and the rest of the IETF protocols. Part of the SIP development are also the dierent SIP extensions and the proposal of formal methods to use SIP in specic situations and even as transport for other protocols that use part of the SIP provided infrastructure. The current development has started to get closer to the users, where the use of SIP in P2P networks, quality-of-service assurance, lawful interception and emergency services (911-type) are subject to analysis. Other issues, such as privacy, protection against DoS attacks, and network access handover are always in consideration.

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

62

7.3.3

Media Sessions Mark-up Language (MSML)

At the moment of writing this work, MSML is being dened by an Internet draft [87] and therefore it is not nalized yet. However, the rst version of the draft was released in June 2003 and it actually is comprised of two separated drafts: one for MSML and another one for Media Object Markup Language (MOML). These two drafts were merged in February 2006 into the current draft for MSML. MSML was developed to provide a way to control media servers by using SIP as transport. Still, the latest versions of the MSML claim that it is a transport independent protocol. 1. Architecture specics. The MSML transactions are triggered by events in the application domain that are required by the services provided by the media servers. MSML provides an abstraction of the media server called Media Server Object Model. This model assumes that there exists one single control context within a media server. This control context is aware of the state of all media objects and media streams within the media server. The objects are endpoints of one or more media streams. They are four types: network connections, conferences, dialogs and operators (see table 7.6). The single control context receives and processes all MSML requests and all events generated internally by media objects and sends them to the appropriate SIP dialog. As was already mentioned, MSML claims to be transport independent, although the only transport dened at the moment of writing this work is SIP. SIP is deployed rst to establish the session and also to provide third party call [84] control to create sessions on behalf of end users. The IETF draft presents two alternative ways to use SIP to transport MSML: one is by deploying SIP INFO messages and the other is using the SIP Control Framework [17]. VoiceXML can be deployed in MSML for the dialog descriptions as well. 2. Functionality. The MSML request may carry several actions (elements) to be processed or a single command. The multiple elements request must be processed in the sequential order in which the elements are sent in the request. Every request is treated as a simple transaction, and a media server should make sure that it has enough resources to carry out the transaction before executing the request. The execution is expected to happen immediately, meaning that often the execution of elements that can take long or unpredictable quantities of time are forked. When the fork is successful, the next element is processed. The transaction must be stopped when an error occurs

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS Object Network connection Description Abstraction representing the media stream processing resources involved in a RTP termination of a call. They are instantiated through SIP Represents the resources and state information required for a single logical type of media mix in a conference. Conferences are instantiated by the <createconference> element of MSML. Dialogs represent automated participants. They are very similar to network connections from a media ow perspective although they are instantiated by MSML instead of SIP by the <dialogstart> element. They are functions used to process a media stream (ltering, transforming, adapting, etc). They are divided into unidirectional or bidirectional, where the dierence is in the number of input and output streams. They are instantiated implicitly when the streams are created or modied by the elements <join> and <modifystream> Table 7.6: MSML object classes

63 Example Full duplex audio stream, Multimedia connections Audio , video

Conferences

Dialogs

Interactive messages, tones, recorder

Operators

Unidirectional: gain control, tone ltering. Bidirectional: Muxers, moderated muters

during the execution of an element. The processed elements are not rolled back and the client is responsible for taking the necessary actions according to the situation. The value of the attribute of the last processed element must be returned with the error response. If the errors happens during a forked process, then an asynchronous event with an error should be issued. The transaction results are returned as part of the SIP request response, indicating the success or failure of the transaction. The objects are referenced by using identiers. The identiers are composed of one or more terms which specify an object class and the names of a specic instance within that class. The objects are assigned to identiers when they are created. Some classes of objects such as conference and network connections can exist independently on a media server to provide services that are going to be used in future connections. The dialogs provide services to independent

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

64

objects, either by acting as participants of a conference or interacting with a connection. Therefore the dialogs are depending upon the existence of independent objects, which is reected in the composition of their identiers. The operators on the other hand are also depending on other objects, but they are not represented by any identier, since they are used to modify the media ow between other objects. The relationship between MSML objects is presented in the example of gure 7.6.

Media Server

RTP

Network Connection

Operator

Conference

Dialog

Dialog

Figure 7.6: MSML Object classes relationship The identiers allow every object in a media server to be uniquely addressed, but they can also be used to address multiple objects by using wildcards. The MSML elements can restrict the class of objects which are valid in a given context, even when the identiers share a common syntax. The language structure of MSML is based on a package scheme and a prole scheme. A package is an integrated set of one or more XML schemes that denes additional features and functions by new or extended use of elements and attributes.[87]. MSML consists of a core package that basically provides a structure skeleton and does not support any specic feature set. The func-

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

65

tionality is provided by additional packages that rely on the core package. Figure 7.7 shows the hierarchy between the MSML core packages, in addition the description of the mandatory and conditional mandatory packages can be seen in table 7.7.

MSML Core

Dialog Core

Dialog Base

Dialog Transform

Dialog Group

Dialog Speech

Dialog Fax Detect

Dialog Fax Send/Receive

Conf Core

Audit Core

Audit Conf

Audit Conn

Audit Dialog

Audit Stream

Figure 7.7: MSML Core Package Hierarchy Since the applications and devices might not support all the available functionality of MSML, a description of the supported set is needed to easily identify the supported features. This is provided by the prole scheme. With the proles it is possible to point to a subset of a package functionality that is required or supported by a party. For example, an audio only announcements prole that will not support video announcements. The conferences using MSML have a mixer for every type of media that is supported in the conference. The mixer description elements are the ones describing how the multiple inputs are combined into a single logical output. The media streams are created when objects are connected together. Every object has at least one input and output for each type of media that it supports. Therefore the connections between the objects represent the media stream

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS Package Name MSML Core package MSML Conference Core package MSML Dialog Core package MSML Dialog Base package MSML Audit Core package MSML Audit Conference package MSML Audit Connection package MSML Audit Dialog package MSML Audit Stream package Description Minimum framework which must be implemented to support additional core packages Basic and advanced multimedia and audio conference package. Used if conferencing is used. Dialog core package implemented for any dialog services. The systems supporting only conferencing may omit support for MSML dialogs. Used if the Dialog Core package is in use. Species the framework within which additional audit packages are supported. It must be implemented to support auditing services. For auditing Conference, Conference Dialog and Conference Stream For auditing Connection, Connection Dialog and Connection Stream For auditing Dialog. Must be used either with MSML Audit Conference Package or MSML Audit Connection Package For auditing Stream. Must be used either with MSML Audit Conference Package or MSML Audit Connection Package

66 Mandatory Full mandatory Conditional Conditional

Conditional Conditional

Conditional Conditional Conditional

Conditional

Table 7.7: MSML mandatory and conditional packages connections. 3. Performance. Since MSML is based in practice on SIP, its performance is not expected to overtake the SIP analyzed performance in terms of bandwidth deployment and processing delays. There can even be expected an increment of the processing time and memory consumption, because the type of monolithic media server context additional to the MSML and voiceXML (in case of being used) extra parsing. The control plane of the server may become a limitation for the media server sessions handling capacity in this case. The MSML Internet draft does not mention any special load control features integrated into the protocol, but relies on the lower layers to provide them. Some control could

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

67

be implemented by an application through inferring the load level on the error codes provided. The error handling is basically copied from SIP, adding the possibility of reporting the errors produced by forked executions by the means of asynchronous events. 4. Extensibility. The creation of new packages is the primary way of extending MSML. Such package extensions should follow the presented hierarchy and follow the core framework. MSML Packages are assembled together to form a specic MSML prole that is shared between dierent implementations. A MSML script must include references to all the schemes that dene the packages used by the script. According to the MSML internet draft, the MSML package proles must be published in standard documents dedicated to MSML package proles, which are dierent from the MSML specication. The proles are not registered with IANA. On the other hand, the Control Package names have been planned to be registered, as well as the XML schema associated. It is intended the addition of future registration with IANA of the packages. Auditing of supported MSML packages and proles is not specied at the moment. 5. Scalability. MSML does not have any special handling specied for overload situations. In case of connection congestions, they are either left to the application control or to the protocol transport to solve and recover from them. Since MSML follows a client-server connection model, the implications is that many controllers could use the resources of one Media Server. This might complicate the hardware allocation logic of the Media Server and some other protocols may need to be deployed to provide among others authentication, policing, capacity and capabilities handling. The protocol assumes a monolithic context that handles all the pool of available resources and the state of all the connections. This concept of a monolithic context might in practice limit the capacity and not be very scalable, if a modular and distributed implementation is not deployed. This kind of implementation for monolithic context is usually complex and might involve performance issues on the process communication level. 6. Security. The Internet draft of MSML mentions a security consideration where the security is a function of the MSML invoking protocol or language. The dened security considerations for XML in RFC 3023 [75] are also considered applicable. This means that in the case of the SIP protocol transport, the

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

68

already mentioned security issues are valid for MSML as well. In addition to that, the XML inherent security risk from the package dependency to fetch from sources with unknown security could increase the risk associated to malicious changes. For example, changes in the master templates can force the execution of unintended commands or information exposure. 7. Interoperability. The advantage of deploying SIP as transport protocol gives MSML the exibility of reusing the SIP stack to provide control of the connections to a certain extent. Also, the client-server architecture is very exible and present in many dierent type of networks. These two features might give some leverage to MSML to be deployed in dierent network contexts and not only the IMS, where SIP has being deployed already. This is one of the main claims of the protocol inventors that see the widely deployment of SIP as an important trend to be follow and complemented with services provided by MSML for the media control. 8. Development. The protocol is still evolving and being specied in the IETF. Some products have been released incorporating early versions of the protocol, but in general the community has not given lot of attention regardless of being open and free of patent considerations. The main claim is the similarity to H.248 functionality, since it basically ports the main H.248 functionality into a XML compatible format that can be easily transported over SIP.

7.3.4

Media Server Control Mark-up Language (MSCML)

The Media Server Control Mark-up Language has being specied in the informational RFC 4722 [32] in IETF. It was driven together with the Network Announcement (NETANN) protocol specied by RFC 4240 [19] and called Basic Network Media Services with SIP. Such services include network announcements, user interaction, and conferencing services. NETANN was found insucient for advanced conferencing and IVR handling. Therefore, SnowShore (today Dialogic) authored and drove MSCML in opposition to MSML driven by Convedia (nowadays Radisiys). MSCML is guaranteed to be royalty free although it is covered by patent protection under normal conditions. 1. Architecture specics. SIP has been dened as the transport protocol for MSCML. The Media Servers get the MSCML command by the means of SIP INVITE and SIP INFO messages. The abstraction level of MSCML is high enough to provide the application

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

69

layer with constructs and primitives which deal with the required conference services with single commands with a semantics that translate the implementation details to the Media server, in contrast to H.248 and MSML, since they keep tight control over the Media Server actions. The connection model is a peer to peer relationship in a client-server association. The deployment of SIP enables also the deployment of the native SIP service location and load balancing. The Media Server must support SIP message bodies with the MIME type multipart/mixed. The MSCML response should be placed in the nal response to the SIP INVITE used to transport the MSCML request. In the case of using SIP INFO, the only allowed nal response is a 200 OK, consequently, the Media Server sends the MSCML response in a separate SIP INFO request. Only one MSCML body must be present in a SIP message, and only one MSCML request or response may be contained in a MSCML body. Also, MSCML does not support provisional responses, only nals. However, a request may end in multiple notications. The request may include uniquely dened client ID attribute to provide a mechanism to match request and responses. For basic conferences, the process is done following the procedure described by RFC 4240 [19] where the rst INVITE to the media server with a unique client ID creates a conference and it is based on SDP. The advanced conference, on the other hand, requires that the rst INVITE contains a MSCML <congure conference> payload that extends the number of session parameters which is not possible to express with SDP. The resulting SIP dialog created by the Media Server Controller is called Conference Control Leg and the conference will exist during the lifetime of this leg. The Media Server does not expect any RTP stream associated with this leg. Figure 7.8 shows an example of a advanced conference model, and gure 7.9 presents a graphical view of the MSCML connection model. 2. Functionality. Analyzes the dierent functions and services provided by the protocols. There are two classes of MSCML functionality. The rst includes primitives for advanced conferencing (see table 7.8) and the second provides primitives for IVR (presented in table 7.9 ). The advanced conferencing is available when the Conference Control Leg has been created. Once the conference has been created, it can be manipulated by the client as a whole, a particular leg or a team by issuing commands on the associated SIP dialog. The support for video conferencing is implicitly

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

70

RTP

UE
AS MRF

UE Public URI SIP Session Private URI MSCML Session

UE

RTP

RTP

Figure 7.8: MSCML Advanced Conference Model supported in the form of video switching. The video of the loudest talker is sent to the conference participants except to the talker. This at the same time gets the video of the immediately prior loudest talker. The transcoding of the media should be provided by the Media Servers, and also ensure that the media sent is supported by the participants. The conference events are sent mainly to the Conference Control Leg, and the client can subscribe to active talker event reports received in that leg. Finally, a conference is terminated when the client issues a SIP BYE request in the dialog representing the Control Conference Leg. The media server replies with 200 OK and issues a SIP BYE request to all the other legs in the conference. The IVR services provided by MSCML are:
Basic interactive voice response functions Playing of announcements Collecting of DTMF digits Recording

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

71

Media Server
SIPMSCML SIPMSCML
Conference Control Leg Sip Conference Leg

RTP

MSCML Relationships SIPMSCML SIPMSCML

RTP

SIP Conference Leg

Sip Conference Leg

RTP

Figure 7.9: MSCML Media Server Connection Model The MSCML IVR request should be sent in a SIP INFO request, and they are not allowed in INVITES. The participant leg that will receive the IVR commands should be congured with a mixmode = parked (using <congure leg> for example) isolating the input and output of the participant from the rest of the conference. An exception of this is the digit collection and recording. For this functions it is not necessary to congure the mixmode as parked. The service indicator for IVR must be set to ivr in the deployed URI, in contrast to dialog used by VoiceXML. The media servers do not queue IVR requests. If a second IVR request is received when the Media Server is processing the rst, the rst request has to be stopped and the second request executed. The Media Server will reply with a response to the request, once it has been completed or stopped. 3. Performance. The MSCML performance will be very close to the analyzed SIP performance. The reason is the total dependency of the protocol on SIP. The protocol goal is to provide an application interface, and therefore it does not

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS Request method <congure conference> Description Used to create an advanced conference when the control conference leg is created and to provide new input for a created conference. Provide properties for the conference legs to be modied by the client Attributes

72

reservedtalkers (optional if other than creating the conference control leg): Provides the maximum number of participant legs to be allocated for the conference. reserveconfmedia (optional): To control the allocation of resources for playing to or recording from the conference. type (optional): values are listener and talker, indicates to the server the inclusion of the leg in the output mix. dtmfclamp (optional), if yes, takes DTMF digits from the input audio. toneclamp (optional), if yes, removes tones from the input audio. mixmode (optional), valid inputs are: full, parked, mute, preferred and private id, it is the unique ID of the leg being modied action, it can take the values add, delete, query, and set

<congure leg>

<congure team>

Congures the participants as members of a team within a specic conference. It is a child of <congure leg>

Table 7.8: MSCML Request Methods for advanced conference interact on the stream level for the media control. That could indeed bring some improvements in the call setup times, resource reservation and modication and processing time. The high-level interface proposed by the protocol foments the deployment of proprietary solutions that are driven by high-level instructions, giving large freedom to the implementation of the Media Server to process the request in the preferred way, as long as the result provided is the expected. There are not public available measurements of the MSCML commercial server performance, and as it has been stated they might be strongly dependent on the implementation. The error handling is rather simple also, because the high level of the requests and the architecture give enough exibility to deploy the system in a distributed environment, where the processing load can be shared and controlled.

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS Request method Description <play> It is used to play an announcement without interruption and with no digit collection <playcollect> It is used to play an announcement and collect digits. Share the same basic attributes dened for <play>

73

Attributes id (optional), it is a client-dened id promptencoding (optional), it species the non self-described content encoding prompturl (optional), it provides the URL of the content to be played cleardigits(optional), species if previous user input should be considered. barge (optional), it species if the user input will barge the prompt and force transition to the collect phase. maskdigits, it controls if the DTMF inputs are logged in the media server. maxdigits, species the maximum number of digits to be collected escapekey(optional), it species a DTMF key to indicate the termination of the operation without saving any input recorded at that point. recurl, Species the URL for the record. recencoding(optional), it species the encoding of the recording mode(optional), it species whether the recording should overwrite or be appended to the target. Possible values are overwrite and append. duration(optional), Maximum allowable time for the recording. Expressed as a time value from 1 onwards or the strings immediate and innite. beep(optional), it species whether a beep should be played to the caller. Valid values are yes and no initsilence (optional), it species how long to wait for initial input before canceling the recording. Expressed as a time value from 1ms onwards or the strings immediate and innite. endsilence (optional), it indicates how long the media server waits after speech has ended to stop the recording. Expressed in the same way than previous. recstopmask(optional), it points to a list of individual DTMF characters that, if detected, will cause the recording to be terminated. id (optional), it is a client-dened id

<playrecord> It indicates the need of convert and/or transcode RTP streams and store them in a URL using the specied codec(s). Share the basic attributes dened for <play>, plus barge and cleardigits from <playcollect>)

<stop> Used to stop a request in progress and not to initiate another operation.

Table 7.9: MSCML Request Methods for IVR

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

74

4. Extensibility. MSCMLs RFC [32] does not provide any explicit extension mechanism. The protocol is under a royalty-free license regime where the protocol is free to be deployed but it provides barriers for further development done by others than the current patent holders. 5. Scalability. The protocol does not provide other handling for overloads than using error codes in transaction responses. It also relies on the implementation of the handling application, the distribution of resources and the protocol exibility to address those problems. They are considered to be out of scope of the protocol. 6. Security. The responsibility of the session security is translated to the Media Server and there is strong recommendation to implement security measures to provide integrity and condentiality. The referenced security model is the one used by SIP to provide authentication, authorization and access control. Also the media server must support TLS and SIPS(SIP Secure) for the signaling and should support SRTP (Secure RTP) for the media plane. Also the relationships of the Media Servers with external entities such as HTTP servers must be secured and authenticated. The general considerations for XML markups are also applied to this protocol. 7. Interoperability. The target of MSCML is to operate in networks where SIP is the dominant signaling protocol, also where the nodes are preferred to enclose a large quantity of functionality and provide simple and straightforward services. It can easily be adapted to systems which already have a SIP stack to extend the processing functionality to process the XML-based payload. But the high-level paradigm makes it very dicult to adapt it to systems that have been operating with a tight control interface (like H.248). The protocol allows fall-back to the simple conferencing services provided by SIP and NETANN, but mainly translate the support responsibility to the Media Server implementation as well as other issues. 8. Development. MSCML has being specied in the IETF RFC 4722 [32]. Currently the framework and paradigm supported by MSCML is being extended in a newly created workgroup in IETF MEDIACTRL and specially addressed by the workgroup draft A Control Framework for the Session Initiation Protocol (SIP)[17] at the moment of writing this work.

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

75

7.4

Trac Scenario Implications

The assumptions about the type of trac that IMS media servers will face is still quite unknown. The vendors are expecting the introduction of old and new services in the IMS especially powered from the Internet. For this study, the focus has been in the advance conference services with video and audio,in addition to multimedia playback and user interaction. This might not be the main application deployed by the users, but the application will for sure make use of these capabilities to provide more functionality to the users. A good example is given today by Facebook [38]. This web application gives the user the power to select the type of applications that they want to have attached to their proles and share with their friends. The Facebook platform provides them with the tools to attach such applications to their proles and make use of them. Meanwhile application developers provide the applications and make them available for the users. The applications can be from text posting, picture sharing, to conferencing and instant messaging systems. The Facebook example gives us an analogy of what we can expect from Media Servers in the IMS, a platform that whould allow the users to process their applications request in an easy and transient way. The dilemma comes in the degree of control of the network over what the users can request and how to process it. To give full access to the users to the media processing capabilities of the network is something that simply is not going to happen for obvious reasons (security, resource administration, charging policies, etc), therefore the intermediate control is needed and desired by the network operators. The other dimensions, users, vendors and service providers might have a dierent view:
A tighter control will give more power to the network operators to decide the exact way how the resources are administrated. It will provide them with a more granular detail of the operations performed in their network and exibility to handle new services but it might at the same time increase the complexity with some impact in their OPEX. The complexity rises from the security mechanism needed to be verify for the operations requested by the Media Servers and the handling of the network trac targeting the Media Servers (i.e. load control, QoS, etc). For the end user, the tight network control might be translated in bigger delays due to signaling, resource allocation and security verications. For the vendors, the provisioning control nodes with a complex logic implementing the operators needs is a good business that opens opportunities for proprietary optimizations and creates dependency on the vendors equipment (if it provides a good performance). Finally, the ser-

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

76

vice providers do not have a transparent interface to deal with, but a network provided contact point and total dependency on the network policies, agreements and handling from the operators. This type of control will have a greater degree of deployment of the Mp interface. The Mr interface would handle less signaling comparatively.
An application driven control (or loose control) will give to the Media Server application the responsibility of serving the users application handlers requests. This means that the Media Servers optimize their operations by having a high level of knowledge of what the application needs to get from the Media Server. The exibility for operator control is diminished and the Media Server becomes dependent on the control protocol request capabilities and their own resources availability. New services might needed to be introduced by changes in the control protocol. The operators could nd this type of Media Server Control dicult to steer from the point of view of the security, QoS and load balancing, since the granularity of the request commands is quite low and their direct source are the application servers (own and external). The users might experience shorter signaling gaps and higher service availability but maybe more incompatibilities when accessing from dierent operators. The service providers would obtain more control over their applications and a direct interface to utilize the network resources. Finally, the vendors would have to implement more complicated Media Servers instead of Media Controllers, which in practice would be simple proxies. The optimizations and service supports are then emphasized in the Media Servers, and several proprietary solutions might compete in this area to provide the best performance and support for wider services.

7.5

Comparison

The analysis of the studied protocols for Media Server control provides us with several characteristics, capabilities and supported functionalities, which are possible to group and compare. Table 7.10 collects the relevant characteristics from the analyzed Media Server Control protocols in brief. From the general characteristics we can rst compare the tting interface from the 3GPP point of view, according to the already dened specications. Since H.248 has been dened as the protocol for the Mp interface [9], the SIP based protocols are only suitable for the Mr interface or a non-specied interface (may be proprietary highly application dependent) to control the media servers. This is shown in gure 7.10.

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

77

Possible deployment of the Analyzed Media Control Protocols in the IMS 3GPP interfaces
AS

Mr (MSCML /SIP )

MRF

No 3GPP MRF Split


AS

The functions (MRFC and MRFP) can be implemented in the same node. The combined node has to be more intelligent and complex to handle the high level commands. (i.e. More application implementation dependent) The functions are implemented in a combined MRF but the implementation divides the control part and the processing part similarly to the MRFC and MRFP but using a proprietary protocol to communicate between them Less participation of MRFC. Only Translation of SIP transported media control protocols to H.248

Mr (MSML /SIP)

MRFC

MRFP

AS

Mr (MSML/SIP)

MRFC

Mp (H.248)

MRFP

3GPP MRF Split

AS

Mr (MSCML/SIP, Netann-SIP)

MRFC

Mp (H.248)

MRFP

High participation in control from MRFC. High-level commands from SIP transported Media Control Protocols to lower-level H.248 commands.

Figure 7.10: Possible deployment of the Analyzed Media Server Control Protocols on the 3GPP interfaces of IMS (The ISC interface is not present for simplicity). The MRF as specied by 3GPP provides the functions expected from the Media Server. Its realization can be either a logical split in one physical node that contains both the MRFP and the MRFC or non-split at all. The rst option implies that the Mp interface is deployed and therefore H.248. In the case of deploying an application high-level control protocol such as MSCML or even Netann-SIP, the MRFC would provide a realization of the high-level commands into H.248 and commanding the MRFP through more device specic orders. The case of MSML would leave the MRFC in a mere translation function, where almost all the Mr commands can be mapped one-to-one to a H.248 order. On the other hand, when the split is not eective, the node can leave the application to take care of the high-level commands from MSCML without the need of 3GPPs functional splitting (where the controller part is cleared divided from the processing part). MSML, meanwhile, can still make use of such a split because of the connection model and protocol framework deployed, but there are not dependencies for the communication protocol between the parts since they are realized in one node. In other words, the application split

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

78

is independent of the protocol specied for the 3GPP interface between them. The likelihood of deploying one or another model is truly an operators choice, whether they prefer to keep a rm and strict control of their network nodes, or provide trust handling of their equipment to third-parties that provide the services. The rst alternative favors the models where the 3GPP split is enforced, especially the case where high level instructions are provided from the third-parts, meanwhile the second alternative requires more trust and condent in the third party control and provisioning of knowledge about the physical node characteristics. This last case also apply to the direct deployment of H.248 from the AP due to the tightly-coupled relationship of the master-slave model deployed. Another argument that can be used by the operators for a direct control of their media servers is the lack of load control support from the SIP transported protocols, and they might argue in favor of deploying broker functions that balance and control the load and the access to the media servers pool. From the vendors point of view, additionally to the operators wishes are also the cost of implementing the application that handles the functions of the node. The complexity required by some applications to handle a node might encourage implementations in which a well dened split can modularize the development of the products or let them deploy as many available proprietary solutions as possible. This complexity might be on dierent levels, in the controlling part or in the processing part. Big companies with access to hardware development facilities might prefer to implement the complexity in the processing part, while small companies handling mainly software might prefer to handle the complexity in the controlling part. The user that actually can be considered in two separate dimensions, depending on their usage of the operators network, are the application provider and the nal user. The application providers needs can be very dierent, depending on what service they are going to provide. But mainly they expect that the network can satisfy eciently their media processing requests, in a fast and reliable way. The nal users require similar treatment and have similar expectations on the network. Therefore, the largest number of available functionalities, the highest eciency to provide the services with the lowest quantity of transactions as possible and the lowest latency are highly desired by them. In this case, from the users perspective, the high-level protocols provide them with a simple interface without the need of complex applications to handle the protocol requests. Even so, the networks are still driven by the operators requirements. They consider the users requirements to provide better service and to obtain an increasing deployment of the network. It is expected that the increase of the network utilization means an increase in the

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS Protocol Characteristics General Applying Interface Connection model Standardized Transport Load Balancing Extendability Abstraction level Application Framework Distribution Requests level Complexity Media Server Controller logic complexity Media Server logic complexity Commands length Distributed control Stream Monolithic control Stream Mp Master-Slave ATM TCP-UDP-SCTP, over IP Some support Dened Mr Client-Server Mr/Mp Client-Server Mr

79

H.248

SIP/Netann

MSML

MSCML

Client-Server

UDP-TCP/IP No Possible

SIP No Dened

SIP No Not dened

Peer-to-peer Session

Distributed control Dialog

High Medium Long

High Low Medium

High Medium Long

Low High Short

Table 7.10: Comparison of the characteristics of the Analyzed Media Control Protocols revenue of the operators. In reality, this might depend on the operators charging strategy.

7.6

Preferred model

The analysis provides us with dierent options of protocols to be deployed for Media Server Control. The alternatives were inspected over the light of their characteristics, functionality and nal deployment. Balancing all the requirements from the dierent

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS Protocols Functionality Basic conferencing Unscripted IVR Scripted IVR Fax handling Audio recording & Play Video recording & Play Advanced conferencing Statistics Auditing Announcements Video Announcements H.248 SIP/Netann MSML MSCML

80

Yes Possible Possible Yes Yes Possible Yes Yes Yes Yes Yes

Yes No Yes No if adding VoiceXML No No No No Yes No

Yes Yes if adding VoiceXML Yes Yes Yes Yes Yes Yes Yes Yes

Yes Yes if adding VoiceXML Yes Yes Yes Yes No No, Only event report Yes Yes

Table 7.11: Functionality comparison of the analyzed Media Server Control Protocols parts involved in the media server control protocols deployment (also indirectly by using the media processing services), the model that presents us with the bets set of characteristics is the MRF split model with a high-level control protocol for the Mr interface, such as MSCML, and keeping the Mp interface based on H.248. The main reasons to select this alternative (see gure 7.11 ) are:
H.248 is a well known and tested protocol, even in real networks because of its deployment in Media Gateways. In addition, the strong synergies between MGW and Media Servers would facilitate the reuse of the application layers in the construction of the Media Server products. Operators also benet from this, since the personnel providing support for their current MGW control can easily provide the same type of services for the additional media server nodes. The other protocols do not provide better argument to substitute H.248 than the advantages provided by a SIP transport which is not enough from the 3GPP perspective.

CHAPTER 7. MEDIA SERVER CONTROL INTERFACE ANALYSIS

81

The physical split will allow the operators to control and administrate their pool of resources directly from their trusted nodes, which will interpret and serve the requests from their own and external application servers. This facilitates load handling and balancing. The deployment of H.248 facilitates the state alignment between the nodes thanks to the Audit commands, and also supports the control from multiple master to single slave physical entity by using virtualization of the slave entity. A high-level protocol to provide instructions for the Media Control simplify the application server interface and request of services from the Media Servers, although this could restrict the access to the full potential of utilization. However, full access to the media server by external parties might be undesirable for the operators, and in special cases some kind of tunneling of pseudo H.248 commands (or MSML) could be allowed to be translated into the H.248 that the Media Server understands. This model is exible enough to be extended and rened, and it is fully 3GPP compliant.

AS

Mr (MSCML/SIP, Netann-SIP)

MRFC

Mp (H.248)

MRFP

High participation in control from MRFC. High-level commands from SIP transported Media Control Protocols to lower-level H.248 commands.

Figure 7.11: Selected preferred Model for Media Server Control (The ISC interface is not present for simplicity)

Chapter 8

Prototype
8.1 Mobile-MGW VoIP prototype

The objective of the prototype was to allow the M-MGW to be controlled by a MGCF (Media Gateway Control Function) using H.248 transported in text mode with a SCTP/IP stack. The MGCF needed to be able to set up a single or multiparty (conference) VoIP call using the MGW as a conference controller. The MGW will be able to connect directly to the SIP Clients to send and receive audio media on top of RTP/UDP/IP (see gure 8.1 ).

8.2

M-MGW prototype considerations

The M-MGW prototype implementation work was meant:


To deploy only IP terminations. Other types of transport were supported (ATM, TDM) but not tested. The H248 text parsing was handling only the commands from specic detailed call ows. Other commands and parameter were recognized by the parser but not processed. The implementation of the parser allowed to congure the VMGW to use binary or text mode, although the new functionality was implemented only for the text mode H.248. Binary mode H.248 operates with the M3UA transport.

82

CHAPTER 8. PROTOTYPE

83

Ctx1

T1
PCM AAL2

T2

PCM AAL2

PCM AAL2

IPB

IPB

PCM IP Client 1 RTP UDP IP

PCM RTP UDP IP

IP Client 2

Figure 8.1: H.248 Context view targeted to the prototype connection model

8.3

Aected User-Plane Control application subsystems

The prototype development aected only the application software of User Plane Control Function application code. The following subsystems needed to be modied.

8.3.1

Signalling Transport Converter (STC)

The STC handles the transport of H.248 between the M-MGW and the MGC. The changes in the design of the STC were avoided by using a Linux box between the SBC and the MGW to terminate the M3UA layer and transfer the payload for SCTP. The bridge between the protocol stacks was possible by using an open source package (M3UA from sourceForge or OpenSS7).

8.3.2

GCP Termination (GCPT)

This subsystem is decoding and encoding the H.248 message in binary mode, using the ASN.1 and encoding applying the Basic Encoding Rules (BER). It also handles

CHAPTER 8. PROTOTYPE

84

the H.248 messages by breaking them into M-MGW internal data and vice-versa, checking the syntax of the messages and controling the timer implementations for H.248. The modication includes the implementation of the H.248 parser and text mode and handling for the SDP package (the package itself has to be dened). There exists some dierences in how the parameters are transmitted between the binary mode and the text mode. GCPT has to decode the SDP descriptors instead of tunneling them as is done by the IPBC protocol. The generation and decoding of the protocol for text was aected as well. The prototype mapped the text H.248 to the binary structures already handled by the MGW. This subsystem was the most aected.

8.3.3

Connection Coordinator

It is responsible for mapping the H.248 view to the user plane view. In other words, the CC interprets the H.248 instructions and maps them in to devices reservations, connections, stream processing commands for the platform and events monitoring. The prototype impacted several parts of the subsystem. The context handling had to be updated to accept the possible dierences compared with the previous implementation. The stream management was also updated to deploy the SDP descriptors instead of decoding the IPBCP. The interface used to transport the SDP descriptor had to be updated accordingly. Also CC should not reserve the MFD (Multi-Function Device) that provides the Nb framing by default when the IP device is requested to eliminate the Nb framing from the RTP/IP stack.

8.4

M-MGW Prototype testing

The testing environment for the applications was divided into three levels:
Subsystem Test. It included the independent test of every modied subsystem:

STC Subsystem test. It veries the H.248 over SCTP transmission correctness, including associations and connection issues. GCPT Subsystem test. It veried the right interpretation of the H.248 and code execution. MeSC Subsystem test. With this test, the execution of the code and the resource reservation by the subsystem was checked.
Load Module Test. It included the test of all the aected load modules connected between them and it tested only the signaling between the subsystems.

CHAPTER 8. PROTOTYPE

85

Node Test. Testing of all the prototype functionality in a real M-MGW node. The media was also possible to process although this was not veried. The H.248 was sent to the node using a trac generator tool.

8.4.1

H.248 ows between M-MGW and SBC

The Session Border Controller (SBC) was acting as a Media Gateway Controller (MGC) from the point of view of H.248. The currently supported ows for both nodes had some dierences beyond the binary/text format of the protocol, which needed to be unied and adapted to make possible the interworking. For this prototype it was only analyzed the successful cases needed to set-up and release a RTP/IP conference call and the start up of the MGW node. The Session Border Controller H.248 call ows had some dierences with respect to the MGW deployed call ows:
Two commands per transaction Use of Session Description Protocol directly in the Local and Remote descriptors, and tunneling of IPBC is not used. Dierent packages are used, and some of the SDP MGW parameters are not supported. The reporting of the local IP address is done in the ADD response command, instead of being received in the NOTIFY request command. The SBC does not currently support a call setup with several terminations in a context.

Therefore the prototype used the following settings aiming for the minimal impact in each node:
Use one H.248 command per transaction. The SDP was kept as in the SBC local and remote descriptors. Not supported packages were not sent or ignored. The M-MGW reporting of the local address was not done in the ADD response command. NOTIFY commands were suppressed by the MGW and not sent to the SBC. The adding of further terminations (after two parties) to one context was also included in the ows (with the n indicating a number up to the maximum number of terminations allowed in the context)

CHAPTER 8. PROTOTYPE

86

Changes in the stream mode were not processed. The stream mode for IPB was kept as SenderReceiver. Link restart needed a VMGW lock-unlock to get the signaling back. SBC must not send audit for terminations after the ServiceChange sequence was done. Termination id mapping had to be realized and a hard coded termination group was used in the SBC.

8.4.2

Call setup procedure.

The MGW is using Nb framing nowadays and IPBC tunneling for the IP calls. The IPBC is used to notify a remote MGW of the local IP address of the termination that it will be connected to. This IP address and UDP port are tunneled (together with the Session Descriptor Protocol parameters) in H.248 using IPBC (IP Bearer Control Protocol) and it is forwarded by the MGC in the signal descriptor of an ADD or MODIFY command. The MGW informs about the local IP address by NOTIFY command. The type of commands used or the sequence of the connection is dependent on the parameter: type of tunneling (fast or delay connection). In the fast connection, the IP address of the remote site is known and provided in the ADD command descriptors, and the MGW returns the local IP address of the termination in a NOTIFY command in the same transaction. Meanwhile, in the delay connection the IP address is sent later in a MODIFY command, and the local IP address is also sent in the NOTIFY command but in a dierent H.248 transaction. The call ow is shown in gure 8.2.

8.4.3

Call Release Procedure

The release of terminations from the M-MGW point of view is simple and supports dierent combinations of wildcarding as well. The SBC requests some statistics not supported by the M-MGW, and this was disabled for the prototype ows. Also, the one command per transaction encoding was used.

8.4.4

MGW Cold startup ow

The dierence is only in the prole used by each node (EP1 and EP2 in the diagrams). The other node prole is not supported currently by the nodes (In this case the MGW prole will not be supported by the SBC). For the purpose of this

CHAPTER 8. PROTOTYPE

87

M-MGW

SBC

4.1.1: addReq({Ctx=Choose, Term=Choose (IP Wildcard), MediaDescriptor: LocalCntrlDes {strmMode=SR} LocalDes{SDP(v=0, c=IN IPv4 $, m=-$ RTP/AVP-)} RemoteDes{SDP(v=0, c=IN IPv4 "CLIENT1 IP", m=- "CLIENT1 PORT" RTP/AVP)}}})

4.1.2: addRsp(Ctx=1 ,Term=T1, MediaDescriptor: LocalDes{SDP(v=0, c=IN IPv4 "T1 IP ADDRESS" , m=-"T1 UDP PORT" RTP/AVP-)})

At this point the remote IP of the second termination is unknown by the MGC . The alternatives are to wait until the IP is known and follow the flow for Tn reservation or start the second termination reservation and modify it later with the remote IP address when is available as is done in the following sequence .

4.2.1: addReq({Ctx=1, Term=Choose (IP Wildcard), MediaDescriptor: LocalCntrlDes {strmMode=SR} LocalDes{SDP(v=0, c=IN IPv4 $, m=-$ RTP/AVP-)}}) 4.2.2: addRsp(Ctx=1 ,Term=T2, MediaDescriptor: LocalDes{SDP(v=0, c=IN IPv4 "T2 IP ADDRESS" , m=-"T2 UDP PORT" RTP/AVP-)}) Creating T2 in Ctx 1 IP address from Client2 is available 4.3.1: modifyReq({Ctx=1, Term= T2, MediaDescriptor: LocalCntrlDes {strmMode=SR} RemoteDes{SDP(v=0, c=IN IPv4 "CLIENT2 IP", m=- "CLIENT2 PORT" RTP/AVP)}}}) Modifying T2 in Ctx 1 4.3.2: modifyRsp(Ctx =1 , Term= T2)

4.4.1: addReq({Ctx=1, Term=Choose (IP Wildcard), MediaDescriptor: LocalCntrlDes {strmMode=SR} LocalDes{SDP(v=0, c=IN IPv4 $, m=-$ RTP/AVP-)} RemoteDes{SDP(v=0, c=IN IPv4 "CLIENTn IP", m=- "CLIENTn PORT" RTP/AVP)}}}) Creating Tn in Ctx1 4.4.2: addRsp(Ctx=1 ,Term=Tn, MediaDescriptor: LocalDes{SDP(v=0, c=IN IPv4 "Tn IP ADDRESS" , m=-"Tn UDP PORT" RTP/AVP-)})

Abbreviations: Ctx: Context, Term: Termination, LocalCntrlDes: Local Control Descriptor, strmMode: Stream mode, SR: Sender Receiver, LocalDes: Local Descriptor, RemoteDes: Remote Descriptor, SDP: Session Descriptor Protocol , v: Version, c: Connection, m: Media.

Figure 8.2: H.248 Call setup ow used for the Prototype

CHAPTER 8. PROTOTYPE

88

prototype, the prole was overridden in the H.248 text encoder from the MGW to match the currently accepted SBC prole (see gure 8.3).

M-MGW

MGC

7.1.1: SubReq (Ctx=1, Term= ALL)

7.1.2: subRsp(Ctx= 1, Term= ALL)

Releasing Ctx 1 T1 and T2

M-MGW : MGW

SBC : MGC

MGW Start up 1.1: serviceChangeReq ( Ctx=NULL, Term=ROOT,scm=restart ,scr=coldBoot , scp=EP2, scv=2)

1.2: ServiceChangeRsp (Ctx=NULL, Term=ROOT, scv=2) : . Abbreviations : Ctx: Context Term: Termination scm: Service Change Method scr: Service Change Reason scp: Service Change Profile EP2: Profile version

MGW without TDM termination groups configured

Figure 8.3: H.248 call release and cold start-up ows used for the prototype

Chapter 9

Conclusions and Future Work


Standardization in the Media Server Control area has received a great deal of attention in the telecommunication industry. The 3G mobile telecommunication industry needs a model to evolve from the current network architecture to the all IP vision. The road map of this evolution includes the denition of Media Server Control protocols and Media Control interfaces. This work is constantly evolving. The requirements for this standardization are provided by the industry interests, the service oerings, and the end-users demands. Strong synergies exist between the already deployed Mobile Media Gateway (MMGW) and the future Media Resource Function Processor (MRFP). These synergies provide a good starting point to analyze the issues that will be found in the deployment of the MRFP. They point to the most important aspects to be considered related to the control protocol and the required interfaces. Based on the analysis of the mentioned factors, this work concludes that the preferred model for media server control in 3G networks includes a high-level control protocol for the Mr interface together with the already specied H.248-based Mp interface. The high-level control protocol could consist of the already specied MSCML or Netamm-SIP. H.248 is deployed to provide low-level and detailed control of the MRFP by the MRFC. The study of the alternative models and protocols was mainly based on the level of complexity that was required to implement it on the dierent network elements. The study also took into account requirements coming from nal users, operators, vendors, and service providers. The model chosen is believed to provide the lowest overall level of complexity for all network elements while satisfying the requirements of all the aected parties. Other models and protocols, in addition to the ones chosen in this study, also satised many of the requirements. However, their implementation

89

CHAPTER 9. CONCLUSIONS AND FUTURE WORK

90

seemed to involve an equal or a higher level of complexity than the suggested model. At equal complexity, priority was given to already deployed protocols, since the experience gained during their deployment is considered to be a valuable additional advantage in their favor. Finally, as the practical part of this thesis, a prototype of a MRFP was implemented by modifying the User Plane Control Function of a M-MGW. The experience gained from this work provided valuable information to recognize the potential of reuse of current implemented functions into the design of a MRFP. The prototype successfully demonstrated the synergies between the M-MGW and a MRFP. Also, it provided a glimpse of the services that can be deployed using an already functional platform.

9.1

Future Work

The development of multimedia technologies has increased at a high pace during the last decade. The requirements for media processing are continuously changing with the development of new technologies. New media contents and formats are being created and made available all the time. The architecture of the media provisioning and processing functions in mobile networks is also likely to evolve according to the needs that new developments bring. Some of these issues, such as the implementation of MBMS (Multimedia Broadcast Multicast Service) [92], are already being investigated. Also, increases in the demand for certain services could lead to the need and research in the area of load distribution and ecient resource allocation. In [25], some steps in this area are already given by proposing specic optimizations. 3GPP has also felt the need of presenting solutions to the Media Server Control. In response to this need, the technical report TR.24-880 for Release 8 [10] is currently in its nal state before being approved, at the time of writing. The recommendation provided by the report is to support two models of Media Server Control, both keeping the Mp interface with the current H.248 specied protocol:
A delegation model, where the MRFC fetches from the AS an indicated execution script (based on XML), and by following the script, the MRFC controls the MRFP. The SCXML[15] or CCXML prole are recommended to be used here together with VoiceXML 2.1[77] or 3.0 depending on the availability of the later. A protocol model with dedicated control channel. It uses a transport channel to send media control messages from a high-level control protocol to the MRFC

CHAPTER 9. CONCLUSIONS AND FUTURE WORK

91

that interprets them in the correspondent H.248 transactions. The high-level protocol recommended by the report is the result of the IETF MEDIACTRL working group (the current draft-ietf-mediactrl-sip-control-framework-00[17]). In case this IETF work is not available with in the 3GPP release 8 timeframe, the proposal is to use MSCML[32]. Both proposals require the creation of a new interface that will be named Cr or Sr. This interface would establish the direct connection between the AS and the MRFC that this thesis assumed is done through the ISC and Mr interfaces. Additionally, the report recommends the support of Netamm-SIP[19] with the clarication dened in the draft-burke-vxml[21]. This provides a third way of supplying media control commands from the AS. Furthermore, the report explores the possibility of introducing broker functions to provide an improved connection between the MRFC and the AS. This possibility was briey mentioned in this thesis (see section 7.3.2). The introduction of the same idea by the technical report highlights its importance. Work in this area will continue once the denitive model to be used is dened.

Bibliography
[1] 3GPP: Base Station System - Mobile-services Switching Centre (BSS - MSC) interface; General aspects. Ts 48.001, 3rd Generation Parnership Project (3GPP), February 2005. http://www.3gpp.org/ftp/Specs/html-info/ 48001.htm. [2] 3GPP: Bearer-independent circuit-switched core network; Stage 2. Ts 23.205, 3rd Generation Parnership Project (3GPP), June 2005. http://www.3gpp. org/ftp/Specs/html-info/23205.htm. [3] 3GPP: Core network Nb data transport and transport signalling. Ts 29.414, 3rd Generation Parnership Project (3GPP), January 2005. http://www.3gpp. org/ftp/Specs/html-info/29414.htm. [4] 3GPP: Media Gateway Controller (MGC) - Media Gateway (MGW) interface; Stage 3. Ts 29.232, 3rd Generation Parnership Project (3GPP), June 2005. http://www.3gpp.org/ftp/Specs/html-info/29232.htm. [5] 3GPP: Packet switched conversational multimedia applications; Default codecs. Ts 26.235, 3rd Generation Parnership Project (3GPP), April 2005. http:// www.3gpp.org/ftp/Specs/html-info/26235.htm. [6] 3GPP: Signalling System No. 7 (SS7) signalling transport in core network; Stage 3. Ts 29.202, 3rd Generation Parnership Project (3GPP), January 2005. http://www.3gpp.org/ftp/Specs/html-info/29202.htm. [7] 3GPP: UTRAN Iu interface data transport & transport signalling. Ts 25.414, 3rd Generation Parnership Project (3GPP), June 2005. http://www.3gpp. org/ftp/Specs/html-info/25414.htm. [8] 3GPP: Conferencing using the IP Multimedia (IM) Core Network (CN) subsystem; Stage 3. Ts 24.147, 3rd Generation Parnership Project (3GPP), December 2006. http://www.3gpp.org/ftp/Specs/html-info/24147.htm. 92

BIBLIOGRAPHY

93

[9] 3GPP: Multimedia Resource Function Controller (MRFC) - Multimedia Resource Function Processor (MRFP) Mp interface: Procedures Descriptions. Ts 23.333, 3rd Generation Parnership Project (3GPP), December 2006. http: //www.3gpp.org/ftp/Specs/html-info/23333.htm. [10] 3GPP: Media server control using the IP Multimedia (IM) Core Network (CN) subsystem; Stage 3. Tr 24.880, 3rd Generation Parnership Project (3GPP), December 2007. http://www.3gpp.org/ftp/Specs/html-info/24880.htm. [11] 3rd Generation Partnership Project (3GPP): Study on Release 2000 services and capabilities. TR 22.976 V2.0.0 , Technical Report, Technical Specication Group Services and System Aspects, June 2000. [12] 3rd Generation Partnership Project (3GPP): Service requirements for the IP Multimedia Core Network Subsystem (Stage 1). TR 22.228 V5.6.0 , Technical Specication, Technical Specication Group Services and System Aspects, June 2002. [13] 3rd Generation Partnership Project (3GPP): Network architecture. TS 23.002 V5.12.0 , Technical Specication, Technical Specication Group Services and Systems Aspects, September 2003. [14] 3rd Generation Partnership Project (3GPP): IP Multimedia Subsystem (IMS)language = english ;stage 2. TR 23.228 V6.5.0 , Technical Specication, Technical Specication Group Services and System Aspects, March 2004. [15] Barnett, Jim, Michael Bodell, Dan Burnettand Jerry Carter, and Rafah Hosn: State chart xml (scxml): State machine notation for control abstraction. Working Draft, February 2007. http://www.w3.org/TR/2007/WD-scxml-20070221, Work in progress. [16] Boulton, C., T. Melanchuk, S. McGlashan, and A. Shiratzky: A basic interactive voice response (ivr) control package for the session initiation protocol (sip). Internet draft, May 2008. http://tools.ietf.org/html/ draft-boulton-ivr-control-package-05, Work in progress. [17] Boulton, C., T. Melanchuk, S. McGlashan, and A. Shiratzky: A control framework for the session initiation protocol (sip). Internet draft, February 2008. http://tools.ietf.org/html/ draft-ietf-mediactrl-sip-control-framework-00, Work in progress.

BIBLIOGRAPHY

94

[18] Boulton, C., T. Melanchuk, S. McGlashan, and A. Shiratzky: A voicexml interactive voice response (ivr) control package for the session initiation protocol (sip). Internet draft, May 2008. http://tools.ietf.org/html/ draft-boulton-ivr-vxml-control-package-03, Work in progress. [19] Burger, E., J. Van Dyke, and A. Spitzer: Basic Network Media Services with SIP. RFC 4240 (Informational), December 2005. http://www.ietf.org/rfc/ rfc4240.txt. [20] Burger, Eric and Greg Pisano: Mscml protocol: The key to unlocking a new generation of multimedia sip services. White Paper, July 2005. http://www. cantata.com/whitepapers/pdf/mscml_protocol.pdf. [21] Burke, D., M. Scott, J. Haynie, R. Auburn, and S. McGlashan: Sip interface to voicexml media services. Internet draft, January 2008. http://tools.ietf. org/html/draft-burke-vxml-03, Work in progress. [22] Calhoun, P., J. Loughney, E. Guttman, G. Zorn, and J. Arkko: Diameter Base Protocol. RFC 3588 (Proposed Standard), September 2003. http://www.ietf. org/rfc/rfc3588.txt. [23] Camarillo, Gonzalo: SIP Demystied. McGraw-Hill Telecom, 1st edition, 2002, ISBN 0-07-137340-3. [24] Camarillo, Gonzalo and Miguel Angel Garcia-Martin: The 3G IP Multimedia Subsystem (IMS) : Merging the Internet and the Cellular Worlds. John Wiley & Sons, 1st edition, 2004, ISBN 0470871563. [25] Cao, Feng, J. Smith, and K. Takahashi: An architecture of distributed media servers for supporting guaranteed qos and media indexing. Multimedia Computing and Systems, 1999. IEEE International Conference on, 2:15 vol.2, Jul 1999. [26] Comer, Douglas E.: Internetworking with TCP/IP, volume 1. Prentice Hall, 4th edition, 2000. [27] Media services in the ims: Evolution for innovation. White Paper, June 2004. http://www.convedia.com/PDFs/WP_Media_Servers_and_SIP.pdf. [28] Cortes, Mauricio, J. Robert Ensor, and Jairo O. Esteban: On sip performance. Bell Labs Technical Journal, 9(3), 2004. http://dx.doi.org/10.1002/bltj. 20048.

BIBLIOGRAPHY

95

[29] Crocker, D. and P. Overell: Augmented BNF for Syntax Specications: ABNF. RFC 2234 (Proposed Standard), November 1997. http://www.ietf.org/rfc/ rfc2234.txt, Obsoleted by RFC 4234. [30] Dierks, T. and C. Allen: The TLS Protocol Version 1.0. RFC 2246 (Proposed Standard), January 1999. http://www.ietf.org/rfc/rfc2246.txt, Obsoleted by RFC 4346, updated by RFC 3546. [31] Durham, D., J. Boyle, R. Cohen, S. Herzog, R. Rajan, and A. Sastry: The COPS (Common Open Policy Service) Protocol. RFC 2748 (Proposed Standard), January 2000. http://www.ietf.org/rfc/rfc2748.txt. [32] Dyke, J. Van, E. Burger, and A. Spitzer: Media Server Control Markup Language (MSCML) and Protocol. RFC 4722 (Informational), November 2006. http://www.ietf.org/rfc/rfc4722.txt. [33] Media gateway. Technical description, May 2000. [34] Softswitch in mobile networks. White Paper, April 2005. http: //www.ericsson.com/products/white_papers_pdf/3025_softswitch_ mobile_A.pdf. [35] ETSI: European digital cellular telecommunications system (Phase 1); GSM Full Rate Speech Transcoding (GSM 06.10). Recommendation GTS 06.10, European Telecommunications Standards Institute (ETSI), Jan 1995. [36] Even, R.: Requirements for a media server control protocol. Internet draft, June 2006. http://www.ietf.org/internet-drafts/ draft-even-media-server-req-01.txt, Expired. [37] Eyers, Tony and Henning Schulzrinne: Predicting internet telephony call setup delay. Proceedings of the 1st IP-Telephony Workshop, Berlin - Germany, 2000. http://www.cs.columbia.edu/~hgs/papers/Eyer0004_Predicting.pdf. [38] Facebook: Facebook. http://www.facebook.com. [39] Freed, N. and K. Moore: MIME Parameter Value and Encoded Word Extensions: Character Sets, Languages, and Continuations. RFC 2231 (Proposed Standard), November 1997. http://www.ietf.org/rfc/rfc2231.txt. [40] Froumentin, M.: The W3C Speech Interface Framework Media Types: application/voicexml+xml, application/ssml+xml, application/srgs, applica-

BIBLIOGRAPHY

96

tion/srgs+xml, application/ccxml+xml, and application/pls+xml. RFC 4267 (Informational), November 2005. http://www.ietf.org/rfc/rfc4267.txt. [41] Fyr o, Magnus, Kai Heikkinen, Lars G oran Petersen, and Patrik Wiss: Media gateway for mobile networks. Ericsson Review, 2000. http://www.ericsson. com/about/publications/review/2000_04/files/2000042.pdf. [42] Groves, C., M. Pantaleo, T. Anderson, and T. Taylor: Gateway Control Protocol Version 1. RFC 3525 (Proposed Standard), June 2003. http://www.ietf.org/ rfc/rfc3525.txt. [43] Hares, Teemu: Adding multimedia resource function processor functionality to Mobile Media Gateway. Masters thesis, Helsinki University of Technology, Espoo, Finland, February 2003. [44] International Organization for Standardization: ISO/IEC 14496-1:1999: Information technology Coding of audio-visual objects Part 1: Systems. International Organization for Standardization, Geneva, Switzerland, 1999. http://www.iso.ch/cate/d24462.html. [45] ISO/IEC: MPEG-1 coding of moving pictures and associated audio for digital storage media at up to about 1,5 mbit/s. ISO/IEC 11172, 1993. [46] ISO/IEC: MPEG-2 generic coding of moving pictures and associated audio information. ISO/IEC 13818, 1996. [47] ITU-T: Video codec for audiovisual services at px64 kbit/s. Recommendation H.261, International Telecommunication Union, Mar 1993. [48] ITU-T: B-ISDN ATM adaptation layer - Service specic connection oriented protocol (SSCOP). Recommendation Q.2110, International Telecommunication Union, Jul 1994. [49] ITU-T: B-ISDN signalling ATM adaptation layer (SAAL) - Overview description. Recommendation Q.2100, International Telecommunication Union, Jul 1994. [50] ITU-T: B-ISDN ATM Adaptation Layer; Service Specic Coordination Function for Signaling at the Network Node Interface (SSCF at NNI). Recommendation Q.2140, International Telecommunication Union, Feb 1995.

BIBLIOGRAPHY

97

[51] ITU-T: Message transfer part level 3 functions and messages using the services of ITU-T Recommendation Q.2140. Recommendation Q.2210, International Telecommunication Union, Jul 1998. [52] ITU-T: Overall network aspects and functions. Protocol layer requirements. Recommendation I.366.1, International Telecommunication Union, Jun 1998. [53] ITU-T: Protocol for multimedia application text conversation. Recommendation T.140, International Telecommunication Union, Feb 1998. [54] ITU-T: Pulse Code Modulation (PCM) of voice frequencies. Recommendation G.711, International Telecommunication Union, Nov 1998. [55] ITU-T: Narrow-band visual telephone systems and terminal equipment. Recommendation H.320, International Telecommunication Union, May 1999. [56] ITU-T: AAL type 2 signalling protocol - Capability Set 2 . Recommendation Q.2630.2, International Telecommunication Union, Dec 2000. [57] ITU-T: B-ISDN ATM Adaptation Layer specication : Type 2 AAL . Recommendation I.363.2, International Telecommunication Union, Nov 2000. [58] ITU-T: Gateway control protocol: Transport over ATM. Recommendation H.248.5, International Telecommunication Union, Nov 2000. [59] ITU-T: Gateway control protocol: Transport over Stream Control Transmission Protocol (SCTP). Recommendation H.248.4, International Telecommunication Union, Nov 2000. [60] ITU-T: BICC IP Bearer control protocol. Recommendation Q.1970, International Telecommunication Union, Jul 2001. [61] ITU-T: Signalling Transport Converter on MTP3 and MTP3b . Recommendation Q.2150.1, International Telecommunication Union, May 2001. [62] ITU-T: Gateway control protocol: Version 2. Recommendation H.248.1, International Telecommunication Union, May 2002. [63] ITU-T: Information technology - ASN.1 encoding rules: Specication of Basic Encoding Rules (BER), Canonical Encoding Rules (CER) and Distinguished Encoding Rules (DER). Recommendation X.690, International Telecommunication Union, Jul 2002.

BIBLIOGRAPHY

98

[64] ITU-T: OSI Networking and System Aspects. Abstract Syntax Notation One (ASN.1); Information Technology. Abstract Syntax Notation One (ASN.1): Specication of Basic Notation. Recommendation X.680, International Telecommunication Union, Jul 2002. [65] ITU-T: Gateway control protocol: Decomposed multipoint control unit, audio, video and data conferencing packages. Recommendation H.248.19, International Telecommunication Union, Mar 2004. [66] Jin, Haipeng and AC Mahendran: Using sigcomp to compress sip/sdp messages. ICC 2005. 2005 IEEE International Conference on Communications, 2005, May 2005. [67] Kent, S.: IP Authentication Header. RFC 4302 (Proposed Standard), December 2005. http://www.ietf.org/rfc/rfc4302.txt. [68] Kling, Lars Orjan, Ake Lindholm, Lars Marklund, and Gunnar B. Nilsson: Cpp cello packet platform. Ericsson Review, 2002. http://www.ericsson.com/ about/publications/review/2002_02/files/2002023.pdf. [69] Kohler, E., M. Handley, and S. Floyd: Datagram congestion control protocol (dccp). Internet draft, March 2005. http://www.ietf.org/internet-drafts/ draft-ietf-dccp-spec-11.txt, Work in Progress. [70] Kopajtic, Ozren and Riko LuSa: H.248 - implementation and interoperability issues. conTel 2003, 2003. [71] Koukal, M. and R. Bestak: Architecture of ip multimedia subsystem. Multimedia Signal Processing and Communications, 48th International Symposium ELMAR-2006 focused on, June 2006. [72] Lee, Yeong Hun Cho; Moon Sang Jeong; Jong Tae Park; Wee Hyuk: Distributed management architecture for multimedia conferencing using sip. Distributed Frameworks for Multimedia Applications, 2005. DFMA 05. First International Conference on, pages 98105, 6-9 Feb. 2005. [73] Liao, Wanjiun, Jen Chun Chang, and V.O.K. Li: Application-layer conference trees for multimedia multipoint conferences using megaco/h.248. Multimedia, IEEE Transactions on, 7(5):951959, Oct. 2005, ISSN 1520-9210. [74] Marsic, B., T. Borosa, and S. Pocuca: Ims to pstn/cs interworking. Telecommunications, 2003. ConTEL 2003.Vol.2. Proceedings of the 7th International Conference on, June 2003.

BIBLIOGRAPHY

99

[75] Murata, M., S. St. Laurent, and D. Kohn: XML Media Types. RFC 3023 (Proposed Standard), January 2001. http://www.ietf.org/rfc/rfc3023.txt. [76] Nokalva, OSS: Benchmark review : Comparison between binary encoder vs. textual encoder. Technical report, ASN.1 Consortium, 2002. http://www. asn1.org/benchmark/benchmark1.htm. [77] Oshry, Matt, R. J. Auburn, Paolo Baggia, Michael Bodell, David Burke, Daniel C. Burnett, Emily Candell, Hakan Kilic, Je Kusnitz, Scott McGlashan, Alex Lee, Brad Porter, and Kenneth G. Rehor: Voice extensible markup language (voicexml) 2.1. World Wide Web Consortium, Recommendation, June 2007. http://www.w3.org/TR/2007/REC-voicexml21-20070619. [78] Poikselka, Miikka, Georg Mayer, Hisham Khartabil, and Aki Niemi: The IMS: IP Multimedia Concepts and Services in the Mobile Domain. John Wiley & Sons, 1st edition, 2004, ISBN 0-470-87113-X. [79] Postel, J.: User Datagram Protocol. RFC 768 (Standard), August 1980. http: //www.ietf.org/rfc/rfc768.txt. [80] Postel, J.: Transmission Control Protocol. RFC 793 (Standard), September 1981. http://www.ietf.org/rfc/rfc793.txt, Updated by RFC 3168. [81] Project, Open Source Erlang Megaco/H.248: Performance comparison of megaco/h.248 encodings in erlang/otp. Technical report, Open Source Erlang, December 2003. http://www.erlang.org/project/megaco/encoding_ comparison-v4/index.html. [82] Reinius, Jonas: Cello an atm transport and control platform. Ericsson Review, 1999. http://www.ericsson.com/about/publications/review/1999_ 02/files/1999021.pdf. [83] Rosenberg, J.: A Framework for Conferencing with the Session Initiation Protocol (SIP). RFC 4553 (Informational), February 2006. http://www.ietf. org/rfc/rfc4553.txt. [84] Rosenberg, J., J. Peterson, H. Schulzrinne, and G. Camarillo: Best Current Practices for Third Party Call Control (3pcc) in the Session Initiation Protocol (SIP). RFC 3725 (Best Current Practice), April 2004. http://www.ietf.org/ rfc/rfc3725.txt.

BIBLIOGRAPHY

100

[85] Rosenberg, J. and H. Schulzrinne: Guidelines for Authors of Extensions to the Session Initiation Protocol (SIP). RFC 4485 (Informational), May 2006. http: //www.ietf.org/rfc/rfc4485.txt. [86] Rosenberg, J., H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, R. Sparks, M. Handley, and E. Schooler: SIP: Session Initiation Protocol. RFC 3261 (Proposed Standard), June 2002. http://www.ietf.org/rfc/rfc3261. txt, Updated by RFCs 3265, 3853. [87] Saleem, A., Y. Xin, and G. Sharratt: Media server markup language (msml). Internet draft, December 2005. http://www.ietf.org/internet-drafts/ draft-saleem-msml-02.txt, Work in progress. [88] Schulzrinne, H., S. Casner, R. Frederick, and V. Jacobson: RTP: A Transport Protocol for Real-Time Applications. RFC 3550 (Standard), July 2003. http: //www.ietf.org/rfc/rfc3550.txt. [89] Schulzrinne, H. and S. Petrack: RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals. RFC 2833 (Standard), May 2000. http:// www1.ietf.org/rfc/rfc2833. [90] Stewart, R., Q. Xie, K. Morneault, C. Sharp, H. Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, and V. Paxson: Stream Control Transmission Protocol. RFC 2960 (Proposed Standard), October 2000. http://www.ietf. org/rfc/rfc2960.txt, Updated by RFC 3309. [91] Suominen, Mikko: Enhancing System Capacity and Robustness by Optimizing Software Architecture in a Real-time Multiprocessor Environment. Masters thesis, Helsinki University of Technology, Espoo, Finland, May 2004. [92] Zafar, Madiha, Nigel Baker, Meir Fuchs, Justino Santos, Ahsan Ikram, and Susana Sargento: Ims mbms integration: Functional analysis & architectural design. Mobile and Wireless Communications Summit, 2007. 16th IST, pages 15, 1-5 July 2007.

Você também pode gostar