Você está na página 1de 6

PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN MOBILE NETWORKS

Niko GAMULIN1, dr. Mitja TULAR1, dr. Sao TOMAI2 1 Telekom Slovenije, d.d., Cigaletova 15, 1000 Ljubljana 2 Faculty of Electrical Engineering, University of Ljubljana, Traka 25, 1000 Ljubljana niko.gamulin@telekom.si Abstract:As the telecommunications market has reached the mature stage, the majority of population in developed areas has already adopted mobile services and there are many competitors on the market, churn prediction has become critical for companies in order to retain their market shares. Analyzing the data that telecommunications service providers store, originally for billing purposes, it is possible to observe their users in the context of social network and gain additional insights about the spread of influence, relevant to churn.In this paper, we examine the communication patterns of mobile phone users and subscription plan logs. Our primary goal is to discover whether it is possible to determine which users are more likely to churn upon observing their outgoing calls and churn among their neighbors (friends). Keywords: churn, social network analysis, machine learning

1. INTRODUCTION In order to attract new service consumers and retain the existing ones, telecommunications service providers have been constantly forming new subscription adapting the terminal equipment offer according to actual trends and improving the quality of services by upgrading network equipment. Along with the development of machine learning methods and their efficiency the service providers from all industries have become aware of importance of the data which can be used to gain additional insights about their service consumers and consecutively target the important customers, more prone to churn. In the past, there have already been many methods proposed for churn prediction using the past data. In some of these, the user is treated as an individual, independent from his acquaintances data while in the others network effects have been considered as well. The main question that motivated our research is whether it is possible to spot a spread of behavior, i.e. churn among connected users solely from observing the strength of call connections among them. In order to construct a social network, we have observed the Call Detail Record (CDR) data along the subscription plan log to determine the users subscription state in the observed time period. Each user represented a node and the aggregated outgoing calls towards each neighbor represented a directed edge, weighted with the number of calls and the sum of duration of all calls. The observed users subscription state along with his neighbors subscription state served as an indicator of spread of behavior. As it is dynamic process, the time variable is also important. If the time period, that has elapsed between two acts of a same kind performed by two connected users, is large, it might not be appropriate to state that the second actor followed the first one and that the same state of two connected users

is not a mere coincidence. On the other hand, if the observing time period is too short, we might not notice that the two connected users performed the same act after some additional time elapsed. In order to check whether the observing time period is relevant, we observed the users states over different time period lengths. As we have anticipated that the individual subscriber's choice about churn has been partially motivated by prior churners among his acquaintances, we have tried to determine the acquaintances impact on churn decision by observing the number of phone calls established, the duration of phone calls and the number of prior churners among the observed subscriber acquaintances. In order to prove the relevance between churn decision, and the number of prior churners among acquaintances along with the connections strength, measured by the relative number and duration of phone calls, we have observed the users in 3D space, defined by the axes that represented the relative number of prior churners among acquaintances and the number and duration of calls established with these relative to all acquaintances. In order to prove the relevance between observed subscriber's behavior and prior acts of their neighbors we have calculated and plotted the lift curve to show the significant influence from prior acts, made by acquaintances for different time period lengths. 2. PROBLEM STATEMENT Although the awareness of the importance of social networks has increased significantly along with spread of online social networks, such as Facebook, Twitter, Google+ and LinkedIn, the majority of service providers from non-internet industries haven't exploited the potential of real social networks of interconnected people, who influence each other in real world. While several online services and retailers have already

22

developed marketing campaigns, based on social networks, the majority of businesses from other industries still try to attract new customers and keep the existing ones by threating each of them as individual and mostly invest in broadcast marketing campaigns and form offers on global level. Some of service providers and retailers are certainly not able to treat their customers as interconnected peers, influenced by their friends due to lack of data, needed to represent users as nodes and form edges between them. On the other hand telecommunications service providers have to keep the data for their customers phone conversations for charging purposes. These same data could be used to form a social network if interconnected users and anticipate how they influence each other. Our motivation for this research was based on the assumption that users influence each other over phone conversations and the power of influence is conditional on the duration and number of conversation. According to this assumption, the user whose many friends have churned is also more likely to churn. Furthermore, we guessed that if the observed user is influenced by his peers, he follows them in the shorter amount of time. In the following chapter, there is an overview of some of the methods, previously proposed, for churn prediction where users are observed in social network context. 3. EXISTING SOLUTIONS In the past, there have been numerous studies performed, dealing with churn problem in telecommunications services sector. When dealing with churn problem, at first one has to be aware of the big difference between prepaid and postpaid users. The first ones are, as opposed to second ones, not bound by a contract. In case of prepaid users, it is easier to observe users in social network context and extract the rules for the diffusion of churn as these users are free to make a decision about the change of service plan anytime. The model where prepaid users are observed in the context of social network is presented in[1]. On the other hand, while it is easier to observe prepaid users in the context of social network, it is not trivial to determine the users churn status as such users dont explicitly cancel the subscription plan. In[2], a model for prepaid user labeling is proposed along with churn prediction technique where users are observed as individuals, without influence of interconnected users. Dierkes et al. [3] observe if user churn decision of individuals in previous time periods have an impact on other users whom the target customer interacted with either via voice call, short message service (SMS), or multimedia message service (MMS) using Markov Logic Networks (MLNs). 4. THE PROPOSED SOLUTION The main motivation for this research was to prove that users, connected among each other with phone calls and to determine the importance of strength of connections for the churn spread. Although there are many factors that influence users decision about

subscription plan change, such as service price, marketing campaigns and special offers from competitor providers, we wanted to prove that the social factors itself plays an important role and for 5. DATASET For our analysis, we used anonymised historical data for about 790.000 users and about 42.000.000 aggregated daily call connections records from CDR for September and October 2010. The call connection record contained the number of calls between caller and called person and the sum of calls durations for the observed day. Along with the call connections data we had available the churn log for the time period from year 2005 to year 2011 from which it was possible to label each observed user either as churner or nonchurner. In order to perform the experiment, the original data had to be reshaped the following way. At first, the list of all active postpaid users was made from aggregated daily call records, i.e. all postpaid callers were selected. Once having the list, we defined three different observation period lengths: 60 days, 30 days and 15 days. For each period length we looped through the list of active users and for each one checked the total number of called neighbors, the number of all outgoing calls and the duration sum of all outgoing calls. Then, if user has churned in the period of observed CDR data period, he was labeled as churner, otherwise as nonchurner. Then, according to the observed period length, all of his neighbors that churned before him, inside the defined period length, were counted and the duration and number of outgoing calls were summed. Each neighbor that churned outside the defined period length or after observed user was labeled as non-churner and the call data were treated as non-churners, i.e. the relative number of calls and duration for non-churners was increased. Having these data, the relative values were calculated for the number of neighbors that churned before, number of outgoing calls and the sum of duration of outgoing calls to users that churned before. After reshaping the data, we performed the experiment, described in the following section. 6. EXPERIMENT Each record from the reshaped data can be represented as pair of input vector of independent variables and output dependent target class variable. In our case, the independent variables were the number of neighbors that churned before, relative to the number of all neighbors, the number of calls and the sum of duration of all calls to neighbors that churned before, relative to all neighbors and the target variable was the class that represented whether the observed user has churned or not. As the maximum length of input vector is 3 and the output class variable has 2 possible states, it is possible, for simpler visual interpretation, to represent the observed users as colored points, scattered in 3D space (Picture 1). The main aim of representing observed users in 3D space was to gain intuition for further analysis; the reduction of observed variables on one and can potentially reduce the quality of results,

23

while, on the other hand in the domain of large numbers of input variables the result might be difficult to interpret.

In case of observing single variable, while its value was gradually increased from minimum value, 0, to

Picture 1 - Sample set of users from the observed dataset, represented as points in space. Blue points represent nonchurners while red points represent churners. The sphere with centre point (1,1,1) represents a classification shape; all users inside are classified as churners. From visual representation it was possible to draw the following intuitive conclusions. If many of observed user's neighbors churn and if one spends relatively long time talking with churners, the probability that the observed user will churn is higher than in case the observed users don't have many neighbors who churned.Certainly, visual observation might lead to false conclusion and therefore we decided to draw a sphere and observe, how many users of each class (churners and non-churners) where captured inside or outside the sphere, circle or region (depending on the number of observed independent variables and base point) with varying range from different base points. With 3 dimensions, represented with independent input variables, it is possible to observe users churn state depending on either single input variable or combinations of 2 or all variable values. To select a segment of observed users, we first set a base point and then increased the observed area range from minimum value, 0, to maximum value. For all possible combinations of input variable, we first set the starting point to the origin of the coordinate system and then to the point, furthest away from the origin of the coordinate system. In first case, the observed area were the points which distance from the origin of the coordinate system was greater or equal to current range. In second case, while the base point was set to the point, furthest away from the origin of the coordinate system, the observed points which distance from the base point was smaller or equal to current range. Depending on the number of observed variables, the observed area was either defined by line, circle (Picture 2) or sphere. The following description of the experiment is limited to the case of selecting the point, farthest from the origin of coordinate system as the base point as this model achieved better results although the difference was not significant.

Picture 2 - Examples of observation area for 2 input variables, marked with grey color relative number of neighbors churned before and relative duration of calls to neighbors that churned before for base points set to the origin of coordinate system (a) and the point, furthest away (1,1) (b). In case of (a), the observation area are the point, which distance from the base point (0,0) is larger or equal than R, whereas in case of (b), the observation area are the point, which distance from the base point (1,1) is smaller or equal to R. maximum value, 1, the users churn states inside and outside range were observed and the percentage of churners and non-churners inside and outside the range was calculated. Similarly, in case of combinations of 2 independent variables, the circle center was set at point (1,1) and the circle radius, which represented the observed range was gradually increased from base point, to value 2, where both observed variables reached maximum value, 1. In case of sphere, the center was also set at point (1, 1, 1) and the radius was increased from 0 to3.
= = (2 (1) ) + ; (1 ) + (1 ) ; = (0,0) = (0,0)

Having calculated the percentage of churners and nonchurners inside and outside the observed range for each combination of input variables at each step, it is possible to draw conclusions for how observed users

24

interaction with his neighbors and neighbors churn affect the observed users decision about churn. To measure the relevance of different segmentations, based on the combination of independent variables, we used some standard data mining terms, which are described in the following chapter. 7. RESULTS In this section, we present the results of our experiment for different observational time period lengths, comparing different combinations of observed variables for churners segmentation. For each defined time period length, we present and discuss the results of all combinations and present the Receiver Operating Characteristic (ROC) along with precision and recall values for selected fraction of segmented users. The aforementioned factors are defined as follows. Let TP be the true positives, TN the true negatives, FP the false positives and FN the false negatives. In this experiment TP represents the number of churners, captured inside the observed range, FP non-churners inside the observed range, TN non-churners outside the observed range and FN churners outside the observed range. Precision is defined as the fraction of retrieved instances that are relevant (Equation 2), while recall is the fraction of relevant instances that are retrieved

Having calculated precision and recall values for different rates of population, captured inside the observational area, it is possible to represent these values graphically (Picture 3) and interpret the significance of segmentation against random selection. In case of labeling all users as churners, the precision would be equal to 1 and the recall value would be equal to 1 as all churners among all users would be selected. In case of random selection, the precision value is always close to actual churn rate, while the recall value increases equally with selected population size, assuming that churners and non-churners are equally distributed among population. In case of defining the criteria to select a specific segment of population with aim to increase the precision and recall values, the segmentation efficiency could be measured by comparing values for segmented users with values for random selection. Certainly, it is very difficult to design a perfect segmentation model, valid for general usage and therefore in real models, there is a certain amount of samples, in this case non-churners, who are classified as churners.

(2)

(3)

Picture 1 Precision and recall values for random selection, ideal segmentation and segmentation with observation of prior neighbor churners rate, relative number of calls to neighbors that churned before and relative duration of calls to neighbors that churned before with time period length 60 days

25

Picture 4 ROC curve and AuC, colored with grey for observation of prior neighbor churners rate, relative number of calls to neighbors that churned before and relative duration of calls to neighbors that churned before with time period length 60 days Beside the precision and recall values, a useful measure for model evaluation is the Area under Curve value (AuC), which is derived from Receiver Operating Characteristic (ROC)[4]. ROC curve is a graphical plot which illustrates the performance of binary classification for different rates of captured users and enables the observer to visually estimate the cost/benefit ratio of the segmentation model for selected size of population. The best possible prediction method would yield a point in the upper left corner or coordinate (0, 1) of the ROC space, in which case there would be selected all actual churners (TP) and none of non-churners selected (FP). The actual ROC curve values depict relative trade-offs between benefits from selecting actual churners and cost from classifying actual nonchurners as churners (FP). The AuC value is a proportion of the area of the unit square under ROC curve and is equivalent to the probability that the classifier will rank a randomly chosen positive instance (churner in this case) higher than a randomly chosen negative instance (non-churner). As random guessing produces the diagonal line Dimensions AuC between (0,0) and (1,1), which splits the whole observation space in half, the AuC value for random guessing is equal to 0.5. As in this discussion we observed the difference between random guessing and using the classification model, we calculated the area size between random guessing curve and actual classification model ROC curve (AuC) as it is shown in [4]. Similarly, besides actual AuC value, derived from ROC curve, we used recall value of random guessing and actual model to calculate the area size between actual model recall curve and random guessing recall curve.With adjusting the capture range, which in this case represents a side of rectangle in case of observing 1, circle radius in case of 2 and sphere radius in case of 3 dimensions, it is not possible to capture the exact percentage of users, and therefore, for model estimation we used the AuC. The precision, recall, AuC and AuC (Recall) values for different combinations of input variables, for observation period length of 60 days are listed in Table 1.

Precision (% of users) Recall (% of users) AuC (Recall) ~5% ~10% ~5% ~10% (actual) (actual) x 0.13075 0.0377(5) 0.0237(9.6) 0.2814 0.3397 0.12987 y 0.1353 0.0419(4.93) 0.0299(7.3) 0.3081 0.3264 0.1344 z 0.1368 0.0443(4.72) 0.0263(8.55) 0.3126 0.3362 0.13588 x, y 0.13395 0.0425(4.86) 0.0238(9.53) 0.3084 0.3393 0.13305 x, z 0.13466 0.0422(4.99) 0.0238(9.55) 0.3147 0.3393 0.13376 y, z 0.13643 0.0446(4.64) 0.0271(8.25) 0.3095 0.334 0.13551 x, y, z 0.02355 0.0425(4.95) 0.0236(9.64) 0.3143 0.3397 0.02359 Table 1 - 1 Model performance for the time period length of 60 days, using different combinations of variable inputs, where x represents the relative number of neighbors that churned before, y relative duration of calls to neighbors that churned before and z relative number of calls to neighbors that churned before

26

Comparing the results for different period lengths, we can see that the number of calls to neighbors that churned before, relative to total number of calls as single input variable is the best predictor for period lengths of 60 and 30 days, while it is close to the best one in case of 15 days period length as well, where the best predictor is th relative duration of calls to neighbors that churned before. By comparing model performances for different period lengths and same combinations of input variables, we can see that the model achieves the best values in case of 60 days period length.To describe the model usefulness in practice, we can consider the case of observing the relative number of neighbors that churned before and relative number of calls to neighbors that churned before for the time period of 60 days. For this case, if we set a range threshold to value, for which around 5% (4.99) of segmented users are captured and treated as churners, the AuC value is equal to 0.13466, precision is equal to 0.0422, recall is equal to 0.3147 and AuC (Recall) is equal to 0.13376. 8. CONCLUSIONS AND FUTURE RESEARCH DIRECTIONS Where the majority of population has already adopted mobile services, it is critical to implement churn prediction methods, in order to retain the market share. Besides observing user behavior as individual, it is crucial to discover patterns and rules that hold for network of interconnected users. In this research, we proved that observed users behavior in terms of churn depends of his neighbors prior behavior. In case of observing users as individuals, many users importance might be overlooked; from billing records the service provider can measure users importance from the amount of monthly charges whereas the users, who are not active, in this case do not stand out but are nevertheless important in case of receiving many incoming calls and therefore indirectly generate significant profit as well. If the churn of such users was prevented, the spread of churn to active users, who directly generate profit, could be prevented by targeting the influential neighbors. As the existence of the influence among connected users has been proved, our plan for the future research is to observe the connection in the longer time period and distinguish the contribution of influence to observed user of each neighbor separately. 9. ACKNOWLEDGEMENTS The authors would like to thank Telekom Slovenije for cooperation. The work was supported in part by the Ministry of Education, Science, Culture and Sport of Slovenia and the Slovenian Research Agency. Special thanks go to the European Union for partly financing a young researcher training program from the European Social Fund, under the Operational Programme Human Resources Development for the period 2007 2013.

10.

REFERENCES

[1]

[2]

[3]

[4]

K. Dasgupta, R. Singh, B. Viswanathan, D. Chakraborty, S. Mukherjea, A. A. Nanavati, and A. Joshi, "Social ties and their relevance to churn in mobile telecom networks," presented at the Proceedings of the 11th international conference on Extending database technology: Advances in database technology, Nantes, France, 2008. L. Alberts, I. R. L. M. Peeters, R. Braekers, and C. Meijer, "Churn Prediction in the Mobile Telecommunications Industry," Citeseer. T. Dierkes, M. Bichler, and R. Krishnan, "Estimating the effect of word of mouth on churn and cross-buying in the mobile phone market with Markov logic networks," Decision Support Systems, vol. 51, pp. 361-371, 2011. T. Fawcett, "An introduction to ROC analysis," Pattern recognition letters, vol. 27, pp. 861-874, 2006.

27

Você também pode gostar