Escolar Documentos
Profissional Documentos
Cultura Documentos
ORGANISED BY
TABLE OF CONTENTS
Sr. No. 1 Topic DEFINE PHASE Project Charter Deployment Process Flow Diagram SIPOC Diagram KPIs and KPOs MEASURE PHASE Measurement System Analysis (MSA) Line Chart (Month Wise Downtime) Pie Chart (Category Wise Downtime) Pareto Analysis (Category Wise Downtime) Cause & Effect Diagram (Downtime) Pie Chart Country Wise Downtime City Wise Downtime Bar Chart Shift Wise Average Downtime Gender Wise Average Downtime Qualification Wise Downtime Experience Wise Downtime Age Wise Downtime Multiple Bar Chart Country & City Wise Downtime Qualification & Experience Wise Downtime Age & Experience Wise Downtime Probability Plot Current Sigma Level Calculation ANALYSIS PHASE Testing of Hypothesis IMPROVE PHASE Design of Experiment CONTROL PHASE Process Failure Mode & Effect Analysis Control Charts Page No. 3 4 5 5 8 9 10 12 13 13 14 14 14 15 16 16 17 17 17 18 18 18 19 20 21 33 34 42 43 59
Page 2
DEFINE PHASE
DEFINE PHASE: Project Charter Process Flow Diagram SIPOC Diagram KPIs & KPOs
Page 3
PROJECT CHARTER:
PROJECT TITLE: BUSINESS CASE: Managing & Enhancing Quality of It Operations By Reducing The Unplanned Downtime
Last year (from February to August, 2011) due to the different issues in the IT operations and development company faces unplanned downtime which causes the loss of revenue, reputation and customers. By considering these issues the overall losses are approximately Average Rs. 23981100 per month (Average Downtime per month X Average Cost (Human Resource) per Minute X 7 Months). By reducing above said losses by 30%, we will be able to save approximately average Rs.7194330 per month. PROBLEM Managing quality by controlling and improving processes of identified major downtime STATEMENT: categories which are causing 45% of total down time. OBJECTIVE: METRICS: Reduction in %age downtime up to 30% of the total downtime. Primary Metric = Downtime (%) = (Total Downtime / Secondary Metric = % Yield Total Production Time) * 100
Total Downtime 8079 6961 7047 6263 6559 5405 6645 5430 4355 2122 2276
Schedule:
ID Task Name Start Finish Duration 1 2 3 4 5 Define Measure Analyze Improve Control 10/17/2011 10/31/2011 11/21/2011 12/12/2011 1/30/2012 10/28/2011 11/18/2011 12/9/2011 1/27/2012 2/17/2012 10d 15d 15d 35d 15d
Oct 2011
Nov 2011
23/10 30/10 6/11 13/11 20/11 27/11 4/12 11/12 18/12 25/12
Page 4
Client Services
Analyze Performance
Yes
Inform Client
Development Team
No
Update Script
DBA Team
Yes
Deploy Script
Apply Filter
Update Filter
Reboot Dialers
Login Agents No
Getting Leads
Yes
Start Dialing
Help Desk
Record Downtime
NT Admin Team
Assign/Unassign Machine
GNOC
SIPOC DIAGRAM:
SUPPLIER
Development Team Operations Team Strategy Analyst
INPUT
Dialing Filter Dialing Criteria Email
PROCESS
APPLY FILTER
Operations team requests filter update
OUTPUT
Correct Leads Enough Leads Available
CUSTOMER
Operations Team Agents
Dialer team implements the filter and checks the leads count
Dialer team provide the updated leads count and filter to development team
Page 5
SUPPLIER
NT Admin
INPUT
Credentials with the rights to reboot
PROCESS
Reboot Dialer
Dialer team analyst stops the controller
OUTPUT
CUSTOMER
Controller up Dialer Team and running Agents CTI Engine up and running Dialogic up and running
SUPPLIER
Client Client Services Team
INPUT
Client Requirements Client's Approval Client Services Approval
PROCESS
UPDATE SCRIPT
Client services requests a change in script
OUTPUT
Correct Script With Updated Information
CUSTOMER
Agents Client Services Operations Manager Developer
Page 6
SUPPLIER
Client Services Strategy Analyst
INPUT
Previous Dialing Data New Strategy
PROCESS
CHANGE DAILING PARAMETERS
Client Services/Strategy Analyst request change in parameters
OUTPUT
Target Strategy Met Dialing with New Parameters
CUSTOMER
Strategy Analyst Client Services Team
Dialer Team supervisor verifies that the change has been implemented correctly
Dialer Team inform the requestor that change has been implemented
SUPPLIER
Client Services Client
INPUT
Lead files
PROCESS
LEADS LOADING
Operations team requests filter update
OUTPUT
New Leads
CUSTOMER
Operations Team Client Services Dialer Team
Dialer team implements the filter and checks the leads count Dialer team provide the updated leads count and filter to development team
SUPPLIER
NT Admins Dialer Team
INPUT
X-ten NT login credentials Agent application login credentials
PROCESS
LOGIN AGENTS
Agent starts soft phone
OUTPUT
Agent Application Campaign Loaded
CUSTOMER
Agents
Agents starts agent application and login Once agent is logged into agent application he logs into the campaign Required campaign is loaded and agent starts dialing
Page 7
SUPPLIER
Machine Owner
INPUT
Required approvals Required machine information
PROCESS
MAINTENANCE
Dialer Team request field support to perform maintenance
OUTPUT
Machine up and running
CUSTOMER
Dialer Team NT Admin
Field support team performs the maintenance and restarts the machine
Page 8
MEASURE PHASE
MEASURE PHASE:
Measurement System Analysis Line Chart Pie Chart Pareto Chart Cause & Effect Diagram Bar Chart Multiple Bar Chart Probability Plot Current Sigma Level
Page 9
Process performance is measured in terms of downtime per month. This time is measured from the clock and noted on an Excel sheet. Different cases of intervals were provided to operators to evaluate those intervals as Good or Bad. While conducting study, Downtime 60 minutes as good and others as bad. After that operators performance was put in Minitab-16 and Measurement System Analysis for Attribute data was analyzed. Results as shown below:
No
93.3% The appraisals of the test items correctly matched the standard 93.3% of the time.
Yes
Comments Consider the following when assessing how the measurement system can be improved: -- Low accuracy rates: Low rates for some appraisers may indicate a need for additional training for those appraisers. Low rates for all appraisers may indicate more systematic problems, such as poor operating definitions, poor training, or incorrect standards. -- High misclassification rates: May indicate that either too many Good items are being rejected, or too many Bad items are being passed on to the consumer (or both). -- High percentage of mixed ratings: May indicate items in the study were borderline cases between Good and Bad, thus very difficult to assess.
100
95.0
93.3%
80
60
40
20
Hassan Rauf
Fahad Javaid
Najeeb Rehma
Page 10
% Good rated Bad Item 1 Item 5 Item 2 Item 7 Item 9 0 10 20 30 Item 3 Item 4 Item 6 Item 8 Item 10 0 10
20
30
Hassan Rauf
Hassan Rauf
Hassan Rauf
Fahad Javaid
Fahad Javaid
Fahad Javaid
Najeeb Rehma 0 15 30
Najeeb Rehma 0 15 30
Najeeb Rehma 0 15 30
Fahad Javaid
Fahad Javaid
% by Standard Good Bad 40 60 80 % by Trial 100 Hassan Rauf Najeeb Rehma Bad
Fahad Javaid
Najeeb Rehma
40
60
80
100
40
60
80
100
Page 11
i i
LINE CHART:
9000 8000 Total Downtime 7000 6000 5000 4000 3000 2000 1000 0 Total Feb 8079 Mar 6961 Apr 7047 May 6263 Jun 6559 Jul 5405 Aug 6645
Above line chart and descriptive statistics shows last seven months trend of the total production downtime on monthly basis.
Page 12
PIE CHART:
Facilities 10%
Maintenance 1%
From pie chart of category wise production downtime we can clearly observe that unknown was the largest downtime category with the share of 25% of total downtime. Dialer Software and Procedure Error were the second largest downtime category with 12% share of each catagory. Overall, we have a clear picture of percentage of downtime associated with each root cause category through pie chart.
PARETO ANALYSIS:
Category Vs Downtime
50000 40000 100 80 60 40 20 0
Downtime
20000 10000 0
r e ies ro w ar lit Er Er no tw ci f k e ge Fa So ur Un an ed er al Ch oc Di Pr n
r k e ed lc o or ar he at w Te tw Ot el rd Ne tR Ha en k i r Cl wo et N 11661 5465 5379 5237 4652 2842 2066 2043 1933 5681 24.8 11.6 11.5 11.2 9.9 6.1 4.4 4.4 4.1 12.1 24.8 36.5 47.9 59.1 69.0 75.0 79.4 83.8 87.9 100.0
r ro
Page 13
Percent
30000
After constructing the Pareto chart of downtime of causes it was observed that out of remaining 14 categories (except Unknown) only 4 were responsible for a downtime of 44.2%. Procedure error and Dialer software proved to be the first and second largest category of downtime which was responsible for an 11.6% and 11.5% of the total downtime respectively. Change error caused 11.2% while facilities caused around 9.9% of total downtime. From above information, main four major categories identified which are Change Error, Procedure Error, Facilities Failure and Dialer Software. These four categories alone were responsible for 44.2% of total downtime. By managing these four categories we can considerably increase the quality by reducing unplanned downtime.
Facilities Failure
Storm
Documentation not available Not considered important Not upgraded Trainings not arranged Lack of job knowledge Not enough trained resource
Too much load Load unbalancing Software crash Not thoroughly tested Incompatibility with other components Unhandled exception QA not performed before release New version needs to be released immediately Parameters were not double checked Wrong parameters applied Lack of education/training Lack of experience/knowledge
Downtime
Change implemented wrongly Lack of communication between stakeholders Change was misunderstood Lack of automation Resistance to change Tools are considered unimportant Resources are busy in other tasks
Resources not available to develop customized tools Lack of ownership Change manager/implementer not properly defined Technology failed during change Current system does not support upgrade Change was not properly tested Campaign needs to go live on short notice Not good enough testing environment Lack of time
Change Error
PIE CHARTS:
4% 1%
Page 14
From pie chart of Country wise downtime we can clearly observe that Pakistan was the largest downtime country with the share of 95% of total downtime.
Karachi 10%
Lahore 85%
From pie chart of City wise downtime we can clearly observe that Lahore was the largest downtime city with the share of 85% of the total downtime. BAR CHARTS:
70.00 Average Downtime 60.00 50.00 40.00 30.00 20.00 10.00 0.00 Total
Evening 58.94
Morning 56.94
Night 39.95
Above bar chart indicates that average downtime occurs in the evening and morning shifts are much higher than the night shift.
Page 15
Above bar chart indicates that average downtime occurs due to male is more than female.
BCE 156 2
BEE 1704 59
BIS
BIT
BT
MCS 86 2
MIS 420 1
MT 5389 104
Legend:
A Levels A-Levels BBIT Bachelors in Business and Information Technology BCE Bachelors in Computer Engineering BCS Bachelors in Computer Science BEE Bachelors in Electrical Engineering BIS Bachelors in Information Systems BIT Bachelors in Information Technology BT Bachelors in Telecommunication MCS Masters in Computer Science MIS Masters in Information Systems MISS Masters in Information Systems Security MT Masters in Telecommunication Due to lesser counts in few categories regarding the downtime bar chart used to high light major responsible qualification levels. So, most of the downtime are due to Graduate (i.e.: BIT, BT and BCS). SIX SIGMA (BLACK BELT)PROJECT Page 16
8-9 3033 74
10-11 809 15
12-14 30 1
Above bar chart indicates that the lesser experience persons are more responsible for downtime.
Above bar chart indicates that the people having less than 34 years age are more responsible for downtime. MULTIPLE BAR CHARTS:
Page 17
Above multiplebar chart indicates that the most of downtime occurs in Lahore, Pakistan.
4-5
8-9
6-7
2-3
6-7
8-9
4-5
2-3
4-5
6-7
8-9
2-3
2-3
4-5
8-9
2-3
4-5
2-3
4-5
6-7
4-5
4-5 10-11
10-11
A Levels
BBIT
BCE
BCS
BT
MCS MIS MT
Above multiplebar chart indicates that the most of downtime occurs due to lesser experience and having Graduate (i.e.: BIT, BT, BCS) qualification.
8-9
2-3
4-5
2-3
4-5
6-7
2-3
4-5
6-7
8-9
2-3
4-5
6-7
2-3
4-5
8-9
2-3
4-5
8-9
12-14
12-14
28-29
30-31
32-33
34-35
38-39
40-41
50- 5451 55
99 95
Percent
80 50 20 5 1
0.01
-200
-100
100
400
500
600
700
Above probability plot of the overall downtime for all root cause categories indicating a non normal distribution because the p-values for the Anderson-Darling test are less than 0.05 SIX SIGMA (BLACK BELT)PROJECT Page 18
10-11
4-5
MEASURING THE CURRENT PROCESS SIGMA LEVEL: Root Cause Downtime Root Cause Categories Categories (Minutes per Year) Dialer Software 5379 Procedure Error Dialer Hardware 1431 Network Hardware Network 2043 Database Windows Server 1685 Telecommunication (Telco) Facilities 4652 Software Client Related 2066 Service Provider Maintenance 491 Unknown/Other TOTAL Change Error 5237
Downtime (Minutes per Year) 5465 1933 677 2842 1257 140 11661 46959
COST OF DOWNTIME: Total Production time (Minutes) per Month = 15 hours X 60 Minutes X 26 Days = 23400 Average Down Time (Minutes) per Month = 6708 Average Cost per minute (Human Resources) = Rs. 3575 Average down time Cost / month = Rs. 23981100 SIGMA LEVEL CALCULATIONS: LAST SEVEN MONTH STUDY: Total Downtime (Minutes) = 46959 Total Production Time (Minutes) = 163800 Opportunity per Unit = 1 DPU (Defects / Unit) = 46959 / 163800 = 0.2867 DPO (Defects/opportunity) = 46959 / (163800 X 1) =0.2867 Yield = 1 DPO = 1 0.2867 = 0.7133 DPMO = 286685 (defects/million opportunity) Sigma Level = 2.06 (Using M.S. Excel) CONCLUSION: From above said investigations it is concluded that most of the downtime occurs in Lahore, Pakistan. On average in evening and morning shift downtime occurs more than night shift and male are more responsible than female for this downtime. Persons having lesser experience and having graduate level qualification (i.e: BIT, BT, BCS) are more responsible than others.
Page 19
ANALYZE PHASE
ANALYSIS PHASE: Testing of Hypothesis
Page 20
Pakistan
500 250 0
Philippines
Philippines
500 250 0
Senegal
Senegal
500 250 0
United State
United State
Page 21
Yes
P = 0.076 Differences among the means are not significant (p > 0.05).
No
None Identified
Senegal
Comments Pakistan You cannot conclude that there are differences among the means at the 0.05 level of significance.
United State
Philippines
-400
-200
200
400
The result of the above test shows that County is an insignificant factor as the Pvalue is more than 0.05.
Page 22
The result of the above test shows that City is an insignificant factor as the Pvalue is more than 0.05.
Page 23
500
250
Morning
0
Morning
500
250
Night
0
Night
500
250
Yes
P = 0.000 Differences among the means are significant (p < 0.05).
No
Night Comments You can conclude that there are differences among the means at the 0.05 level of significance. Use the Comparison Chart to identify means that differ. Red intervals that do not overlap indicate means that differ from each other. Consider the size of the differences to determine if they have practical implications.
Evening
Morning
40
60
80
Page 24
The result of the above test shows that shift is a significant factor as the Pvalue is less than 0.05.
2-Sample t Test for the Mean of Downtime (Male and Female) Diagnostic Report
Data in Worksheet Order Investigate outliers (marked in red).
Downtime (Ma Downtime (Fe
500
250
Power W hat is the chance of detecting a difference? < 40% 60% 90% 100%
What difference can you detect with your sample sizes of 892 and 53? Difference 14.433 16.495 18.907 22.253 Power 60.0 70.0 80.0 90.0
14.433 22.253 For alpha = 0.05 and sample sizes = 892, 53: If the true mean of Downtime (Ma was 14.433 greater than Downtime (Fe, you would have a 60% chance of detecting the difference. If Downtime (Ma was 22.253 greater than Downtime (Fe, you would have a 90% chance.
Power is a function of the sample sizes and the standard deviations. To detect a difference smaller than 18.907, consider increasing the sample sizes.
Page 25
2-Sample t Test for the Mean of Downtime (Male and Female) Summary Report
Mean Test Is Downtime (Ma greater than Downtime (Fe? 0 0.05 0.1 > 0.5 Statistics Sample size Mean 90% CI Standard deviation Difference between means* 90% CI Downtime (Ma 892 50.280 (46.83, 53.73) 62.571 Downtime (Fe 53 39.792 (27.699, 51.886) 52.573
Yes
P = 0.084 The mean of Downtime (Ma is not significantly greater than the mean of Downtime (Fe (p > 0.05).
No
* The difference is defined as Downtime (Ma - Downtime (Fe. 90% CI for the Difference Does the interval include zero? Comments 0 5 10 15 20 -- Test: There is not enough evidence to conclude that the mean of Downtime (Ma is greater than Downtime (Fe at the 0.05 level of significance. -- CI: Quantifies the uncertainty associated with estimating the difference from sample data. You can be 90% confident that the true difference is between -2.0710 and 23.047. -- Distribution of Data: Compare the location and means of samples. Look for unusual data before interpreting the results of the test.
Downtime (Fe
-100
100
200
300
400
500
600
Page 26
23.539 159.82 Based on your samples and alpha level (0.05), you have at least a 90% chance of detecting a difference of 159.82, and at most a 60% chance of detecting a difference of 23.539.
Power is a function of the sample sizes and the standard deviations. To detect differences smaller than 150.83, consider increasing the sample sizes.
Statistics Qualification A Levels Bachelors _2 Bachelors _3 Bachelors _4 Bachelors _5 Bachelors _6 Bachelors _7 Bachelors _8 Masters in_9 Masters in_1 Sample Size 142 132 2 144 58 52 190 117 2 104 Mean 35.106 38.977 78 47.410 29.034 63.25 60.368 64 43 51.817 Standard Deviation 43.816 41.490 56.569 51.121 38.518 66.703 75.072 82.220 18.385 57.211 Individual 95% CI for Mean (27.836, 42.375) (31.833, 46.121) (-430.25, 586.25) (38.989, 55.831) (18.907, 39.162) (44.680, 81.820) (49.625, 71.112) (48.945, 79.055) (-122.18, 208.18) (40.691, 62.943)
Yes
P = 0.016 Differences among the means are significant (p < 0.05).
No
Means Comparison Chart Red intervals that do not overlap differ. Bachelors _5 A Levels Bachelors _2 Bachelors _4 Masters in_1 Bachelors _7 Bachelors _8 Bachelors _6 Masters in_9 Bachelors _3 -100 0 100 200 300
1 1
2 2
Comments You can conclude that there are differences among the means at the 0.05 level of significance. Use the Comparison Chart to identify means that differ. Red intervals that do not overlap indicate means that differ from each other. Consider the size of the differences to determine if they have practical implications.
Page 27
The result of the above test shows that qualification is a significant factor as the Pvalue is less than 0.05.
Page 28
250
31
500
31
32
33
32
250
33
0
35
500
35
36
38
36
250
38
0
40
500
40
50
55
50
250
55
0
250
500
Yes
P = 0.000 Differences among the means are significant (p < 0.05).
No
Means Comparison Chart Red intervals that do not overlap differ. 40 30 28 38 35 29 32 31 55 36 50 33 -100 0 100 200 300
1 1 1 2 3 4
Comments You can conclude that there are differences among the means at the 0.05 level of significance. Use the Comparison Chart to identify means that differ. Red intervals that do not overlap indicate means that differ from each other. Consider the size of the differences to determine if they have practical implications.
Page 29
The result of the above test shows that age is a significant factor as the Pvalue is less than 0.05.
Page 30
2
500
3
250
4
0
5 6 7
5
500
6
250
7
0
8 9 10
8
500
9
250
10
0
250
500
Yes
P = 0.000 Differences among the means are significant (p < 0.05).
No
Means Comparison Chart Red intervals that do not overlap differ. 8 7 2 3 6 4 10 5 9 0 40 80 120 160
Comments You can conclude that there are differences among the means at the 0.05 level of significance. Use the Comparison Chart to identify means that differ. Red intervals that do not overlap indicate means that differ from each other. Consider the size of the differences to determine if they have practical implications.
Page 31
The result of the above test shows that experience is a significant factor as the Pvalue is less than 0.05.
CONCLUSIONS: Testing of Hypothesis indicating that the few Key Input Variables (KPIs) are became significant like Shift, Qualification, Experience and Age. In Improve phase, further experimentation will be done by considering following significant factors that is Shift, Qualification, and Experience. While age is a significant factor but its a noise factor thats why not considered it further.
Page 32
IMPROVE PHASE
IMPROVE PHASE:
Design of Experiment
Page 33
0.33 0.483 132.81 79.71 6353.23 0.040952 -0.641979 16 10.00 55.00 135.00 180.00 280.00 175.29 180.00 123.36
50
100
150
200
250
300
95% C onfidence Interv al for M ean 95% C onfidence Interv al for M edian 95% C onfidence Interv al for S tDev 58.88
Graphical summary of the response clears that the data generated (i.e.: response) against the Design of Experiments is normally distributed and hence the assumption of the normality is fulfilled.
Page 34
95 90 80
Percent
70 60 50 40 30 20 10 5 B BC
-8
-6
-4 -2 Standardized Effect
Page 35
Term
Versus Fits
Residual
Percent
Histogram
6.0
Mean StDev N -4.44089E-15 24.31 16
Versus Order
50
Frequency
Residual
-40 -20 0 20 Residual 40
25 0 -25 -50
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Observation Order
Page 36
Mean
Main effect plots of shift, qualification & experience shows that the most dominating factor is the qualification. As it gives visible variation in the downtime required for the reducing the unplanned downtime.
105.0 Maters
115.0
Qualification
230.0
185.0 14 Experience 2
210.0 Evening
Cube plot for downtime shows the optimum conditions for the desired results. If minimum unplanned downtime is required than organization must work with the combination of Night shift, Master Qualification, and 14 year of experience (means maximum experience). SIX SIGMA (BLACK BELT)PROJECT Page 37
RESPONSE OPTIMIZATION:
Shift Evening Night Night Qualific Maters Maters A-Levels Experien 14.0 [14.0] 2.0
When optimizer is run for the goal of minimum downtime evaluation, it gives minimum downtime which is 15 minutes.
Page 38
95 90 80
Percent
70 60 50 40 30 20 10 5
-100
100 Downtime
200
300
Page 39
Downtime
20000 10000 0
k e d lc o or te ar Te tw ela dw r R Ne Ha nt lie rk C o tw Ne
r he Ot
11661 5465 5379 5237 4652 2842 2066 2043 1933 5681 24.8 11.6 11.5 11.2 9.9 6.1 4.4 4.4 4.1 12.1 24.8 36.5 47.9 59.1 69.0 75.0 79.4 83.8 87.9 100.0
Total Downtime
15000 10000
r o n rk ted are ver are or re ties er are he r ro rr Telc wa r ow two a ili w Er ovid ftw Ot el se o ftw e E kn e rd rd Fac e r R o s r P N a S Un ng Ha nt r H dow du ice rS a k e e ie e al Ch or rv Cl oc al Win Di Pr tw Se Di Ne Total Downtime 3258 1759 1503 1198 1180 1175 1106 1089 1025 1005 967 965 960 1703 Percent 17 9 8 6 6 6 6 6 5 5 5 5 5 9 Cum % 17 27 35 41 47 53 59 65 70 76 81 86 91 100
Feb
Mar 6961
Apr 7047
May 6263
Jun 6559
Jul 5405
Aug 6645
Sep 5430
Oct 4355
Nov 2122
Dec 2276
Page 40
Percent
Percent
30000
SIGMA LEVEL CALCULATIONS (AFTER IMPROVEMENT): FOUR MONTH STUDY: Total Downtime (Minutes) = 14183 Total Production Time (Minutes) = 93600 Opportunity per Unit = 1 DPU (Defects / Unit) = 14183 / 93600 = 0.1515 DPO (Defects/opportunity) = 14183 / (93600 X 1) = 0.1515 Yield = 1 DPO = 1 0.1515 = 0.8485 DPMO = 151528 (defects/million opportunity) Sigma Level = 2.53 (Using M.S. Excel) SIGMA LEVEL BEFORE THE IMPROVEMENT 2.06 AFTER THE IMPROVEMENT 2.53
CONCLUSIONS: With new suggested combination provided by the Design of Experiment; the overall system performance is improved as shown in the above sigma level calculation.
Page 41
CONTROL PHASE
CONTROL PHASE:
Page 42
120
144
Import date of leads not updated Part of filter missing while applying
196
168
Applied for only one day instead of until further notice 4 6 Current tasks finished
Usually asked from OPS team if change is only for one day or for until further notice No process controls available Normally developers mentions in which database to
144
54
Leads Loading
32
Tasks audit should be performed every month Always get a confirmation from dialer team and developer
Afriaz Cheema (Manager Systems and Production) Hammad Haasan (Database Administrator)
Page 43
Misunderstanding by the DBA Loaded with wrong scrubbing information Customers complaining about DNC 6 Wrong scrubbing information provided by client services/developer
Update Script
No process controls available Client services normally mentions the scrubbing criteria Developer just relies on the information provided
32 Double check with the client services before actually scrubbing the leads Communication regarding script must be documented and understanding must be developed by asking questions Script must be verified after development and before UAT There should be more focus to improve testing environment and employees should be encouraged to do more and more testing in test Arsalan Ahmed (Database Administrator)
72
100
120
Functionality not working in production environment Term codes were not added
168
216
Page 44
Applications crashing
Unhandled Exception 4
Some level of exception handling is done by the developer Once issues are reported the parameters are double checked to find out if anything was missed Information only flows through formal emails
128
Production affected
63
120
environment before going live More importance needs to be given to the exception handling in programming Updated parameters should be verified by the supervisor or checked by other team member in order make sure changes are correct Both dialer development and CTI Engine development team should have a closer interaction and communication before making any changes in any component. There should be more informal communication
Page 45
Normally version upgrade is done over the phone as some times clients need not to be aware of the change
96
96
Dialing stopped
No process controls available Normally when the issue arises, parameters are checked to make sure if these were updated correctly or not
Everything regarding new and old version must be documented including what is current version, what is new version and are the back out steps in case a change is failed Automatic upgrade must be turned off as it affect custom build application Parameters should be double checked Change management system must be properly implemented and every change must be properly documented and approved before actual change is implemented. Change must be verified by supervisor or another person
TAR updated with high value TAR updated with low value Wrong time zones opened Incorrect dialing mode applied
4 4 5
6
6
72
72
150
Zeeshan Jamil (Supervisor Systems and Production) Asim Zafar (Director IT Operations) Shoib Sakoor (Head of Production)
30
Page 46
S 8
96
96
Capacity issues
64
Page 47
Compatibility issues with other components Parameter Configuration Application not starting or crashing again and again Agents logged out and are in downtime Wrong parameters applied 8
96
144
Syntax mistake 2
Implemented himself checks the parameters Implemented himself checks the parameters
30
90
Stuck channels are not released so rest of the leads are marked as ND Resources utilization is too high and no further resources are available to perform further actins
All relevant stakeholders should be involved during development process Parameters should be double checked and if possible there should be tools available instead of updating anything from backend Training should be given to the technical teams on the current tools being used Parameters should be double check with another team member immediately after any updating This needs to be taken care at programming level
OPS team should always inform technical teams if there is any change in current resources allocation
Page 48
120
Page 49
S 8
64
Head of production normally checks with the development team about incompatibility issues
120
Afriaz Cheema (Manager Systems and Production) Afaq Ali (Deputy Head of Production)
Unexpected error
Facility Evacuation
All resources are in downtime All relevant resources are logged out
96
144
2 2
3 3
48 48
Ehtesham Opel (Manager System Administrator) John Skubis (Field Support Officer)
Page 50
Earthquake
Maintenance
96
Nothing can be done Time should be properly mentioned either its Pakistan time, US time or UK time Hardwares must be continuously updated An extra resource should be hired who could be contacted during off contact hours of primary person Antivirus must be regularly updated and security policies must be tightened without exception to anyone. WSUS server must be installed so that all users are automatically updated on regular basis Munam Khalid (Helpdesk Analyst)
120
Hardware failure during maintenance No one available to restart system after maintenance as the person is moved to another site for some other maintenance Antivirus was not updated
96
120
Afriaz Cheema (Manager Systems and Production) Shoib Sakoor (Head of Production)
Virus Attack
System shutdown
Some malicious software was installed Windows domain server was attacked due to lack of security Some malicious software/patch was installed that affected the software
System not recognizing the users Some applications not functioning properly
Antivirus is manually updated All users are not allowed to install the software Antivirus and security policies are in place
48
64
96
60
Page 51
S 8
64
Controller is down and was not restarted CTI Engine is not properly working Dialogic is not working properly X-ten /dial pad is not configured properly
288
216
216 Agents should be properly given training regarding how to configure the phone before they login Agents should always sit on their designated stations New agents must always be assigned to the campaign by requesting the required Jason Oliver(Operations Manager)
192
168
No campaigns loaded
168
Page 52
No dialing
Campaign not setup to dial during the specific time agents try to login
Reboot Dialer
Loss of production 8
Was shutdown instead of reboot Hardware failure Dialogic was not started with optimum settings CTI Engine threads were not stable when next application was started
Production time is normally communicated to Technical teams before production starts No process control available No process control available No process control available
168
technical teams before they try to login OPS team should be informed if dialer is changed and it should be documented and OPS should acknowledge the receipt of email as well OPS team must ensure that they always inform technical teams if there is any change in production time Shutdown option should be disabled for the user Nothing can be done Technical support teams should be given proper training and their knowledge should be continuously updated regarding the
168
112
64
256
Afriaz Cheema (Manager Systems and Production) Shoib Sakoor (Head of Production)
Page 53
Applications were started in wrong sequence One of the applications were not started
112
168
Script Deployment
Script Missing
Agents can't see the script and can't properly communicate with customers Performance impacting as agents can't properly communicate with customers
Script was not deployed on one of the terminal servers Some files were not copied Version number was not updated
60
current processes. Checklists should be maintained and followed if a process always follows the same steps. A test agent should also be logged in before production to ensure applications are properly working An automated application should be developed that deploys the script itself
120 An automated application should be developed that updates the script version automatically Afaq Ali (Deputy Head of Production)
96
Page 54
Critical identified processes which should be monitored on regular basis according to corresponding recommended actions.
Process Function / Requirements Potential Effects of Failure Potential Causes / Mechanisms of Failure Current Process Controls Root Cause Category RPN
Recommended Action(s) Before production login a test agent to make sure that dialer applications are up and running Technical support teams should be given proper training and their knowledge should be continuously updated regarding the current processes. Checklists should be maintained and followed if a process always follows the same steps. A test agent should also be logged in before production to ensure applications are properly
Responsibility
Login agents
Can't login
288
Procedure Error
Reboot dialer
CTI Engine threads were not stable when next application was started
256
Afriaz Cheema (Manager Systems and Production) Shoib Sakoor (Head of Production)
Procedure Error
Page 55
Update Script
216
Login agents
Can't login
216
Apply filter
196
Software Crash
192
working There should be more focus to improve testing environment and employees should be encouraged to do more and more testing in test environment before going live Before production login a test agent to make sure that dialer applications are up and running Double check the filter with developer and with Ops manager to discuss the result of implementation and avoid requesting filter change himself Before deployment of application, an extensive QA/testing should be performed and
Change Error
Procedure Error
Change Error
Dialer Software
Page 56
specially stress testing should also be performed Agents kicked out of agent application immediately after they login Agents should be properly given training regarding how to configure the phone before they login Change management system must be properly implemented and every change must be properly documented and approved before actual change is implemented. Change must be verified by supervisor or another person Parameters should be double checked and if possible there should be tools available instead of updating anything from backend
Login agents
192
Procedure Error
Normally when the issue arises, parameters are checked to make sure if these were updated correctly or not
150
Change Error
Parameter configuration
144
Dialer Software
Page 57
System Crash
Hardware failure Stuck channels are not released so rest of the leads are marked as ND
144
Hardwares must be continuously updated This needs to be taken care at programming level Compatibility issues must be taken into consideration before updating any software and all stake holders should be informed about the update Time should be properly mentioned either its Pakistan time, US time or UK time
Ehtesham Opel (Manager System Administrator) Abul Asim (Senior Software Engineer/Program Architect)
Facilities Failure
125
Dialer Software
System Crash
System shutdown
Head of production normally checks with the development team about incompatibility issues
120
Afriaz Cheema (Manager Systems and Production) Afaq Ali (Deputy Head of Production)
Facilities Failure
Maintenance
120
Facilities Failure
Page 58
40
Mean
_ _ X=30.16
20 LCL=2.89 1 3 5 7 9 11 Subgroup 13 15 17 19
Is the process variation stable? Investigate out-of-control subgroups. Look for patterns and trends.
150 100 50 LCL=19.7 0 1 3 5 7 9 11 Subgroup 13 15 17 19 UCL=157.2 _ R=88.5
Range
Page 59
Yes
0.0%
No
40
Mean
_ _ X=30.16
20 LCL=2.89 1 3 5 7 9 11 13 15 Subgroup 17 19
Comments The process mean is stable. No subgroups are out of control on the Xbar chart. 0
CONCLUSIONS: Process Failure Mode & Effect Analysis (PFMEA), X-bar & R Charts are suitable for monitoring and controlling the system.
Range
_ R=88.5
Page 60
Page 61