Escolar Documentos
Profissional Documentos
Cultura Documentos
Vol. III, Issue X – October 2014 Precision as a Stochastic Cost Estimating Method
www.pmworldjournal.net Featured Paper by Yosep Asro Wain
Updating the Lang Factor and Testing its Accuracy, Reliability and
Precision as a Stochastic Cost Estimating Method
ABSTRACT
The Lang factor is a one of the factored estimating techniques that is recommended American
Association of Cost Engineers (AACE) International for class 4 and class 5 estimates. This
method was proposed by Hans J. Lang in 1940’s use a simple formula: consist of a set of
factor multiplied by the Total Equipment Cost (TEC) to obtain the Total Plant Cost (TPC).
These factors are 3.10 for solid plant, 3.63 for solid-fluids plant and 4.74 for fluids plant.
Over the ensuing decades, several people tried to calculate the Lang factor by using their
current data.
In this paper, the fluid plant Lang factor was updated and tested its accuracy, precision and
reliability by using historical project data from a major Indonesian national oil company.
The result of updating and testing show that while the Lang factor is appropriate for use for
Class 4-5 estimates, because it exhibits such a high degree of variability; it is not
recommended for creating high accurate, reliable or precise of cost estimates.
Keywords: Lang factor, Factorized Estimating, Cost Estimating, Stochastic Cost Estimating,
Parametric Cost Estimating, Equipment Factor Cost Estimating, Statistical Analysis, Monte
Carlo Simulation.
1. Introduction.
Cost estimating is the predictive process used to quantify, cost, and price the resources
required by the scope of an investment option, activity, or project [1]. In that regard, cost
estimating contains two thinks, namely resources quantification and resources pricing or
costing. In resources quantification, the project scope definition in the form of the work
breakdown structure (WBS) and the work statement may be used to identify the activities that
make up the work, and further, each activity is decomposed into detailed items so that labor
hours, material, equipment and subcontract are itemized and quantified [2]. In resources
pricing, any methodology such as stochastic, factored, or deterministic may be used to cost
the resources.
The cost estimating process is carried out during the entire of the project life cycle. At
beginning of the project where scope definition is still roughly, the accuracy of the cost
estimation is low. As the project definition go to more detail, then the accuracy of the cost
estimation become higher.
Figure 1 provides an example of the variability in uncertainty ranges for a process industry
estimate versus the level of project/scope definition.
Figure 2, Example of the Variability in Accuracy/Uncertainty Ranges for a Process Industry Estimate [4]
A Class 5 estimate is associated with the lowest level of project definition or maturity, while
a Class 1 estimate, with the highest one. The estimating methodology tends to progress from
stochastic or factored to deterministic methods with increase in the level of project definition,
which result the increase in accuracy. Meanwhile, preparation effort ranging from the lowest
on Class 5 estimate (0.005% of project cost) to the highest on Class 1 estimate (0.5% of
project cost).
Factored estimating techniques are method recommended by AACE International for Class 4
and Class 5 estimate. These factors are derived from historical data by using statistical
inferential or modeling. Several type of factored estimating that is used, especially in process
industries are capacity factored estimates, equipment factored estimates, and parametric cost
estimates.
The capacity factored estimates using cost of the similar plant or equipment of known
capacity to obtain cost of a new plant or equipment, by using equation :Cost B/CostA =
(CapB/CapA)r, where CostA and CostB are the costs of two similar plants or equipment, Cap A
and CapB are the capacities of the two plants or equipment, and r is the exponent or proration
factor.
The equipment factored estimates is used to obtain total installation cost from equipment
cost. Several method are categorized as the equipment factored estimates are Lang factor,
Happel, Hand, Hackney, and Guthrie.
Parametric model estimates using parametric model to obtain equipment cost and further the
total plant cost. Parametric model is derived from statistical analysis of equipment cost data
from specific time duration.
As mentioned above, Lang factor is a one of the factored estimating techniques that is
recommended for Class 4 or Class 5 estimate. Lang factor was proposed by Hans J. Lang in
1940’s, using a simple formula; consist of a set of factor multiplied by main equipment cost
to obtain total cost. These factors are 3.10 for solid plant, 3.63 for solid-fluids plant and 4.74
for fluids plant.
Over the ensuing decades, several people tried to calculate the Lang factor by using their
current data. Several of these studies and results are contained in the Fixed Capital Cost
Estimation chapter of Perry’s handbook; others include in books by Gerrard, Page, and
Dysert, also studied by Wolf, T.E [6].
The updated Lang factor also is used in the AACE International Recommended Practice,
which are 3.89 for solid plant, 5.04 for solid-fluids plant and 6.21 for fluids plant [7].
During period from Lang Factor was introduced until now, many things have changed. There
are now governmental rules and regulations in-place, which just did not exist in the 1940s
and 1950s. There are materials and construction methods that are different. There are digital
process controls instead of pneumatic controls. The computer is used in lieu of the slide rule
and there is a three dimensional of computer design. Then there is material and labor cost
inflation (escalation) over the many decades [8]; therefore, updating the factor sometimes
necessary to adapt with the current condition.
In addition, testing the accuracy, precision and reliability of the Lang factor also necessary to
know whether this method can be used for high accuracy of cost estimate or not.
In this paper, Lang factor will be updated and tested its accuracy, precision and reliability by
using historical data from Refinery Directorate of a major Indonesian national oil company
(“Company”).
2. Lang Factor.
Hans Lang introduced the concept of using the total cost of equipment to estimate the total
cost of a plant [9], by using the following formula:
TPC = f x TEC
………………………………………………………………………………………………………………………
….. (1)
The TPC is a total plant cost, while the TEC is total (main) equipment cost.
As mentioned above, several people have tried to calculate the Lang factor by using their current data, some of
them as shown in table 2.
Lang’s approach was simple, utilizing a factor that varies only by the type of process. Today,
many different methods of equipment factoring have been proposed. The Lang factor,
however, is often used generically to refer to all the different types of equipment factors [15].
3. Data Gathering.
Data are collected from several refineries owned by Company which are located at several
areas in Indonesia. A total 29 project data sets is obtained, spanning from 2003 to 2013 (10
years project data). Figure 2 shows data distribution based on refinery unit. All of the 29
projects are for fluid plants; therefore the Lang factor to be obtained and tested in this paper
is for process fluid only.
20
15
10
5
0
Unit A Unit B Unit C Unit D Unit E
Each of the 29 sets of project data consists of the total plant cost (TPC) and the total
equipment cost (TEC).
The data range from several tens of thousands up to hundreds of millions US Dollar of TEC.
A factor f for each project data is derived by using the following equation:
f = TPC/TEC
……………………………………………………..………………………………………………………
……………. (2)
9.0
8.0
7.0
6.0
5.0
4.0
3.0
2.0
1.0
0.0
0 50,000,000 100,000,000 150,000,000 200,000,000
Figure 3, Scatter Plot of f and TEC (USD)
4. Data Analysis.
It is necessary to do data analysis before is used for deriving the Lang factor. Data analysis
includes understanding of the data and relationship between the variables. Table 3 shows
descriptive statistics for factor f.
Table 3 Descriptive Statistics for the factorf.
Description Statistical
Mean 3.298
Standard Error 0.293
Median 2.865
Standard Deviation 1.580
Sample Variance 2.495
Kurtosis 3.035
Skewness 1.673
Range 7.121
Minimum 1.221
Maximum 8.342
Confidence Level (95%) 0.601
The factor, f range from 1.221 to 8.342, with the mean is 3.298 and standard deviation is
1.580, at 95% confidential level. This information shows us the wide variability of the data.
The value of Kurtosis and Skewness are 3.035 and 1.673 respectively indicate that
distribution of data is not symmetries with wide tail to the right (right skew).
Another aspect that necessary to analyze are accuracy, precision and reliability of the data.
The standard deviation value of 1.580, also indicate that the precision of data is very low
(wide spread). In this case, we could not analyze the accuracy of the data, unless we know the
true value of the data. To know reliability of the data, outlier checking by using Q-test was
conducted for rightmost data, namely 8.342. The result was Qcalculated 0.214<Qcritical 0.298 at
95% confidential level, which indicate there is no outlier data, so that the data is reliable.
Due to data distribution is not symmetries, it is necessary to conduct Monte Carlo simulation
first to find normal distribution data, and then to derive Lang factor.
Further analysis is to identify whether there is any relationship between variables, in this case
is relation between TEC as the independent variable and the factor (f) as the dependent
variable. For this purpose, the Pearson Correlation test was conducted with result Pearson
correlation coefficient, r = -0.244, that shows very low correlation between both variables.
LNf = ln (f)
……………………………………………………………………………………………………………
……………………(4)
The Pearson Correlation test was conducted on transformation data with result Pearson
correlation coefficient, r = -0.40, that shows medium correlation.
Another possibility of relationship between TEC and f is polynomial form. The Pearson
Correlation test was conducted for some polynomial forms as shown in Table 4.
Table 4: Pearson Correlation Test for Polynomial Relationship
As shown in the table, the correlation between TEC and f in polynomial forms are very low
also.
From the descriptive statistical of factor f on Table 3, at first glance we may conclude that the
Lang factor, f is:
f = 3.298
………………………………………………………………………………………………………….…………
………….(6)
A value of 3.298 is mean of the data. However, due to the data is slightly far from normal
distribution, then it is better to first doing Monte Carlo simulation to the data, and then using
it for obtaining the Lang factor.
The table 5 and figure 4 show result of Monte Carlo simulation for the data.
Table 5.Descriptive Statistics for the Factor, f after Monte Carlo Simulation.
Description Statistical
Mean 3.264
Standard Error 0.049
Median 3.299
Standard Deviation 1.560
Sample Variance 2.435
Kurtosis 0.096
Skewness 0.003
Range 10.418
Minimum -1.726
Maximum 8.691
Count 1000
Confidence Level (95%) 0.097
120 120%
100 100%
Frequency
80 80%
60 60%
40 40%
20 20%
0 0%
0.29
1.30
2.31
3.31
4.32
5.33
6.34
7.35
8.36
-1.73
-0.72
Bin
As shown in Table 5 and Figure 3, the range of the data is from -1.726 to 8.691. In regard to
the factor f, the value less than 1 has no meaning, therefore it is better to be excluded from
the data. The best way to exclude this data is by cutting both tail left and right of the data.
Due to probability the factor value of 1 is7.3%, so that the data will be cut on 7.3% left and
right.
The table 6 and figure 5 show the descriptive statistics and histogram of data after doing
7.3% tail cutting.
Table 6. Descriptive Statistics for the Factor, f after 7.3% left and right tail cutting.
Description Statistical
Mean 3.282
Standard Error 0.038
Median 3.323
Standard Deviation 1.123
Sample Variance 1.261
Kurtosis -0.849
Skewness -0.036
Range 4.521
Minimum 1.005
Maximum 5.526
Count 855
Confidence Level (95%) 0.075
60 120%
50 100%
Frequency
40 80%
30 60%
20 40%
10 20%
0 0%
Bin
Figure 5, Histogram for the Factor, f after 7.3% left and right tail cutting.
As shown in Table 6, the mean of the data is 3.282 and standard deviation is 1.123, with
0.038 of standard error that means there is a significant improvement on standard error. The
significant improvement also occurred in the data distribution, as indicted by value of the
Kurtosis and Skewness. Hence the Lang factor f will be:
f = 3.282
………….………………………………………………………………………………………………….………
…………….(7)
Just call this factor as a first model of Lang factor. Further, we tried to obtain second model
of Lang factor from logarithmic relationship as mentioned above. To this purpose, regression
analysis was conducted on LNTEC on equation (3) as the independent variable and LNf on
equation (4) as the dependent variable, with result as shown in table 7a, 7b, and 7c.
df SS MS F Significance F
Regression 1 0.792 0.792 5.13 0.032
Residual 27 4.173 0.155
Total 28 4.965
From the regression statistics table, model variance is low, as indicated by R Square value of
0.160, which is means only 16% variance in dependent variables can be explained by the
regression model. Based on ANOVA table, this model is significant at 0.05 level. From
Coefficients table, linear model is obtained as follow:
LNf = 2.469 – 0.09 * LNTEC
…………………………………………………………………………………………………..…(8)
The Coefficient table also indicate that the LNTEC variable also significant at 0.05 level.
Equation (8) can be simplified as:
LNf = ln (11.807) + LNTEC-0.09
…………………………………………………………………………………………….….(9a)
or
ln f = ln (11.807 x TEC-0.09
)………….…………….………………………………………………………………………..…(9b)
Another model for Lang factor, namely the third model is obtained from the polynomial
relationship. Table 8 contains several polynomial model are obtained from regression
analyses.
Table 8, Polynomial Model for Lang Factor
Order Equation R
Squared
1st f = 3.487 – 9.9 * 10-9 * TEC 0.060
2nd f = 3.598 – 2.4 * 10-8 * TEC + 9.0 * 10-17 * TEC2 0.076
rd
3 -8
f =3.865 – 9.4 * 10 * TEC + 1.5 * 10 -15 2
* TEC - 5.7 * 10 -24
* TEC 3
0.143
4th f =4.163 – 2.2 * 10-7 * TEC + 6.8 * 10-15 * TEC2 - 6.9 * 10-23 * TEC3 + 2.1 * 10-31 * 0.198
th TEC4
5 0.243
f =4.455 – 3.9 * 10-7 * TEC + 2.0 * 10-14 * TEC2 – 4.1 * 10-22 * TEC3 + 3.3 * 10-30 *
TEC4 -
6th 0.225
8.8 * 10-39 * TEC5
f =4.602 – 5.0 * 10-7 * TEC + 3.5 * 10-14 * TEC2 – 1.0 * 10-21 * TEC3 + 1.5 * 10-29 *
7th 0.338
TEC4 -
9.3 * 10-38 * TEC5 + 2.2 * 10-46 * TEC6
8th 0.343
f =5.074 – 1.0 * 10-6 * TEC + 1.5 * 10-13 * TEC2 – 8.9 * 10-21 * TEC3 + 2.6 * 10-28 *
TEC4 -
9th 3.7 * 10-36 * TEC5 + 2.5 * 10-46 * TEC6 – 5.9 * 10-53 * TEC7 0.343
-7 -14 2 -20
f =4.960 – 8.5 * 10 * TEC + 9.2 * 10 * TEC – 3.2 * 10 * TEC – 1.4 * 10 * 3 -29
TEC4 +
10th 3.0 * 10-36 * TEC5 – 6.0 * 10-44 * TEC6 + 4.6 * 10-52 * TEC7 – 1.2 * 10-60 * TEC8 0.343
-7 -13 2 -21
f =4.987 – 9.1 * 10 * TEC + 1.1 * 10 * TEC – 6.2 * 10 * TEC + 1.9 * 10 * 3 -28
TEC4 -
4.5 * 10-36 * TEC5 + 9.1 * 10-44 * TEC6 – 1.2 * 10-51 * TEC7 + 8.4 * 10-60 * TEC8 –
2.0 * 10-67 * TEC9
f =4.996 – 9.3 * 10-7 * TEC + 1.3 * 10-13 * TEC2 – 8.7 * 10-21 * TEC3 + 4.3 * 10-28 *
TEC4 -
1.8 * 10-35 * TEC5 + 5.0 * 10-43 * TEC6 – 8.8 * 10-51 * TEC7 + 8.7 * 10-59 * TEC8 –
Order Equation R
Squared
4.4 * 10-67 * TEC9 + 8.7 * 10-76 * TEC10
The table above also shows that the models variance is low, as indicated by the value of R
Square range from 0.06 to 0.343.
6. Model Comparison.
Up to now, we have obtained three models for Lang factor; the Lang factor as a constant as
shown in equation (7); Lang factor as logarithmic function of TEC as shown in equation (9c)
and Lang factor as polynomial function of TEC as shown in Table 8. Even though the
models have low variance, however it is necessary to compare the calculation error of each
other by using 29 pairs fit data on following equation:
Individual Calculation Error (%) = ((Exact Value – estimated Value) / Exact
Value)*100%……………….……(10)
and
Average Calculation Error (%) = (((| Exact Value – estimated Value | / Exact Value)/n
)*100%………………(11)
The “|” symbol mean absolute value, and n is number of data, that is 29.
The results of the average calculation error are 38%, 31% and 38% for Constant Lang factor,
Logarithmic function and 2ndPolynomial Lang factor respectively.
In addition, it is necessary to derive the Lang factor for each refinery unit. Due to limitation
of the data, factor that derived only for the first model. By using same procedure used to
obtain Lang factor for whole refinery units, the first model of Lang factor for each refinery
unit were derived as shown in Table 9.
Table 10, Lang Factor for Each Refinery Unit
Data in Table 9 shows us that the Lang factor is different between the refinery units. The
smallest Lang factor is for Unit A is equal to 2.497, while the largest is for Unit E is equal to
5.260.
8. Conclusion.
By using the company own data, the fluid plant Lang factor has been obtained with value of
3.282 (ref equation (7)). This value is derived from 29 pairs of data with wide range of
factor, namely 1.221 to 8.342.
Calculated fluid plant Lang factor also variety between refinery units as shown in Table 10.
Lang factor as a function of TEC also was tried to be derived, for logarithmic and
polynomial function, however the result shows that model correlation and variant are very
low, as indicated by low value of both, correlation factor and R-square.
The individual calculation errors of the obtained Lang factor are spread, with the average
about 30% to 40%. As shown in figure 6, the individual calculation errors (for constant Lang
factor) are spread on all classes of estimate, even there are go out from class 5 estimate,
however most of them lie on area of class 4 and class 5 estimates, as their average were
located. This VALIDATES using Lang Factors for Class 4 and 5 estimates, however based
on this research; the original Lang Factor is no longer valid and needs to be revised and the
use of a single Lang Factor is not recommended.
Proposed Revisions.
The comparison of the company owned Lang factor of 3.282 with the Lang factors which
obtained previously, shows that all of previously obtained Lang factor is higher than this
company owned Lang factor, as shown in Table 11.
Table 11, Benchmarking of Company Owned Lang Factor and Others
All of these indicate that the Lang factor exhibits a high degree of variability; therefore it is
not recommended for creating highly accurate, reliable or precise cost estimates other than
Class 5 or Class 4
1
It is assumed that the Lang factor is used during project initiation phase, where degree of project definition below 10%.
Finally, due to wide variety of Lang factor as mentioned above, it is better to express it as a
range, instead of as a single value.
That’s means the Lang factor f will be in the range of 2.159 to 4.405.
Or, suppose the range is from P30 to P90, so that the Lang factor f will be in the range of
2.693 to 4.721.
References
1. AACE International, Recommended Practice No. 46R-11. (2013). Required Skills And
Knowledge of Project Cost Estimating, Page 1 of 21, AACE International, Morgantown, WV.
2. US Government, Department of Energy (DOE). (2011). DOE G 413.3-21 Cost Estimation
Guide, Page 18.
3. US Government, Department of Energy (DOE). (2011). DOE G 413.3-21 Cost Estimation
Guide, Page 15.
4. US Government, Department of Energy (DOE). (2011). DOE G 413.3-21 Cost Estimation
Guide, Page 16.
5. AACE International. (2011). Recommended Practice No. 59R-10, Development of Factored
Cost Estimates – As applied In Engineering, Procurement, And Construction For The Process
Industries. AACE International. Morgantown, WV.
6. Wolf, T.E. (2013). Lang Factor Cost Estimates. Retrieved
from http://prjmgrcap.com/langfactorestimating.html.
7. AACE International. (2011). Recommended Practice No. 59R-10, Development of Factored
Cost Estimates – As applied In Engineering, Procurement, And Construction For The Process
Industries. AACE International. Morgantown, WV.
8. Wolf, T.E. (2013). Lang Factor Cost Estimates. Retrieved
from http://prjmgrcap.com/langfactorestimating.html.
9. Dysert, L.R., (2003). Sharpen Your Cost Estimating Skills. Cost Engineering Vol. 45/No.6 June
2003, page 25.
10. Dysert, L.R., (2003). Sharpen Your Cost Estimating Skills. Cost Engineering Vol. 45/No.6 June
2003, page 25.
11. AACE International. (2011). Recommended Practice No. 59R-10, Development of Factored
Cost Estimates – As applied In Engineering, Procurement, And Construction For The Process
Industries. AACE International. Morgantown, WV.
12. Perry, Robert H, Don W. Green, and James O. Maloney. (1997).Perry's Chemical Engineers'
Handbook, New York: McGraw-Hill, Seventh Edition April 1997., page 9-68.
13. Dysert, L.R., (2003). Sharpen Your Cost Estimating Skills. Cost Engineering Vol. 45/No.6 June
2003, page 25.
14. Wolf, T.E. (2013). Lang Factor Cost Estimates. Retrieved
from http://prjmgrcap.com/langfactorestimating.html.
15. Dysert, L.R., (2003). Sharpen Your Cost Estimating Skills. Cost Engineering Vol. 45/No.6 June
2003, page 25.
Bibliography
1. AACE International, Recommended Practice No. 46R-11. (2013). Required Skills And
Knowledge of Project Cost Estimating, AACE International, Morgantown, WV.
2. AACE International, Recommended Practice No. 17R-97. (2011). Cost Estimate Classification
System, AACE International, Morgantown, WV.
3. AACE International. (2011). Recommended Practice No. 59R-10, Development of Factored
Cost Estimates – As applied In Engineering, Procurement, And Construction For The Process
Industries, AACE International. Morgantown, WV.
4. US Government, Department of Energy (DOE). (2011). DOE G 413.3-21 Cost Estimation
Guide.
5. Wolf, T.E. (2013). Lang Factor Cost Estimates. Retrieved
from http://prjmgrcap.com/langfactorestimating.html.
6. Wolf, T.E. (2013). Escalation Treatment Validation. Retrieved
from http://prjmgrcap.com/estimatesescalationrelationships.html
7. W. Andreas, W. Wahyu. (2007). Capacity Factor Based Cost Models for Buildings of Various
Functions. Civil Engineering Dimension, Vo.9, No. 2, 70-76, September 2007.
8. Harrison R. (2010). Capital Project Cost Estimation in the Phosphate Industry, Presented June
12, 2010 at the Annual Clearwater Conference of the Central Florida Chapter of the American
Institute of Chemical Engineers.
9. Dysert, L.R. (2003). Sharpen Your Cost Estimating Skills. Cost Engineering Vol. 45/No.6 June
2003.
10. Perry, Robert H, Don W. Green, and James O. Maloney. (1997). Perry's Chemical Engineers'
Handbook, New York: McGraw-Hill, Seventh Edition April 1997.
11. Office.microsoft.com. Introduction to Monte Carlo simulation. Retrieved from
http://office.microsoft.com/en-us/excel-help/introduction-to-monte-carlo-simulation-
HA001111893.aspx.
12. Support.microsoft.com. Excel statistical functions: PEARSOM. Retrieved from
http://support.microsoft.com/kb/828129.
13. Laerd Statistics. Pearson Product-Moment Correlation. Retrieved
fromhttps://statistics.laerd.com/statistical-guides/pearson-correlation-coefficient-statistical-
guide.php.
14. Archive.bio.ed.ac.uk. Correlation,and regression analysis for curve fitting. Retrieved
fromhttp://archive.bio.ed.ac.uk/jdeacon/statistics/tress11.html.
15. home.ubalt.edu. Statistical Data Analysis.Retrieved fromhttp://home.ubalt.edu/ntsbarsh/stat-
data/Topics.htm.
16. mathsisfun.com. (2013). Percentage Error. Retrieved from
http://www.mathsisfun.com/numbers/percentage-error.html.