Você está na página 1de 3

DEPARTMENT OF CHEMICAL ENGINEERING

CH 544 : MULTIVARIATE DATA ANALYSIS


Assignment 2

1. In order to study the implication of various assumptions in OLS and TLS we can
simulate data (using MATLAB) for the univariate linear regression model. Yt = 4Xt.
Generate a sample of 100 measurements corresponding to different pairs of (Yt, Xt). The
errors in Y and X are both assumed to be normally distributed with mean 0 and variance
0.25. Assume that the true values Xt are drawn from a uniform distribution between 1
and 4. Estimate the slope of the regression line assuming (i) X is measured without error,
(ii) Y is measured without error and (iii) both are measured with error and have same
error variances. Plot the three estimation regression lines along with the true line (for
visual inspection). For case (iii) note the smallest singular value of the data matrix, and
study the effect of sample size N on the estimated slope as well as the smallest singular
value. Report your conclusions.

2. It is desired to analyze the cost of fabricating a plant in terms of labour and


engineering units in the plant for the following data.

Cost in Labour Engineering


1000 Rs. (x) Units y
310 120 55
300 130 50
275 108 52
250 110 42
220 84 40
200 90 30
190 80 23
150 55 12
140 64 19
100 50 10

The cost (z) is modeled as a linear function of x and y, that is, z = a0 + a1x + a2y.
Estimate the coefficients of this linear model using OLS or TLS as the case may be for
the following assumptions (i) Only z is measured with error (ii) All three variables are
measured with error and the error variances for all three measurements are equal. Solve
the problem by manual computation and verify the results using MATLAB.

3. Fitting VLE data using nonlinear regression : The following table gives experimental
VLE data for water-dioxane system at 20 deg C at different pressures.
Mole fraction of water in Total pressure p
liquid phase (x1) (mm Hg)
0 28.1
0.1 34.4
0.2 36.7
0.3 36.9
0.4 36.8
0.5 36.7
0.6 36.5
0.7 35.4
0.8 32.9
0.9 27.7
1 17.5

The vapour phase is assumed to be an ideal gas mixture, while the non-ideal liquid
phase behaviour is modeled using Van-Laar equation which is given by

2
 A21 x 2 
lnγ 1 = A12  
 A12 x1 + A21 x 2 
2
 A12 x1 
lnγ 2 = A21  
 A12 x1 + A21 x 2 

The saturation pressures of water and dioxane at 20 deg C can be estimated using
Antoine equation log psat = a – b/(T + c) where psat is in mm Hg and T is in deg C.
The constants a, b and c are as follows

Species a b c
Water 8.07131 1730.63 233.426
Dioxane 7.43155 1554.679 240.337

Obtain the best fit values of the constants of the Van-Laar model for the following
cases (i) Assume that the estimated saturation pressures from Antoine equation
and liquid phase mole fraction measurements are without error (use nonlinear
OLS), (ii) Assume the experimentally measured pressures and liquid mole
fractions both contain error. The standard deviation of the error in mole fraction
measurements is 0.002, while the standard deviation of error in pressure
measurement is 0.05 mm Hg. (iii) Assume in addition to errors in measured
pressure and liquid phase mole fraction, the saturation pressures estimated from
Antoine equation have an error whose standard deviation is 0.1 mm Hg. For all
three cases use MATLAB to obtain the estimates.
Note: For cases (ii) and (iii) set up a weighted TLS nonlinear regression problem
similar to MLE approach. The resulting problem is a nonlinear constrained
optimization problem which may be solved as such or converted to a nonlinear
unconstrained optimization problem. The unconstrained optimization problem
can also be solved using the built-in MATLAB function. You can also attempt to
solve the unconstrained optimization problem using successive linearization
where the linearized problem can be solved using weighted PCA

Você também pode gostar