Escolar Documentos
Profissional Documentos
Cultura Documentos
Variable reduction
There might be fields with equal value for all the observation.
For this specific situation variable standard deviation will be
zero. We can remove these set of variables as it cannot have
any contribution on the model. There may be variables also for
which almost all (say > 98%) the records are with equal value.
We should not use these variables as they cannot contribute
much in the model. Calculating percentile with minimum and
maximum value of the variable will help identify such variables.
Method of Correlation
● ● ●
In the method of prediction where we have one response Data mining methods simplify the extraction of
variable and other set of predictors this technique is very
much useful. Though we can first use any one of the above key insights from a huge database. They offer the
two methods to reduce the number of predictors in stage one possibility of starting the analysis from any given
and then use this method for farther reduction. Let use point in it. However, without proper methods and
consider we have response variable as Y and predictors are
X1, X2, … , Xn. Calculate the correlation matrix for all techniques we may never be able to do so. Variable
predictors including Y. Here we can impose a condition on reduction technique greatly helps both in handling
correlation value when we will take any one of two predictors
huge data and reducing the model development
if it is higher than some specific value, say r. Now if r ij,
correlation between Xi and Xj is greater than r we will keep time. And in the bargain, all of this is accomplished
Xi if r yi > r yj, where ryi is correlation between Y and Xi. In without sacrificing the quality of the model.
practice we generally use r ranges within (0.75 to 0.9).
Identifying the right technique becomes all the
more easier with a better understanding of the
data.
Note: If you feel that still you have many variables for model
and need to reduce prior to actual model you can do this on
With techniques like these we, at Cequity, are able
the basis of VIF value of each predictors performing
regression of Y on predictors. Remove the variable which to combine data & technology, and build
has VIF higher than 2.5 and remove variable one by one. actionable analytical marketing services to
accelerate ROI-driven, real-time customer-
engaged marketing. Touch base with us to learn
more…
Reach us at 105-106, 1st Floor, Anand Estate, 189-A, Sane Guruji Marg, Mahalaxmi, Mumbai-400 011, India
Phone: +91 22-43453800 Fax: +91 22-43453840
For more case studies, white papers and presentations log on to www.cequitysolutions.com
Or Write to info@cequitysolutions.com
For the latest thinking in Analyical Marketing, check out our blog at blog.cequitysolutions.com