分享

6.2 Using the Multiple Regression Equation

 朗朗xl 2017-02-25



Using the Multiple Regression Equation

Multiple regression equation is used to obtain a ? value when particular values of two or more X variables present in the equation are given. You may interpret ? as an estimate of the mean of the subpopulations of the Y values assumed to exist for a particular combination of Xi values.

Under this interpretation ? is called an estimate, and its equation is called estimation equation and the corresponding interval is called confidence interval. The second interpretation of ? value is that it is the value of Y that is most likely to assume for given values of Xi. In this case ? is called predicted value of Y, and the equation is called prediction equation and the corresponding interval is called prediction interval.

The 100(1-α) % Confidence Interval for the Mean of a subpopulation of Y values given particular values of the Xi is as follows:




where S? is the standard error of the prediction. The 100(1-α) % Prediction Interval for a particular value of Y given particular values of the Xi is as follows:




where, S'? is the standard error of the prediction. 

c. Simple Linear Correlation

Correlation analysis deals with the association between two or more variables or it is an attempt to determine the degree of relationship between two variables, when the relationship is of a quantitative nature. 

For example:

(1) to check the effect of increase in rainfall up to a point and the production of rice. 

(2) to check the effect of trained operators on the output of a process 

(3) to check whether there exists some relationship between age of husband and age of wife. 

The use of correlation analysis is very important in Six Sigma. Correlation analysis helps the analyst to study cause and effect of a problem. This can be used in every stage of problem solving and planning process. 

Significance of the study of correlation

1. Most of the variables show some kind of relationship. For example, there is relationship between price and supply, income and expenditure, etc. With the help of correlation analysis you can measure in one figure the degree of relationship existing between the variables. 

2. Once you know that the variables are closely related, you can estimate the value of one variable given the value of another with the help of regression analysis. 

3. In Six Sigma operations, correlation analysis enables the analyst to estimate costs, sales, prices and other variables on the basis of some other series with which these costs, sales, or prices may be functionally related.

Correlation Assumption

The following assumptions must hold for inferences about the population to be valid when sampling is from bivariate distributions.

  • For each value of X there is a normally distributed subpopulation of Y values.
  • For each value of Y there is a normally distributed subpopulation of X values.
  • The joint distributions of X and Y is a normal distribution called bivariate normal distributions.
  • The subpopulations of X and Y values all have the same variance.


Difference between Correlation and Causation

Correlation analysis helps in determining the degree of relationship between two or more variables - it does not tell anything about cause and effect relationship. Correlation does not necessarily imply causation though the existence of causation always implies correlation. Even a high degree of relationship does not necessarily imply that a relationship of cause or effect exists between the variables. In general, if factors A and B are correlated, it may be that

1. A causes B 
2. B causes A 
3. A and B influence each other continuously 
4. A or B both are influenced by C or 
5. The correlation is due to chance

This can be explained as follows: 

a. The Correlation may be due to pure chance, especially in a small sample. You may get a high degree of correlation between two variables in a sample but in the universe there may not be any relationship between the two variables at all. For e.g.
Income ($) : 350 360 370 380 390 
Weight (lbs): 120 140 160 180 200 

The above data shows a Perfect Positive Relationship between income and weight, i.e., as the income is increasing the weight is increasing and the rate of change between two variables in the same. 

b. Both the correlated variables may be influenced by one or more other variables. It is just possible that a high degree of correlation between two variables may be due to some causes affecting each with the same effect. For example: Suppose the correlation of teachers’ salaries and the consumption of liquor over a period of years comes out to be 0.9, this does not prove that teachers drink: nor does it prove that liquor sale increases teachers’ salaries. 

c. Both the variables may be mutually influencing each other so that neither can be designated as the cause and the other the effect . There maybe a high degree of correlation between the variables but it is difficult to pinpoint as to which is the cause and which is the effect. For e.g., as the price of commodity increases its demand goes down and so price is the cause and demand is the effect. But it is also possible that increased demand of a commodity is due to growth of the population. Now, the cause is the increased demand, the effect is price.

Coefficient of Correlation

The coefficient of correlation is said to be a measure of covariance between two series. The covariance between two series is written as: 

Covariance = Σ x y / N








When r = - 1, it means that there is a perfect negative correlation.




When r = 0, it means that there is a no correlation.




Steps to Calculate Correlation Coefficient

1. Take the deviations of X series from the mean of X and denote these deviations by x. 

2. Square these deviations and obtain total i.e. Σx2 

3. Take the deviations of Y series from the mean of Y and denote these deviations by y. 

4. Square these deviations and obtain total i.e. Σx2 

5. Multiply the deviations of X and Y series and obtain the total i.e. Σ x y. 

6. Substitute the values of Σ xy, Σx2, Σy2 in the formula. 

Direct Method of Finding Correlation Coefficient

Correlation coefficient can also be calculated without taking deviations of items either from actual mean or assumed mean, i.e., actual X and Y values. The formula in such a case is:




Since r is a pure number, shifting the origin and changing the scale of the series does not affect the value of correlation coefficient. 

Apply and interpret a hypothesis test for correlation coefficient











    本站是提供个人知识管理的网络存储空间,所有内容均由用户发布,不代表本站观点。请注意甄别内容中的联系方式、诱导购买等信息,谨防诈骗。如发现有害或侵权内容,请点击一键举报。
    转藏 分享 献花(0

    0条评论

    发表

    请遵守用户 评论公约

    类似文章 更多