Using the Multiple Regression Equation Multiple regression equation is used to obtain a ? value when particular values of two or more X variables present in the equation are given. You may interpret ? as an estimate of the mean of the subpopulations of the Y values assumed to exist for a particular combination of Xi values. Under this interpretation ? is called an estimate, and its equation is called estimation equation and the corresponding interval is called confidence interval. The second interpretation of ? value is that it is the value of Y that is most likely to assume for given values of Xi. In this case ? is called predicted value of Y, and the equation is called prediction equation and the corresponding interval is called prediction interval. The 100(1-α) % Confidence Interval for the Mean of a subpopulation of Y values given particular values of the Xi is as follows: c. Simple Linear Correlation Correlation analysis deals with the association between two or more variables or it is an attempt to determine the degree of relationship between two variables, when the relationship is of a quantitative nature. For example: (1) to check the effect of increase in rainfall up to a point and the production of rice. (2) to check the effect of trained operators on the output of a process (3) to check whether there exists some relationship between age of husband and age of wife. The use of correlation analysis is very important in Six Sigma. Correlation analysis helps the analyst to study cause and effect of a problem. This can be used in every stage of problem solving and planning process. Significance of the study of correlation 1. Most of the variables show some kind of relationship. For example, there is relationship between price and supply, income and expenditure, etc. With the help of correlation analysis you can measure in one figure the degree of relationship existing between the variables. 2. Once you know that the variables are closely related, you can estimate the value of one variable given the value of another with the help of regression analysis. 3. In Six Sigma operations, correlation analysis enables the analyst to estimate costs, sales, prices and other variables on the basis of some other series with which these costs, sales, or prices may be functionally related. Correlation Assumption The following assumptions must hold for inferences about the population to be valid when sampling is from bivariate distributions.
Difference between Correlation and Causation Correlation analysis helps in determining the degree of relationship between two or more variables - it does not tell anything about cause and effect relationship. Correlation does not necessarily imply causation though the existence of causation always implies correlation. Even a high degree of relationship does not necessarily imply that a relationship of cause or effect exists between the variables. In general, if factors A and B are correlated, it may be that 1. A causes B 2. B causes A 3. A and B influence each other continuously 4. A or B both are influenced by C or 5. The correlation is due to chance This can be explained as follows: a. The Correlation may be due to pure chance, especially in a small sample. You may get a high degree of correlation between two variables in a sample but in the universe there may not be any relationship between the two variables at all. For e.g. Income ($) : 350 360 370 380 390 Weight (lbs): 120 140 160 180 200 The above data shows a Perfect Positive Relationship between income and weight, i.e., as the income is increasing the weight is increasing and the rate of change between two variables in the same. b. Both the correlated variables may be influenced by one or more other variables. It is just possible that a high degree of correlation between two variables may be due to some causes affecting each with the same effect. For example: Suppose the correlation of teachers’ salaries and the consumption of liquor over a period of years comes out to be 0.9, this does not prove that teachers drink: nor does it prove that liquor sale increases teachers’ salaries. c. Both the variables may be mutually influencing each other so that neither can be designated as the cause and the other the effect . There maybe a high degree of correlation between the variables but it is difficult to pinpoint as to which is the cause and which is the effect. For e.g., as the price of commodity increases its demand goes down and so price is the cause and demand is the effect. But it is also possible that increased demand of a commodity is due to growth of the population. Now, the cause is the increased demand, the effect is price. Coefficient of Correlation The coefficient of correlation is said to be a measure of covariance between two series. The covariance between two series is written as: Covariance = Σ x y / N 1. Take the deviations of X series from the mean of X and denote these deviations by x. 2. Square these deviations and obtain total i.e. Σx2 3. Take the deviations of Y series from the mean of Y and denote these deviations by y. 4. Square these deviations and obtain total i.e. Σx2 5. Multiply the deviations of X and Y series and obtain the total i.e. Σ x y. 6. Substitute the values of Σ xy, Σx2, Σy2 in the formula. Direct Method of Finding Correlation Coefficient Correlation coefficient can also be calculated without taking deviations of items either from actual mean or assumed mean, i.e., actual X and Y values. The formula in such a case is: Apply and interpret a hypothesis test for correlation coefficient |
|