Linear Regression of Sampling Distributions of the Mean
Author(s): David J Torres, Ana Vasilic, Jose Pacheco.
We show that the simple and multiple linear regression coefficients and the coefficient of determination R2 computed from sampling distributions of the mean (with or without replacement) are equal to the regression coefficients and coefficient of determination computed with individual data. Moreover, the standard error of estimate is reduced by the square root of the group size for sampling distributions of the mean. The result has applications when formulating a distance measure between two genes in a hierarchical clustering algorithm. We show that the Pearson R coefficient can measure how differential expression in one gene correlates with differential expression in a second gene.