Covariance
The mathematical formula for covariance is
E(XY) - E(X)E(Y)
This essay intends to explain why this is not an unreasonable definition.
The "covariance" of two variables is a measure of how much they vary together.
For example, consider a room full of people. The heavy ones will tend to be
taller, and the taller ones will tend to be heavier. So if we let one variable
be the weights of the individuals in the room, and the other variable be their
heights, a "covariance" is, appropriately, a measure of this tendency for the
two variables to vary together.
To calculate the mathematical covariance using the formula above, first
find the expectation of the product of height and weight. This amounts
to multiplying each person's height by his or her weight. (While this may
seem like adding apples and oranges, the way that the entire result tracks
covariance will be clear momentarily.) To get the expection, E(XY), then
divide the total sum of these products by the number of people. This yields
an average (or expectation) of the height*weight product. Next, subtract
the average of the heights E(X) times the average of the weights E(Y). The
result is the covariance.
To make this result seem reasonable, first suppose that there were no
relationship between the people's heights and weights (say they were just
entirely random to begin with). Then E(XY) - E(X)E(Y) would tend to be
zero: there would be absolutely no reason to expect that the average
product would be any different from the product of the averages. But
if you suppose that there is indeed a relationship---imagine now that
there exists almost a linear relationship---then the excess of E(XY)
over E(X)E(Y) arises from the products of the larger people's heights
and weights, which more than compensates for the contributions of the
products of the smaller
people's heights and weights. As a silly example, suppose that four
people's heights and weights were
Person 1   1 1
Person 2   2 2
Person 3   3 3
Person 4   4 4
Then E(XY) is 1*1 + 2*2 + 3*3 + 4*4 divided by 4, or
30/4 = 7.5, whereas E(X)E(Y) is only 2.5 * 2.5 = 6.25. In this
example, the covariance is 1.25.
Addendum: Norm Hardy www.cap-lore.com
also points out that adding a constant to either X or Y does not change the defined
covariance:
E((X+3)Y) - E(X+3)E(Y)
= Sum((X+3)Y)/n - Sum(X+3)Sum(Y)/n^2
= Sum(XY)/n + Sum(3Y)/n - (Sum(X)+3n)Sum(Y)/n^2
= E(XY) + 3Sum(Y)/n - Sum(X)Sum(Y)/n^2 - 3nSum(Y)/n^2
= E(XY) + 3E(Y) - E(X)E(Y) - 3Sum(Y)/n
= E(XY) + 3E(Y) - E(X)E(Y) - 3E(Y)
= E(XY) - E(X)E(Y)