Saturday, June 29, 2019

Co-variance

Covariance 

Introduction:

  • In mathematics and statistics, covariance is a measure of the relationship between two random variables.
  • It measures the degree of change in the variables, i.e. when one variable changes, will there be the same/a similar change in the other variable.
  • Covariance does not assess the dependency between variables.
  • Positive covariance: Indicates that two variables tend to move in the same direction.
  • Negative covariance: Reveals that two variables tend to move in inverse directions. 


  • In numpy we use the function np.cov(x,y) to calculate covariance between two variavles. The result is depicted in the figure below. You can see the 2x2 result matrix consists of covariance of a and b as well as the variance of variable a and variable b. note that np.cov(x,y)=np.cov(y,x)

  • To refresh your understanging of variance let us learn it once again. Variance is a measure of how much the data varies from the mean. The following figure illustrates how to calculate variance.



The following example is actually illustrating the computation of covariance using numpy


import numpy as np

x = np.array([1.2,2.4,3.6,4.8,7.2])
y = np.array([4,6,8,10,12])
print(np.cov(x,y))


Output:

[[ 5.328 7.2 ] [ 7.2 10. ]]



Covariance for three variables


  • In case of three variables the covriance is calculated in the similar way as discussed above and yields a 3x3 matrix containing the covariance of (x,y), (x,z) and (y,z) and variance of x, y and z.
  • The follwoing example demonstrates this feature. Since you cannot pass three variables to cov() function, we need to pass a multidimensional array to the cov() function to get the results.
  • It is interesting to note that there is negative covariance between (x,z) and (y,z)

import numpy as np

x = np.array([[1.2,2.4,3.6,4.8,7.2],[4,6,8,10,12],[10, 7, 5, 4, 2]])
print(np.cov(x))

OUTPUT:

[[ 5.328 7.2 -6.78 ] [ 7.2 10. -9.5 ] [-6.78 -9.5 9.3 ]]

Getting only the covariance and not the full matrix

  • The following example illustrates how to just print the covariance of x and y and not the whole matrix

import numpy as np

x = np.array([1.2,2.4,3.6,4.8,7.2])
y = np.array([4,6,8,10,12])
print(np.cov(x,y)[0][1])

OUTPUT:
7.199999999999999

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.