Saturday, June 29, 2019

Co-variance

Covariance 

Introduction:

  • In mathematics and statistics, covariance is a measure of the relationship between two random variables.
  • It measures the degree of change in the variables, i.e. when one variable changes, will there be the same/a similar change in the other variable.
  • Covariance does not assess the dependency between variables.
  • Positive covariance: Indicates that two variables tend to move in the same direction.
  • Negative covariance: Reveals that two variables tend to move in inverse directions. 


  • In numpy we use the function np.cov(x,y) to calculate covariance between two variavles. The result is depicted in the figure below. You can see the 2x2 result matrix consists of covariance of a and b as well as the variance of variable a and variable b. note that np.cov(x,y)=np.cov(y,x)

  • To refresh your understanging of variance let us learn it once again. Variance is a measure of how much the data varies from the mean. The following figure illustrates how to calculate variance.



The following example is actually illustrating the computation of covariance using numpy


import numpy as np

x = np.array([1.2,2.4,3.6,4.8,7.2])
y = np.array([4,6,8,10,12])
print(np.cov(x,y))


Output:

[[ 5.328 7.2 ] [ 7.2 10. ]]



Covariance for three variables


  • In case of three variables the covriance is calculated in the similar way as discussed above and yields a 3x3 matrix containing the covariance of (x,y), (x,z) and (y,z) and variance of x, y and z.
  • The follwoing example demonstrates this feature. Since you cannot pass three variables to cov() function, we need to pass a multidimensional array to the cov() function to get the results.
  • It is interesting to note that there is negative covariance between (x,z) and (y,z)

import numpy as np

x = np.array([[1.2,2.4,3.6,4.8,7.2],[4,6,8,10,12],[10, 7, 5, 4, 2]])
print(np.cov(x))

OUTPUT:

[[ 5.328 7.2 -6.78 ] [ 7.2 10. -9.5 ] [-6.78 -9.5 9.3 ]]

Getting only the covariance and not the full matrix

  • The following example illustrates how to just print the covariance of x and y and not the whole matrix

import numpy as np

x = np.array([1.2,2.4,3.6,4.8,7.2])
y = np.array([4,6,8,10,12])
print(np.cov(x,y)[0][1])

OUTPUT:
7.199999999999999

Friday, June 28, 2019

NUMPY

INTRODUCTION  TO AN ARRAY IN PROGRAMMING LANGUAGE:-
Array is a container which can hold a fix number of items and these items should be of the same type. Most of the data structures make use of arrays to implement their algorithms. Following are the important terms to understand the concept of Array.
  • Element − Each item stored in an array is called an element.
  • Index − Each location of an element in an array has a numerical index,                      which is used to identify the element.
  • Thus, An array is a special variable, which can hold more than one value at a time. i.e. An array can hold many values under a single name, and you can access the values by referring to an index number.
  • Note: Python does not have built-in support for Arrays, but Python Lists can be used instead.

Introduction to NumPy

  • NumPy, which stands for Numerical Python, is a library consisting of multidimensional array objects and a collection of routines for processing those arrays. Using NumPy, mathematical and logical operations on arrays can be performed.
  • The most important object defined in NumPy is an N-dimensional array type called ndarray. It describes the collection of items of the same type. Items in the collection can be accessed using a zero-based index.


Creating 1D ndarray

The following example creates an 1D ndarray using a list.
import numpy as np
a = np.array([1,2,3,4])
print(a)
print('Dimension:',a.shape)# dimension of the array- (4 rows for 1 D array)
print('Datatype:',a.dtype)
print('Size of each element:', a.itemsize, 'bytes')

OUTPUT:
[1 2 3 4] Dimension: (4,) Datatype: int64 Size of each element: 8 bytes

We can create an array with smaller size by mentioning the dtype specifically to int16 or int8

import numpy as np
a = np.array([1,2,3,4], dtype='int8')
print(a.dtype)
print('Size of each element:', a.itemsize, 'byte')

OUTPUT:
int8 Size of each element: 1 byte

Creating ndarrays using fromiter()

  • fromiter() function can be create a numpy array using any iterable (that can be traversed through) object like string, list or a dictionary
  • The following examples illustrate the use of fromiter() method

#This example creates an array of unicode charactersrs from a string
#The dtype needs to be exclusively given (Unicode 2 bytes)
import numpy as np
st ="NUMPY"
arr = np.fromiter(st, dtype='U2')
print(arr)

#This example creates an array of integers from a loop
itr = (x*x for x in range(1,8))
arr = np.fromiter(itr, dtype='int8')
print(arr)

#This example will create a dictionary of keys of a dictionary
dic = {'A':10, 'B':20, 'C':20}
arr = np.fromiter(dic, dtype='U2')
print(arr)

#The last example will create an arry selecting only first 5 charecters of the string
#For this we need to set the count attribute to a desired value
st = "PYTHONLANG"
arr = np.fromiter(st, dtype='U2', count=5)
print(arr)


OUTPUT :
['N' 'U' 'M' 'P' 'Y'] [ 1 4 9 16 25 36 49] ['A' 'B' 'C'] ['P' 'Y' 'T' 'H' 'O']

Accessing elements of 1D array

We use 0 based indexing for accessing the elements of an ndarray
import numpy as np
a = np.array([1,2,3,4], dtype='int8')
print('The first element:',a[0])
print('The last element:',a[3])

OUTPUT:
The first element: 1 The last element: 4

Creating 2D ndarray

We can pass a 2D list in the array() function of numpy to create a 2D ndarray as shown below.
import numpy as np
b = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(b)
print('The shape is:',b.shape)

OUTPUT:
[[1 2 3] [4 5 6] [7 8 9]] The shape is: (3, 3)

Accessing the elements of 2D array

The following examples illustrates the use of 0 based indexing on 2D array
import numpy as np
b = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(b)
print('first row:',b[0])
print('First row 1st element:',b[0][0])
print('Third row 3rd element:',b[2][2]) 
print('Second row 1st element:',b[1,0])

OUTPUT:
[[1 2 3] [4 5 6] [7 8 9]] first row: [1 2 3] First row 1st element: 1 Third row 3rd element: 9 Second row 1st element: 4

Creating matrices with numpy functions

Creating matrix of 0s using the zeros() function.

Observe that the dtype of the resulting matrix of zeros is float64

import numpy as np
b = np.zeros((3,3)) # Creates a 3x3 matrix with zeros
print(b)
print(b.dtype)

OUTPUT:

[[0. 0. 0.] [0. 0. 0.] [0. 0. 0.]] float64

Creating the matrix of 1s using the ones() function

import numpy as np
b = np.ones((3,3)) # Creates a 3x3 matrix with ones
print(b)
print(b.dtype)

OUTPUT:
[[1. 1. 1.] [1. 1. 1.] [1. 1. 1.]] float64

Creating a matrix using full() function

This function accepts two parameters - size as tuple and the data value as a number

import numpy as np
b = np.full((2,2), 8) # Creates a 2x2 constant matrix with 8
print(b)

OUTPUT:
[[8 8] [8 8]]

Creating an identity matrix using eye() function

This method accepts one parameter as the size of the identity matrix.

import numpy as np
b = np.eye(3) # Creates an identity matrix
print(b)

OUTPUT:
[[1. 0. 0.] [0. 1. 0.] [0. 0. 1.]]

Creating matrix with random() funcion

  • Random numbers can be generated using the random library of numpy.
  • The following statement creates a a 2x2 matrix using random.random() funcion

import numpy as np
b = np.random.random((2,2)) # Creates an random number matrix
print(b)


OUTPUT:
[[0.53352925 0.48243526] [0.51794348 0.46761645]]

Using arange() function to create numpy array


#Using arange function to create a numpy array
import numpy as np

# creates an array of integers from 0 to 9
b = np.arange(10) 
print(b)

# creates an array of floats from 1 to 10
a = np.arange(1,11, dtype=float) 
print(a)

# Creates an array with interval of 0.1
c = np.arange(2,3, 0.1)
print(c) 

OUTPUT:
[0 1 2 3 4 5 6 7 8 9]
[ 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.] [2. 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9]

Slicing of numpy arrays

  • Slicing of numpy arrays creates sub-arrays from and existing array
  • Any modification done in the subarray is reflected back in the main array

#Slicing of numpy arrays
#Resultant array will always be a sub-array of the original array
import numpy as np
a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])

b = a[0:2] # slice consists of only first two row (excludes row three)
print('b:',b)

c = a[1:2] # Slice consists of only second row
print('c:',c)

d = a[0:1,0:4] # Slice consists of entire first row (first row all columns)
print('d:',d)

e = a[0:1,0:2] # Slice consists of first row and first two columns
print('e:',e)

f = a[0:1,2:4] # Slice consists of first row and columns three and four
print('f:',f)

OUTPUT:
b: [[1 2 3 4] [5 6 7 8]] c: [[5 6 7 8]] d: [[1 2 3 4]] e: [[1 2]] f: [[3 4]]

Some more examples of slicing in numpy arrays


#Slicing of numpy arrays
import numpy as np
a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])

b = a[1:2,0:4]                 # Slice consists of second row all columns
print('b:',b)

c = a[1:2,1:2]                 # Slice consists of second row second column
print('c:',c)

d = a[:2,0:2]                   # Slice consists of two rows first[1,2] and second[5,6]
print('d:',d)

e = a[:2,1:3]                  # Slice consists of two rows first[2,3] and second[6,7]
print('e:',e)

f = a[1:3,2:4]                              # Slice consists of two rows first[7,8] and second[11,12]
print('f:',f)


OUTPUT:

b: [[5 6 7 8]] c: [[6]] d: [[1 2] [5 6]] e: [[2 3] [6 7]] f: [[ 7 8] [11 12]]

Creating a new arbitry array from an existing one using indexing


#Indexing will create a new arbitry array from the original array
import numpy as np
a = np.array([[1,2], [3, 4], [5, 6]])
print(a.shape)
b = a[[0,1,2],[0,1,0]] # The resultant array will have shape (3,)
print(b)
print(b.shape)

OUTPUT:

(3, 2) [1 4 5] (3,)

Changing the value in a slice will also create a change in original array


#Changing the values in an array resulting from slice
import numpy as np
a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
b = a[0:2,1:3] #Sliced array
print(b)
b[0,0]=20 # Changing the value in the slice
print(b)
print(a) # The change is getting reflected in the original array

OUTPUT:
[[2 3] [6 7]] [[20 3] [ 6 7]] [[ 1 20 3 4] [ 5 6 7 8] [ 9 10 11 12]]

Joining two numpy arrays using concatenate() function


#Joining two numpy arrays
import numpy as np
a = np.array([[1,2],[2,4]])
b = np.array([[7,8]])

#Joing as row
c = np.concatenate((a,b), axis=0)
print(c)

#Joinging as column
d = np.concatenate((a,b.T), axis=1)
print(d)

OUTPUT:

[[1 2] [2 4] [7 8]] [[1 2 7] [2 4 8]]

Arithmatic operations on numpy arrays

  • Unlike lists the aritmatic operations on numpy arrays are vector operations
  • Addition, subtraction, simple multipliction, matrix multiplication, division, square root
  • Addition of two arrays: np.add(a,b) or a+b
  • Subtraction of two arrays: np.subtract(a,b) or a-b
  • Simple multiplication: np.multiply(a,b) or a*b
  • Division of two arrays: np.divide(a,b) or a/b
  • Matrix multiplication: np.dot(a,b) or a.dot(b)

#Numpy arithmatic operations
import numpy as np
a = np.array([[1,2],[3,4]], dtype=np.int32)
b = np.array([[5,6],[7,8]], dtype=np.int32)

print(a+b)
print(np.add(a,b))

print(a-b)
print(np.subtract(a,b))

print(a*b) # Multiplies corresponding elements
print(np.multiply(a,b))
print(a/b)

print(np.divide(a,b))

print(np.sqrt(a))

# Actual Matrix multiplication
print(a.dot(b))
print(np.dot(a,b))

OUTPUT:
[[ 6 8] [10 12]] [[ 6 8] [10 12]] [[-4 -4] [-4 -4]] [[-4 -4] [-4 -4]] [[ 5 12] [21 32]] [[ 5 12] [21 32]] [[0.2 0.33333333] [0.42857143 0.5 ]] [[0.2 0.33333333] [0.42857143 0.5 ]] [[1. 1.41421356] [1.73205081 2. ]] [[19 22] [43 50]] [[19 22] [43 50]]

Row and column sum of array elements using sum() function


  • for axis=0 there will be row sum
  • for axis=1 there will be column sum

#numpy sum() function
a = np.array([[1,2],[3,4]], dtype=np.int32)
print(np.sum(a, axis=0))
print(np.sum(a, axis=1))

[4 6] [3 7]




Friday, June 21, 2019

XII-IP : Plotting with Pyplot

Plotting with Pyplot


Matplotlib is the whole python package/ library used to create 2D  graphs and plots by using python scripts. pyplot is a module in  matplotlib, which supports a very wide variety of graphs and plots  namely - histogram, bar charts, power spectra, error charts etc. It is  used along with NumPy to provide an environment for MatLab.

Pyplot provides the state-machine interface to the plotting library in  matplotlib.It means that figures and axes are implicitly and  automatically created to achieve the desired plot. For example,  calling plot from pyplot will automatically create the necessary  figure and axes to achieve the desired plot. Setting a title will then  automatically set that title to the current axes object.The pyplot  interface is generally preferred for non-interactive plotting (i.e.,  scripting).







XII-IP Histogram

Introduction:

A histogram is a powerful technique in data visualization. It  is an accurate graphical representation of the distribution of  numerical data.It was first introduced by Karl Pearson. It is  an estimate of the distribution of a continuous variable  (quantitative variable). It is similar to bar graph. To construct  a histogram, the first step is to “bin” the range of values   means divide the entire range of values into a series of  intervals   and then count how many values fall into  each interval. The bins are usually specified as consecutive,  non- overlapping intervals of a variable. The bins (intervals)  must be adjacent, and are often (but are not required to  be) of equal size.



Difference between a histogram and a bar chart / graph

A bar chart majorly represents categorical data (data that has some  labels associated with it), they are usually represented using  rectangular bars with lengths proportional to the values that they  represent.
While histograms on the other hand, is used to describe distributions. Given a set of data, what are their distributions





Histogram in Python
Drawing a histogram in Python is very easy. All we have to do is code for 3-4  lines of code. But complexity is involved when we are trying to deal with live  data for visualization.

To draw histogram in python following concepts must be clear.
Title To display heading of the histogram.  Color – To show the color of the bar.
Axis: y-axis and x-axis.
Data: The data can be represented as an array.
Height and width of bars. This is determined based on the analysis.
The width of the bar is called bin or intervals.
Border color To display border color of the bar.

There are various ways to create histogram in python pandas. One of them is using matplotlib python library. Using this library we can easily create histogram. We have to write just few statements to create histogram.
so install matplotlib library using following statements at python command prompt:
pip install matplotlib

 after installation we can create histogram if pip does not work then copy the pip.exe file to the folder where we want to run the above command or move to the folder of pip.exe then write above command.

PROGRAM :
import numpy as np
import matplotlib.pyplot as plt
data= [1,11,21,31,41]
plt.hist([5,15,25,35,45,55],
bins=[0,10,20,30,40,50],
weights=[20,10,45,33,6,8], 
edgecolor("red")
plt.show()


#first argument of hist() method is  position (x,y Coordinate) of weight,  where weight is to be displayed.No of coordinates must match with  No of weight otherwise error will  generate
#Second argument is interval 
#Third argument is weight for bars

output:


For  better  understandinwe  develop  the  same  program  with  minor change .
import numpy as np
import matplotlib.pyplot as plt
data= [1,11,21,31,41]
plt.hist([5,15,25,35,15,55],
bins=[0,10,20,30,40,50],
weights=[20,10,45,33,6,8], 
edgecolor("red")
plt.show()
# at interval(bin)40 to 50 no bar because  we have not mentioned position from 40 to  50 in first argument(list) of hist method.
Where as in interval 10 to 20 width is being  Displayed as 16 (10+6 both weights are  added) because 15 is twice In first  argument.

Histogram in Python
By default bars of histogram is displayed in blue color but we can  change  it  to  other  color  witfollowing  code  .
plt.hist([1,11,21,31,41, 51],  bins=[0,10,20,30,40,50,  60],  weights=[10,1,0,33,6,8],facecolor='y', edgecolor="red")
In above code we are passing ‘y’ as facecolor means yellow color to  be displayed in bars.
To give a name to the histogram write below code before  calling show()  plt.title("Histogram Heading")
he histogram can be saved by clicking on the Save button on the  GUI. Also, the following code will save the histogram as a PNG  image.
plt.savefig(“temp.png")
For x and y label below code can be written
plt.xlabel('Value')
plt.ylabel('Frequency')