Friday, June 21, 2019

XII-IP Histogram

Introduction:

A histogram is a powerful technique in data visualization. It  is an accurate graphical representation of the distribution of  numerical data.It was first introduced by Karl Pearson. It is  an estimate of the distribution of a continuous variable  (quantitative variable). It is similar to bar graph. To construct  a histogram, the first step is to “bin” the range of values   means divide the entire range of values into a series of  intervals   and then count how many values fall into  each interval. The bins are usually specified as consecutive,  non- overlapping intervals of a variable. The bins (intervals)  must be adjacent, and are often (but are not required to  be) of equal size.



Difference between a histogram and a bar chart / graph

A bar chart majorly represents categorical data (data that has some  labels associated with it), they are usually represented using  rectangular bars with lengths proportional to the values that they  represent.
While histograms on the other hand, is used to describe distributions. Given a set of data, what are their distributions





Histogram in Python
Drawing a histogram in Python is very easy. All we have to do is code for 3-4  lines of code. But complexity is involved when we are trying to deal with live  data for visualization.

To draw histogram in python following concepts must be clear.
Title To display heading of the histogram.  Color – To show the color of the bar.
Axis: y-axis and x-axis.
Data: The data can be represented as an array.
Height and width of bars. This is determined based on the analysis.
The width of the bar is called bin or intervals.
Border color To display border color of the bar.

There are various ways to create histogram in python pandas. One of them is using matplotlib python library. Using this library we can easily create histogram. We have to write just few statements to create histogram.
so install matplotlib library using following statements at python command prompt:
pip install matplotlib

 after installation we can create histogram if pip does not work then copy the pip.exe file to the folder where we want to run the above command or move to the folder of pip.exe then write above command.

PROGRAM :
import numpy as np
import matplotlib.pyplot as plt
data= [1,11,21,31,41]
plt.hist([5,15,25,35,45,55],
bins=[0,10,20,30,40,50],
weights=[20,10,45,33,6,8], 
edgecolor("red")
plt.show()


#first argument of hist() method is  position (x,y Coordinate) of weight,  where weight is to be displayed.No of coordinates must match with  No of weight otherwise error will  generate
#Second argument is interval 
#Third argument is weight for bars

output:


For  better  understandinwe  develop  the  same  program  with  minor change .
import numpy as np
import matplotlib.pyplot as plt
data= [1,11,21,31,41]
plt.hist([5,15,25,35,15,55],
bins=[0,10,20,30,40,50],
weights=[20,10,45,33,6,8], 
edgecolor("red")
plt.show()
# at interval(bin)40 to 50 no bar because  we have not mentioned position from 40 to  50 in first argument(list) of hist method.
Where as in interval 10 to 20 width is being  Displayed as 16 (10+6 both weights are  added) because 15 is twice In first  argument.

Histogram in Python
By default bars of histogram is displayed in blue color but we can  change  it  to  other  color  witfollowing  code  .
plt.hist([1,11,21,31,41, 51],  bins=[0,10,20,30,40,50,  60],  weights=[10,1,0,33,6,8],facecolor='y', edgecolor="red")
In above code we are passing ‘y’ as facecolor means yellow color to  be displayed in bars.
To give a name to the histogram write below code before  calling show()  plt.title("Histogram Heading")
he histogram can be saved by clicking on the Save button on the  GUI. Also, the following code will save the histogram as a PNG  image.
plt.savefig(“temp.png")
For x and y label below code can be written
plt.xlabel('Value')
plt.ylabel('Frequency')

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.