Introduction:
A histogram is a powerful
technique in data visualization. It  is
an accurate graphical representation of the distribution of  numerical data.It was first introduced by Karl
Pearson. It is  an estimate of the
distribution of a continuous variable 
(quantitative variable). It is similar to bar graph. To construct  a histogram, the first step is to “bin” the
range of values  
means divide the entire range of
values into a series of  intervals   and then count how many values
fall into  each interval. The bins are
usually specified as consecutive,  non-
overlapping intervals of a variable. The bins (intervals)  must be adjacent, and are often (but are not
required to  be) of equal size.
Difference between a histogram
and a bar chart / graph –
A
bar chart majorly represents categorical data (data
that has some  labels associated with it), they are usually represented using 
rectangular bars with lengths proportional to the values that they  represent.
While histograms on the other hand, is used to
describe distributions. Given a set of data, what
are their distributions
Histogram in Python –
Drawing
a histogram in Python is very easy. All we have
to do is code for
3-4  lines of code. But complexity is involved when we are trying to deal with live  data for
visualization.
To
draw histogram in python following concepts must
be clear.
Title –To display heading of the histogram.  Color – To show the color of the bar.
Axis: y-axis and x-axis.
Data: The data can be represented as an array.
Height
and width of bars. This is determined based on the analysis.
The width of the bar is called bin or intervals.
Border
color –To display border color of the bar.
There are various ways to create histogram in python pandas. One of them is using matplotlib python library. Using this library we can easily create histogram. We have to write just few statements to create histogram.
so install matplotlib library using following statements at python command prompt:
pip install matplotlib
 after installation we can create histogram if pip does not work then copy the pip.exe file to the folder where we want to run the above command or move to the folder of pip.exe then write above command.
PROGRAM :
import numpy as np
import matplotlib.pyplot as plt
data= [1,11,21,31,41]
plt.hist([5,15,25,35,45,55],
bins=[0,10,20,30,40,50],
weights=[20,10,45,33,6,8], 
edgecolor("red")
plt.show()
#first argument of hist() method is 
position (x,y Coordinate) of weight,  where weight is to be displayed.No of coordinates must match with 
No of weight
otherwise error will 
generate
#Second argument is interval 
#Third argument is weight for bars
output:
For  better  understanding  we  develop  the  same  program  with  minor change .
import numpy as np
import matplotlib.pyplot as plt
data= [1,11,21,31,41]
plt.hist([5,15,25,35,15,55],
bins=[0,10,20,30,40,50],
weights=[20,10,45,33,6,8], 
edgecolor("red")
plt.show()
# at interval(bin)40 to 50 no bar
because  we have not mentioned position from 40 to 
50 in first argument(list) of hist method.
Where as in interval 10 to 20 width is being  Displayed as 16 (10+6 both weights
are 
added) because 15 is twice In first  argument.
Histogram in Python –
By default bars
of histogram is displayed in blue color but we can  change  it  to  other  color  with  following  code  .
plt.hist([1,11,21,31,41, 51],  bins=[0,10,20,30,40,50,  60],  weights=[10,1,0,33,6,8],facecolor='y',
edgecolor="red")
In above code we are passing ‘y’ as facecolor means yellow
color to 
be displayed in bars.
To give
a name to the
histogram write below code before  calling
show()  plt.title("Histogram
Heading")
he histogram can be saved by clicking on
the Save button on the  GUI.
Also, the following code will save the histogram as a PNG  image.
plt.savefig(“temp.png")
For
x and y label below code can be written
plt.xlabel('Value')
plt.ylabel('Frequency')



 
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.