Introduction:
A histogram is a powerful
technique in data visualization. It is
an accurate graphical representation of the distribution of numerical data.It was first introduced by Karl
Pearson. It is an estimate of the
distribution of a continuous variable
(quantitative variable). It is similar to bar graph. To construct a histogram, the first step is to “bin” the
range of values
means divide the entire range of
values into a series of intervals and then count how many values
fall into each interval. The bins are
usually specified as consecutive, non-
overlapping intervals of a variable. The bins (intervals) must be adjacent, and are often (but are not
required to be) of equal size.
Difference between a histogram
and a bar chart / graph –
A
bar chart majorly represents categorical data (data
that has some labels associated with it), they are usually represented using
rectangular bars with lengths proportional to the values that they represent.
While histograms on the other hand, is used to
describe distributions. Given a set of data, what
are their distributions
Histogram in Python –
Drawing
a histogram in Python is very easy. All we have
to do is code for
3-4 lines of code. But complexity is involved when we are trying to deal with live data for
visualization.
To
draw histogram in python following concepts must
be clear.
Title –To display heading of the histogram. Color – To show the color of the bar.
Axis: y-axis and x-axis.
Data: The data can be represented as an array.
Height
and width of bars. This is determined based on the analysis.
The width of the bar is called bin or intervals.
Border
color –To display border color of the bar.
There are various ways to create histogram in python pandas. One of them is using matplotlib python library. Using this library we can easily create histogram. We have to write just few statements to create histogram.
so install matplotlib library using following statements at python command prompt:
pip install matplotlib
after installation we can create histogram if pip does not work then copy the pip.exe file to the folder where we want to run the above command or move to the folder of pip.exe then write above command.
PROGRAM :
import numpy as np
import matplotlib.pyplot as plt
data= [1,11,21,31,41]
plt.hist([5,15,25,35,45,55],
bins=[0,10,20,30,40,50],
weights=[20,10,45,33,6,8],
edgecolor("red")
plt.show()
#first argument of hist() method is
position (x,y Coordinate) of weight, where weight is to be displayed.No of coordinates must match with
No of weight
otherwise error will
generate
#Second argument is interval
#Third argument is weight for bars
output:
For better understanding we develop the same program with minor change .
import numpy as np
import matplotlib.pyplot as plt
data= [1,11,21,31,41]
plt.hist([5,15,25,35,15,55],
bins=[0,10,20,30,40,50],
weights=[20,10,45,33,6,8],
edgecolor("red")
plt.show()
# at interval(bin)40 to 50 no bar
because we have not mentioned position from 40 to
50 in first argument(list) of hist method.
Where as in interval 10 to 20 width is being Displayed as 16 (10+6 both weights
are
added) because 15 is twice In first argument.
Histogram in Python –
By default bars
of histogram is displayed in blue color but we can change it to other color with following code .
plt.hist([1,11,21,31,41, 51], bins=[0,10,20,30,40,50, 60], weights=[10,1,0,33,6,8],facecolor='y',
edgecolor="red")
In above code we are passing ‘y’ as facecolor means yellow
color to
be displayed in bars.
To give
a name to the
histogram write below code before calling
show() plt.title("Histogram
Heading")
he histogram can be saved by clicking on
the Save button on the GUI.
Also, the following code will save the histogram as a PNG image.
plt.savefig(“temp.png")
For
x and y label below code can be written
plt.xlabel('Value')
plt.ylabel('Frequency')
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.