Bubble chart using Python

Bubble chart is one of the powerful and useful chart for representing data with three or four dimensions.

The position of the bubble is determined by the x & y axis values. These are the first two properties.

The size of the bubble can be controlled by the third property.

The colour of the bubble can be controlled by the fourth property.

A Sample program to create a bubble chart using the python library matplotlib is given below.

import matplotlib.pyplot as plot
import numpy as npy

# create some dummy data using numpy random function.
# Bubble charts are used to represent data with three or four dimensions.
# X axis can represent one property, Y can represent another property,
# The bubble size can represent another properly, the color of the bubble can represent another property.

x = npy.random.rand(50)
y = npy.random.rand(50)
z = npy.random.rand(50)
colors = npy.random.rand(50)
# use the scatter function
plot.scatter(x, y, s=z * 1000, c=colors)

Here we are generating some random data using numpy and plotting the bubble chart using matplotlib.

A sample output is given below.


Bubble Chart using Python


Basic statistics using Python

Python comes with a built-in statistics module. This will help us to perform the statistical calculations very easily.

The following are the commonly used statistical functions.

Arithmetic Mean

Arithmetic mean is the average of a group of values. The mathematical equation is

Mean = Sum of group of values / Total number of values in the group

Mean vs Average: What’s the Difference?

Answer: Both are same. No difference

Suppose we have a list of values as shown below.

values = [1,2,3,4,5,6,7,8]

For calculating the mean, without using any built-in function, we have to use the following snippet of the code

values = [1,2,3,4,5,6,7,8]
sum = 0
for value in values:
    sum += value

mean = sum/len(values)
print("Sum -->:", sum)
print("Total Count-->:", len(values))
print("Arithmetic Mean-->:", mean)

The above program involves multiple steps. Instead of writing the entire logic, we can easily calculate the mean using the following code snippet

import statistics
values = [1,2,3,4,5,6,7,8]
print("Arithmetic Mean--> ", statistics.mean(values))

Arithmetic Mode

Arithmetic mode refers to the most frequently occurred value in a data set. Mode can be calculated very easily using the statistics.mode() function

import statistics
values = [1,2,2,2,2,2,2,1,2,3,4,5,2,3,4,5,6,66,6,6,6,6]

Arithmetic Median

Median is basically the mid value in the numerical data set. The median is calculated by ordering the numerical data set from lowest to highest and finding the number in the exact middle. If the count of total numbers in the group is an odd number, the median will be the number which is in the exact middle of the ordered list. If the count of total numbers is an even number, then the median will be the mean of the numbers that reside in the middle of the ordered list.

This can be simply calculated by the statistics.median() function.

import statistics
values = [21,1,2,3,4,5,6,7,8,24,29,50]
print("Arithmetic Median--> ", statistics.median(values))