Basic statistics using Python

Python comes with a built-in statistics module. This will help us to perform the statistical calculations very easily.

The following are the commonly used statistical functions.

Arithmetic Mean

Arithmetic mean is the average of a group of values. The mathematical equation is

Mean = Sum of group of values / Total number of values in the group

Mean vs Average: What’s the Difference?

Answer: Both are same. No difference

Suppose we have a list of values as shown below.

values = [1,2,3,4,5,6,7,8]

For calculating the mean, without using any built-in function, we have to use the following snippet of the code

values = [1,2,3,4,5,6,7,8]
sum = 0
for value in values:
    sum += value

mean = sum/len(values)
print("Sum -->:", sum)
print("Total Count-->:", len(values))
print("Arithmetic Mean-->:", mean)

The above program involves multiple steps. Instead of writing the entire logic, we can easily calculate the mean using the following code snippet

import statistics
values = [1,2,3,4,5,6,7,8]
print("Arithmetic Mean--> ", statistics.mean(values))

Arithmetic Mode

Arithmetic mode refers to the most frequently occurred value in a data set. Mode can be calculated very easily using the statistics.mode() function

import statistics
values = [1,2,2,2,2,2,2,1,2,3,4,5,2,3,4,5,6,66,6,6,6,6]
print(statistics.mode(values))

Arithmetic Median

Median is basically the mid value in the numerical data set. The median is calculated by ordering the numerical data set from lowest to highest and finding the number in the exact middle. If the count of total numbers in the group is an odd number, the median will be the number which is in the exact middle of the ordered list. If the count of total numbers is an even number, then the median will be the mean of the numbers that reside in the middle of the ordered list.

This can be simply calculated by the statistics.median() function.

import statistics
values = [21,1,2,3,4,5,6,7,8,24,29,50]
print("Arithmetic Median--> ", statistics.median(values))

 

How to maintain packages in a python project ?

In most of the cases, we might need external packages for the development of a python program. These external packages are either available in pypi repository or available locally as archive files. Usually people just installs the packages directly in the python environment using pip command.

The pip command by default installs the package from the pypi repository. If we are not specifying the version, it selects the latest available version of that package supported by the python present in the environment (Python 2 or 3). Because of this nature, the pip command will always pick up the latest version of the packages. The packages may undergo drastic changes in newer releases. For example, an application developed with version X of a package may not work with the version Y of the same package. So simply noting down the package names itself will not help to manage the project. We need the list of all packages with the versions. Also manually installing the packages one by one is also a difficult task, because there can be several tens of packages within a single project.

The best practices for managing packages in a project are

  1. Use python Virtual Environment.
  2. Create a requirements.txt to maintain the package details.

The details on how to create and use virtual environment is explained in my previous post.

requirements.txt is a simple text file to maintain all the package dependencies with versions. A sample format is given below

Click==7.0
Flask==1.0.2
Flask-Cors==3.0.7
itsdangerous==1.1.0
Jinja2==2.10
MarkupSafe==1.1.0
six==1.12.0
Werkzeug==0.14.1

Packages can be installed using a single command

pip install -r requirements.txt

Packages in an environment can be captured in a requirements.txt file in one shot using the following command.

pip freeze > requirements.txt

This practice will help developers to manage the dependency list and easy code migration.

Python code to list all the running EC2 instances across all regions in an AWS account

This code snippet will help you to get the list of all running EC2 instances across all regions in an AWS account. I have used python boto3 package for developing the code. This code will dynamically pick up all the aws ec2 regions. So the code will work perfectly without any modification even if a new region gets added to the AWS.

Note: Only the basic api calls just to list the instance details are mentioned in this program . Proper coding convention is not followed . 🙂

How to hide or obfuscate python source code ?

Sometimes we may have the requirement to provide applications without source code. In Java it is very easy and people are widely using also. If we want to hide our source code in python what we will do ??

I checked for several solutions for obfuscating the source code . One is using pyminifier. This is  a good tool. This will rename the methods and variables. So that the obfuscated code will look more complicated. But still if you spend some time, we can read it.

Another best way to hide the source code completely is by using the built-in compiler in the python itself. This will generate a byte code and we can use that for execution.

python -OO -m py_compile  <your code.py>

This will generate a .pyo file. Rename the .pyo file to .py extension. You can use this for execution. This will work just like the actual code.

NB : If your program imports modules obfuscated like this, then you have to rename them with a .pyc suffix instead