Switch Case Statements in Python

Switch case statements are very popular conditional control statements in almost all programming languages. But surprisingly this is not available in python.

Question: Is there any switch case statements in python ?

Answer: The direct answer is NO

Alternative options for switch case statements in python

Option 1: Using If – elif – else statements. An example is given below.

if case == "case1":
    execute_func_case1()
elif case == "case2":
    execute_func_case2()
elif case == "case3":
    execute_func_case3()
else:
    execute_default_func()

Wow. Excellent.  The above code looks good right ?. It works exactly like switch-case statements, then why need switch-case statements in Python ?

Have you noticed a problem ?. The above if-elif-else conditions are fine as long as we have less number of cases. Imagine the situation with 10 or more elif conditions. Now you got the problem right ?.

Lets try the second option

Option 2: Using List in Python as an alternative to switch case statements

An example is given below.

def add(a, b):
    return a + b

def sub(a, b):
    return a-b

case_funcs = [add, sub]

case_funcs[0](1, 2)
case_funcs[1](1, 2)

 

In the above program, we don’t have to use if-elif-else blocks, instead, we can call using the position or index of the list and call the function. This looks better than the previous option right ?. But what about the default case ?. Also what if someone types an option greater than the size of the list ?. It will throw exception and there is no way to handle default case.

Option 3: Using Dictionary as alternative to switch case statements in python

An example is given below.

def add(a, b):
    return a + b

def sub(a, b):
    return a-b

case_funcs = {'sum':add, 'subtract':sub}

case_funcs['sum'](1,2)

 

Here the implementation is much similar to the switch case statement. We use a key to identify or route to the required case or function. The keys can be anything and are not limited by the indices or positions.

Now lets talk about the drawbacks of the above implementation. The above method will throw KeyError if we pass an unknown key. Also there is no default case statement. How will we handle these problems?

Check the below program

def add(a, b):
    return a + b

def sub(a, b):
    return a-b

def default(a, b):
    return "Default Return"

case_funcs = {'sum':add, 'subtract':sub}

# sum is the key for the add(). 
# default is the default function that gets called for non existent keys
# (1, 2) are the arguments for the function
print(case_funcs.get('sum',default)(1,2))

 

Python dictionary has a get() method that returns the value based on the key. This has one more feature. We can configure a default value for non-existent keys. Wow now we got the solution.

So by using this feature, we can implement the switch-case like feature in python.

How to split a list into chunks using Python

To split a large list into smaller lists, you can use the following code snippet.

This can be easily performed by numpy.

import numpy
num_splits = 3
large_list = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]
splitted_list = numpy.array_split(large_list,num_splits);
for split in splitted_list:
    print(list(split))

 

Sample program with Hadoop Counters and Distributed Cache

Counters are very useful feature in hadoop. This helps us in tracking global events in our job, ie across map and reduce phases.
When we execute a mapreduce job, we can see a lot of counters listed in the logs. Other than the default built-in counters, we can create our own custom counters. The custom counters will be listed along with the built-in counters.
This helps us in several ways. Here I am explaining a scenario where I am using a custom counter for counting the number of good words and stop words in the given text files. The stop words in this program are provided at the run time using distributed cache.
This is a mapper only job. The property job.setNumReduceTasks(0) makes the it a mapper only job.

Here I am introducing another feature in hadoop called Distributed Cache.
Distributed cache will distribute application specific read only files efficiently through out the application.
My requirement is to filter the stop words from input text files. The stop words list may vary. So if I hard code the list in my program, I have to update the code everytime to make changes in the stop word list. This is not a good practice. I used distributed cache for this and the file containing the stop words is loaded to the distributed cache. This makes the file available to mapper as well as reducer. In this program, we don’t require any reducer.

The code is attached below. You can also get the code from the github.

Create a java project with the above java classes. Add the dependent java libraries.(Libraries will be present in your hadoop installation). Export the project as a runnable jar and execute. The file containing the stop words should be present in hdfs. The stop words should be added line by line in the stop word file. Sample format is given below.

is

the

am

are

with

was

were

Sample command to execute the program is given below.

hadoop jar <jar-name>  -skip  <stop-word-file-in hdfs>   <input-data-location>    <output-location>

Eg:  hadoop jar Skipper.jar  -skip /user/hadoop/skip/skip.txt     /user/hadoop/input     /user/hadoop/output

In the job logs, you can see the custom counters also. I am attaching a sample log below.

Counters