4.2 MultiThreading, MultiProcessing & Logging

1. MultiThreading:

Any program defaultely uses a Thread. It starts before the start of the program and it kills at the end of the program. If there is this only one thread through out the program . the execution is sequential and we can call it synchronous program.

if there we want to use multiple threads then we call it asyncronous world where using threading we can run different task concurrently. For example a person is multitasking means. It's not truly parallelism. But some what similar. person starts task1 but he has to wait for some time to execute further in the task1, in this wait time he starts task2 when there is a wait time or completion of this task he switch back to task1 or total different task 3.

Threading commonly used in more I/O driven tasks where program has to wait to read the inputs and posts the outputs.

Deamon Thread: Deamon thread is instantly killed when program ended it's execution.

There is old way of using Threads where we use Threading module

and new way is by using concurrent.futures module

Old way(Threading Module):

2 type to do it:

  1. Thread + start+ join combo method

Code:


import threading

import time

def time_perf(f):

def wrapper(*arg,**kwargs):

a=time.time()

result = f(*arg,**kwargs)

print(f": {round(time.time()-a,0)} seconds to complete the task")

return result

return wrapper

@time_perf

def task(task_name: str,time_to_complete: int) -> None:

time.sleep(time_to_complete)

print(f"{task_name} completed in {time_to_complete} seconds",end="")

print("Tasks Running in sequence")

print("-"*50)

a=time.time()

#tasks running in sequence, will take sum of time_to_complete of the tasks

task("task1",5)

task("task2",2)

task("task3",1)

print(round(time.time()-a,0),"seconds to complete the program")

print("\nTasks Running with threads concurrently using old method")

print("-"*50)

a=time.time()

t1=threading.Thread(target=task, args=("task1",5))

t2=threading.Thread(target=task, args=("task2",2))

t3=threading.Thread(target=task, args=("task3",1))

t1.start()

t2.start()

t3.start()

print(f"Before Join: All tasks are done with in {round(time.time()-a,0)} seconds")

# Above print statement is running before completion of above 3 threads (t1,t2,t3) because these threads are not joined with Main program thread

t1.join()

t2.join()

t3.join()

#t3.daemon=True or t3.setDaemon(True)

print(f"After Join: All tasks are done with in {round(time.time()-a,0)} seconds")# It should run the tasks with in 8 sec if it ran sequentially But It completed all 3 tasks with in 5 sec which is concurrent run.

#What happen if I set Deamon to true

print("\nTasks Running with Daemon thread")

print("-"*50)

a=time.time()

t4=threading.Thread(target=task, args=("task4",3), daemon=True)

t4.start()

print(f"Before Join: All tasks are done with in {round(time.time()-a,0)} seconds") #It completed the program without completing the t4 thread because it is a daemon thread it get instantly killed once last line of main program executed. if I add t4. join in next line then it may work

#t4.join()


New way( Concurrent.futures module):

using ThreadPoolExecutor() method

  1. Thread + submit method
  2. Thread + map method

Code:


import concurrent.futures

import time

def time_perf(f):

def wrapper(*arg,**kwargs):

a=time.time()

result = f(*arg,**kwargs)

print(f": {round(time.time()-a,0)} seconds to complete the task")

return result

return wrapper

@time_perf

def task(task_name: str,time_to_complete: int) -> None:

time.sleep(time_to_complete)

print(f"{task_name} completed in {time_to_complete} seconds",end="")

print("Tasks Running concurrently by using submit method")

print("-"*50)

a=time.time()

with concurrent.futures.ThreadPoolExecutor() as executor:

t1=executor.submit(task,"task1",2)

t2=executor.submit(task,"task2",5)

t3=executor.submit(task,"task3",1)

#print(t1.result(),t2.result(),t3.result(),sep="\n")

print(f"all tasks completed in {time.time()-a} sec")

print("\nTasks Running concurrently in loop by using submit method")

print("-"*50)

a=time.time()

with concurrent.futures.ThreadPoolExecutor() as executor:

result=[executor.submit(task,t_name,t_t_cmpt) for t_name,t_t_cmpt in [["task1",2],["task2",5],["task3",1]]]

#print("result:",list(result)) # It gives list of future objects to run

#for f in concurrent.futures.as_completed(result):

#print(f.result())

#print(t1.result(),t2.result(),t3.result(),sep="\n")

print(f"all tasks completed in {time.time()-a} sec")

print("\nTasks Running concurrently in loop by using map method")

print("-"*50)

a=time.time()

with concurrent.futures.ThreadPoolExecutor() as executor:

task_name=["task1","task2","task3"]

time_to_complete=[2,5,1]

executor.map(task, task_name, time_to_complete)

print(f"all tasks completed in {time.time()-a} sec")



2. MultiProcessing:

Instead of threading where one core(person) is utilized to run the program. Here we can use multiple cores(Persons) to complete different tasks. It's more to a CPU utilization of cores and it's computational power. It is commonly used when there are more operation or calculation involved in the code instead of I/O. thread is locked by Global Interpreter locks where only one thread can be interpreted to computer at a time. so Multiprocessing uses total different system to bypass this GIL(Global Interpreter Locks).

syntax wise to use both Threading and Processing are similar

There is old way of using multiprocessing using multiprocessing module

and new way is by using concurrent.futures module

Old way(multiprocessing Module):

2 type to do it:

  1. Process->start ->join

New way( Concurrent.futures module):

using ProcessPoolExecutor() method

  1. Process+ submit method
  2. Process+ map method

everything is same as threads code wise. only few differences are

instead of import threading use import multiprocessing

Whereever thread is there keep process


3. Logging

Logging is used for debug if any error occurred. Best logging will help you easily point you where and why that error occurred instead of simply failing the task without any details.

using print, we can do the debugging. But it's not best way. when in production usecases when something is failed, we want to see the root cause instead of running it again and coming up with same error. so logging is required and all logs should snow in log files.

There are levels of logging. which are debug, info, warning, error, exception, critical. we can set these levels using

import logging

logging.setLevel(debug)

or

logging.basicConfig(file_name, log_level, format)

when logging, we need to look at these 3 things

where we are logging( log file location), what is the log level, in which format we have to log the details('%(asctime)s:%(debug)s:%(message)s').

we can use below methods to set or get above level

getLogger(), setLevel(), setFormatter(), addHandler(), Formatter(), basicConfig(), config.dictConfig()

FileHandler(),StreamHandler(),QueueHandler()




To view or add a comment, sign in

More articles by Nichuth Eathamukkala

Explore content categories