4.2 MultiThreading, MultiProcessing & Logging
1. MultiThreading:
Any program defaultely uses a Thread. It starts before the start of the program and it kills at the end of the program. If there is this only one thread through out the program . the execution is sequential and we can call it synchronous program.
if there we want to use multiple threads then we call it asyncronous world where using threading we can run different task concurrently. For example a person is multitasking means. It's not truly parallelism. But some what similar. person starts task1 but he has to wait for some time to execute further in the task1, in this wait time he starts task2 when there is a wait time or completion of this task he switch back to task1 or total different task 3.
Threading commonly used in more I/O driven tasks where program has to wait to read the inputs and posts the outputs.
Deamon Thread: Deamon thread is instantly killed when program ended it's execution.
There is old way of using Threads where we use Threading module
and new way is by using concurrent.futures module
Old way(Threading Module):
2 type to do it:
Code:
import threading
import time
def time_perf(f):
def wrapper(*arg,**kwargs):
a=time.time()
result = f(*arg,**kwargs)
print(f": {round(time.time()-a,0)} seconds to complete the task")
return result
return wrapper
@time_perf
def task(task_name: str,time_to_complete: int) -> None:
time.sleep(time_to_complete)
print(f"{task_name} completed in {time_to_complete} seconds",end="")
print("Tasks Running in sequence")
print("-"*50)
a=time.time()
#tasks running in sequence, will take sum of time_to_complete of the tasks
task("task1",5)
task("task2",2)
task("task3",1)
print(round(time.time()-a,0),"seconds to complete the program")
print("\nTasks Running with threads concurrently using old method")
print("-"*50)
a=time.time()
t1=threading.Thread(target=task, args=("task1",5))
t2=threading.Thread(target=task, args=("task2",2))
t3=threading.Thread(target=task, args=("task3",1))
t1.start()
t2.start()
t3.start()
print(f"Before Join: All tasks are done with in {round(time.time()-a,0)} seconds")
# Above print statement is running before completion of above 3 threads (t1,t2,t3) because these threads are not joined with Main program thread
t1.join()
t2.join()
t3.join()
#t3.daemon=True or t3.setDaemon(True)
print(f"After Join: All tasks are done with in {round(time.time()-a,0)} seconds")# It should run the tasks with in 8 sec if it ran sequentially But It completed all 3 tasks with in 5 sec which is concurrent run.
#What happen if I set Deamon to true
print("\nTasks Running with Daemon thread")
print("-"*50)
a=time.time()
t4=threading.Thread(target=task, args=("task4",3), daemon=True)
t4.start()
print(f"Before Join: All tasks are done with in {round(time.time()-a,0)} seconds") #It completed the program without completing the t4 thread because it is a daemon thread it get instantly killed once last line of main program executed. if I add t4. join in next line then it may work
#t4.join()
New way( Concurrent.futures module):
using ThreadPoolExecutor() method
Code:
import concurrent.futures
import time
def time_perf(f):
def wrapper(*arg,**kwargs):
a=time.time()
result = f(*arg,**kwargs)
print(f": {round(time.time()-a,0)} seconds to complete the task")
return result
return wrapper
@time_perf
def task(task_name: str,time_to_complete: int) -> None:
time.sleep(time_to_complete)
print(f"{task_name} completed in {time_to_complete} seconds",end="")
print("Tasks Running concurrently by using submit method")
print("-"*50)
a=time.time()
with concurrent.futures.ThreadPoolExecutor() as executor:
t1=executor.submit(task,"task1",2)
t2=executor.submit(task,"task2",5)
t3=executor.submit(task,"task3",1)
#print(t1.result(),t2.result(),t3.result(),sep="\n")
print(f"all tasks completed in {time.time()-a} sec")
print("\nTasks Running concurrently in loop by using submit method")
print("-"*50)
a=time.time()
with concurrent.futures.ThreadPoolExecutor() as executor:
result=[executor.submit(task,t_name,t_t_cmpt) for t_name,t_t_cmpt in [["task1",2],["task2",5],["task3",1]]]
#print("result:",list(result)) # It gives list of future objects to run
#for f in concurrent.futures.as_completed(result):
#print(f.result())
#print(t1.result(),t2.result(),t3.result(),sep="\n")
print(f"all tasks completed in {time.time()-a} sec")
print("\nTasks Running concurrently in loop by using map method")
print("-"*50)
a=time.time()
with concurrent.futures.ThreadPoolExecutor() as executor:
task_name=["task1","task2","task3"]
time_to_complete=[2,5,1]
executor.map(task, task_name, time_to_complete)
print(f"all tasks completed in {time.time()-a} sec")
2. MultiProcessing:
Instead of threading where one core(person) is utilized to run the program. Here we can use multiple cores(Persons) to complete different tasks. It's more to a CPU utilization of cores and it's computational power. It is commonly used when there are more operation or calculation involved in the code instead of I/O. thread is locked by Global Interpreter locks where only one thread can be interpreted to computer at a time. so Multiprocessing uses total different system to bypass this GIL(Global Interpreter Locks).
syntax wise to use both Threading and Processing are similar
There is old way of using multiprocessing using multiprocessing module
and new way is by using concurrent.futures module
Old way(multiprocessing Module):
2 type to do it:
New way( Concurrent.futures module):
using ProcessPoolExecutor() method
everything is same as threads code wise. only few differences are
instead of import threading use import multiprocessing
Whereever thread is there keep process
3. Logging
Logging is used for debug if any error occurred. Best logging will help you easily point you where and why that error occurred instead of simply failing the task without any details.
using print, we can do the debugging. But it's not best way. when in production usecases when something is failed, we want to see the root cause instead of running it again and coming up with same error. so logging is required and all logs should snow in log files.
There are levels of logging. which are debug, info, warning, error, exception, critical. we can set these levels using
import logging
logging.setLevel(debug)
or
logging.basicConfig(file_name, log_level, format)
when logging, we need to look at these 3 things
where we are logging( log file location), what is the log level, in which format we have to log the details('%(asctime)s:%(debug)s:%(message)s').
we can use below methods to set or get above level
getLogger(), setLevel(), setFormatter(), addHandler(), Formatter(), basicConfig(), config.dictConfig()
FileHandler(),StreamHandler(),QueueHandler()