Updated April 15, 2023
Difference Between Python Multiprocessing vs Threading
The following article provides an outline for Python Multiprocessing vs Threading. With the passage of time, the structured and unstructured data for computation has increased exponentially. We have come a long way from printing “Hello World” in the console and be content with it. Nowadays, the digital footprint of every individual accessing a desktop, laptop, or mobile is in itself data, and programmers are sweating it out to work with magnanimous data. With huge data, the computational ability of a program is affected. So, when with such huge data, our main aim should be to look for techniques that would lessen a program’s computational time.
As always, Python has the solution to this problem. Python provides a parallelization technique whereby parallel computing is made possible, and the execution of the program is distributed across the different gpu cores of the system.
Python has two built-in libraries for achieving parallelization:
- Multiprocessing
- Threading
In this article, our agenda would be to have a brief look into multiprocessing and threading, and then we would have a comparative study of the two techniques.
Multiprocessing:
Before understanding Multiprocessing, we should have a look at the building block of it, which is a Process.
- Process: A Process can be defined as an instance of a computer program which is being executed. Every process has its own memory space, which it uses to store data and information pertaining to the program which it is executing.
- Multiprocessing: Multiprocessing is a library in Python that helps achieve parallel programming with the help of many processes that have their separate memory allocation and CPU cores for execution.
Code:
import multiprocessing
from datetime import datetime
def count_number_of_words(sentence):
number_of_words = len(sentence.split())
print("Number of words in the sentence :\n",sentence,": {}".format(number_of_words))
def count_number_of_characters(sentence):
number_of_characters = len(sentence)
print("Number of characters in the sentence :\n",sentence,": {}".format(number_of_characters))
if __name__ == "__main__":
sentence = "Python Multiprocessing is an important library for achieving parallel programming."
start = datetime.now().microsecond
p1 = multiprocessing.Process(target=count_number_of_words,args=(sentence,))
p2 = multiprocessing.Process(target=count_number_of_characters,args=(sentence,))
p1.start()
p2.start()
p1.join()
p2.join()
end = datetime.now().microsecond
print("Time taken to execute process : {}".format((end-start)/1000)," ms")
Output:
Head to Head Comparison Between Python Multiprocessing vs Threading (Infographics)
Below are the top 8 differences between Python Multiprocessing vs Threading:
Use Case of Multiprocessing
Multiprocessing is typically preferred where the program is CPU intensive and does not require any user/IO interaction. Any program relating to loading and processing a huge amount of data can be thought of as a good use case of multiprocessing.
Threading:
A Thread is a basic component of Multithreading, so before knowing about Multithreading, let us have a basic knowledge of what a Thread is.
- Thread: A Thread is a component of a Process which can run parallely. There can be multiple threads inside a parent process. All the threads share the program to be executed along with the data required for it within the parent process.
- Threading: Multithreading is a library in Python which helps to achieve parallel programming with the help of the various threads residing inside the parent process.
Code:
import threading
from datetime import datetime
def count_number_of_words(sentence):
number_of_words = len(sentence.split())
print("Number of words in the sentence :\n",sentence,": {}".format(number_of_words))
def count_number_of_characters(sentence):
number_of_characters = len(sentence)
print("Number of characters in the sentence :\n",sentence,": {}".format(number_of_characters))
if __name__ == "__main__":
sentence = "Python Multiprocessing is an important library for achieving parallel programming."
start = datetime.now().microsecond
t1 = threading.Thread(target=count_number_of_words,args=(sentence,))
t2 = threading.Thread(target=count_number_of_characters,args=(sentence,))
t1.start()
t2.start()
t1.join()
t2.join()
end = datetime.now().microsecond
print("Time taken to execute process : {}".format((end-start)/1000)," ms")
Output:
The above example shows that python threading seems to outperform python multiprocessing as it is approximately 9 times faster than the other.
Use Case of Threading
Threading is typically preferred when the program involves interaction with user/IO and is not CPU intensive. Threading is used in applications like Microsoft Word, where one thread is inputting the text, another thread is displaying the text on the screen while the other is saving the text.
Global Interpreter Lock (GIL)
Since threads share the same memory location within a parent process, special precautions must be taken so that two threads don’t write to the same memory location. CPython interpreters use the Global Interpreter Lock (GIL) mechanism to prevent multiple threads from executing Python bytecodes at the same time. This lock is very important as CPython’s memory management is not thread-safe.
Key Difference Between Python Multiprocessing vs Threading
Let us discuss some of the major key differences between Python Multiprocessing vs Threading:
- While processes reside in a separate memory location, threads of a parent process reside in the same memory location.
- Sharing objects among different processes is difficult and is achieved by the use of an inter-process communication model, while object sharing among threads is easier.
- Multiprocessing has a higher overhead, while threading has a lower overhead.
- Multiprocessing is less bug-prone, while threading is more prone to bugs.
- True Parallelism is achieved in Multiprocessing while it is not achieved in Threading.
- In Multiprocessing, scheduling is handled by OS, while in Threading, it is handled by the Python interpreter.
Multiprocessing vs Threading Comparison Table
Let’s discuss the top comparison between Multiprocessing vs Threading:
Basis of Comparison Between Multiprocessing vs Threading | Multiprocessing | Threading |
Memory Management | Reside in a separate memory location. | Reside in the same memory location. |
Object Sharing | Easier object sharing. | Relatively difficult. |
Overhead | Higher overhead. | Lower overhead. |
Bugs | Less Bug is prone. | More Bug prone. |
Parallelism | True Parallelism can be achieved. | True Parallelism cannot be achieved. |
Scheduling | Handled by the OS. | Handled by Python interpreter. |
Memory Deallocation | Child processes are interruptible and killable. | Child threads are not interruptible nor killable. |
Applications | For CPU-bound intensive computation. | Programs involving GUI/ user interaction. |
Conclusion
Finally, it is time to draw closure to this article on Python Multiprocessing and Threading. In the following article, we saw about the concepts of multiprocessing and threading, looked at examples of each, discussed use cases of both. Finally, we did a comparative study of both the methods. It is time now to put these concepts into practice. So next time you do, some coding does have a practice of using multiprocessing or threading as and when required.
Recommended Articles
This is a guide to Python Multiprocessing vs Threading. Here we discuss the key differences with infographics and comparison tables. You may also have a look at the following articles to learn more –