Process vs. ThreadI read a lot of tech-related blogs and other tech-news, and I’ve caught a number of very talented programmers and intelligent technologists using the terms thread and process interchangibly. Forgive me for being pedantic, but they’re not the same thing! It’s true that threads and processes are very similar: they’re both methods of parallelizing an application. But the similarities pretty much stop there.

A process is usually defined as an instance of a program that is being executed, including all variables and other information describing the program’s state. Processes have a life cycle: they are spawned, optionally generate one or more child processes, and eventually die. Each process is an independent entity to which system resources (CPU time, memory, etc.) are allocated. And each processes is executed in a separate address space. Thus, one process cannot access variables or other data structures that are defined in another process. If two processes want to communicate, they have to use inter-process communication mechanisms like files, pipes, or sockets.

The term thread is short for a thread of execution, and refers to a particular execution path through a process. Threads and processes work differently on different operating systems but, in general, multiple threads can share the state information of a single process. Since threads share memory and other system resources, they can communicate directly via variables and other memory structures. And because threads can share a single address space, context switching between threads is faster than switching between processes.

Many modern applications take advantage of multithreading. In particular, applications that perform a lot of I/O — like web servers and databases — can drastically improve performance by implementing a multithreaded execution model. On a multiprocessor system, multiple threads of execution can even run simultaneously within a single process. Unfortunately, the threading abstractions in modern operating systems can be hard to understand, and are unavailable in certain programming languages.

So, one more time. A process can be thought of as a thread, plus an address space, file descriptors, and a bunch of other data. A single process can consist of multiple threads, and when one thread modifies a process resource, the change is immediately visible to sibling threads. On the other hand, processes have their own address spaces, and are unable to communicate directly. Processes and threads are not the same thing.