2018-11-21
Book recommendation: Michael Sipser's Introduction to the theory of computation
The key concept at the heart of theoretical computer science is Turing machines. So let's jump right to it.
A Turing machine is an extremely simple yet powerful model of computation. I will first describe what it is, and then describe why it is so important.
What are Turing machines?
You could look at this video that explains it. Or Google any number of resources that explain it. Or else read through my (pretty average) explanation given below.
If you want a book reference, check out Michael's Sipser's Introduction to the theory of computation (pdf available online).Its a highly recommended book and the one I used to learn more about all this.
And here we go ...
Imagine you have an infinite tape. A one dimensional array that runs infinitely in both directions, left and right. The tape is divided into cells and each cell can store a zero or a one. For a Turing machine that receives no input, the entire tape initially starts full of only zeroes. This tape is basically the memory of the machine.
Now we have a tape head. The tape head looks at only one cell at a time. It can both read and write on the tape. It can also move from one cell to the next, but only one step at a time. So if the tape head is pointing a cell, it can perform read/write operations on the cell, and it can move to the immediate left cell or the immediate right cell.
So far so good?
We decide on a finite number of states, let's say 5. Let's name them A B C D E. We now need a starting state, let A be the starting state. We also need one or more halting states, so let D be the halting state. What are states, you ask? I don't know, they're just ... states. And the machine is said to be in any one state at one point in time. And the states are not directly related to the tape or the tape head in any way.
We need to now provide a set of instructions for the machine to use when it comes across different situations. Since we have 5 states and a 2-letter alphabet (zeroes and ones), we need to provide a total of 5 * 2=10 instructions which are named A0 A1 B0 B1 C0 C1 ... And so on. For instance A0 tells the machine, if your current state is A and if your tape head reads 0, then follow the instructions given for A0.
Now how do we give instructions. An instruction comprises of 3 parts - a write command, a move command and a state change command. The write command tells the tape head to either overwrite the 1 or 0 it just read (with 0 or 1 respectively), or else leave the 0 or 1 as it is. The move command asks the machine to move the tape head either one step left or one step right. The state change command tells the machine to change from one state to another.
For instance, the instruction set attached to B1 could be 0LE. Which means, if the machine is currently in state B and the tape head reads 1, overwrite the 1 with 0, move the tape head one cell left (L), and then change the state of the machine to E.
If the machine is now in E state, we need to see what the tape head now reads (remember the tape head has moved hence it might read something different this time). Suppose it reads 1 again. We will now follow the instructions corresponding the E1.
Here's an example of a complete description of a 5 state (5 + 1 halting) 2 letter Turing machine.
A0 A1 B0 B1 C0 C1 D0 D1 E0 E1
1RB 1LC 1RC 1RB 1RD 0LE 1LA 1LD 1RH 0LA
A is assumed to be the starting state. The halting state is H (not mentioned) and E0 is the situation that leads to a halting state.
These 10 instructions form the 'program' that has been programmed into the machine.
A couple more points
So far we've only dealt with TMs that start with a blank tape (all zeroes), follow their instructions and then reach a halting state.
A TM in general is not guaranteed to halt, in fact it's very easy to write a TM that runs into an infinite loop.
Also we've had no mention of the concept of giving input to the TM the way we give input to a modern-day program.
The initial state of the tape is the input. So suppose we have a TM that finds the sum of two numbers (and yes, such a TM does exist), what we'll do is first write down both the numbers in binary representation on the tape, then place the head at the start of the input, and start the TM. Once done, the TM will halt and what's left on the tape is the sum of the two numbers in binary notation.
The Church Turing thesis
Now ... What's so great about Turing machines?
Anything that can be done by a modern day computer can be done by a Turing machine.
And it's as simple (and profound) as that.
Meaning if you can write a program in C or python or whatever that achieves some task, let's say... finding the factorial of a number, then there we will also be able to create a corresponding Turing machine with a finite number of states that can also compute the factorial of any number given in the input in a finite amount of time.
The procedure to convert a program written in a modern day language to a TM is quite complicated, but you can get an idea of how it is done here.
What we do now?
Now that we have a model of computation that is simple enough it can be described on paper, we can prove all sorts of theorems and results on:
For instance, the halting problem states that we can never create a TM that will be able to always identify if another TM is halting or is in an infinite loop.
It is often useful while programming to be able to determine whether your computer is in an infinite loop or if it is just taking a long time to reach its answer. The proof for the halting problem says that in general, always being able to determine this is impossible, no matter how much time you have.
If you've done some competitive program you'd be familiar with the big-O notation that gives a rough idea how 'fast' an algorithm is. For example, if the time taken by a program to do a task will increase nine times if we triple the size of the input, then we can say the program operates in quadratic time O(n^2).
One of the greatest unsolved problems in all of mathematics is the P versus NP problem that asks whether any task that can be achieved in non-polynomial time can also be achieved in polynomial time.
It might initially feel obvious, that if an algorithm runs in O(2^n) there simply cannot exist another algorithm that does the same thing in O(n^2) or O(n^3) time, but unfortunately there is yet to be a proper mathematical proof of the same.
The section on busy beavers grew too large so I shifted it here. Do check it out.
Enter email or phone number to subscribe. You will receive atmost one update per month
Enter comment