Brief History Of Artificial Intelligence, Part II – “Theoretical Fondation”

by Stefan Iliescu / 6 August


As we anticipated with the first episode of this small foray into the history of AI, in this second part we will try to present some essential theoretical achievements in the field. The way we consider to be the most appropriate is to exemplify two representative algorithms of their time in the field of computational and cognitive science. And who is more appropriate to start with than John McCarthy, the man who introduced the term “Artificial Intelligence”? (at the famous Dartmouth conference in Summer 1956, which also marked the beginning of AI as a field)



One of the most critical American researchers of the time, McCarthy contributed massively in related fields such as mathematics, logic, information technology, cognitive sciences, artificial intelligence. As it is very difficult for us to mention now all his major contributions, we limit ourselves to listing some of them, such as:


•  creation in the 50s of the LISP language, which, based on lambda calculus becomes the preferred language of the AI ​​community; inventing the concept of “garbage collector” in 1959 for LISP

•  participation in the committee that gave birth to the ALGOL60 language, for which he proposed in 1959 the use of recursion and conditional expressions

•  significant contribution in defining three of the very earliest time-sharing systems (Compatible Time-Sharing System, BBN Time-Sharing System, and Dartmouth Time-Sharing System)

•  the first promoter of the idea of ​​a computer utility, a prevalent concept in the 60s to the 90s, and returned to a new youth today in various forms: cloud computing, application service provider, etc.


But his genius (for which McCarthy is referred to as “Father of AI”) came to light even better through papers such as “Artificial Intelligence, Logic and Formalizing Common Sense”, “Making Conscious Robots of their Mental States”, “The Little Thoughts of Thinking Machines”, “Epistemological Problems of Artificial Intelligence”, “On the Model Theory of Knowledge”, “Creative Solutions to Problems”, or “Appearance and Reality: A challenge to machine learning” – papers that we hope will arouse the curiosity of many of our readers.



From the article “Free Will – Even for the Robots,” we present below a sample of his deterministic approach to free will. The aim was to propose a theory of Simple Deterministic Free Will (SDFW) in a deterministic world. The theory splits the mechanism that determines action into possible actions and their consequences first and then decides which action is preferred the most. AI requires the formal expression of phenomena through the mathematical logic of situation calculus. The equation:


s’ = Result(e, s)


asserts that s’ is the situation that results when event e occurs in the situation s. Since there may be many different events that can occur in s, and the theory of the function Result does not say which occurs, the theory is nondeterministic. Having some preconditions for the event to occur, we will get to the formula:


Precond(e, s) → s’ = Result(e, s).


McCarthy added a formula Occurs(e, s) to the language that can be used to assert that the event e occurs in situation s. We have:


Occurs(e, s) → (Next(s) = Result(e, s)).


Adding occurrence axioms makes a theory more deterministic by specifying that certain events occur in situations satisfying designated conditions. The theory still remains partly non-deterministic, but if there are occurrence axioms specifying what events occur in all the possible situations, then the theory becomes deterministic (i.e. has linear time).



We can now give a situation calculus theory for SDFW illustrating the role of a non-deterministic theory in determining what will deterministically happen, i.e. by saying what choice a person or machine will make.


In the following formulas, lower case term represents a variable and capitalized term represents a constant. Let us assume that an actor has a choice of just two actions a1 and a2 that may be performed in situation s. It means that event Does(actor, a1) or Does(actor, a2) occurs in situation s according to which of Result(Does(actor, a1), s) or Result(Does(actor, a2), s) that actor prefers.


The formulas that declare that an actor will do the preferred action are



                                            Occurs(Does(actor, Choose(actor, a1, a2, s), s), s),                                         




Choose(actor, a1, a2, s) =


if P refers(actor, Result(a1, s), Result(a2, s)).


then a1 else a2.


Prefers(actor, s1, s2) means that actor prefers s1 to s2 (and therefore made his choice) and this way it makes the theory determinist.

Now let us take a non-deterministic theory of “greedy John”:


Result(A1, S0) = S1,

Result(A2, S0) = S2,

Wealth(John, S1) = $2.0 × 106 ,

Wealth(John, S2) = $1.0 × 106 ,


(∀s s’)(Wealth(John, s) > Wealth(John, s’) → Prefers(John, s, s’).


It is obvious that greedy John prefers a situation in which he has greater wealth by making the right action from situation S0 to situation S1. From equations ¹-³ it can be inferred



Occurs(Does(John, A1, S0))


Prefers(actor, s1, s2) means that actor prefers s1 to s2 and there were used two actions to keep the formula for Choose as short as possible. This illustrates briefly the role of the non-deterministic theory of Result within a deterministic theory of what occurs. Equation (1) represents the non-deterministic of Result used to asses which action leads to the better situation. Equation (2) represents the deterministic part that indicates which action occurs.


McCarthy makes four conclusions:


•  “1. Effective AI systems, e.g. robots, will require identifying and reasoning about their choices once they get beyond what can be achieved with situation-action rules (i.e. chess programs always have).

•  The above theory captures the most basic feature of human free will.

•  Result(a1, s) and Result(a2, s), as they are computed by the agent, are not full states of the world but elements of some theoretical space of approximate situations the agent uses in making its decisions. Part of the problem of building human level AI lies in inventing what kind of entity Result(a, s) shall be taken to be.

•  Whether a human or an animal uses simple free will in a type of situation is subject to experimental investigation.”


We can consider that formulas (¹) and (²) illustrate a person making a choice. Nothing about person knowing it has some choices or preferring situations in which there are available more choices. So, for the situations when we need to take into considerations these phenomena we have to extend SDFW – as a partial theory. The importance of this theory is enormous both in terms of the interest given to the understanding of cognitive processes by man and as an aggregating result of some essential minds of the time who supported McCarthy in its realization.


The second algorithm proposal comes from the well-known Alan Turing.  One of the pioneers and most prominent promoters of theoretical computer science, Alan Turing was a mathematician, logician, cryptanalyst, philosopher, and British botanist. Perhaps his best-known contribution to the field is the Turing Machine – a mathematical model that has received over time numerous theoretical variants and alternatives as well as practical implementations. To fully understand the context of its creation, it must first be mentioned that in the 1930s, there were no computers, but this did not prevent the scientists of the time from proposing extremely bold theoretical objectives regarding the opening of the application area (i.e., “Halting Problem”).


The Turing Machine has the following parts:


1. an infinite roll of tape over which the symbols can be written, deleted and rewritten

2. the head that moves left and right on the tape as the symbols are written/rewritten or deleted (i.e. the head of a Hard Disk Drive)

3. the state register that represents a memory area which stores the state of the machine


The machine can read a symbol on the tape at some point, then write a symbol and then reposition the writing head to the left or right.  Although it has only implemented these simple routines, we will prove in the following that this model – and therefore, the Turing Machine – provides the theoretical basis for implementing any algorithm in any known language. The table of instructions for use of the machine is presented in the table below.


Current state Current symbol Action Move Next state
S0 “0” Write “1” Right S1
S0 “1” Write “0” Right S1
S1 “0” Write “0” Right S0
S1 “1” Write “1” Right S0


The first two columns determine the input combinations that the machine can receive, and it consists of the state of the car and the symbol read. The next three columns determine the action performed by the machine, consisting of the symbol to be written, the direction of movement of the writing head and the future state of the machine. For example, the second line in the table above tells us that, being in the state S0, with the write end positioned on the symbol “0”, the machine will write the symbol “1” in that position after which it will move to Right transitioning to state S1.


The analysis of the table with instructions shows the following:


•  from any state S0, the symbols 0 and 1 are interchanged

•  from any state S1, the symbols 0 and 1 remain the same

•  based on the two points above we deduce that a string of type “111111” will be processed resulting in the string “010101”.

Let’s take a more complex example, which allows us to perform additions like “000+00=00000”, the equivalent of “3+2= 5”. For this we will consider the following instruction table – a variant of the instruction table above.


Current state Current symbol Action Move Next state
S0 “0” Write “Blank” Right S1
S0 “+” Write “Blank” Right S5
S1 “0” Write “0” Right S1
S1 “+” Write “+” Right S2
S2 “0” Write “0” Right S2
S3 “0” Write “0” Left S3
S3 “+” Write “+” Left S4
S4 “0” Write “0” Left S4
S4 “Blank” Write “Blank” Right S0


Applying the calculation method from the previous example, we can see that the machine does the following major steps:


•  STEP 1: replaces the first “0” (in the group of three blank spaces) with a blank space

•  STEP 2: moves to the end of the string (after the group of two blank spaces)

•  STEP 3: add a “0” at the end of the string (after the last “0”)

•  STEP 4: return to the beginning of the string and resume STEP 1

•  STEP 5: if the first symbol is “+” it is removed and the algorithm ends successfully.


We start with a step-by-step addition. The initial input (written on tape) is “000+00”, and once the machine is started, the writing head is positioned on the first “0” – in the S0. S0 has two transitions, one for “0” and another for “+”; read the first “0”, replace it with blank space and then move the head one position to the right. From S1, the machine can have again two transitions. The first of these is a loop, and involves replacing all “0” with “0” with the repeated movement of the head to the right – keeping the machine in the S1. The transition to S2 is made once the “+” is passed, after which moves are made to the left of the writing head until S0 is reached; then the specific transitions towards S1 and S2 are resumed (similar to above). Once the blank space is found, the write head will replace it with a “0” and move to the left, towards S3. Similarly, from S3 it will jump over all “0”s again until the head reaches a “+”, but this time the machine will move to the left. Once “+” is reached, the machine will move a space to the left and transition to S4 state. From S4 the machine will jump over all “0”s and move back to S0 and move to the right if it reaches an empty space (i.e. a space past the beginning of the row) – that is, the entire loop repeated. In fact, the machine replaces a “0” in front of a “+” with a blank space, moves its head to the end of the string and adds a “0”. Then go back to the first “0” on the left and repeat. It keeps doing this until all the “0” characters from the left of “+” will be replaced with blank spaces.


•  Loop 1: “000+00”

•  Loop 2: “00+000”

•  Loop 3: “0+0000”

•  Loop 4: “+00000”

•  Loop 5: “00000”.


After four loops, the machine is in S1, but this time the head reads a “+”. In fifth loop the machine replaces “+” with an empty space and moves to S5, the final state. The conclusion is that, without a real computer, through a simple set of tools and rules, we can build a machine that can calculate! And it will work for any length of the “0”s (string).

The algorithms presented above were in a simplified form, and of course, perhaps more examples would have been needed for a thorough understanding of them. We hope, however, that the chosen examples will arouse your curiosity to read more about them and their authors. We will return with a third and last part of this short history of AI with some examples of the most representative achievements in the field, real turning points in the human-machine relationship.