Brief History Of Artificial Intelligence, Part III

by Metaminds / 19 December

It is time for the third foray in our short history of AI, dedicated to presenting the top achievements in the field. It is not an easy thing to make a significant selection if we think about what makes an achievement to be considered great. In some cases it is about the innovative character of the algorithms, in others, the direct benefits brought to people are the most appreciated, or, in others, the psychological impact predominates. The following examples are categorized according to these three criteria, which, simplified, are: innovation, benefits, and impact.


The area of ​​innovative contributions is closely linked to the creation of the concept of a neural network. Although some efforts to create mathematical models date back to the 1930s, it was not until 1943 when Warren McCulloch and Walter Pitts created the first recognized computational model for neural networks, namely threshold logic, based on algorithms which mimics the functionality of a biological neuron.
Also in the 40’s the famous Canadian psychologist Donald Hebb made a hypothesis of learning built on neural plasticity mechanism, called Hebbian learning, which is largely considered as the ancestor of unsupervised learning models.
Another remarkable contribution appeared in 1958 when Frank Rosenblatt from Cornell University created the first algorithm for supervised learning called perceptron.  Using a linear classifier that combines a set of weights with the feature vector, the algorithm has immediate applicability in areas such as document classification, and more generally for problems with a large set of variables.
The earliest functional networks with multiple layers are derivatives of a family of inductive algorithms, called the Group Method of Data Handling, were created by Alexey G. Ivakhnenko from Glushkov Institute of Cybernetics in 1965. These algorithms proved extremely useful in areas such as data miningknowledge discoverypredictioncomplex systems modeling, optimization, and pattern recognition.
An widely used algorithm for training feedforward neural networks and other artificial neural networks is backpropagation. The basis of it resulted from control theory by Henry J. Kelley in 1960 but was derived by some other researchers in the early 60’s and implemented to run on computers by Seppo Linnainmaa as the subject of his master’s thesis (general method for automatic differentiation of discrete connected networks of nested differentiable functions) at University of Helsinki in 1970.
Paul Werbo‘s contribution to backprop in 1975 helped to effectively solve the problem reported by Marvin Minsky and Seymour Papert in 1969, that single-layer networks are incapable of processing the exclusive-or circuit. They also highlighted the fact that computers lacked the needed power to process large neural networks, a problem mitigated in the 1980s by the development of metal-oxide-semiconductor (MOS) and the very-large-scale integration (VLSI), in the form of complementary MOS.
The simulation of massive neural networks received a boost in 1986 with the introduction of distributed processing by American psychologists David Everett Rumelhart and James McClelland. The basis of convolutional neural networks (CNN) was laid in 1979, with the publication by Kunihiko Fukushima of the work on neocognitron, a type of artificial neural network (ANN).
Mathematical models for Deep Learning have been developed through joint or independent contributions by scientists such as Geoffrey Everest HintonYoshua Bengio, and Yann LeCun, since 1986 and continues to this day.  Of their most important achievements we mention here only a few:
• Geoffrey Everest Hinton co-invented Boltzmann machine with David Ackley and Terry Sejnowski, contributions to distributed representations, time-delay neural network, mixtures of experts, Helmholtz machine and Product of Expertscapsule neural networks;
•  Yoshua Bengio combined neural networks with probabilistic models of sequences, an idea that was incorporated into a system used by AT&T/NCR for reading handwritten checks in the 1990s; also he introduced high-dimension word embeddings as a representation of word meaning, with a huge impact on natural language processing tasks including language translation, question answering, and visual question answering;
•  In the 1980s, LeCun developed convolutional neural networks, an underlying principle that made deep learning more efficient; In the late 1980s he trained the first CNN system on images containing handwritten digits; also LeCun proposed one of the first implementable versions of the backpropagation algorithm, and is credited with developing a more expansive vision for neural networks as a computational model for a broad spectrum of tasks, introducing in his early works many standard concepts today in AI.



Regarding the beneficial achievements of mankind, a leading place is occupied by the application of AI in medicine.
Zebra Medical announces a new deep learning algorithm on its medical image analysis platform.  Thus, the company has an algorithm for identifying vertebral fractures also, in addition to existing algorithms that can detect bone density, fatty liver, and coronary artery calcification.
The platform provides support for diagnosis by analyzing a variety of medical images allowing to reduce response times in radiology services and increase their accuracy. The goal is to detect diseases from the stage when their signs are not obvious in imaging. A common example is vertebral compression fractures for which less than a third of them are “effectively diagnosed,” the company said in a statement. The Zebra VCF algorithm uses deep learning to highlight the difference between vertebral compression factors and other conditions, such as degeneration of the vertebral plate or bone spurs.
A Swiss company, Sophia Genetics, has created an AI technology for reading and aggregating the genetic code of DNA to help diagnose and predict genetic diseases, such as cancer. Named Sophia, the system uses AI to combine genome data with analysis, medical knowledge databases, and expert suggestions to create the best diagnosis to help healthcare professionals customize the patient treatment. At the moment, the system collects data from 170 major hospitals globally, continuously improving its capacity for early detection and diagnosis of genomic diseases.

Autonomous driving

Another area that could benefit enormously from AI would be transportation, by introducing autonomous driving capabilities.
The Google (now Weymo) self-driving car system has been officially recognized as a “driver” in the US since 2016, becoming the pioneer of autonomous driving systems. This recognition can be the necessary lever to change the legislation in the field of cars that do not require a human driver, so that they can meet the safety standards for driving on public roads without conventional driving mechanisms, namely steering wheel and pedals. Thus, “the driver in the context of the design of the vehicle described by Google” is the automatic driving system itself and not any of the occupants of the vehicle, “the government agency explained in a public letter. The agency acknowledged not only this change but the fact that no human occupant of the Google autonomous vehicle could meet the common definition of “human driver” because of the design of the car – even if he wants it. The recognition of the Google autonomous computer as a driver could be the legal basis for establishing liability in the event of car accidents, in the context in which US Department of Transportation has unveiled a plan to reduce accidents on public roads by increasing the number of autonomous vehicles.
Waymo recently said (2020) that the start of driverless travel will first begin with members of Waymo One in Arizona, after which it will gradually expand by registering on a smartphone.
Tesla boss Elon Musk – as Waymo’s direct rival, responded by saying that while Waymo’s autonomous driving technology is “impressive”, Tesla’s technology has a wider range of applications.
Moreover, Musk also ventured into a comparative appreciation of the two technologies. If Waymo technology uses a suite of sensors – including LiDAR – located above the cars, Tesla technology uses a system of sensors consisting of 8 video cameras, radar, and sonar – Musk stated that “anyone relying on the laser-based sensors is doomed to failure because of their expense and drain on power”. According to the estimates of many specialists in the field, the next years will be decisive in the gradual introduction of autonomous management services, so we do not have to wait long to find out which technology will be superior.
We appreciate that agriculture is a good candidate for benefits, especially when we talk about less developed areas, threatened by famine. A team of researchers from Pennsylvania State University and the École Polytechnique Fédérale de Lausanne, Switzerland, use in-depth learning algorithms to detect crop diseases before they spread. In the poor regions, up to 80% of agricultural production is made by small farmers, and they are most prone to the devastating effects of crop diseases, which can lead to famine.
The team has developed a program capable of running efficiently on a smartphone. They trained the algorithm on huge data sets – over 50,000 images – collected using PlantVillage, an online open-access archive dedicated to images of plant diseases. As a result, the algorithm identifies 26 diseases in 14 plant species with an accuracy of 99.35%, and to benefit from this service you only need to have a smartphone.

Earthquake prediction

The area of ​​benefits could not miss the area of ​​cybersecurity, an area of ​​impact on modern life in terms of online fraud. Recent research by the Association of Certified Fraud Examiners (ACFE), KPMG, PwC, and others highlights how organized crime modernizes its attack vectors and their magnitude and speed. Sadly, this modernization in most cases involves the use of machine learning to commit frauds undetectable by legacy cyber protection systems (systems based on inefficient rules and predictive models. Thus, the detection of new generations of fraud online needs the same Machine learning mechanisms applied to fight with equal weapons against remain equal to the complexity and extent of fraud today.
The modern strategy to combat online fraud focuses on the following 3 basic aspects:
• actively use supervised machine learning to train models so they can detect fraud attempts quicker than legacy systems;
• combine supervised and unsupervised machine learning into a single risk score (for fraud prevention) because anomalies are easier to detect in emerging data;
• take advantage of wide-ranging data networks of transactions to tweak and scale supervised machine learning algorithms, thus improving risk scores for fraud prevention.


In terms of public impact, all selections include the Deep Blue phenomenon. It was the first official victory of a “machine” against a reigning world champion under regular tournament conditions. After a first match rapidly wrapped up by chess grandmaster Garry Kasparov in 1996 (with a score of 4 – 2), the IBM team returned after 1 year with an upgraded version of Deep Blue to win the rematch (with a score of 3  – 2 ). Although many specialists of the time claimed that artificial intelligence eventually had caught up man, Deep Blue barely met the requirements to be considered an intelligent machine. It used a custom VLSI chip topology to run a sort of brute-force search algorithm (alpha-beta pruning) which involves at its core not a neural network architecture but a decision tree classifier. How Deep Blue made a decision (regarding a move) involves finding the optimal values ​​for a wide set of parameters, and for this thousands of games played by professionals (grandmasters) were analyzed – thus meaning the study of thousands of openings and endgames and tens of thousands of positions. Given that computer chess programs were still in their early stage, the match was more of a race to exploit the other’s weaknesses: the machine knew Kasparov’s style in depth (the IBM team was also allowed to adjusts the parameters between games) and Kasparov relied on the fact that the Deep Blue is greedy for material advantage, so he found it appropriate to set such traps. Long story short, there were enough arguments for those into conspiracy theories (Kasparov included) who were left with suspicions about the fairness of the match, especially since IBM refused the rematch and dismantled the machine.
A completely different story was in 2015 when no skepticism existed following the crushing victory of AlphaGo over the world champion at GoLee Sedol. A quite astonishing piece of distributed software supported by a team of more than 100 Google DeepMind scientists, AlphaGo relies entirely on a full-stack AI (neural network at the bottom, machine learning next, and deep learning on top). Running on 48 TPU‘s distributed on several machines, AlphaGo’s decision-making approach (in terms of moves) differs greatly from previous efforts, in the sense that the evaluation heuristic is not modeled by a library of thousands of master matches (of professional players) but results from the experience of playing with an identical instance of AlphaGo. The versions represent optimized variations of both the hardware and the time allowed for the play, AlphaGo continuously raising its level of play (the current version, AlphaZero that runs on 4 TPU’s on a single machine is an optimized version of AlphaGo Zero which is an optimized version of Alpha Go Lee – the version who defeated Lee Sedol). This decisional approach, freed from the need for human hard-coding, was the only one that could have paid off, knowing that Go’s level of complexity is far superior to chess. Given that the number of possible positions in Go is greater than the estimated number of atoms in the universe (and after the first two moves of a Chess game, there are 400 possible next moves. In Go, there are close to 130,000), it is clear For that reason, AI researchers can’t use traditional brute-force AI, which is when a program maps out the breadth of possible game states in a decision tree, because there are simply too many possible moves.
Although initially many experts (including Elon Musk) estimated that, due to the complexity of the Go game, it will take at least 10 years for the car to win a world champion, this occurred at the first opportunity (unlike Deep Blues). and Garry Kasparov). The fact that AlphaGo is based on pure learning mechanisms and less on human examples indicates a huge potential in the application of algorithms and other areas necessary for humanity. Thre years after the memorable match, Lee announced his retirement from professional play, arguing that he no longer feels like a top competitor in the Go world that will be so authoritatively dominated by AI (Lee referred to AlphaGo machine that defeated him as “an entity that cannot be defeated.”).
Another area of ​​great impact we consider to be “sentiment analysis” (or Emotion AI) –
Sentiment analysis
The first achievement that comes to mind is Microsoft Emotions, a sentiment detection from photos platform launched as a service in 2016. Part of a larger project of the company, namely the Oxford Project, it uses “world-class machine learning” to interpret people’s feelings as a cognitive service. The recognition engine is trained to detect eight emotions, and, for each of them, calculates a score associated with the analyzed image (the emotions are: Anger, Contempt, Disgust, Fear, Happiness, Neutral, Sadness, and Surprise).
Although the platform isn’t available for free, it is declared by Microsoft to be in an experimental state, and is expected to gradually add other capabilities such as “spell check“, “speaker recognition” as well as emotions recognition from movies.
VocalisHealth pioneers a complementary approach, aiming to detect vocal biomarkers. The biomarker is an indicator that signals the presence of a disease, and, most often, the severity of the disease. VocalisHealth has also built a significant database of biomarkers for many known diseases (COVID-19 included), in the form of voice samples collected through more than 250,000 records from more than 50,000 people. Advanced machine learning and deep learning mechanisms are used for the analysis of new voice samples, integrated into a customized platform for healthcare screening, triage, and continuous remote monitoring of health.


Another area of ​​impact is astronomy, where recently a team of astronomers and computer scientists from the University of Warwick identified 50 new planets using AI techniques, marking a technological breakthrough in astronomy. To do this, they have built a machine learning algorithm for analyzing old NASA data containing thousands of potential candidates for planet status. The classic method of searching for exoplanets (planets outside our solar system) is to detect decreases in the amount of light from a star under observation, a sign that a planet has passed through the telescope and that star. But these drops can also be caused by background interference or even camera errors.
The merit of the deep learning algorithm is that, by training, it manages to accurately separate the planets from the false-positives, and this on old, unconfirmed data, resulting in this set of new “registered” planets. The approach is a first, in the sense that such techniques have been used in astronomy only for the classification of planets, but never for their validation in the probabilistic realm.


We believe that the conclusions at the end of this short history of AI are multiple and we encourage you to find them for yourself and share them with us. We appreciate that perhaps the most important thing to recognize is that, whether it is welcome or feared, or whether it is considered to be truly competent and useful or not, AI makes its presence felt in more and more areas, with smaller or larger steps. Despite many current shortcomings, including the cumbersome generalization or the need for huge computing power, there are certain advantages, such as the increasing attention enjoyed by researching mathematical models, the existence of a large dataset for analysis, and the sympathy it receives among the public so that we are entitled to believe that AI will be a partner in our lives for a long time to come. It depends only on our choices that this partner will be a guarantor of improving the quality of life and not a threat.