“All men are mortal, Socrates is a man, therefore Socrates is mortal”, this three-line Syllogism is an example of one of the most ancient forms of reasoning. This Syllogism uses deductive reasoning to form a conclusion from two propositions: “All men are mortal” and “Scrotates is a man”. The conclusion, “Socrates is mortal” is new information.
Syllogisms are written in Natural Language and therefore are not as exact as modern symbolic reasoning systems. In addition, Syllogisms can open to inexact propositions such as: “Dogs are animals, cats are not dogs, therefore dogs are not animals”. Because of these weaknesses, Syllogistic Reasoning had fallen out of favour as a rigorous logical reasoning system.
Syllogistic reasoning has been resuscitated and rebranded as Natural Logic and the process as Natural Language Inference. This reawakening was mainly due to the increase in popularity of Natural Language Processing and by academics such as Larry Moss from Indiana University and Christopher Manning from Stanford.
This Bleeding Edge Episode 4, will cover the current state of the art in Natural Logic and Natural Language Inference and their practical applications for the Data Scientist.
The recent surge of interest from both the academic sphere and industry practitioners in the field of Natural Language Processing (NLP) and the rise to prominence of large Neural Networks, often referred to as Deep Learning, has generated very large corpora such as the iWeb: The Intelligent Web-based Corpus, and the Common Crawl.
Within these very large datasets are numerous prepositions that can be used in a computational Syllogistic Reasoning (Natural Reasoning) system. This raises the very real possibility of an automated large scale knowledge discovery system.
The prepositions used in an automated Natural Reasoning could be numerous, and therefore could be beyond the capacity of humans to deduce a conclusion. The brute power of computational systems and the freely available corpora could lead to non-obvious discoveries from existing information sources. Natural reasoning also raises the possibility of more human-like chat-bots where systems can reason with information not only in textual databases to infer new information but with the speech input provided by the people the system is interacting with.
The formal definition of Natural Logic is provided by Karttunen who states:
“Natural Logic attempts to do formal reasoning in natural language making use of syntactic structure and the semantic properties of lexical items and constructions”.
In his paper, from Natural Logic to Natural Reasoning, presented at the Cycling 2015 conference where he provided an overview of the history of Natural logic, he concluded that there is no advantage of converting the logic of natural language into a formal representation.
What this means in practical terms for computer scientists is that reasoning can be done directly from logic embedded in the text rather than converting it to logical rules for logic programming languages such as Prolog.
Fuzzy Natural Logic
Natural language is an imprecise medium when compared with precise mathematical language. Any logical deduction system that uses natural language as a basis for its prepositions will need to deal with vagueness in the underlying medium. Fuzzy natural logic is a technique that combines fuzzy and natural logic to handle vagueness.
Fuzzy natural logic is defined by Novak as:
“…a mathematical theory that provides models of terms and rules that come with natural language and allow us to reason and argue in it. At the same time, the theory copes with the vagueness of natural language semantics”.
An over-simplistic explanation of Fuzzy Logic is that it is a set of rules with overlapping borders.
An overused example of Fuzzy Logic is a controller for air conditioning. Because there is no authoritative way to decide the exact temperature that defines: cold, comfortable and hot, and because there is a lag between a temperature being reached and air-conditioning affecting the room temperature, fuzzy control rules overlap in a series of If/Then statements, as shown in the following diagram.
Fuzzy Natural Logic can be used to approximate the meaning of inexact expressions such as: ‘not very big’ or ‘very small’. The overlapping meanings of: ‘extremely small, very small, small, medium, big’ as defined by Novak is shown in the following figure. The following figure shows the overlapping of the semantic meanings of each of the words. And as expected: extremely small, and very small, whereas big and small do not.
Fuzzy natural logic is a start to disambiguate the inexactness of human expression in natural language.
To date there seem to be limited applications of Fuzzy Natural Logic, however, Novak suggests that Fuzzy Natural Logic can be applied to decision making, Classiﬁcation and evaluation, Time series forecasting and Mining information from time series in the form of natural language expressions.
It is without a doubt that the development of Fuzzy Natural Logic will be seen as an important milestone in the practical application of natural logic roadmap.
Applications of Natural Logic
Although Natural Logic has long been of academic interest, it has been hinted at in the introduction it now has some practical applications. An obvious and popular application is question-answering systems where users enter a question, and the system returns a natural-sounding answer.
Natural logic has been used by people like Christopher Manning of Stanford University to improve question-answering systems. In combination with his graduate student Neha Nayak, they combined shallow reasoning and natural logic for question-answering.
They combined these techniques because they claimed that question-answering is a hard task because of the lack of structured knowledge bases and therefore many systems rely upon unstructured textual databases.
They claim that shallow reasoning gives a broad coverage of information in the textual database whereas natural logic provides answers with high precision. Their approach views question-answering as a textual entailment task, where one sentence is inferred from another.
They used an open-source tool called NaturalLI which was designed to infer if common-sense facts are true by analysing large collections of known facts. Then, they extended it to work on dependency trees rather than a sequence of tokens.
They evaluated it on the Aristo dataset which is a collection of 8th-grade science questions and found it that it improved upon the basic Natural LI system as well as outscoring Knowbot, a system developed by Ben Hixon.
A related task to question-answering is common sense reasoning where artificial agents are able to make presumptions or inferences about common everyday situations.
The original example shown in the introduction about Socrates demonstrates that natural logic can make deductions, and with sufficient information would be able to make deductions about the majority of everyday situations.
The amount of publicity for self-driving cars has generated the question of whether self-driving cars need common-sense reasoning.
The argument is that current machine learning techniques generalise about various situations encountered when driving, but it is not clear if this type of technique is sufficient for all situations. For example, it is ‘common-sense’ that when driving behind a truck with an uncovered and unsecured load that at some stage this load will detach. Common-sense reasoning techniques may be able to assist the self-driving car to take evasive action before the inevitable result.
Gabor Angeli and Christopher Manning have demonstrated that Natural Logic can be used to extend and generalise facts. For example, in their paper, they were able to deduce from the statement: ‘a cat ate a mouse’, that ‘no carnivores eat animals’ was false.
Although this technique is at the start of strong artificial intelligence this database completion task shows the potential of natural logic in common-sense reasoning and by extension self-driving cars and other impactful artificial intelligence tasks.
Knowledge representation is a common task in the Artificial Intelligence community. Frequently computer scientists use ontologies, which are an undirected graph that connects entities (nodes) with an edge. The edges can have properties which describe the connection, and the nodes have properties as well as a class membership.
The whole system is typically underpinned with description logic, through which a reasoner can make inferences about the information stored. The ontology can be queried through languages such as SPARQL. Natural logic can be adapted to make approximations of ontologies from text where the connections are derived directly from logical statements in text.
Andreasen created a natural logic knowledge base by using the logical assertions in text for the life science domain. In addition, he suggested that natural logic could be used to ‘fill in the gaps’ in existing ontologies.
Knowledge discovery is a common task in text mining where ‘new knowledge’ can be inferred or discovered through correlations of facts from multiple data sources. The techniques are simple, and often do not require any reasoning capacity. Natural logic provides an opportunity to deduce new knowledge through logical deduction from large text corpora.
Although there was no research discovered that used Natural Logic to deduce new knowledge from large scale corpora it reasonable to assume that this is possible because logical inference was one of the human race’s oldest methods to infer new knowledge. Large corpora contain a large number of propositional statements, and domain-specific corpora contain propositional statements about a specific subject.
Because the value of any logical deduction system or machine learning strategy is dependent upon data rather than the strategy or system, it would be reasonable to assume that Natural Logic systems will be able to deduce a significant amount of new information.
The general renewed interest in Artificial Intelligence has stimulated research efforts in logical reasoning systems such as Natural Logic.
Logical inference can be seen as one of the first steps on the path to Strong AI where artificial systems can match or exceed human intelligence.
Current efforts in machine learning can be dismissed as curve fitting efforts and lack general intelligence.
Although Natural Logic systems alone will not be considered as a Strong AI system, it is likely that Natural Logic will be an important part of future Strong AI techniques.