Article snapshot taken from[REDACTED] with creative commons attribution-sharealike license.
Give it a read and then ask your questions in the chat.
We can research this topic together.
Golem takes as input a definite program B as background knowledge together with sets of positive and negative examples, denoted and respectively. The overall idea is to construct the least general generalisation of with respect to the background knowledge. However, if B is not merely a finite set of groundatoms, then this relative least general generalisation may not exist.
Therefore, rather than using B directly, Golem uses the set of all ground atoms that can be resolved from B in at most h resolution steps. An additional difficulty is that if is non-empty, the least general generalisation of may entail a negative example. In this case, Golem generalises different subsets of separately to obtain a program of several clauses.
Golem also employs some restrictions on the hypothesis space, ensuring that relative least general generalisations are polynomial in the number of training examples. Golem demands that all variables in the head of a clause also appears in a literal of the clause body; that the number of substitutions needed to instantiate existentially quantified variables introduced in a literal is bounded; and that the depth of the chain of substitutions needed to instantiate such a variable is also bounded.
Example
The following example about learning definitions of family relations uses the abbreviations
par: parent, fem: female, dau: daughter, g: George, h: Helen, m: Mary, t: Tom, n: Nancy, and e: Eve.
It starts from the background knowledge (cf. picture)
,
the positive examples
,
and the trivial proposition
true
to denote the absence of negative examples.
The relative least general generalisation is now computed as follows to obtain a definition of the daughter relation.
Relativise each positive example literal with the complete background knowledge:
from and , similar for all other background-knowledge literals
from and , and many more negated literals
Delete all negated literals containing variables that don't occur in a positive literal:
after deleting all negated literals containing other variables than , only remains, together with all ground literals from the background knowledge
Convert clauses back to Horn form:
The resulting Horn clause is the hypothesis h obtained by Golem. Informally, the clause reads " is called a daughter of if is the parent of and is female", which is a commonly accepted definition.
References
Muggleton, Stephen H.; Feng, Cao (1990). Arikawa, Setsuo; Goto, Shigeki; Ohsuga, Setsuo; Yokomori, Takashi (eds.). "Efficient Induction of Logic Programs". Algorithmic Learning Theory, First International Workshop, ALT '90, Tokyo, Japan, October 8-10, 1990, Proceedings. Springer/Ohmsha: 368–381.
^ Nienhuys-Cheng, Shan-hwei; Wolf, Ronald de (1997). Foundations of inductive logic programming. Lecture notes in computer science Lecture notes in artificial intelligence. Berlin Heidelberg: Springer. pp. 354–358. ISBN978-3-540-62927-6.
^ Aha, David W. (1992). "Relating relational learning algorithms". In Muggleton, Stephen (ed.). Inductive logic programming. London: Academic Press. p. 247.
Nienhuys-Cheng, Shan-hwei; Wolf, Ronald de (1997). Foundations of inductive logic programming. Lecture notes in computer science Lecture notes in artificial intelligence. Berlin Heidelberg: Springer. p. 286. ISBN978-3-540-62927-6.
i.e. sharing the same predicate symbol and negated/unnegated status
in general: n-tuple when n positive example literals are given