Computational Logic and Machine Learning

C o m p u t a t i o n a l L o g i c

Computational Logic and Machine Learning

Computational Logic and Machine Learning

Representation Issues

Prague, September 20, 1997

Michele Sebag

The area meeting of CompulogNet devoted to Representation Issues in Reasoning and Learning was organized in Prague by P. Flach and N. Lavrac, immediately after ILP'97.

Luc de Raedt recalled the different formalisms used for learning in first order logic: learning from interpretations (examples are interpretations, i.e. sets of atoms); learning from entailment (examples are clauses); and learning from satisfiability (examples are clausal theories). These settings are nested (learning from interpretations I learning from entailment I learning from satisfiability) which allows for propagating the PAC results (and situating learning systems) along this scale: negative PAC results for learning from interpretations still hold for learning from satisfiability, and conversely, positive PAC results for learning from satisfiability would still hold for learning from interpretations. This finally leads to a unifying framework for the different logical settings: learning from satisfiability, which subsumes all previous settings.

Simon Anthony and Alan Frisch showed how meta-languages can help reduce the cost of induction, and improve the processing of numerical information. The cost of induction is greatly increased by the redundancy of the search, i.e. the fact that the same clause may be discovered several times (e.g. clauses p(X) :- a(X), b(X) and p(X) :- b(X),a(X)). This could be limited by setting a partial order over literals, (ie over the predicates and the symbols in their arguments), thereby decreasing the number of possible refinement paths. Meta-languages can also support the discovery of numerical literals - and handling numbers is indeed a major concern in ILP. The question is to determine the numerical constants involved in such literals, in order to improve the accuracy of the whole clause. One possibility is to handle these constants as if they were variables (Skolem constants or meta-variables); the choice of these constants can then be delegated to specific or external (e.g. regression) routines. To sum up, learners need to avoid making over-commitments during learning; and these can be avoided using meta-languages at different stages in the learning task.

Ashwin Srinivasan made us benefit from his experience regarding ILP applications to molecular biology and bio-chemistry, with five (plus one) lessons.

* First lesson is: clearly identify problem areas where propositional learning has been shown to be insufficient; then focus on the tradeoff online - off-line efficiency. More precisely: ILP can deal with a general-purpose description of the examples; but a propositional learner dealing with a specific, hand-tuned description of the examples, might outperform the ILP learner. What you gain with ILP is the pre-processing phase; furthermore, ILP can suggest relevant features and thereby contribute to an efficient re-representation of the problem.

* In real-world domains, one digs the examples and background knowledge in the specialized literature (true for pre-Web times only ?) and these have to be debugged (still true for some time, I'm afraid). Debugging is easier in a declarative than in a procedural context.

* Once debugged, an efficient encoding of the background knowledge is essential.

* The optimality function (which kind of knowledge is to be preferred) is problem-specific: the Minimum Description Length may not necessarily correspond to a relevant preference order.

* Comprehensibility is problem-specific, too: e.g. a chemist might prefer a representation of knowledge, other than Prolog clauses.

* Last, have a log-book ! One easily forgets the failures and dead ends as time goes by and drawbacks are alleviated...). A. Srinivasan last underlined that a tight collaboration with an expert of the problem domain is a key factor of success, and he acknowledged the major contribution of Ross King to the breakthrough of ILP in bio-chemistry; the role of another domain expert, Mike Sternberg, was highlighted too. Bravo.

Fabrizio Riguzzi described an extended formalism for ILP, based on Abductive Logic Programming. Here, the stress is put on dealing with incomplete background knowledge; this is strongly required for learning from e.g. questionnaires, which always include unanswered questions. Abductive Logic Programming resembles learning from entailment, as far as rules are learned from (incomplete) clauses. But abductive entailment involves making additional assumptions, e.g. the grand-father is male. And this completion of the available information must be as regular as possible: the final interpretations (examples + assumptions) give rise to integrity constraints, e.g. a human being is either male or female. Abductive Logic Programming therefore also resembles learning from interpretations. One price to pay for this powerful setting comes from its non-monotonicity. In theory, one should check the consistency of the set of assumptions, with regards to one another, and/or with regards to further examples. In the worst case, the previous assumptions would prevent from finding any explanation for the current examples and backtracking would be required. But this does not occur in practice, as examples are ''sufficiently'' independent in most applications.

John Lloyd presented a new higher order logic language, termed Escher, that addresses many limitations of Prolog regarding lists, functions, negation, and so forth. Within Escher, a predicate is defined via a set of equations: these are much more natural than clauses, and easier to write as disjunction and explicit existential quantification are allowed. These equations correspond to a partition of the instances of the predicate; and the left side of the equation should give all means to check whether this equation is relevant to a given context. From the point of view of induction, Escher offers a clean way to express learning bias, and specify: the role of predicate arguments; the allowed number of occurrences of a predicate, including the connectives; a functional description of the search space (e.g. an upper bound on some complexity measure of the candidate solutions); and so forth.

As expected, the more powerful the language, the more one has difficulty in designing a learner, and the more expensive the learning search is (same remark as for the non-monotonic abductive learning setting); these are still open questions.

And what about inventing another, still more powerful, language, suggested P. Stepanek ? An object oriented language, subsuming Escher, and named Bach...

Jean-Francois Puget advocated for a fruitful collaboration between the Constrained Logic Programming (CLP) and the ILP communities. He sketched several possibilities for learning numerical literals in the same context as S. Anthony and A. Frisch, such as reformulating this problem as a constraint satisfaction problem (CSP) or an optimization problem. Another approach would consist of keeping the implicit description of such a literal given by the CSP, and directly check further examples against the CSP.

Conversely, a constraint solver could indeed benefit from a learning module able to generalize the previous search failures; the solver could then take advantage of such generalizations to prune further combinatorial exploration. Another direction for fruitful interaction is for ILP to refine the representation of a CSP (e.g. taking advantage of the symmetries, invariance and so forth, of the problem) in order to optimize the search.

Last, and not least, Celine Rouveirol presented three case studies illustrating how representational changes can support learning.

* The flattening operator proposed by C. Rouveirol and J.F. Puget allows one to get rid of function symbols; let us recall that most ILP learners do not presently deal with function symbols other than constants. Flattening can significantly decrease the complexity of learning a theory with an infinite Herbrand model, as one judiciously chooses the finite model of the flattened theory among the subsets of the infinite one.

* Some ILP problems (handling noise, numbers) can be addressed by mapping first-order onto attribute-value examples. LINUS (Lavrac and Dzeroski) uses syntactic restrictions to ensure a one-to-one mapping. The mapping is one-to-exponentially-many in the disjunctive version space (Sebag), which involves no syntactic restrictions. The latter mapping is combined with a stochastic bias in STILL (Sebag and Rouveirol), to enjoy a bounded (polynomial and tunable) complexity.

* A third change of representation is concerned with tuning the granularity of search when learning from databases. Different mappings can be defined from a DB to a LP representation: a table with $n$ attributes can be represented as a single n-ary predicate (large grain representation), or n-1 binary predicates (fine grain representation). The strategy of exploring the large grain lattice until inconsistencies are found, and only then switching to the fine grain lattice, dramatically decreases the number of queries to the DB. Such representation changes are implemented in Haiku (Nedellec and Rouveirol) and RDT/DB (Morik and Buchausen).

To sum up, most work is concerned with extending the learning language. These extensions aim at: dealing with more complex or incomplete examples; expressing more powerful learning biases; handling numbers more efficiently. An alternative to extending the language consists in reformulating the examples or the learning task.

Last, as ILP is getting mature, the first real-world applications deliver interesting feedback. This feedback is partly expected (the expert should be able to control the whole learning process) and occasionally counter-intuitive (Prolog programs are not considered understandable by all experts).

Michele Sebag

LMS, Ecole Polytechnique & LRI,

Universite d'Orsay

Michele.Sebag@polytechnique.fr

[ Coordinator's Report ] [ Computational Logic and Machine Learning ] [ BOOK ANNOUNCEMENT ] [ The 7th International Workshop on Inductive Logic Programming (ILP-97) ] [ Biomedical Applications of Computational Logic and Machine Learning ] [ Data Mining and Knowledge Discovery ] [ International Summer School on Inductive Logic Programming and Knowledge Discovery in Databases ] [ Frontiers of Inductive Logic Programming ] [ Abduction and Induction in AI ]

[ Home ] [ Automated Deduction Systems ] [ Computational Logic & Machine Learning ] [ Concurrent & Constraint Logic Programming ] [ Language Design, Semantics & Verification Methods ] [ Logic Based Databases ] [ Program Development ] [ Knowledge Representation & Reasoning ]