José Luis Balcázar (Universitat Politècnica de Catalunya y Universidad de Cantabria)
Towards a Logic of Association Rules: Deduction,
Optimum Axiomatizations, and Objective Novelty
An association rule is a form of partial
implication between two terms (sets of propositional variables, understood
conjunctively). In the case of standard implications, we are just back in Horn
logic; but, in association rules, the notion of implication is redefined to
allow exceptions or different populations. Association rules are among the most
widely employed data analysismethods in the field of Data Mining.
Naive uses of association miners end up often
providing far too large amounts of mined associations to result actually useful in
practice. Many proposals exist for selecting appropriate association rules,
trying to measure their interest in various ways; most of these approaches are
statistical in nature, or share their main traits with statistical notions. In
the most common approach, association rules are parameterized by a lower bound
on their confidence, which is the empirical conditional probability of their
consequent given the antecedent, and/or by some other parameter bounds such as
``support'' or deviation from independence.
Alternatively, some existing notions of
redundancy among association rules allow for a logical-style characterization
and lead to irredundant bases (axiomatizations) of absolutely minimum size. We
will discuss notions of redundancy, that is, of logicalentailment, among association rules, and how to
complement the association rule mining process by filtering also the obtained
rules according to their novelty, measured in a relative way with respect to
the confidences of related rules.
Recent papers describing these advances are
available from the author's webpage. Additionally, we can actually offer a
preliminary version of a rule-mining proof-of-concept system implementing our
contributions.