Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining

Petra Kralj Novak, Nada Lavrač, Geoffrey I. Webb

Research output: Contribution to journalArticlepeer-review

Abstract (may include machine translation)

This paper gives a survey of contrast set mining (CSM), emerging pattern mining (EPM), and subgroup discovery (SD) in a unifying framework named supervised descriptive rule discovery. While all these research areas aim at discovering patterns in the form of rules induced from labeled data, they use different terminology and task definitions, claim to have different goals, claim to use different rule learning heuristics, and use different means for selecting subsets of induced patterns. This paper contributes a novel understanding of these subareas of data mining by presenting a unified terminology, by explaining the apparent differences between the learning tasks as variants of a unique supervised descriptive rule discovery task and by exploring the apparent differences between the approaches. It also shows that various rule learning heuristics used in CSM, EPM and SD algorithms all aim at optimizing a trade off between rule coverage and precision. The commonalities (and differences) between the approaches are showcased on a selection of best known variants of CSM, EPM and SD algorithms. The paper also provides a critical survey of existing supervised descriptive rule discovery visualization methods.

Original languageEnglish
Pages (from-to)377-403
Number of pages27
JournalJournal of Machine Learning Research
Volume10
StatePublished - Jan 2009
Externally publishedYes

Keywords

  • Contrast set mining
  • Descriptive rules
  • Emerging patterns
  • Rule learning
  • Subgroup discovery

Fingerprint

Dive into the research topics of 'Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining'. Together they form a unique fingerprint.

Cite this