Closed sets for labeled data

Gemma C. Garriga, Petra Kralj, Nada Lavrač

Research output: Contribution to journalArticlepeer-review

Abstract (may include machine translation)

Closed sets have been proven successful in the context of compacted data representation for association rule learning. However, their use is mainly descriptive, dealing only with unlabeled data. This paper shows that when considering labeled data, closed sets can be adapted for classification and discrimination purposes by conveniently contrasting covering properties on positive and negative examples. We formally prove that these sets characterize the space of relevant combinations of features for discriminating the target class. In practice, identifying relevant/irrelevant combinations of features through closed sets is useful in many applications: to compact emerging patterns of typical descriptive mining applications, to reduce the number of essential rules in classification, and to efficiently learn subgroup descriptions, as demonstrated in real-life subgroup discovery experiments on a high dimensional microarray data set.

Original languageEnglish
Pages (from-to)559-580
Number of pages22
JournalJournal of Machine Learning Research
Volume9
StatePublished - Apr 2008
Externally publishedYes

Keywords

  • Closed sets
  • Emerging patterns
  • Essential rules
  • ROC space
  • Rule relevancy
  • Subgroup discovery

Fingerprint

Dive into the research topics of 'Closed sets for labeled data'. Together they form a unique fingerprint.

Cite this