On the Splitting Properties of Common Attribute Evaluation Functions

Tapio Elomaa and Juho Rousu: On the Splitting Properties of Common Attribute Evaluation Functions. Report C-2000-1, Department of Computer Science, University of Helsinki, January 2000. 19 pages. <http://www.cs.helsinki.fi/TR/C-2000/1>

Full paper: Postscript file

Abstract

The complexity of numerical domain partitioning depends on the number of potential cut points. For a large family of attribute evaluation functions only boundary points need to be considered as candidates. We prove that an even more general property holds for many commonly-used functions. They do not obtain their optimal value within a segment of examples in which the relative class frequency distribution of examples is static. The borders of such segments are a subset of boundary points. Thus, even less cut points need to be examined for these functions.

The results shed a new light on the splitting properties of common attribute evaluation functions and they have practical value as well. The functions that are examined also include non-convex ones. Hence, the property introduced is not just another consequence of the convexity of a function.

Index Terms

Categories and Subject Descriptors:
I.2.6 Learning [Artificial Intelligence]: Concept learning, Induction, Knowledge acquisition

General Terms: Learning, Analysis

Additional Key Words and Phrases: numerical attributes, boundary points, decision trees.

Online Publications of Department of Computer Science, Anna Pienimaki

Last updated Mar 06, 2001