Inheritance diagram for kddml.Operators.Preprocessing.DiscretizationAlgorithms.NATURAL_BINNING_DISCRETIZATION_RESOLVER:
Public Member Functions | |
void | readParameters (Hashtable< String, KDDMLScalarManager > parameters) throws ResolverException, KDDMLCoreException |
void | readDiscretizationAttributeStatistics (NumericalStatisticManager stat) throws ResolverException, KDDMLCoreException |
Object[] | discretize (Double[] values) throws ResolverException |
boolean | isNumericLabeling () |
String | getHistoryDescription () |
w = vmax vmin / k
and the cut points are at vmin+w, vmin+2w, . . . , vmin+(k)w. The number of output intervals k and the width of the interval w are mutually exclusive parameters.
The algorithm takes as input a preprocessing table containing at least a numeric field, representing the discretization attribute.
When the intervals have been computed, the algorithm replaces each training instance value of A with an interval label. Numeric or nominal labeling are allowed.
A Numeric interval label includes the mean, the median, the minimum or maximum calculated on the values belonging to the interval.
A Nominal interval label includes a list of strings, each containing the labels used to replace each training instance value belonging to the interval. The system guarantees that the number of nominal labels is equal to the number of output intervals k. The mapping between intervals computed by the algorithm and nominal labels starts from the interval containing the lowest values6. As an instance, suppose that the algorithm computes the intervals I1 = [6, 35), I2 = [35, 65) and I3 = [65, 95). Moreover suppose that the nominal labels provided are "young", "adult" and "elder" in that order. For each training input instance, a value v of the discretization attribute is replaced with "young", "adult" and "elder" if v belongs to I1, v belongs I2 and v belongs to I3 respectively. By using the nominal interval labeling, the type of the discretization attribute become enumerated.
At present, the algorithm is implemented using (in part) the WEKA system library.
Title: KDDML
Description: Knowledge Discovery in Database Environment
Copyright: Copyright (c) 2003-2005
Company: Universita' di Pisa - Dipartimento di Informatica
Sandra Zimei
|
Reads the XML parameters related to a generic algorithm stored in the ALGORITHM entity. An algorithm settings object captures the parameters associated with a particular algorithm. It allows a knowledgeable user to fine tune algorithm parameters. Generally, not all parameters must be specified, however, those specified are taken into account by the KDDML.
Implements kddml.Operators.AlgorithmResolverTask. |
|
Reads the data statistics related to the input discretization attribute. Data statistic can be used to provide additional information to preprocessing algorithm, such as the minimum and maximum value of the attribute.
Implements kddml.Operators.Preprocessing.DiscretizationAlgorithms.DiscretizationAlgorithmResolverTask. |
|
Main method that discretizes the input values related to an attribute. Input values are given as array of Doubles where missing values are represented as null objects. The operator returns the discretized values as array, where missing values are represented as null object. The order in wich values appear in the arrays corresponds to the order in wich they appear in the preprocessing table. So, the size of the input and output array is equal to the total number of instances. According to the labeling technique, the result of a discretization process can be either numeric (e.g. the mean of the bin) or nominal (e.g. a labels used to replace each instance value belonging to the bin). In the first case, the method returs an array of Double objects. Otherwise, it returns an array of String objects.
Implements kddml.Operators.Preprocessing.DiscretizationAlgorithms.DiscretizationAlgorithmResolverTask. |
|
Specifies the type of labeling to be used. Return true if the result of the discretization process is numeric (e.g. the mean of the bin). Returns false if the result of the discretization process is nominal (e.g. a labels used to replace each instance value belonging to the bin).
Implements kddml.Operators.Preprocessing.DiscretizationAlgorithms.DiscretizationAlgorithmResolverTask. |
|
Returns a description of the actions performed by this preprocessing algorithm. This description will be reported in the history related to the preprocessing data source.
Implements kddml.Operators.Preprocessing.PPAlgorithmResolverTask. |