Main Page | Class Hierarchy | Class List | Class Members

kddml.Core.DataMining.Clustering.ClusteringModel Class Reference

Inheritance diagram for kddml.Core.DataMining.Clustering.ClusteringModel:

kddml.Core.DataMining.MiningModel kddml.Core.DataMining.Clustering.ClusteringModelManager kddml.Core.KDDMLObject kddml.Core.DataMining.MiningModelManager kddml.Core.DataMining.MiningModelManager kddml.Core.HTMLTranslator List of all members.

Public Member Functions

KDDMLObjectType getType ()
boolean isEmpty ()
void saveToRepository () throws KDDMLCoreException
String toString ()
void saveHTML () throws KDDMLCoreException
void addCluster (ClusterManager cluster) throws ClusteringModelException
ComparisonMeasure getComparisonMeasure ()
ClusterDescriptionManager getClusterDescription ()
ClusterManager getCluster (int identifier) throws ClusteringModelException
java.util.Iterator getClusters ()
int getNumberOfClusters ()
boolean isCentroidBased ()
boolean isDistributionBased ()
int getMaxNumberOfClusters ()
ClusterManager getCluster (Object instance) throws ClusteringModelException
Object toInstances () throws ClusteringModelException

Detailed Description

The process of grouping a set of physical object into classes of similar objects is called clustering. A cluster is a collection of data object that are similar to one another within the same cluster and are dissimilar to the objects in other cluster.
Clustering methods may be classified into three groups: distance-based, distribution-based (or model-based), density-based methods.
Distance-based clustering needs a distance or dissimilarity measurement based on which they try to group those most similar objects into one cluster. K-Means is distance-based partitioning method.
Model-based or distribution-based clustering methods assume the data of each cluster conforms to a specific statistical distribution (e.g. the Gaussian distribution) and the whole dataset is a mixture of several distribution models. EM is an example of distribution-based partitioning clustering that do not require the specification of distance measures.
Density-based approaches regard a cluster as a dense region of data objects.
This class manages both center-based clustering and distribution-based clustering. A ClusteringModel basically consists of a set of clusters. For each cluster a center vector can be given. In center-based models a cluster is defined by a vector of center coordinates. Some distance measure is used to determine the nearest center, that is the nearest cluster for a given input record. For distribution-based models the clusters are defined by their statistics. Some similarity measure is used to determine the best matching cluster for a given record. The center vectors then only approximate the clusters.
The model must contain information on the distance or similarity measure used for clustering. It may also contain information on overall data distribution, such as covariance matrix, or other statistics.

Title: KDDML

Description: Knowledge Discovery in Database Environment

Copyright: Copyright (c) 2003-2005

Company: Universita' di Pisa - Dipartimento di Informatica

Author:
Andrea Romei (romei@di.unipi.it)
Version:
2.0.16


Member Function Documentation

KDDMLObjectType kddml.Core.DataMining.Clustering.ClusteringModel.getType  )  [virtual]
 

It returns the type of the object.

Returns:
KDDMLObjectType

Implements kddml.Core.KDDMLObject.

boolean kddml.Core.DataMining.Clustering.ClusteringModel.isEmpty  )  [virtual]
 

Returns ture if the model do not contain clusters.

Returns:
boolean

Implements kddml.Core.KDDMLObject.

void kddml.Core.DataMining.Clustering.ClusteringModel.saveToRepository  )  throws KDDMLCoreException [virtual]
 

Save the object in the system repository. The destination path is provided by the object_path variable.

Exceptions:
KDDMLCoreException 

Implements kddml.Core.KDDMLObject.

String kddml.Core.DataMining.Clustering.ClusteringModel.toString  ) 
 

Returns a representation of this object as string.

Returns:
String

Reimplemented from kddml.Core.DataMining.MiningModel.

void kddml.Core.DataMining.Clustering.ClusteringModel.saveHTML  )  throws KDDMLCoreException [virtual]
 

Save the object in the system repository as HTML document.

Exceptions:
KDDMLCoreException 

Implements kddml.Core.HTMLTranslator.

void kddml.Core.DataMining.Clustering.ClusteringModel.addCluster ClusterManager  cluster  )  throws ClusteringModelException
 

Adds a new cluster to the model, both for centroid-based and distribution-based clustering. Throws an exception if the number of clusters exceededs the maximum number of clusters allowed.

Parameters:
cluster ClusterManager
Exceptions:
ClusteringModelException 

Implements kddml.Core.DataMining.Clustering.ClusteringModelManager.

ComparisonMeasure kddml.Core.DataMining.Clustering.ClusteringModel.getComparisonMeasure  ) 
 

Returns the aggregate function used to compare two objects. This depends on the type of clustering. Cannot return null.

Returns:
ComparisonMeasure

Implements kddml.Core.DataMining.Clustering.ClusteringModelManager.

ClusterDescriptionManager kddml.Core.DataMining.Clustering.ClusteringModel.getClusterDescription  ) 
 

Returns the cluster description for this model. Cannor return null.

Returns:
ClusterDescriptionManager

Implements kddml.Core.DataMining.Clustering.ClusteringModelManager.

ClusterManager kddml.Core.DataMining.Clustering.ClusteringModel.getCluster int  identifier  )  throws ClusteringModelException
 

Returns the Cluster object in the model with the specified identifier. Throws an exception if the specified index do not exist in the clustering model.

Parameters:
identifier int a positive value representing the cluster index.
Exceptions:
ClusteringModelException 
Returns:
ClusterManager

Implements kddml.Core.DataMining.Clustering.ClusteringModelManager.

java.util.Iterator kddml.Core.DataMining.Clustering.ClusteringModel.getClusters  ) 
 

Returns an iterator of cluster objects.

Returns:
Iterator the set of cluster as ClusterManager.

Implements kddml.Core.DataMining.Clustering.ClusteringModelManager.

int kddml.Core.DataMining.Clustering.ClusteringModel.getNumberOfClusters  ) 
 

Returns the number of clusters in the model.

Returns:
int

Implements kddml.Core.DataMining.Clustering.ClusteringModelManager.

boolean kddml.Core.DataMining.Clustering.ClusteringModel.isCentroidBased  ) 
 

Returns true if the clustering is center-based.

Returns:
boolean

Implements kddml.Core.DataMining.Clustering.ClusteringModelManager.

boolean kddml.Core.DataMining.Clustering.ClusteringModel.isDistributionBased  ) 
 

Returns true if the clustering is distribution-based.

Returns:
boolean

Implements kddml.Core.DataMining.Clustering.ClusteringModelManager.

int kddml.Core.DataMining.Clustering.ClusteringModel.getMaxNumberOfClusters  ) 
 

Returns the maximum number of clusters that the model can contain.

Returns:
int

Implements kddml.Core.DataMining.Clustering.ClusteringModelManager.

ClusterManager kddml.Core.DataMining.Clustering.ClusteringModel.getCluster Object  instance  )  throws ClusteringModelException
 

Returns the cluster containing the input instance. This depends on the type of clustering and the comparison measure used.

Parameters:
instance Object
Exceptions:
ClusteringModelException 
Returns:
ClusterManager

Implements kddml.Core.DataMining.Clustering.ClusteringModelManager.

Object kddml.Core.DataMining.Clustering.ClusteringModel.toInstances  )  throws ClusteringModelException
 

Returns a representation of each single cluster as instance. This depends on the type of clustering. For center-based clustering, each cluster is represented by the centroid. In this case, it returns the centroid poind as instance. For distribution-based clustering, each cluster is represented by the statistics. In this case, the instace values depend on the type of attribute. For numeric attribute, the mean containing in the statistics is reported. For discrete attribute, the most probable category value is reported.

Exceptions:
ClusteringModelException if an error occurs.
Returns:
Object the set of instances as in weka.core.Instances. Each record represent a single cluster.

Implements kddml.Core.DataMining.Clustering.ClusteringModelManager.


Generated on Thu Feb 23 13:04:40 2006 for kddml by  doxygen 1.4.3