yadt::dtree Class Reference

A decision tree. More...

#include <YaDT.h>

List of all members.

Public Types

enum  PruningStrategy { PRUNING_NO, PRUNING_C45, PRUNING_DT }
 Pruning strategy options. More...
enum  SplitType { ST_GAIN, ST_GAIN_RATIO }
 Split type of a splitting decision node. More...

Public Member Functions

void build (table *maintable, table::subset *subtable, bool evaluate=true) throw (runtime_error)
 Build an pruned tree.
dtreeclone () const
 Return a clone of the called object.
unsigned depth () const
 Return tree depth.
 dtree (const dtree &)
 Copy constructor not defined.
 dtree (const string &name="my_decision_tree")
 Constructor.
double evaluate (const datasource &ds, ostream &output, char sep= '\t') const throw (runtime_error)
 Predict classes of unseen cases.
double get_elapsed () const
 Return elapsed time (in secs) taken to build tree.
conf_matrixget_prediction ()
 Return confusion matrix over the training set.
const dtreeoperator= (const dtree &)
 Assignment constructor not defined.
conf_matrixpredict (table *cases, table::subset *subtable) const throw (runtime_error)
 Test classes of unseen cases.
pair< string, float > predict (vector< string > &attributes, float weight=1) const
 Predict class and confidence of an unseen case.
conf_matrixpredict (const datasource &ds) const throw (runtime_error)
 Test classes of unseen cases.
bool set_conf_level (float conf_level)
 Set confidence level in simplifying a decision tree.
bool set_min_obj (float min_objects)
 Set mininum weight of cases in sons in order to further split a node during tree building.
bool set_pruning_strategy (PruningStrategy strategy)
 Set simplification strategy.
bool set_split_type (SplitType st)
 Set split strategy.
unsigned size () const
 Return number of tree nodes.
void toBinary (const string &filename)
 Binary output.
void toDOT (ostream &os=cout) const
 Dot output.
void toTEXT (ostream &os=cout) const
 Textual output.
void toXML (ostream &os=cout, const conf_matrix *cmTest=NULL) const
 XML output.
unsigned training_n_rows () const
 Return number of cases used in building the tree.
 ~dtree ()
 Destructor.

Static Public Member Functions

static dtreefromBinary (const string &filename)
 Binary input.


Detailed Description

A decision tree.

The class provides methods for building, simplifying, and evaluating a decision tree.


Member Enumeration Documentation

enum yadt::dtree::PruningStrategy

Pruning strategy options.

Enumerator:
PRUNING_NO  No pruning.

PRUNING_C45  C4.5 pruning strategy.

PRUNING_DT  YaDT pruning strategy.

enum yadt::dtree::SplitType

Split type of a splitting decision node.

Enumerator:
ST_GAIN  Information gain split.

ST_GAIN_RATIO  Information gain ratio split.


Constructor & Destructor Documentation

yadt::dtree::dtree ( const string &  name = "my_decision_tree"  ) 

Constructor.

Parameters:
name decision tree name.

yadt::dtree::~dtree (  ) 

Destructor.

yadt::dtree::dtree ( const dtree  ) 

Copy constructor not defined.


Member Function Documentation

void yadt::dtree::build ( table maintable,
table::subset subtable,
bool  evaluate = true 
) throw (runtime_error)

Build an pruned tree.

The method builds a tree and simplifies it.

Parameters:
maintable a table containing the training set.
subtable the subset of maintable used as training set. NULL value denotes all the table as training set.
evaluate true if a confusion matrix must be also computed. The resulting confusion matrix can be obtained by calling the get_prediction() method.
See also:
set_pruning_strategy(), set_split_type(), set_min_obj(), set_conf_level()

dtree* yadt::dtree::clone (  )  const

Return a clone of the called object.

unsigned yadt::dtree::depth (  )  const

Return tree depth.

double yadt::dtree::evaluate ( const datasource ds,
ostream &  output,
char  sep = '\t' 
) const throw (runtime_error)

Predict classes of unseen cases.

The source rst::inpout::stream is required to provide for each case all attributes in the same order as the columns of training set: no class or weights must be provided. Optionally, a further attribute may be provided (tipically a key of the case) that is produced in output together with predicted class and confidence.

Parameters:
ds provider of unseen cases.
output output stream of predictions.
sep column separator in output stream.
Returns:
elapsed time.

static dtree* yadt::dtree::fromBinary ( const string &  filename  )  [static]

Binary input.

Parameters:
filename the input filename.
Returns:
a newly allocated decision tree.

double yadt::dtree::get_elapsed (  )  const

Return elapsed time (in secs) taken to build tree.

conf_matrix* yadt::dtree::get_prediction (  ) 

Return confusion matrix over the training set.

The method returns NULL if no tree was build or it was build by not requiring the computation of a confusion matrix.

See also:
build_unpruned(), build_pruned()

const dtree& yadt::dtree::operator= ( const dtree  ) 

Assignment constructor not defined.

conf_matrix* yadt::dtree::predict ( table cases,
table::subset subtable 
) const throw (runtime_error)

Test classes of unseen cases.

The cases table is required to provide for each case all attributes in the same order as the columns of training set and the actual class of cases. If a weights column is prese: no weights must be provided.

Parameters:
cases a table containing the unseen cases.
subtable the subset of cases to test. NULL value denotes all the table.
Returns:
a confusion matrix comparing prediction against actual classes.

pair<string, float> yadt::dtree::predict ( vector< string > &  attributes,
float  weight = 1 
) const

Predict class and confidence of an unseen case.

The attributes of the case are provided as a vector of C strings in the same order as the columns of training set: no class must be provided. Optionally, a case weight may be provided (which affects confidence of prediction).

Parameters:
attributes vector of C string representing case attributes.
weight case weight.
Returns:
a pair with predicted class and confidence.

conf_matrix* yadt::dtree::predict ( const datasource ds  )  const throw (runtime_error)

Test classes of unseen cases.

The source rst::inpout::stream is required to provide for each case all attributes in the same order as the columns of training set and then the actual class: no weights must be provided.

Parameters:
ds provider of unseen cases.
Returns:
a confusion matrix comparing prediction against actual classes.

bool yadt::dtree::set_conf_level ( float  conf_level  ) 

Set confidence level in simplifying a decision tree.

The new confidence level must be in the range [0,1].

bool yadt::dtree::set_min_obj ( float  min_objects  ) 

Set mininum weight of cases in sons in order to further split a node during tree building.

Default value is 2.0. Any value must be > 0.

Parameters:
min_objects new minimum weight.
Returns:
true if setting the new weight succeeded.

bool yadt::dtree::set_pruning_strategy ( PruningStrategy  strategy  ) 

Set simplification strategy.

Parameters:
strategy new pruning strategy.
Returns:
true if setting the new strategy succeeded.

bool yadt::dtree::set_split_type ( SplitType  st  ) 

Set split strategy.

Parameters:
st new split strategy.
Returns:
true if setting the new strategy succeeded.

unsigned yadt::dtree::size (  )  const

Return number of tree nodes.

void yadt::dtree::toBinary ( const string &  filename  ) 

Binary output.

Parameters:
filename the output filename.

void yadt::dtree::toDOT ( ostream &  os = cout  )  const

Dot output.

void yadt::dtree::toTEXT ( ostream &  os = cout  )  const

Textual output.

void yadt::dtree::toXML ( ostream &  os = cout,
const conf_matrix cmTest = NULL 
) const

XML output.

PMML 2.0 complaint.

unsigned yadt::dtree::training_n_rows (  )  const

Return number of cases used in building the tree.


The documentation for this class was generated from the following file:
Generated on Wed Feb 21 12:23:50 2007 for YaDT by  doxygen 1.5.1-p1