> dTcmd <input options> <tree options> <output options>
Command line options.
Input data types. Input data providers. Output data types.
Input data options
Input data to dTcmd consists of:
Tree construction options
The following parameters affect the tree construction algorithm:
Output options
The following options affect the outputs of dTcmd:
Text files
Text files codes tables as comma separated columns. To change separator to another
character c,the -sep <c> option is provided. For instance,
-sep " " switch to space separated columns. Also,the special string
"?" can be present in text files to represent unknown values.
Gzipped text files
Gzipped text files are files with suffix .gz obtained by compressing text files with
gzip.
Microsoft SQL Server tables
Microsoft SQL Server tables are accessed via ADO. Notice that the YaDT classes may access any
ADO data provider,but dTcmd presently only considers SQL Server with trusted connections.
In particular,no user name and password are to be provided. Also,unknown values
are coded by NULL values.
Metadata table
Metadata tables have three columns,which in order represents:
Trainig data table
Training data tables have a number of columns according
to the metadata table. The order of columns must be consistent with the
order of metadata table rows. Unknown values are not admitted when the column type
is weights or class.
Here it is the golf.data training data file:
Binary data table
dTcmd may save and load a binary file containing a binary representation of
a metadata table and a training table (see options, -bd <file> and
-db <file>). Binary files are not guarranteed to be readable
from future/past version of YaDT!!
Binary tree
dTcmd may save and load a binary file containing a binary representation of
a decision tree (see options, -bt <file> and
-tb <file>). Binary files are not guarranteed to be readable
from future/past version of YaDT!!
XML tree
dTcmd may save to a file or to standard output a PMML complaint
XML representation of the built tree (see options, -x <file> and
-xstd).
Confusion matrix and text trees
dTcmd may save to a file or to standard output a text representation of the built tree
and of confusion matrix over training and test data (see options, -t <file> and
-tstd).
Verbose log
dTcmd may save to a file or to standard output a verbose log of computation in progress
(see options, -l <file> and
-lstd).
Test data table
Test data table has exactly the same format of training data table.
outlook,string,discrete
temperature,integer,continuous
humidity,integer,continuous
windy,string,discrete
toPlay,string,class
describes training data consisting of the following columns:
sunny,85,85,false,1,Don't Play
sunny,80,90,true,1,Don't Play
overcast,83,78,false,1.5,Play
rain,70,96,false,0.8,Play
rain,68,80,false,2,Play
rain,65,70,true,1,Don't Play
overcast,64,65,true,2.5,Play
sunny,72,95,false,1,Don't Play
sunny,69,70,false,1,Play
rain,75,80,false,1.5,Play
sunny,75,70,true,3,Play
overcast,72,90,true,1.5,Play
overcast,81,75,false,1,Play
rain,71,80,true,1,Don't Play