Readme

  This is an updated version of CTMBR software submitted to JASA.
Here, I used the h(t) defined in (4) of the paper only.
Please visit http://peace.med.yale.edu for future updates.

@Copyright Heping Zhang 1/24/03

Introduction to the files in this directory:

ctmbr.*: These are the executable codes compiled on various systems as
         indicated by the suffix. For example, ctmbr.solaris8 was compiled
         on a SUN SPARC Ultra 5 with Solaris8.
         Save the corresponding code as, say, ctmbr and simply type ctmbr 
         at your command line and follow the instruction. 
sample.dat:This is a sample data file. The 3 numbers in the first row are:
         the number of study subjects (i.e., sample size), the number of
         covariates, and the number of binary responses. The second
         row indicates the type of covariates. 0 means deleting that 
         covariate from the analysis; 1 means a continuous or an ordinal 
         covariate; 3 means a nominal covariate; -1 means the outcome.

infor.*: The output file. The file consists of three parts:
         Part I can be read as follows:
         Column 1: node number. Node 0 is the root node.
         Column 2: number of subjects in the node.
         Column 3: left daughter node, e.g. node 1 is the left daughter
                   node of node 0.
         Column 4: right daughter node, e.g. node 2 is the right daughter
                   node of node 0.
         Column 5: The splitting variables, number starting from 1.
         Column 6: The splitting value corresponds to the splitting 
                   variable.
                   For example, the split for node 0 is whether variable
                   14  > 3.0.
                         See the remark below for categorical variable.
         Part II records the series of complexity parameters and the nodes
         where pruning has occurred. 
         Part III saves the cross-validation results for selecting the 
         final trees.

After you execute ctmbr, you will be asked to enter the data file name
(e.g., sample.dat), whether there is any missing data (0 for no and 1 for
yes), and what is the missing value if there is (I used -9). Finally,
you will be asked to enter the fold of cross validation. If you do not
want to wait so long, enter 2. Now, be patient. The computation may
take quite a while because the tree has to be grown many times during
cross validation.  

There are some minor numerical differences using the program on different
platform. So, different infor files are provided here.

Remark: For a categorical variable, if your level starts with 1, an artificial
0 is added for convenience. Missing value is assigned to be 1 plus the maximum
level. The levels printed on the output point observations to the right 
daughter node.