Introduction to Tree - and Forest-based Genomic Analysis
Multiple genes, gene-by-gene interactions, and gene-by-environment
interactions are believed to underlie most complex diseases. However, such
interactions are difficult to identify. While there have been recent
successes in identifying genetic variants for complex diseases, it still
remains difficult to identify gene-gene and gene-environment interactions.
To overcome this difficulty, we propose a forest-based approach and a
concept of variable importance. The proposed approach is demonstrated by
simulation study for its validity and illustrated by a real data analysis
for its use. Analyses of both real data and simulated data based on
published genetic models show the effectiveness of our approach.
In this brief presentation, I describe the recursive partitioning technique using a microarray data as an example. Forest constructions using bagging and a deterministic scheme are described.
