Genetic diagnosis of cancer by six types of Microarray data which is representative of landscape dataProfessor Emeritus of School of Economics, Seikei University Shuichi Niimura
According to Professor Golub of Harvard Medical School et al., "Genetic diagnosis of cancer by Microarray data which is representative of landscape data" has been done for more than 30 years and there is no clear conclusion (discrimination analysis question 5).Many statisticians point out problems that can not be dealt with statistically.
1. Small n and large p problem, that is, typical >>> n horizontally long data.For example, when discriminating and regressing cancer and 100 normal subjects around 10,000, it is an NP-hard problem such as variable selection.
2. An oncogene effective for discrimination is concealed by the remaining high dimensional noise and it can not be separated well.
And so on.
After graduating from university in 1971, I developed the logic of automatic diagnostic system of electrocardiogram with Fisher 's LDF at the Osaka Food Center for Adult Disease but it did not satisfy the doctor' s branch logic.This experience considers the statistical discrimination theory as inappropriate for medical diagnosis, and based on the optimal linear discriminant function based on the Minimum Number of Misclassification (MNM) standard by mathematical programming method and "100-fold cross validation method (method 1) for small sample" "And" Matryoshka Feature Selection Method (Method 2) "to solve the five problems of discriminant analysis.Problem 5, which can not be resolved in 30 years, was resolved by Method 2 in only 54 days.In the Microarray data, normality and abnormality can be completely discriminated from a set of genes far less than n (100) in the number of data items, and dozens of exclusive sums (Small Matryoshka, SM) of such gene pairs (Small Matryoshka, SM) It was able to be completely separated into the set and the remaining high dimensional noise space.These were published by Springer.Then we report on the world's first successful creation of cancer malignancy index by analyzing all SM obtained by JMP unilateral analysis of variance, cluster analysis, PCA, statistical discriminant function.These results were compiled and published in Amazon Kindle version (From Cancer Gene Analysis to Cancer Gene Diagnosis, 1200 yen).