Objective: Non-small cell lung carcinoma (NSCLC) is one of the leading causes of death in the world. Lymph node metastasis is not only an important factor in estimating the extent and the metastatic potential of an NSCLC but also in prognosticating the patient outcome. Preoperative prediction of lymph node metastasis might greatly facilitate the choice of appropriate surgical and medical options in patients with NSCLC.
Methods and results: Using a cDNA array, we analyzed the expression profiles of 1,289 genes in 92 cancer tissues of NSCLC (37 squamous cell carcinomas and 55 adenocarcinomas). We divided the patients into two groups (classes) for each of various pathological factors, such as lymph node metastasis and pT-stage. For each pair of classes, we searched for an optimal combination of genes to classify the cases using a sequential forward selection algorithm starting from a gene set that showed significant difference in expression between the classes. We used the leave-one-out error cross-validation on a k-nearest neighbor classifier to sequentially choose the gene. Using the optimized set of genes, it was possible to stratify the patients for lymph node metastasis (pN-stage) and pT-stage at, respectively, 100% (23 genes) and 100% (55 genes) for cases with squamous cell carcinomas and 94% (43 genes) and 92% (35 genes) for those with adenocarcinomas.
Conclusion: We conclude that expression profiling using feature selection provides a powerful means of stratification (personalization) of NSCLC patients and choice in treatment options, particularly for factors such as lymph node metastasis whose radiological diagnosis is presently incomplete.