While lymph node metastasis is a major factor associated with poor prognosis in cancer, little is known of its molecular mechanisms. The aim of this study was to identify genes differentially expressed between non-cancerous and cancerous lung tissues, and to investigate the gene expression profiles of 41 primary lung adenocarcinomas to select sets of gene predictors for lymph node metastasis of lung cancer. Gene expression profiles were obtained using oligonucleotide microarrays, and predictor sets constructed by evaluating the statistical significance of expression levels of selected genes. Gene analysis revealed 15 predictor genes for lymph node metastasis of lung adenocarcinoma. Using the most suitable set of genes, it was possible to predict the lymph node metastasis of patients with lung cancer. The prediction scoring system yielded 71.4% accuracy for forecasting lymph node metastasis in 14 independent test cases. Survival was also significantly better in 18 cases that were pathologically LN negative and predicted to be LN negative according to molecular classification, compared with 23 cases that were pathologically LN positive or predicted to be LN positive according to molecular classification. Gene expression analysis combined with statistical analysis successfully distinguished lymph node metastasis. The findings of this study showed that pathological diagnosis combined with molecular classification clearly distinguished patients with good prognoses from patients with poor prognoses.