Gene expression profiling technologies suffer from poor reproducibility across replicate experiments. However, when analyzing large datasets, probe-level expression profile correlation can help identify flawed probes and lead to the construction of truer probe sets with improved reproducibility. We describe methods to eliminate uninformative and flawed probes, account for dependence between probes, and address variability due to transcript-isoform mixtures. We test and validate our approach on Affymetrix microarrays and outline their future adaptation to other technologies.