When paramecium primaurelia expresses the D serotype, a major high molecular weight mRNA species is detected in the cytoplasm. Using the cDNA derived from this mRNA as a probe, three very similar genes, D alpha, D beta and D gamma, were cloned. Of these three genes, we show that only the D alpha mRNA is present in the cytoplasm of cells expressing the D serotype and corresponds to the major mRNA species. The nucleotide sequence of the entire coding region of the D alpha gene, as well as the upstream and downstream sequences, has been determined. The 7632-nucleotide open reading frame encodes a putative protein that displays the characteristic cysteine residue periodicity of Paramecium surface antigens but does not contain central tandemly repeated sequences. Partial sequences of the two nonexpressed genes D beta and D gamma indicate a high percentage of identity (90%-95%) with the D alpha gene, suggesting that D beta and D gamma genes are either very similar surface protein genes whose transcription is repressed trough mutual exclusion, or perhaps are pseudogenes. A region of variable DNA rearrangement was identified 1 kb upstream of the D gamma gene. This macronuclear region arises from the same micronuclear locus by alternative excision of internal eliminated sequences during macronuclear development.