The gene coding for starch phosphorylase (EC 2.4.1.1) was isolated from a potato genomic library constructed in lambda EMBL3. It is an unusually long plant gene (16.4 kb) which encodes a preprotein of 966 amino acids. The phosphorylase coding sequence is interrupted by 14 introns whose positions do not match those of the introns in the human glycogen phosphorylase gene. A 78 amino acid central peptide unique to plant plastidial phosphorylases is hypothesized to have arisen through the mis-splicing of an intron-exon junction site in an ancestral gene. The fifth intron of the phosphorylase is very large (approximately 7 kb) and contains a copia-like transposable element inserted in the opposite orientation to that of the phosphorylase gene. This element has been named Tst1; it is bordered on the 5' and 3' sides by long terminal repeats of 285 and 283 bp respectively, which define an internal domain of 4492 bp. Tst1 contains 4 open reading frames (ORFs) that encode protein domains for a reverse transcriptase, an integrase, an RNA-binding site and a protease. Transcription of the phosphorylase gene appears to proceed unimpaired through the copia element.