Many mammary tumors induced by mouse mammary tumor virus (MMTV) contain a provirus in the same region of the host-cell genome, leading to expression of a putative cellular oncogene called int-1. Here we present the structure and nucleotide sequence of int-1. We have established several proviral insertion sites exactly by nuclease S1 analysis or by molecular cloning and DNA sequencing. The protein-encoding domain of int-1 is distributed over four exons. At the 5' end of the gene two overlapping exons were detected, one of which is preceded by a TATA box. The deduced int-1-encoded protein has 370 amino acids, with a preponderance of hydrophobic residues at the NH2 terminus. Proviruses are found at both sides of the gene, usually oriented away from the gene. Downstream integrations occur frequently in the long 3' untranslated region of the last exon. One upstream provirus is inserted in the 5' untranslated region and, unlike the other upstream insertions, in the same orientation as the int-1 gene. Proviral integrations always leave the protein-encoding domain intact, providing further evidence that the int-1 protein contributes an essential step in mammary tumorigenesis.