Base pair

Depiction of the adenine-thymine Watson-Crick base pair.

A base pair (bp) is a unit consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA and RNA. Dictated by specific hydrogen bonding patterns, Watson-Crick base pairs (guanine-cytosine and adenine-thymine) allow the DNA helix to maintain a regular helical structure that is subtly dependent on its nucleotide sequence.[1] The complementary nature of this based-paired structure provides a redundant copy of the genetic information encoded within each strand of DNA. The regular structure and data redundancy provided by the DNA double helix make DNA well suited to the storage of genetic information, while base-pairing between DNA and incoming nucleotides provides the mechanism through which DNA polymerase replicates DNA and RNA polymerase transcribes DNA into RNA. Many DNA-binding proteins can recognize specific base pairing patterns that identify particular regulatory regions of genes.

Intramolecular base pairs can occur within single-stranded nucleic acids. This is particularly important in RNA molecules (e.g., transfer RNA), where Watson-Crick base pairs (guanine-cytosine and adenine-uracil) permit the formation of short double-stranded helices, and a wide variety of non-Watson-Crick interactions (e.g., G-U or A-A) allow RNAs to fold into a vast range of specific three-dimensional structures. In addition, base-pairing between transfer RNA (tRNA) and messenger RNA (mRNA) forms the basis for the molecular recognition events that result in the nucleotide sequence of mRNA becoming translated into the amino acid sequence of proteins via the genetic code.

The size of an individual gene or an organism's entire genome is often measured in base pairs because DNA is usually double-stranded. Hence, the number of total base pairs is equal to the number of nucleotides in one of the strands (with the exception of non-coding single-stranded regions of telomeres). The haploid human genome (23 chromosomes) is estimated to be about 3.2 billion bases long and to contain 20,000–25,000 distinct protein-coding genes.[2][3][4] A kilobase (kb) is a unit of measurement in molecular biology equal to 1000 base pairs of DNA or RNA.[5] The total amount of related DNA base pairs on Earth is estimated at 5.0 × 1037 and weighs 50 billion tonnes.[6] In comparison, the total mass of the biosphere has been estimated to be as much as 4 TtC (trillion tons of carbon).[7]

Hydrogen bonding and stability

Base pair GC.svg
Base pair AT.svg
Top, a G.C base pair with three hydrogen bonds. Bottom, an A.T base pair with two hydrogen bonds. Non-covalent hydrogen bonds between the bases are shown as dashed lines. The wiggly lines stand for the connection to the pentose sugar and point in the direction of the minor groove.

Hydrogen bonding is the chemical interaction that underlies the base-pairing rules described above. Appropriate geometrical correspondence of hydrogen bond donors and acceptors allows only the "right" pairs to form stably. DNA with high GC-content is more stable than DNA with low GC-content. But, contrary to popular belief, the hydrogen bonds do not stabilize the DNA significantly; stabilization is mainly due to stacking interactions.[8]

The larger nucleobases, adenine and guanine, are members of a class of double-ringed chemical structures called purines; the smaller nucleobases, cytosine and thymine (and uracil), are members of a class of single-ringed chemical structures called pyrimidines. Purines are complementary only with pyrimidines: pyrimidine-pyrimidine pairings are energetically unfavorable because the molecules are too far apart for hydrogen bonding to be established; purine-purine pairings are energetically unfavorable because the molecules are too close, leading to overlap repulsion. Purine-pyrimidine base pairing of AT or GC or UA (in RNA) results in proper duplex structure. The only other purine-pyrimidine pairings would be AC and GT and UG (in RNA); these pairings are mismatches because the patterns of hydrogen donors and acceptors do not correspond. The GU pairing, with two hydrogen bonds, does occur fairly often in RNA (see wobble base pair).

Paired DNA and RNA molecules are comparatively stable at room temperature, but the two nucleotide strands will separate above a melting point that is determined by the length of the molecules, the extent of mispairing (if any), and the GC content. Higher GC content results in higher melting temperatures; it is, therefore, unsurprising that the genomes of extremophile organisms such as Thermus thermophilus are particularly GC-rich. On the converse, regions of a genome that need to separate frequently — for example, the promoter regions for often-transcribed genes — are comparatively GC-poor (for example, see TATA box). GC content and melting temperature must also be taken into account when designing primers for PCR reactions.

Examples

The following DNA sequences illustrate pair double-stranded patterns. By convention, the top strand is written from the 5' end to the 3' end; thus, the bottom strand is written 3' to 5'.

A base-paired DNA sequence:
ATCGATTGAGCTCTAGCG
TAGCTAACTCGAGATCGC
The corresponding RNA sequence, in which uracil is substituted for thymine in the RNA strand:
AUCGAUUGAGCUCUAGCG
UAGCUAACUCGAGAUCGC
Other Languages
Afrikaans: Basispaar
العربية: زوج قاعدي
Bân-lâm-gú: Iâm-ki-tùi
български: Базова двойка
bosanski: Bazni par
dansk: Basepar
Deutsch: Basenpaar
eesti: Aluspaar
español: Par de bases
Esperanto: Baza paro
فارسی: جفت‌باز
français: Paire de bases
galego: Par de bases
한국어: 염기쌍
Bahasa Indonesia: Pasangan basa
íslenska: Basapar
italiano: Coppia di basi
kurdî: Base pair
magyar: Bázispár
Bahasa Melayu: Pasangan bes
Nederlands: Basenpaar
日本語: 塩基対
norsk nynorsk: Basepar
polski: Para zasad
português: Par de bases
română: Pereche de baze
Scots: Base pair
Simple English: Base pair
slovenščina: Bazni par
српски / srpski: Bazni par
srpskohrvatski / српскохрватски: Bazni par
suomi: Emäspari
svenska: Baspar
Türkçe: Baz çifti
українська: Пара основ
Tiếng Việt: Cặp bazơ
中文: 碱基对