DNA nanostructures must be rationally designed so that individual nucleic acid strands will assemble into the desired structures. This process usually begins with specification of a desired target structure or function. Then, the overall secondary structure of the target complex is determined, specifying the arrangement of nucleic acid strands within the structure, and which portions of those strands should be bound to each other. The last step is the primary structure design, which is the specification of the actual base sequences of each nucleic acid strand.
The first step in designing a nucleic acid nanostructure is to decide how a given structure should be represented by a specific arrangement of nucleic acid strands. This design step determines the secondary structure, or the positions of the base pairs that hold the individual strands together in the desired shape. Several approaches have been demonstrated:
- Tile-based structures. This approach breaks the target structure into smaller units with strong binding between the strands contained in each unit, and weaker interactions between the units. It is often used to make periodic lattices, but can also be used to implement algorithmic self-assembly, making them a platform for DNA computing. This was the dominant design strategy used from the mid-1990s until the mid-2000s, when the DNA origami methodology was developed.
- Folding structures. An alternative to the tile-based approach, folding approaches make the nanostructure from one long strand, which can either have a designed sequence that folds due to its interactions with itself, or it can be folded into the desired shape by using shorter, "staple" strands. This latter method is called DNA origami, which allows forming nanoscale two- and three-dimensional shapes (see Discrete structures above).
- Dynamic assembly. This approach directly controls the kinetics of DNA self-assembly, specifying all of the intermediate steps in the reaction mechanism in addition to the final product. This is done using starting materials which adopt a hairpin structure; these then assemble into the final conformation in a cascade reaction, in a specific order (see Strand displacement cascades below). This approach has the advantage of proceeding isothermally, at a constant temperature. This is in contrast to the thermodynamic approaches, which require a thermal annealing step where a temperature change is required to trigger the assembly and favor proper formation of the desired structure.
After any of the above approaches are used to design the secondary structure of a target complex, an actual sequence of nucleotides that will form into the desired structure must be devised. Nucleic acid design is the process of assigning a specific nucleic acid base sequence to each of a structure's constituent strands so that they will associate into a desired conformation. Most methods have the goal of designing sequences so that the target structure has the lowest energy, and is thus the most thermodynamically favorable, while incorrectly assembled structures have higher energies and are thus disfavored. This is done either through simple, faster heuristic methods such as
sequence symmetry minimization, or by using a full nearest-neighbor thermodynamic model, which is more accurate but slower and more computationally intensive. Geometric models are used to examine tertiary structure of the nanostructures and to ensure that the complexes are not overly strained.
Nucleic acid design has similar goals to protein design. In both, the sequence of monomers is designed to favor the desired target structure and to disfavor other structures. Nucleic acid design has the advantage of being much computationally easier than protein design, because the simple base pairing rules are sufficient to predict a structure's energetic favorability, and detailed information about the overall three-dimensional folding of the structure is not required. This allows the use of simple heuristic methods that yield experimentally robust designs. Nucleic acid structures are less versatile than proteins in their function because of proteins' increased ability to fold into complex structures, and the limited chemical diversity of the four nucleotides as compared to the twenty proteinogenic amino acids.