ultrasound or high pressure air stream randomly shatters the DNA into pieces
These libraries provide a “clone coverage” of more than 20-fold, meaning that, on average, 20
clones span each of the genome’s bases, thus offering the theoretical guarantee that each base is contained in at least one of the clones This guarantee assumes uniformly random-sampled clones from the genome. In practice, this requirement is seldom perfectly satisfied. Cloning biases lead to a nonrandom clone distribution, causing areas of the genome to remain unsequenced regardless of theamount of sequencing performed.
The gaps between contigs belonging to the same
scaffold are called sequence gaps. Although they
represent genuine gaps in the sequence, researchers
can retrieve the original clone inserts spanning the
gap and use a straightforward “walking” technique
to fill in the sequence.
The gaps between scaffolds are called physical
gaps because the physical DNA that would span
them is either not present in the clone inserts or
indeterminable due to misassemblies. Filling these
gaps involves a large amount of manual labor and
complex laboratory techniques
These limitations spurred the development of
new algorithms. Two approaches exploit techniques
developed in the field of graph theory: one
that represents the sequence reads as graph nodes
and another that represents them as edges.
Euler9 detects repeats by finding complex areas, or
tangles, in the graph constructed during assembly.
No comments:
Post a Comment