31
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Current challenges in de novo plant genome sequencing and assembly

      review-article
      1 , , 1 , 1
      Genome Biology
      BioMed Central
      DNA sequencing, genome assembly, plant genomics

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Genome sequencing is now affordable, but assembling plant genomes de novo remains challenging. We assess the state of the art of assembly and review the best practices for the community.

          Related collections

          Most cited references40

          • Record: found
          • Abstract: found
          • Article: not found

          FLASH: fast length adjustment of short reads to improve genome assemblies.

          Next-generation sequencing technologies generate very large numbers of short reads. Even with very deep genome coverage, short read lengths cause problems in de novo assemblies. The use of paired-end libraries with a fragment size shorter than twice the read length provides an opportunity to generate much longer reads by overlapping and merging read pairs before assembling a genome. We present FLASH, a fast computational tool to extend the length of short reads by overlapping paired-end reads from fragment libraries that are sufficiently short. We tested the correctness of the tool on one million simulated read pairs, and we then applied it as a pre-processor for genome assemblies of Illumina reads from the bacterium Staphylococcus aureus and human chromosome 14. FLASH correctly extended and merged reads >99% of the time on simulated reads with an error rate of <1%. With adequately set parameters, FLASH correctly merged reads over 90% of the time even when the reads contained up to 5% errors. When FLASH was used to extend reads prior to assembly, the resulting assemblies had substantially greater N50 lengths for both contigs and scaffolds. The FLASH system is implemented in C and is freely available as open-source code at http://www.cbcb.umd.edu/software/flash. t.magoc@gmail.com.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences

            Increased reliance on computational approaches in the life sciences has revealed grave concerns about how accessible and reproducible computation-reliant results truly are. Galaxy http://usegalaxy.org, an open web-based platform for genomic research, addresses these problems. Galaxy automatically tracks and manages data provenance and provides support for capturing the context and intent of computational methods. Galaxy Pages are interactive, web-based documents that provide users with a medium to communicate a complete computational analysis.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning.

              Cytosine DNA methylation is important in regulating gene expression and in silencing transposons and other repetitive sequences. Recent genomic studies in Arabidopsis thaliana have revealed that many endogenous genes are methylated either within their promoters or within their transcribed regions, and that gene methylation is highly correlated with transcription levels. However, plants have different types of methylation controlled by different genetic pathways, and detailed information on the methylation status of each cytosine in any given genome is lacking. To this end, we generated a map at single-base-pair resolution of methylated cytosines for Arabidopsis, by combining bisulphite treatment of genomic DNA with ultra-high-throughput sequencing using the Illumina 1G Genome Analyser and Solexa sequencing technology. This approach, termed BS-Seq, unlike previous microarray-based methods, allows one to sensitively measure cytosine methylation on a genome-wide scale within specific sequence contexts. Here we describe methylation on previously inaccessible components of the genome and analyse the DNA methylation sequence composition and distribution. We also describe the effect of various DNA methylation mutants on genome-wide methylation patterns, and demonstrate that our newly developed library construction and computational methods can be applied to large genomes such as that of mouse.
                Bookmark

                Author and article information

                Contributors
                Journal
                Genome Biol
                Genome Biol
                Genome Biology
                BioMed Central
                1465-6906
                1465-6914
                2012
                27 April 2012
                27 April 2013
                : 13
                : 4
                : 243
                Affiliations
                [1 ]Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
                Article
                gb-2012-13-4-243
                10.1186/gb-2012-13-4-243
                3446297
                22546054
                9b1a2739-a4b9-42e5-85fa-82c2c0134d85
                Copyright ©2012 BioMed Central Ltd.
                History
                Categories
                Review

                Genetics
                dna sequencing,genome assembly,plant genomics
                Genetics
                dna sequencing, genome assembly, plant genomics

                Comments

                Comment on this article