7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A comprehensive transcriptome data of normal and Nosema ceranae-stressed midguts of Apis mellifera ligustica workers

      other

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Honeybees are pivotal pollinators of crops and wild flora, and of great importance in supporting critical ecosystem balance. Nosema ceranae, a unicellular fungal parasite that infects midgut epithelial cells of honeybees, can dramatically reduce honeybee population and productivity. Here, midguts of Apis mellifera ligustica workers at 7 d and 10 d post inoculation (dpi) with sucrose solution (Ac7CK and Ac10CK) and midguts at 7 dpi and 10 dpi with sucrose solution containing N. ceranae spores (Ac7T and Ac10T) were sequenced using strand-specific cDNA library construction and next-generation sequencing. A total of 1956129858 raw reads were gained in this article, and after quality control, 1946489304 high-quality clean reads with a mean Q30 of 93.82% were obtained. The rRNA-removed clean reads were then aligned to the reference genome of Apis mellifera with TopHat2. For more insight please see “Genome-wide identification of long non-coding RNAs and their regulatory networks involved in Apis mellifera ligustica response to Nosema ceranae infection” [1]. Raw data were deposited in NCBI Sequence Read Archive (SRA) database under the BioProject number PRJNA406998. These data can be used for comparative analysis to identify differentially expressed coding RNAs and non-coding RNAs involved in A. m. ligustica responses to N. ceranae stress, and for investigation of molecular mechanisms regulating host N. ceranae -response.

          Related collections

          Most cited references3

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Genomic Analyses of the Microsporidian Nosema ceranae, an Emergent Pathogen of Honey Bees

          Introduction Honey bees, Apis mellifera, face diverse parasite and pathogen challenges against which they direct both individual and societal defenses [1]. Severe honey bee colony losses have occurred in the past several years in the United States, Asia, and Europe. Some of these losses have been attributed to Colony Collapse Disorder (CCD), a sporadic event defined by high local colony mortality, the rapid depopulation of colonies, and the lack of known disease symptoms [2]. While causes of CCD are not yet known, and are likely to be multifactorial, increased pathogen loads in declining bees suggest a role for disease One candidate disease agent is the microsporidian Nosema ceranae, a species that has sharply increased its range in recent years [3]. Microsporidia are a highly derived lineage of fungi that parasitize a diverse assemblage of animals [4]. N. ceranae was first described from colonies of the Asian honey bee, Apis cerana, that were sympatric with A. mellifera colonies in China. Fries et al. [5] suggested that a host switch from A. ceranae to A. mellifera occurred relatively recently. Currently, N. ceranae is the predominant microsporidian parasite of bees in North America [6] and Europe [3]. N. ceranae is an obligate intracellular parasite of adult honey bees. Ingested spores invade the gut epithelium immediately after germination Intracellular meronts eventually lead to mesospores that can invade neighboring cells after host-cell lysis. Ultimately, hardier exospores are passed into the gut and excreted, at which point these exospores are infective to additional hosts. While congener N. apis appears to restrict its life cycle to the gut wall, N. ceranae was recently shown to invade other tissues [7]. Health impacts of Nosema infection on honey bees include a decreased ability to acquire nutrients from the environment and ultimately a shortened lifespan [8]. At the colony level, Nosema infection can lead to poor colony growth and poor winter survivorship. Nevertheless, N. ceranae is widespread in both healthy and declining honey bee colonies and its overall contribution to honey bee losses is debatable [9],[10],[11]. Genetic studies of N. ceranae and its infected host have been hindered by a lack of genetic data. Prior to this study, microsporidian sequence data were most extensive for the mammalian pathogen Encephalitozoon cuniculi, complemented by genome or EST surveys of several other human and insect pathogens [4],[12],[13],[14] and a recent draft annotation of the mammalian pathogen Enterocytozoon bieneusi [15]. Public sequences for N. ceranae were limited to ribosomal RNA loci. We therefore chose pyrosequencing to rapidly and cost-effectively characterize the N. ceranae genome, simultaneously illuminating the ecology and evolution of this parasite while enabling focused studies of virulence mechanisms and population dynamics. A genomic approach also leverages existing microsporidian and fungal genome sequence, advancing through comparative analysis our understanding of how microsporidian genome architecture and regulation has evolved. Microsporidia are remarkable in having small genomes that overlap prokaryotes in size, a propensity for overlapping genes and transcripts, few introns, and predicted gene complements less than half that found for yeast [16],[17],[18]. Microsporidian cells are also simplified at the organellar level and lack mitochondria, instead containing a genome-less organelle, the mitosome, that appears incapable of oxidative phosphorylation but may function in iron-sulfur biochemistry [19],[20]. Biochemical studies and sequence analyses have identified novel features of carbon metabolism and a dependency on host ATP, but much of their metabolism remains unclear [17],[19] and major metabolic pathways can differ substantially among species [15]. Here we analyze a draft genome assembly for N. ceranae, present a gene set of 2,614 putative proteins that can now be used to uncover salient aspects of Nosema pathology, and describe gene families and ontological groups that are distinct relative to other sequenced fungi. We provide formatted annotations for viewing with the Gbrowse genome viewer [21], which we hope will aid future studies of this economically important pathogen and of microsporidia in general. Materials and Methods Nosema ceranae spore purification Honey bees infected with N. ceranae were collected from the USDA-ARS Bee Research Laboratory apiaries, Beltsville, MD. Alimentary tracts of these bees were removed and crushed in sterile water and filtered through a Corning (Lowell, MA) Netwell insert (24 mm diameter, 74 µm mesh size) to remove tissue debris. The filtered suspension was centrifuged at 3,000×g for 5 minutes and the supernatant discarded. The re-suspended pellet was further purified on a discontinuous Percoll (Sigma-Aldrich, St. Louis, MO) gradient consisting of 5 ml each of 25%, 50%, 75% and 100% Percoll solution. The spore suspension was overlaid onto the gradient and centrifuged at 8,000×g for 10 minutes at 4°C. The supernatant was discarded and the spore pellet was washed by centrifugation and suspension in distilled sterile water. Genomic DNA extraction Approximately 106 N. ceranae spores were suspended in 500 µl CTAB buffer (100 mM Tris-HCl, pH 8.0; 20 mM EDTA, pH 8.0; 1.4 M sodium chloride; 2% cetyltrimethylammonium bromide, w/v; 0.2% 2-mercaptoethanol) and broken by adding 500 µg of glass beads (425–600 µm, Sigma-Aldrich, St. Louis, MO) into the tube and disrupting the mixture at maximum speed for 2–3 minutes using a FastPrep Cell Disrupter (Qbiogene, Carlsbad, CA). The mixture was then incubated with proteinase K (200 µg/ml) for five hours at 55°C. Genomic DNA was extracted in an equal volume of phenol/chloroform/isoamyl alcohol (25∶24∶1) twice, followed by a single extraction in chloroform. The purified DNA was precipitated with isopropanol, washed in 70% ethanol, and dissolved in 50 µl sterile water. The concentration and purity of the DNA were determined by spectrophotometric absorption at 260 nm, and ratios of absorption at 260 nm and 280 nm. Sequencing and assembly Extracted DNA was pooled, sheared, and processed using in-house protocols at 454 Life Sciences (Branford, CT). The template was then amplified by two separate runs of 32 emulsion-PCR reactions each, with each reaction comprised of templates containing 454-linker sequence attached to 600,000 sepharose beads [22]. Successful amplifications were sequenced using GS FLX picotiter plates and reads were trimmed of low-quality sequence before assembly with the Celera Assembler package CABOG [23]. Gene predictions Gene predictions were merged from three distinct sources. We first used the Glimmer package [24], which is designed for predicting exons of prokaryote and small eukaryote genomes, using a hidden Markov model to evaluate the protein-coding potential of ORFs. The model was initially trained on ORFs identified by Glimmer's longorf program, and then run with the following parameters: a minimum length of 90 codons, a maximum overlap of 50 bp, and a threshold score of 30. ORFs that contained a high proportion of tandem sequence repeats were ignored. Secondly, we identified all additional ORFs not predicted to be protein-coding by Glimmer that were BLASTX [25] matches to GenBank fungal proteins (at a lax expectation threshold of 1.0E-5). Finally, all remaining ORFs were searched with the HMMER program (http://hmmer.janelia.org) for Pfam-annotated protein domains [26] using an expectation threshold of 1.0×10E-1. In 58 cases, adjacent ORFs matching different parts of the same GenBank protein or Pfam domain could be joined by hypothesizing a single-base frameshift error in the assembly. Our annotations span the start and stop codons of these conjoined ORFs and indicate the approximate site of the frameshift with the ambiguity characters N and X, respectively, in the nucleotide and protein sequence. tRNA genes were predicted with the program ARAGORN [27]. Ribosomal genes were identified by BLASTN searches and alignments with existing Nosema ribosomal sequence in GenBank and the SILVA ribosomal database [28]. Nucleotide composition of protein-coding genes was investigated with the program INCA2.0 [29]. Protein homology searches and functional annotation We identified probable one-to-one orthologs among these three genomes using reciprocal best BLASTP matches, with the additional requirements that the best match have an expectation ≤1.0E-10 and 103 lower than the second best match (identical protein predictions in E. cuniculi were considered equivalent). Best-fit homologs in yeast, as determined by BLASTP with a minimum expectation of 1.0E-10, were used to annotate N. ceranae genes with GO Slim ontologies [30]. Signal peptides were predicted using the SignalP 3.0 program [31] and transmembrane domains were predicted with TMHMM 2.0 [32]. Assignments to conserved positions in metabolic and regulatory pathways were based on the KEGG annotation resource [33], assisted by the Blast2Go program [34]. Repetitive elements were identified by searching against Repbase [35], by the pattern searching algorithm REPuter [36], and by intragenomic BLASTN analyses. Results Sequencing and assembly Sequence information and annotations are posted in Genbank (www.ncbi.nlm.nih.gov) under Genome Project ID 32973. High-quality reads from two 454 GS FLX sequencing runs contributed 275.8 MB for assembly. The assembly was complicated by an extreme AT bias, frequent homopolymer runs (which are prone to sequencing error), and numerous repetitive elements (see below). Sixty-one independent assemblies were evaluated by systematically increasing the error parameter from zero to 6% in 0.1% increments. The final assembly used an error rate of 3.5% because this maximized both the N50 of contig size and the length of the longest contig. To search for potential mis-assembly, we compared this version to other assemblies using MUMmer [37]. We identified two contigs that likely contained collapsed repeats and replaced these with alternative versions assembled with a stricter error parameter. Other parameters of the assembly remained at their default CABOG settings. Sequencing and assembly statistics are summarized in Table 1 . 10.1371/journal.ppat.1000466.t001 Table 1 Statistics of draft N. ceranae genome assembly analyzed in this paper. Stage Category Value* Sequencing High-quality reads 1,063,650 High-quality bases 275,848,411 Average high-quality read length 259.3 Average high-quality read quality score 30.4 Assembly Number of retained contigs 5,465 Range of contig length (bp) 500–65,607 Sum of contigs 7,860,219 Contig N50 length 2,902 Contig N50 number 470 Average contig coverage 24.2 * Sequence lengths in base pairs. Accidental incorporation of non-target DNA sequence into genome assemblies is a ubiquitous hazard even with stringent sample preparation. We therefore used BLAST, depth of coverage, and G+C content as criteria to help identify potential contamination, but found no evidence of sequence derived from the host genome (A. mellifera), the sympatric congener N. apis, or another common fungal pathogen of bees, Ascosphaera apis. However, we did find evidence for low-level contamination by an unknown ascomycete fungus, indicated by generally short, low-coverage, high-GC contigs with consistently stronger BLASTX matches to Ascomycota than to Microsporidia. We therefore removed all contigs with less than five-fold coverage and a G+C content of 0.5 or greater (see Fig. S1 ), as well as any contig that matched ascomycete ribosomal or mitochondrial sequence. After purging these suspect contigs and removing all contigs less than 500 bp in length, there remained 5465 contigs that totaled 7.86 MB of DNA. The N50 contig size of the pruned assembly was 2.9 kb (i.e., half of the total assembly, or 3.93 MB, was in contigs greater than 2.9 kb). The mean sequence coverage of contigs was 24.2×. Using the GigaBayes suite of programs [38],[39], we estimated the frequency of simple polymorphisms (indel or nucleotide, P≥0.90 per site) on the 100 longest contigs to be 1.0 per kilobase. Genomic G+C content of the final contig set was low compared with E. cuniculi, 26% vs. 47%, but typical of other surveyed microsporidia. Genomic contigs of Enterocytozoon bieneusi in GenBank have a G+C content of 24%, and Williams et al. [13] reported genomic G+C contents of Brachiola algerae and Edhazardia aedis to be 24% and 25%, respectively. Although several factors potentially associated with microbial base composition have been investigated, such as ambient temperature, mutation bias, and selection on genome replication rates, the causes of compositional bias remain unclear (see, for example, [40] and references cited therein). Because genome assemblies may not accurately represent true genome size, due to such factors as redundancy at contig ends or collapsed repeats, we applied the method of Carlton et al. [41] to estimate genome size from sequence coverage, excluding repeats. We first classified all 22-mers occurring in the read sequence not more than 40 times as the unique portion of the genome. Using this filter, the average coverage was 26.6× and 28.2× for regions of at least 1 kb and 10 kb in length, respectively. The total length of the N. ceranae reads is 261.0 MB after filtering reads with G+C content higher than 50%. With these values, the total genome size could be as high as 9.8 MB. However, this G+C filter may be overly permissive; increasing the filter stringency to 35% G+C reduces the genome size estimate to 8.6 MB. An additional consideration is that, at the estimated level of coverage, we expect the entire genome to be sequenced with few singletons or small contigs. Yet 30.0 MB of read sequence assembled into contigs with 10 or fewer reads, including 5.5 MB of single-read contigs. These small contigs are likely to be from reads with relatively high sequencing error. If so, this would boost the average coverage of the assembly by 3×–3.5× and reduce the genome size to as low as 7.7 MB. Our attempts to measure the genome size empirically with pulse-field gel electrophoresis did not adequately resolve N. ceranae chromosomes. However, this technique in other Nosema species has yielded genome size estimates of 7.4–15 MB [42]. Thus, while our computational estimate is in reasonable agreement with current genome size estimates for the genus, an unknown but potentially significant portion of the genome may be unrepresented in this assembly and the absence of particular sequences should not be considered definitive. Sequence repetition The genome sequence of E. cuniculi revealed an unusual distribution of sequence repeats, characterized by a lack of known transposable elements, a paucity of simple repeats, and an abundance of near-perfect segmental duplications of 0.5–10 kb in length. Pulse-field gel electrophoretic studies have identified gross variation in the size of homologous chromosomes among and within isolates of E. cuniculi [43] and the microsporidian Paranosema grylli [44], indicating that large segmental duplications are potentially important sources of intraspecific variation. The origins and gene content of such duplications are therefore of particular interest. While the present assembly limits our ability to describe larger segmental duplications in N. ceranae, we were able to investigate sequence repetition in the genome by searching for microsatellite motifs and by using REPuter [36] to detect complex repeats. All eight dinucleotide repeats found were ‘AT’ repeats, ranging from a perfect 9-unit repeat to an imperfect (3 mismatches) 21-unit repeat. There were six AAT repeats greater than 6 units in length and four ATC repeats. We confined our search for complex repeats to those contigs greater than 1,200 bp in length, so as to identify repeats likely to be dispersed in the genome rather than confined to the most poorly assembled fragments. REPuter identified a total of 4,731 sequence pairs with at most three mismatches that ranged from 70 bp (the minimum threshold for detection) up to 312 bp in length with a median of 85 bp. Repeats were over-represented on smaller contigs, even within the analyzed set of relatively long contigs, indicating that they had affected assembly success. BLASTN analyses of the REPuter-identified repeats against the N. ceranae genome revealed a novel dispersed repeat with a conserved core domain approximately 700 bp in length ( Fig. S2 ). The boundaries of the element are not completely clear because the conserved domain often occurs as tandem copies, there are two or more subtypes of the element based on multiple sequence alignments, and, as expected, copies are most abundant on short contigs and near contig ends. Using an E-value cutoff of 1.0E-5, we identified one or more matches on 250 contigs. No conserved coding potential was evident for these elements, nor did we detect any homology with sequences in GenBank or Repbase. Surprisingly, this element contains a candidate polII promoter that is well conserved and generally scores between 0.90 and 1.00 (the maximum value) when submitted to a neural network prediction tool [45]. Whether this promoter-like motif is functional and, if so, whether it produces a coding or noncoding transcript remain to be seen. However, it is clear from BLAST searches that this promoter sequence is not associated with any of our predicted genes (see below), nor could we identify it in E. cuniculi or yeast. Predicted genes and associated features We identified 2,614 putative protein-coding genes, with reference names, coordinates, and annotation features provided in Text S1 . Gene models were not required to have a start methionine to allow for gene predictions truncated at ends of contigs and (rarely) the possibility of non-canonical start codons or frameshifts in the assembly. In addition to BLAST-hit annotations, Text S1 also lists Pfam protein domains as well as signal peptide and transmembrane motifs. Texts S2 and S3 , respectively, contain GFF-formatted data and a configuration file for viewing our annotations with the Gbrowse viewer [21]. An example of these annotations viewed in GBrowse is shown in Fig. S3 . The number of protein-coding genes we have predicted for N. ceranae lies in between the 1,996 Refseq proteins given by GenBank for the sequenced E. cuniculi genome and the 3,804 predicted for E. bieneusi from sequence representing only two-thirds of the estimated genome content. The density of genes on the 100 largest N. ceranae contigs averaged 0.60 genes/kb (64.8% coding sequence). This is a lower proportion of coding sequence than found in E. cuniculi and Antonospora locustae (0.94 and 0.97 genes/kb, respectively [4]), but comparable to some other microsporidia [13]. However, gene density declines considerably with contig size ( Fig. S4 ), consistent with a preponderance of repetitive elements (described above and to follow) or other noncoding sequence in these regions. We found forty-six contigs containing sequences that matched N. ceranae ribosomal sequence at an expectation of E 1000 bp, 500–1000 bp, and <500 bp, left to right. Note wide range of mean coverage, even for large contigs. (2.13 MB TIF) Click here for additional data file. Figure S2 Partial sequence alignment of copies of a novel dispersed repeat found on 250 contigs using conservative BLAST criteria. The conserved sequence includes a candidate polII promoter but no long ORF. (7.68 MB TIF) Click here for additional data file. Figure S3 Screenshots of annotated N. ceranae assembly viewed with the Gbrowse application. (4.23 MB TIF) Click here for additional data file. Figure S4 Table illustrating the progressive decline in gene density as contig size decreases. (0.88 MB TIF) Click here for additional data file. Figure S5 The 65 N. ceranae tRNA genes predicted by ARAGORN [27], ordered by the corresponding amino-acid. (1.65 MB TIF) Click here for additional data file. Figure S6 Sense-strand matches to the yeast TATA motif, TATA[AT]A[AT], in the 200-bp region upstream of high-confidence start codons. The vertical axis shows the proportion of all matches upstream of the sampled genes (n = 280, see text) that begin at the specified distance from the start codon. There is a pronounced spike in TATA box motifs occurring in the vicinity of the −27 position relative to their frequency in random sequence of the same base composition. (1.42 MB TIF) Click here for additional data file. Figure S7 Codon usage of N. ceranae (red) and E. cuniculi (green) genes, plotted using INCA [29]. Each bar represents the proportion of all codons encoding a given amino-acid that are the specified codon. Thus, the values are one by definition for the single-codon amino-acids, tryptophan and methionine. (5.78 MB TIF) Click here for additional data file. Figure S8 Codon bias of genes of three microsporidian genomes. Only N. ceranae genes with homology to genes in E. cuniculi are plotted. Vertical axis is ENC' (Novembre JA [2002] Accounting for background nucleotide composition when measuring codon usage bias. Mol Biol Evol 19: 1390–1394), a measure of codon bias adjusted for nucleotide composition, plotted versus third-position G+C (GC3). Few N. ceranae genes have an ENC' less than 50. Those that do are not obviously related, by homology or ontology, to comparably biased genes in the other two species. Thus, strong codon bias may not be a useful predictor of gene-expression level in microsporidia as it is in a variety of other microbes. (1.22 MB TIF) Click here for additional data file. Figure S9 Frequency of each amino-acid, indicated by single-letter codes, in predicted proteins of N. ceranae and E. cuniculi. A. The frequency of each amino-acid in those genes that have one-to-one orthologs in the other microsporidian genomes and yeast. The conservation of these genes suggests that they have essential and ancient functions. B. The frequency of each amino-acid in all predicted proteins of the indicated species. (1.93 MB TIF) Click here for additional data file. Figure S10 Characteristics of ‘microsporidian-specific’ genes, orthologous pairs of genes found in N. ceranae and E. cuniculi that lack apparent homology with proteins of taxa outside of order Microsporidia. (1.12 MB TIF) Click here for additional data file. Figure S11 Amino-acid compositions of the putative polar tube proteins PTP1 and PTP2 in N. ceranae and two other microsporidians, E. cuniculi and A. locustae. Lengths of predicted proteins, in amino acid residues, is given in parentheses. (1.86 MB TIF) Click here for additional data file. Figure S12 Degree of synteny between the N. ceranae contigs and E. cuniculi chromosomes. For each of the three largest contigs, predicted N. ceranae genes are shown in order along the contig (not to scale). The relative orientation of each gene is indicated by the arrow. N. ceranae genes shaded gray have one-to-one orthologs with E. cuniculi genes, whereas circled genes have homologs in E. cuniculi but not a one-to-one ortholog. Unmarked genes have no detected homolog in E. cuniculi (see text). The position in kilobases and relative orientation of the E. cuniculi ortholog is shown directly below the N. ceranae gene in the row corresponding to its chromosomal location. Coordinates are based on the GenBank record for each chromosome. These contigs contain regions of extensive, coarse-scale synteny with E. cuniculi, within which there can be considerable change in gene order or orientation. There are also numerous breaks in synteny associated with either a switch in E. cuniculi chromosome or an intervening, non-homologous gene. (2.06 MB TIF) Click here for additional data file. Figure S13 Relative sequence conservation between N. ceranae proteins and their homologs in other reference species. N. ceranae genes with one-to-one orthologs in E. cuniculi and yeast (see text) were BLASTP searched against the combined proteomes of E. cuniculi, E. bieneusi, and yeast. The number of N. ceranae genes with high-scoring matches in all three reference species in this data set was 234. The upper panel plots the BLASTP score of each N. ceranae gene versus the best match in each species, ordered along the X-axis by descending score versus E. cuniculi. Values are represented as lines rather than points for easier visualization. The lower panel plots the BLASTP expectation (E-value) in ascending order versus E. cuniculi. E-values equal to zero were set to 1.0E-200 to allow a logarithmic scale. (2.63 MB TIF) Click here for additional data file. Figure S14 Alignment of adjacent N. ceranae genes that are supported by homology (see Results) and that overlap in sequence. Each set of three sequences represents a contig and two adjacent genes. Start and stop codons are indicated by boxes. (2.37 MB TIF) Click here for additional data file.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            First identification of long non-coding RNAs in fungal parasite Nosema ceranae

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Genome-wide identification of long non-coding RNAs and their regulatory networks involved in Apis mellifera ligustica response to Nosema ceranae infection

              Long non-coding RNAs (lncRNAs) are a diverse class of transcripts that structurally resemble mRNAs but do not encode proteins, and lncRNAs have been proved to play pivotal roles in a wide range of biological processes in animals and plants. However, knowledge of expression pattern and potential role of honeybee lncRNAs response to Nosema ceranae infection is completely unknown. Here, we performed whole transcriptome strand-specific RNA sequencing of normal midguts of Apis mellifera ligustica workers (Am7CK, Am10CK) and N. ceranae -inoculated midguts (Am7T, Am10T), followed by comprehensive analyses using bioinformatic and molecular approaches. A total of 6353 A. m. ligustica lncRNAs were identified, including 4749 conserved lncRNAs and 1604 novel lncRNAs. These lncRNAs had low sequence similarities with other known lncRNAs in other species; however, their structural features were similar with counterparts in mammals and plants, including shorter exon and intron length, lower exon number, and lower expression level, compared with protein-coding transcripts. Further, 111 and 146 N. ceranae -responsive lncRNAs were identified from midguts at 7 day post inoculation (dpi) and 10 dpi compared with control midguts. 12 differentially expressed lncRNAs (DElncRNAs) were shared by Am7CK vs Am7T and Am10CK vs Am10T comparison groups, while the numbers of unique ones were 99 and 134, respectively. Functional annotation and pathway analysis showed the DElncRNAs may regulate the expression of neighboring genes by acting in cis . Moreover, we discovered 27 lncRNAs harboring eight known miRNA precursors and 513 lncRNAs harboring 2257 novel miRNA precursors. Additionally, hundreds of DElncRNAs and their target miRNAs were found to form complex competitive endogenous RNA (ceRNA) networks, suggesting these DElncRNAs may act as miRNA sponges. Furthermore, DElncRNA-miRNA-mRNA networks were constructed and investigated, the result demonstrated that part of DElncRNAs were likely to participate in regulating the material and energy metabolism as well as cellular and humoral immune during host responses to N. ceranae invasion. Finally, the expression pattern of 10 DElncRNAs was validated using RT-qPCR, confirming the reliability of our sequencing data. Our findings revealed here offer not only a rich genetic resource for further investigation of the functional roles of lncRNAs involved in A. m. ligustica response to N. ceranae infection, but also a novel insight into understanding host-pathogen interaction during microsporidiosis of honeybee.
                Bookmark

                Author and article information

                Contributors
                Journal
                Data Brief
                Data Brief
                Data in Brief
                Elsevier
                2352-3409
                22 August 2019
                October 2019
                22 August 2019
                : 26
                : 104349
                Affiliations
                [1]College of Bee Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
                Author notes
                []Corresponding author. ruiguo@ 123456fafu.edu.cn
                Article
                S2352-3409(19)30703-6 104349
                10.1016/j.dib.2019.104349
                6727036
                50a3489a-a432-40fb-b09a-c02b9f56f6e2
                © 2019 The Author(s)

                This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

                History
                : 20 June 2019
                : 23 July 2019
                : 25 July 2019
                Categories
                Biochemistry, Genetics and Molecular Biology

                apis mellifera ligustica,nosema ceranae,midgut,transcriptome

                Comments

                Comment on this article