To verify whether unknown transcripts in collembolan transcriptomes are species-specific, we performed tBlastX analysis [43] of these unknowns against the other springtail transcriptome and twenty collembolan transcriptomes from 1KITE project [1, 69]. As a result, we determined that 9,820 sequences for O. cincta (30.1%) and 16,565 (43.6%) sequences for F. candida do not show a Blast hit. These transcripts could be specific to particular developmental stage or treatment, so they may not be expressed in other transcriptomes. Alternatively, these transcripts could also correspond to the species-specific and strains-specific genes. Such genes are often called orphans, because they lack homology with any other species. They have shown to be a universal feature in genomes and are most likely associated with developmental adaptations and interactions with the environment [26].
Finally, the annotation analysis identified 739 unique enzyme codes associated with F. candida contigs and 668 enzyme codes associated with O. cincta contigs. Plotting these codes onto metabolic pathways in iPATH 2.0 [70], indicated that the majority of genes involved in essential metabolic pathways are present in both transcriptomes (S3 Fig). For both organisms the best-represented KEGG pathways are ‘Purine metabolism’, ‘Pyrimidine metabolism’ and ‘Oxidative phosphorylation’.