Introduction
Bread wheat (Triticum aestivum L.) is an essential component of the global food security mosaic, providing nearly one-fifth of the total calories of the world's population [1]. Since the mid-twentieth century, breeding has contributed to improve wheat crop production by increasing yield and the cultivated area along with the world's growth in population. However, recent studies indicate stagnation in wheat yields at a global level [2] and [3], which contrasts with the projected demands for agricultural crops expected to almost double by 2050 [4]. Therefore, raising the yield potential and stabilizing yields against the damaging effects of climate change are top priorities for agricultural science [2] and [5]. In this context, a better understanding of the wheat genome is central to unlock the full potential of natural genetic variation and to develop more effective breeding strategies for crop improvement.
Sequencing the bread wheat genome has been always a challenging task because of its size and complexity, resulting from its allohexaploid (2n = 6x = 42 AABBDD) nature and high content of repetitive DNA [6], [7] and [8]. Bread wheat originated approximately 8000 years ago as a result of spontaneous interspecific hybridization (and subsequent chromosome duplication) between domesticated emmer T. turgidum ssp. dicoccon (2n = 4x = 28 AABB) and diploid Aegilops tauschii (2n = 2x = 14 DD) [9]. The combination of three large diploid genomes led the haploid complement of wheat genome to be as large as 17 Gb [10], making it about 40 times larger than the rice genome [11]. This magnitude reflects not only the sum of orthologous gene copies but also a high amount of repetitive sequence, which is estimated to represent 80% of the whole wheat genome [7] and [8].
Different approaches have been adopted to circumvent these restrictions and as a result, wheat genomics has moved forward steadily, although slower than that of other crops such as rice. Most of this progress has relied on comparative genomics among grasses and on the use of diploid progenitors to gain knowledge on bread wheat. The advent of high-throughput sequencing technologies – so-called next generation sequencing (NGS) – has a remarkable positive effect on sequencing capabilities, in terms of both speed and depth, at an economically accessible cost [12]. This has prompted the development of genome sequencing projects for a broad range of organisms, including first draft versions of several members of the Triticeae like bread wheat [13] and [14], the A genome donor T. urartu [15], and the D genome donor Ae. tauschii [16]. These studies provided valuable estimations of gene content, putative gene orders and genome organization. However, the preliminary status of such large and complex genomes may reveal only part of the entirety of genes present in wheat and, at the same time, the study of the A or D donor genomes may not reflect the current genome architecture of modern cultivated wheat due to reduction and rearrangements, for instance.