Results and discussion
3.1. General results
The validation results are given in Table 3. It is appropriate to compare these to corresponding values of the previous research for the small library of literature spectra. The principal value was 73% TP [12]. Here the corresponding result is 75% for the literature sublibrary (Table 3) that is very close to the previous value. Two other sublibraries are new in the second version.
The experimental sublibrary includes 54 ESI and MALDI prod- uct-ion mass spectra recorded in our laboratory for 6–7 cyclic pep- tides. This implies 7.7 replicate spectra available per a compound. The last value is relatively high one that is essential for obtaining true search results. Indeed, it is known that both the TPR and the mean MF usually tend to rise with the number of replicate ‘unknown’ and reference spectra acquired for a unique compound [10, p.186; 11]. It is explained by increasing a probability of match- ing spectra of the same compounds. Searches performed only for the spectral subset under consideration, i.e. when both test and reference spectra were experimental, led to 100% TP (Table 3). The average MF in these searches is within the range of 590–648 that is higher than the average value of 396 for the 1st rank spectra in all the test searches.
Both ESI and MALDI fragment spectra are obviously closer to each other in every of the two groups than between groups. The spectral resemblance seems to be due to a similarity in experimen- tal conditions which are rather similar within the groups although collision energy and partly laser shots/power are not the same. Also, ESI product-ion mass spectra match rather well with MALDI ones and vice versa. As the result, in all the test searches reference spectra of these two groups may substitute for one another, retain- ing high outcome of 86–95% initial TP obtained for both groups without their elimination.
It should be supplemented that MALDI product-ion spectra are very seldom to be entered in mass spectral libraries. To our best knowledge, the only exclusion is the group of MALDI ToF–ToF spec- tra obtained by means of high-energy collision-induced dissocia- tion and incorporated in MassBank [38]. Our library is the first one containing MALDI LIFT ToF/ToF spectra (see also [39]).
Another part of the library is ‘one-dimension’ spectra. Searches performed only within this type of data resulted in the low rate of 36% TP. We explain this low value by frequent false matching many ‘one-dimension’ spectra of different compounds because of the same m/z of many fragment ions with the same 100% intensity of their peaks.
Basic validation results refer to the sum of sublibraries and searches in the entire library. The overall TPR of 70% is not very high one. The rate is diminished due to the unconventional contri- bution of defective/‘one-dimension’ data. Without that, the correct results that are the 1st rank above-threshold matches appeared in 88% of cases (Table 3). However, incorporation of ‘one-dimension’ subset into only reference spectra set increases the rate of TP from 88% to 91% (Table 3). We think that this effect is due to rising total probability of spectral match of the same compounds with increas- ing the number of replicate spectra (see below). It is also clear that here probable false matches between ‘one-dimension’ spectra are impossible to occur.
We consider the rate of 88–91% as the principal one for the second version of this library. The rate is higher than the previous one (73%) and corresponds to a good efficiency level of modern libraries [12]. The difference between the two rates for correct answers seems to be caused firstly by increasing an average num- ber of library spectra per compound. The latter is so because the value of the spectra/compound ratio increased from 2.7 to 4.5 on the enlargement of the library from the initial version of 75 spectra to the current status of 263 ones.
3.2. Different compounds
Validation results and related identification potential of the library are not the same for different compounds under research. Taken as a whole, the trend of increasing TPR with the number of replicates (see above) is also expressed for individual com- pounds and this correlation is not strong (Fig 2). Test searches for compounds having 8 and more reference spectra of the same precursor ion (the [M+H]+ ions of microcystins-LR, -LA, -YR, a- and b-amanitin, phalloidin and also the [M+2H]2+ ion of micro- cystin-RR) without ‘one-dimension’ data led to high TPR of 96%. Here the correlation between the two quantities which are the number of spectra in the library and the abundance of chemical compound measured by the number of corresponding publications (Fig. 3), is advantageous to see. Thus identification of such abun- dant compounds based on this library is certainly reliable.
Target identification of those analytes with the use of HRMS1 and measurement of accurate mass of protonated molecules may also reliable. Every abundant cyclopeptide under consideration has the unique molecular formula and therefore molecular mass discriminating it from other peptides of this group. Moreover, the number of known compounds with the same formulas is relatively poor as estimated based on Chemical Abstract database (Table 4).