3.1. Identification of the date palm seed protein isolates by LC–MS/MS
Over 300 proteins were detected in the DSPC sample by LC–MS/MS. Not all identifications were considered significant (see below). Protein identification was achieved after the MS/MS data were compared to known sequences on the NCBI database, using the Mascot Version 2.4 software (Matrix Science Ltd, UK). This search resulted in 318 hits, each of which corresponding to a unique protein. The protein list was screened to remove any contaminants (e.g. proteins that the database only identified as being found in humans or animals). Since the preparation method for the LC–MS/MS requires digestion of the sample with trypsin, this protein, corresponding to the hit number 1 (i.e. the most abundant protein) is ignored A second protein, keratin (hit number 59), an animal protein found in hair, nails and skin, was also removed as this was considered to be a contaminant. To determine the accuracy of the identification of the remaining proteins, two criteria were used: the MOWSE (molecular weight search) score and the condition that the identification be based on at least two peptides being matched to the predicted peptide map of the protein. MOWSE is a method that aids in identifying proteins based on molecular weight of the peptides formed from proteolytic digestion of the protein sample, by allowing the probability of correct identification of the protein to be calculated. The method was first developed by Pappin, Hojrup, and Bleasby (1993). This method calculates the probability that the peptide has been misidentified during database searching, i.e. the identification is a random event. A low probability (P) of misidentification is required for correct identification. Since it is more common to express a more accurate identification as a higher number, the probability of misidentification is converted to a MOWSE score using the formula,