Ahmet Sinan Yavuz, Uğur Sezerman
5th International Symposium on Health Informatics and Bioinformatics (HIBIT), April 20-22, 2010, Antalya, Turkey
Publication year: 2010

A post translational modification SUMOylation is one of the vital processes of protein maturation and function. Determining a protein’s SUMOylation status is important in the context of determining that protein’s function, nuclear localization, and intra-nuclear spatial association. Many of the predictors currently use a consensus motif, which is ΨKxE (where Ψ is a large aliphatic branched hydrophobic amino acid and x is any amino acid), to predict the location of SUMO modification. However, approximately 23% of the validated SUMOylation sites do not conform to the consensus motif, a phenomenon which makes the prediction of SUMOylation sites complicated. Here we present a new method, SUMOtr, using structure and sequence information. This study investigates the role of protein volume, structural motifs, and hydrophobicity of the amino acids in the vicinity of central Lysine in the prediction of SUMOylation sites with tree classification algorithms. A comparison between SUMOtr and the previous methods show that SUMOtr is higher in correlation coefficient and sensitivity. Decision Stump tree classification has provided the overall performance of the method as 85% accuracy, 75% specificity, 95% sensitivity and 0.72 correlation coefficient.