Advances in property-encoded molecular surface methods : PEST, PROLICSS, and pH-sensitive protein descriptors

Authors
Sundling, C. Matthew L.
ORCID
Loading...
Thumbnail Image
Other Contributors
Breneman, Curt M.
Bennett, Kristin P.
Cramer, Steven M.
Embrechts, Mark J.
Kitchen, Douglas Bruce
Issue Date
2007-12
Keywords
Chemistry
Degree
PhD
Terms of Use
Attribution-NonCommercial-NoDerivs 3.0 United States
This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.
Full Citation
Abstract
Presented here are three property-encoded molecular surface-based methods: PEST, PROLICSS, and pH-senstitive protein descriptors. PEST (Property-Encoded Surface Translator) generates whole-molecule alignment free surface descriptors that capture and encode shape-resolved molecular surface property distributions. The PEST method expands on prior property distribution based descriptor technologies by incorporating surface shape information into its descriptors. The PEST algorithm quantifies the spatial qualities of the surface in the form of internal distance measurements which separate points on the molecular surface. The distance information is convoluted with the electronic surface properties of the surface points into a two-dimensional shape-property histograms. The PEST algorithm, descriptors, sampling convergence, and software implementation are presented and then tested using a number of literature sourced datasets. The PEST descriptors have been proven to be a reliable source of surface shape information not found in previous descriptor technologies, and to be vital to several QSAR/QSPR studies. PROLICSS (PROtein-LIgand Complex Surface Scoring) is a knowledge-based scoring function that uses complementary encoded surfaces to assess binding strength. This method captures pose dependent binding interactions as a set of alignment-free interface surface property descriptors and autocorrelation fingerprints, which can be used for pose evaluation, pattern-matching analysis of ligand and receptor properties, and offer the potential for rapid analysis in a high-throughput docking mode. The PROLICSS method, descriptor and fingerprint generation are presented and then tested using a dataset of 121 protein-ligand, and protein-protein complexes. k-Means and SVM clustering performance indicate that PROLICSS is successful in capturing relevant binding information and warrants further development to a fully developed scoring function. The pH-senstive protein descriptors are an adapted form of the PEST descriptors which uses an electrostatic potential (EP) encoding of the protein surface that reflects pH. The electrostatic potential field used to encode protein surface properties was made pH sensitive through the use of ProPKa residue-level pKa estimation, and a Henderson-Hasselbalch expression to assign appropriate atomic-level charges to each residue weighted by protonation state. Presented is a method that assigns pH-dependent restricted CHELPG partial charges to individual residues using their estimated pKa and the pH level. PEST descriptors were generated from the pHsensitive EP-encoded protein surfaces, and subsequently used in a QSPR study to predict chromatographic behavior in ion exchange systems. In addition to these three methods, a protein structure cleaning protocol is presented, and was used in both the PROLICSS and pH-sensitive descriptor studies to prepare raw protein data. Also, a review on the applications of wavelet technology in chemistry and cheminformatics is presented as well as a theoretical study of the terahertz (THz) vibrational spectrum of 2,4-DNT is presented.
Description
December 2007
School of Science
Department
Dept. of Chemistry and Chemical Biology
Publisher
Rensselaer Polytechnic Institute, Troy, NY
Relationships
Rensselaer Theses and Dissertations Online Collection
Access
CC BY-NC-ND. Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.