A machine learning approach for predicting Antibody Properties

Oche Alexander Egaji, Seamus Ballard-Smith, Ikram Asghar, Mark Griffiths

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

135 Downloads (Pure)


This paper used an amino acid location-based sequence encoding as a feature extraction techniques to identify single chains antibody molecules that bind to B-lymphocyte stimulator (BLyS) antigen. The data were manually derived from the European patent (EP2275449B1) text. The dataset was cleaned and made suitable for the machine learning models. The accuracy, precision and recall achieved across individual descriptors (Membrane and Soluble) for Logistic regression, KNN, KSVM, and Random Forest Tree was above 80%. However, it was much lower for the Naïve Bayes except for the precision score. The promising accuracy value achieved from such a minimal dataset has significant implications for the drug discovery process – this includes considerable savings in time and resources.
Original languageEnglish
Title of host publicationProceedings of ICICM 2020 - 2020 10th International Conference on Information Communication and Management, Worshop
Subtitle of host publicationICKET 2020 - 2020 9th International Conference on Knowledge and Education Technology
Place of PublicationParis, France
PublisherAssociation for Computing Machinery
Number of pages5
ISBN (Electronic)978-1-4503-8770-5
ISBN (Print)978-1-4503-8770-5
Publication statusPublished - 28 Jun 2020

Publication series

NameACM International Conference Proceeding Series


  • Amino acid sequence
  • Antibody
  • Antigen
  • Infectious disease
  • Machine learning


Dive into the research topics of 'A machine learning approach for predicting Antibody Properties'. Together they form a unique fingerprint.

Cite this