Wecomput: Accelerating Biologics Discovery with Deep Learning

The current methodology of biologics discovery, a decades-old approach that relies on experimental screens of large sequence libraries or animal immunization, can only get at a small fraction of all possible amino acid sequences. This is where machine learning can be so powerful, because it shifts from bulk screening to data-driven, iterative engineering. For those tasks where enough data are available, deep learning models can explore full potential, which revolutionizes the way of traditional biologics discovery, enabling more efficient and rational design. Wecomput Technology Co., Ltd. (Wecomput), a fast-growing AI-driven startup company specializing in drug discovery (especially biologics), is committed to empowering drug discovery innovation by providing comprehensive computational solutions for biomedicine, integrating cutting-edge technologies like artificial intelligence, biophysics, high-performance computing, etc.

In the early stage of the discovery of therapeutic biologics, a fundamental task is hit identification, which generates various proteins with targeted activities or functions. Wecomput has explored an approach other than the traditional wet-lab approach by rationally designing proteins with specific 3D structures and physico-chemical properties, using generative protein language models of both protein structure and sequence trained from large-scale data of known proteins. Depending on the objective, the proteins can be de novo designed or optimized from known templates. This computational-based approach is also broadly applicable to the design of protein monomers, oligomers, antibodies, peptides, and so on. Compared to experimental screens, computation-driven protein design generates many more diverse candidates in tens to thousands of times less cost and time. What’s more, it may provide new opportunities for targets that used to be undruggable using experimental approaches only.

Monoclonal antibody (mAb) therapeutics are often produced from non-human sources (typically murine) and can therefore generate immunogenic responses in humans. Humanization procedures aim to produce antibody therapeutics that do not elicit an immune response and are safe for human use, without impacting efficacy. Humanization is normally carried out in a largely trial-and-error experimental process. To improve the efficiency and success rate of humanization, Wecomput has built a deep learning-based antibody humanness predictor using language models that can discriminate between human and non-human antibody variable domain sequences using the large amount of repertoire data now available. The model can distinguish human from non-human sequences, and can precisely suggest forward or back mutations critical for the engineering of the sequence from non-human to human. In short, computation-driven humanization is an effective replacement for trial-and-error humanization experiments, producing better results in a fraction of the time.

While therapeutic proteins have made remarkable improvements for the treatment of numerous diseases, some of these products are associated with undesirable immunogenicity that may lead to reduced or loss of efficacy, altered pharmacokinetics (PK), or even adverse clinical effects. Immunogenicity is an important factor affecting the success rate of clinical development of biological drugs, but is difficult to predict based on in vitro experiments. Therefore, in silico prediction algorithms are useful, but most are generally ineffective with high false positives. Wecomput has developed AlphaMHC, a deep learning-based immunogenicity prediction algorithm, providing a better solution for in silico immunogenicity risk assessment. It is developed on top of a more comprehensive understanding of the biological mechanism of immunogenicity, explores and integrates informative data from more dimensions and sources, and adopts state-of-the-art deep learning technologies. GPU acceleration is used for the highly efficient training of tens of billions of data points. As a result, AlphaMHC effectively reduces false positives during prediction, has been verified against clinical ADA data, and has proved useful in dozens of practical biologics R&D projects, which are highly recognized by many partners and customers of Wecomput.

The ability to train complex and large language models (e.g., the aforementioned protein/antibody language models) in reasonable periods of time has been and will continue to be critical to the development and iteration of AI-driven drug design technologies in the real-time drug discovery pipeline for now and especially the future. By using GPU acceleration and high performance computing, it is possible to construct a more capable and more accurate model than any CPU-based system could have delivered in an equivalent amount of time. That’s why Wecomput has chosen to develop WeMol, a next-generation one-stop AI drug design software system integrating Wecomput’s proprietary machine learning algorithms for biologics discovery and more, based on NVIDIA accelerated computing architectures. As expected, the development has been progressing smoothly and quickly with NVIDIA’s support. In many tasks, the performance of WeMol exceeds traditional experimental/computational methods in terms of speed, accuracy, and efficiency.

WeMol is a next-gen software platform covering a variety of drug discovery scenarios in the design of protein, small molecule, RNA, and others. WeMol provides a browser-based GUI, which is very friendly to users not familiar with computation. What’s more, WeMol adopts a streaming architecture, allowing professional users like computational biologists, data scientists, or AI engineers to develop new modules or assemble custom workflows in real time, in a low-code way. Since its release last year, WeMol has been recognized by hundreds of users from biopharmas, biotechs, universities, institutes, and hospitals.

Wecomput is now a member of the NVIDIA Inception program, which nurtures cutting-edge startups, and is working with NVIDIA to offer systems featuring NVIDIA RTX professional graphics cards (RTX A4000/5000/6000).

Additionally, Wecomput offers NVIDIA DGX systems. This provides an on-prem, complete AI solution to lower the barrier for accelerating biologics discovery with deep learning. Wecomput is also looking forward to NVIDIA’s new products with powerful computing power to further expand the AI-driven drug discovery in biologics, including multi-specific antibodies, mRNAs, cell therapeutics, and peptides, to provide more efficient AI-powered computational services and products.

There are many relevant sessions at NVIDIA GTC, the developer conference for the era of AI and the metaverse, which takes place Sept. 19-22 and is free to attend. A few can’t-miss talks: 1. Tuesday, Sept. 20, at 8:00 a.m. PT |  NVIDIA CEO Jensen Huang’s keynote

  1. Tuesday, Sept. 20, at 11:00 a.m. PT | NVIDIA VP of Healthcare, Kimberly Powell, Special Address: The Rise of AI and Digital Twins in Healthcare
  2. Wednesday, Sept. 21, at 11:00 a.m. PT | NVIDIA Global Healthcare AI Startups Lead, Renee Yao, panel: Accelerate Healthcare and Life Science Innovation with Makers and Breakers