AI Meets

CHEMISTRY-화학-化學

session 1 [11/07 Thu]

[Checking Session Timetable]

Speaker: Seok, Chaok (석차옥, chaok@snu.ac.kr, home)

Chaok Seok is a professor at the Department of Chemistry of Seoul National University. She received her BS in chemistry from Seoul National University and her Ph.D. from the University of Chicago. Her research interests include developing methods for protein structure prediction and for predicting interactions of proteins with other molecules.

Title: Prediction of Protein Structure and Interaction by Physics and Informatics

Abstract:

Protein structure prediction problem has challenged theoretical and computational physical scientists since the first protein structure was published in 1958. There have been steady progresses in protein structure prediction since then, but major contributions to the progress came from informatics-based approaches rather than from physics-based approaches. Recently, DeepMind’s AlphaFold made a further contribution by introducing deep learning to extract structural information from the large sequence database. Meaningful contribution of physics-based approaches began to be made only in 2012 in the field of structure refinement. However, structural improvements that can be achieved by refinement with current physics-based approaches are very limited due to both energy and sampling problems. To overcome this limitation, we are taking an approach that combines physics and informatics, including deep learning. We take similar approaches to predict interactions of proteins with other proteins or small ligands including short peptides and oligosaccharides. Our goal is to develop protein structure modeling techniques that can provide useful predictions even in the absence of available information, although currently available experimental data would play important roles in developing such techniques. Such modeling methods would be very useful for applications to a wide range of biomedical research and drug discovery.

Speaker: Lee, Juyong (이주용, drfaust23@gmail.com, home)

  • Department of Chemistry, Kangwon National University, Assistant Professor

Title: Discovering Novel Fluorescent Molecules by Combining Machine-learning and Global Optimization

Abstract:

Fluorescent molecules are widely used for bio-imaging. They are attached to specific cell organelles and/or proteins, enabling observation of detailed structure and dynamics in the cell. Efficient fluorescent molecules must have a high quantum yield for effective bio-imaging. Here, we present a systematic approach to discovering novel fluorescent molecules that combines machine-learning and global optimization algorithms. We recast the problem of discovering novel fluorescent molecules with high-intensity emission light into a global optimization problem by using the oscillator strength of a molecule as an objective function for optimization.

A statistical machine that predicts excitation energies and associated oscillator strengths, the probability of absorption or emission of light in transitions between different energy states, of a molecule were trained using the random forest algorithm. The Pub-chemQC database, which contains TD-DFT calculation results of 3.8 million known molecules, was used as a training set. The extended connectivity fingerprints of molecules were used as input vectors. To optimize the oscillator strength of a molecule, a highly efficient global optimization algorithm called CSA was used. For CSA global optimization, SMILES representation of a molecule was mapped to a 200-dimensional integer vector by using Natural Language Toolkit. After CSA global optimization calculation converged, we assessed the validity of our approach by performing quantum mechanical calculations. TD-DFT calculations were carried out to verify whether novel molecules obtained by this procedure actually have high oscillator strength.

Speaker: Ko, Junsu (고준수, junsuko@arontier.co, home)

  • (주)아론티어 대표이사

Title: Fleming : AI-driven Integrated Drug Discovery Platform

Abstract:

New drug development cost more than one billion over the last decade. And the chance of success was quite low, one in five thousands. In order to make the drug development process efficient in terms of time and cost, Artificial Intelligence has been actively introduced to and rigorously adopted in the various fields of drug development: compound activity prediction, compound design, patient selection, and clinical trial design, just to name a few.

As these fields become advanced, more AI-based drug development platforms are developed and deployed in the fields to overcome data shortage.

Fleming is an integrated and automated platform for time and cost efficient drug development. It uses protein structure prediction and genome-based target selection for efficiency. In Fleming, precise target protein structures, generated by protein structure prediction techniques using AI, are used to pick candidate compounds and to design active compounds with high chance of success. And genome-based analysis powered by Fleming AI selects (find-an-adjective-for-this) compounds by predicting compound activity and toxicity.

These features in Fleming accelerate the new drug development process utilizing and combining various technologies: disease genome analysis, candidate compound selection, compound generation, activity prediction, and toxicity prediction.

session 2 [11/08 Thu]

[Checking Session Timetable]

Speaker: Jung, YounJoon (정연준, yjjung@snu.ac.kr, home)

YounJoon Jung is an associate professor at Seoul National University. He received his BS and MS in chemistry from Seoul National University and his Ph.D. from MIT. He worked as a Miller Fellow at the University of California, Berkeley. He uses statistical mechanical theories and computer simulation methods to reveal structures and dynamics of complex chemical systems, including glass transitions, ionic liquids, polymeric systems, and energy materials. His recent research interests include non-equilibrium ensemble methods and machine learning approaches for chemical dynamics.

Title: Delfos: deep learning model for prediction of solvation free energies in generic organic solvents

Abstract:

Prediction of aqueous solubilities or hydration free energies is an extensively studied area in machine learning applications in chemistry since water is the sole solvent in the living system. However, for non-aqueous solutions, few machine learning studies have been undertaken so far despite the fact that the solvation mechanism plays an important role in various chemical reactions. Here, we introduce Delfos (deep learning model for solvation free energies in generic organic solvents), which is a novel, machine-learning-based QSPR method which predicts solvation free energies for various organic solute and solvent systems. A novelty of Delfos involves two separate solvent and solute encoder networks that can quantify structural features of given compounds via word embedding and recurrent layers, augmented with the attention mechanism which extracts important substructures from outputs of recurrent neural networks. As a result, the predictor network calculates the solvation free energy of a given solvent–solute pair using features from encoders. With the results obtained from extensive calculations using 2495 solute–solvent pairs, we demonstrate that Delfos not only has great potential in showing accuracy comparable to that of the state-of-the-art computational chemistry methods, but also offers information about which substructures play a dominant role in the solvation process.

References:

1. [Paper] Delfos: deep learning model for prediction of solvation free energies in generic organic solvents, Hyuntae Lim and YounJoon Jung, Chemical Science 2019 DOI: 10.1039/C9SC02452B (2019)

2. [Media] http://now.snu.ac.kr/47/3/1450

Speaker: Kim, Woo Youn (김우연, wooyoun@kaist.ac.kr , home)

    • EDUCATION

2004. 포스텍 화학(물리) 학사

2009. 포스텍 화학 박사

    • CAREER

2009 ~ 2010 Max-Planck-Institute of Microstructure Physics, Post-doctoral fellow

2011 ~ 2015 KAIST 화학과 조교수

2015 ~ 현재 KAIST 화학과 부교수

2019 ~ 현재 제약바이오협회 인공지능신약개발지원센터 협의체 위원

Title: AI-based Smart Molecular Design

Abstract:

The ultimate goal of chemistry is to make new molecules with desired properties. It is challenging because chemical space is very large and discrete with a wide variety of molecules. For example, there are only 108 molecules synthesized as potential drug candidates, but 1060 molecules are estimated to be existing. High-throughput virtual screening approach has attracted great attention but still requires large costs and time. In this talk, we propose to use a molecular generative model based on deep learning algorithm as an alternative. It is specialized in controlling multiple molecular properties simultaneously, embedding them in namely the latent space. As a proof of concept, we will show that it can be used to generate a number of molecules as drugs with specific properties. We also apply it to design of new molecules with promising binding energy for a specific target protein and use them as potential drug candidates that are not in the database.