Dr. David E. Konerding, Ph.D. dek@konerding.com EMPLOYMENT AND EDUCATION Aug 2019 - Present: Staff Engineer, Platforms Machine Learning, Google Inc Aug 2018 - Aug 2019: Senior Lead Data Engineer and Interim Head of Data Engineering, Insitro Dec 2013 - Aug 2018: Staff Engineer, Infrastructure, Google Inc Aug 2008 - Dec 2013: Senior Engineer, Infrastructure, Google Inc Aug 2007 - Aug 2008: Senior System Architect for Research Computing, Genentech Inc Aug 2003 - Aug 2007: Computer Scientist, Lawrence Berkeley National Laboratory, Computational Research Division, Distributed Systems Department Oct 2002 - Aug 2003: Postdoctoral Scholar, University of California at Berkeley, Department of Plant and Microbial Biology, Steven Brenner Laboratory Feb 2002 - Oct 2002: Postdoctoral Scholar, University of California at Berkeley, Department of Bioengineering, Kimmen Sjolander Laboratory 2000 - 2002: Programmer/Analyst III, University of California at San Francisco, Computer Graphics Laboratory 1995 - 2001: PhD candidate, University of California at San Francisco, Graduate Group in Biophysics. PhD (Biophysics) awarded January 2001 1991 - 1995: Undergraduate, University of California at Santa Cruz. B.A. (Biochemistry and Molecular Biology) awarded June 1995 PATENTS Storing genetic data in a storage system US10720231/10354748 Opportunistic job processing of input data divided into partitions of different sizes US10169728/9535765 CURRENT JOB FOCUS AND INTERESTS Design and develop a highly scalable software infrastructure product. Advise numerous internal projects on high performance scientific computing. Contribute to academic scientific research projects. SKILLS o Accomplished scientist, software engineer and system administrator. Highly skilled in developing highly scalable services for scientific computing. o Undergraduate, PhD and postdoc experience in computational biology, structural biology, and bioinformatics. o Highly experienced programmer with deep knowledge of open source software development toolchains and operating systems. Extensive experience managing Linux servers including customizing distributions for high-performance scientific computing. o Deep knowledge and experience designing, customizing, implementing and tuning Linux clusters including high-performance networked file systems, network interconnect, and batch queuing. o Extensive experience in data modeling and query design to maximize RDBMS performance using open source relational databases MySQL and postgreSQL. o Primary programming languages: C++ and Python with skill in Java and Perl, and familiarity with all other major programming languages. EXPERIENCE Aug 2008 - present: Google Inc, Senior Engineer, Infrastructure. Proposed, designed and actively developing a product as Technical Lead. Aug 2007 - Aug 2008: Genentech Inc, Senior System Architect for Research Computing Provide Architectural oversight for Genentech Research Computing and deploy solutions for scientists. Frequently did performance firefighting and feature enhancements to ensure Genentech scientists were highly productive. Aug 2003 - Aug 2007: Computer Scientist, Distributed Systems Department, Computational Research Division, Lawrence Berkeley National Lab. Lead developer of ViCE (a visual programming/scientific workflow execution system, based on Grid and Service Oriented Architecture standards) and C-BEI (a scientific computational workflow management infrastructure). Perform research into Computationals Grid including grid system administration and scientific workflow management. Tune applications and utilize high-performance computational grids composed of multiple clusters with 500-1000 nodes each) to enable scale-out of large computational biology workflows. Solve large-scale scientific problems using high performance computing in collaboration with domain scientists. Oct 2002 - Aug 2003: Postdoctoral Scholar, Steven Brenner Lab, Department of Plant and Microbial Biology, University of California, Berkeley. Evaluated and applied comparative protein structure modeling methods for the Berkeley Structural Genomics Center. Managed group 64-processor Linux cluster including application deployment and performance tuning. Feb 2002 - Oct 2002: Postdoctoral Scholar, Kimmen Sjolander Lab, Department of Bioengineering, University of California, Berkeley. Validated and improved subfamily hidden Markov Models (sHMMs) specificity for functional classification and remote homolog detection. Implemented a 16-processor Linux cluster for high-throughput protein structure prediction and functional annotation. 2000 - 2002: Programmer/Analyst III, Computer Graphics Lab, University of California, San Francisco. Developed the Chimera Collaboratory (http://www.cgl.ucsf.edu/Research/collaboratory). Maintained Linux port of Chimera (http://www.cgl.ucsf.edu/chimera). Administered and maintained various software packages including Apache, PHP, and MySQL in support of bioinformatics projects. 1995 - 2001: Graduate Student in James lab, Graduate Group in Biophysics University of California, San Francisco. PhD awarded December 2001: "Structure of a Cytarabine-substituted Okazaki Fragment Model". Determined the structure of gemcitabine- and cytarabine-substituted Okazaki fragment model using restrained molecular dynamics based on NMR-derived distance and torsional restraints. Designed and implemented a Linux cluster for large-scale molecular simulations of nucleic acids and proteins. 1991 - 1995: Department of Biochemistry and Molecular Biology, University of California, Santa Cruz. B.A. awarded June 1995. Undergraduate thesis submitted for completion of graduation at UCSC: "Prediction of Gene-encoding regions in E.Coli DNA using an Optimal Parse Method with Multiple Types of Evidence". AWARDS AND ACHIEVEMENTS o 2009, 2010 IEEE IPDPS Program Committee. o 2006 LBNL LDRD: $90,000 grant to enhance C-BEI for parameter optimization workflows. o 2005 LBNL LDRD: $90,000 grant to develop C-BEI, a computational workflow engine. o 1998 ITD (Instructional Technologies Division) Personal Grant for $5000 to develop the MidasMovie 3.0 application. Grant used to purchase Dual PII 400MHz PC as a development workstation. o 1997 ITD (Instructional Technologies Division) Personal Grant for $5000 to develop the MidasMovie 2.0 application. Grant used to purchase Dual Pentium Pro 200MHz PC as a development workstation. o September 1997-May 1998 U.C. Regents Fellow SCIENTIFIC PUBLICATIONS Bharath Ramsundar, Steven Kearnes, Patrick Riley, Dale Webster, David Konerding, Vijay Pande. Massively multitask networks for drug discovery. arXiv preprint arXiv:1502.02072 Kai J Kohlhoff, Diwakar Shukla, Morgan Lawrenz, Gregory R Bowman, David E Konerding, Dan Belov, Russ B Altman, Vijay S Pande. Cloud-based simulations on Google Exacycle reveal ligand modulation of GPCR activation pathways. Nature Chemistry 2014, 6: 15–21 Patrick Conway, Michael D Tyka, Frank DiMaio, David E Konerding, David Baker. Relaxation of backbone bond geometry improves protein energy landscape modeling. Protein Science 2014, 23(1): 47-55 Chandonia J-M, Konerding DE, Allen DG, Choi I, Yokota, H, Brenner SE. Computational Structural Genomics of a Complete Minimal Organism. Genome Informatics 2002, 13: 390-391 Konerding DE, James TL, Trump E, Soto AM, Marky LA and Gmeiner WH. NMR Structure of a Gemcitabine-Substituted Model Okazaki Fragment. Biochemistry, 2002, 41: 839-46. Gmeiner WH; Cui W; Konerding DE; Keifer PA; Sharma SK; Soto AM; Marky LA; Lown JW. Shape-Selective Recognition of a Model Okazaki Fragment by Geometrically-Constrained bis-distamycins. Journal of Biomolecular Structure and Dynamics, 1999 Dec, 17(3): 507-18. Gmeiner WH; Konerding D; James TL. Effect of cytarabine on the NMR structure of a model okazaki fragment from the SV40 genome. Biochemistry, 1999 Jan 26, 38(4):1166-75. Konerding DE; Cheatham TE 3rd; Kollman PA; James TL. Restrained Molecular Dynamics of Solvated Duplex DNA Using the Particle Mesh Ewald Method. Journal of Biomolecular NMR, 1999 Feb, 13(2):119-31. Gmeiner WH; Konerding D; James TL. Effect of Cytarabine on the NMR Structure of a Model Okazaki Fragment from the SV40 Genome. Biochemistry, 1999 38: 1166-75. TECHNICAL PUBLICATIONS Konerding DE. Virtual Network Computing: Cross-Platform Remote Display and Collaboration Software. Journal of Molecular Graphics and Modeling, 2000 Apr, 17(2):151-4. Konerding DE. The Ensemble/Legacy Chimera Extension: Standardized User and Programmer Interface to Molecular Ensemble Data and Legacy Modeling Programs. Pacific Symposium on Biocomputing 2000, 5:251-62 Gunter DK, Jackson KR, Konerding DE, Lee JR. Essential Grid Workflow Monitoring Elements Conference on Grid Computing and Applications 2005 PAST SOFTWARE PROJECTS o Technical lead and primary developer of ViCE, a visual programming environment for composing workflows from web and grid services http://dsd.lbl.gov/gtg/projects/vice/ o Technical lead and primary developer of C-BEI, an execution framework for high-performance computational workflows composed of web and grid services. http://dsd.lbl.gov/gtg/projects/CBEI/ o Co-developer of pyGlobus, and pyGridware, software toolkits for grid computing using python: http://dsd.lbl.gov/gtg/projects/pyGlobus/ http://dsd.lbl.gov/gtg/projects/pyGridWare/ o (Previously) Sole developer of PyML, a Python interface to Mathematica: http://www.konerding.com/~dek/pyml.html o (Previously) Technical lead and primary developer of the Chimera Collaboratory, a collaborative extension to the molecular modeling application Chimera: http://www.cgl.ucsf.edu/Research/collaboratory/ REFERENCES Dr. Daphne Koller, Insitro. daphne@insitro.com