The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens

Naihui Zhou; Yuxiang Jiang; Timothy R Bergquist; Alexandra J Lee; Balint Z Kacsoh; Alex W Crocker; Kimberley A Lewis; George Georghiou; Huy N Nguyen; Md Nafiz Hamid; Larry Davis; Tunca Dogan; Volkan Atalay; Ahmet S Rifaioglu; Alperen Dalkıran; Rengul Cetin Atalay; Chengxin Zhang; Rebecca L Hurto; Peter L Freddolino; Yang Zhang; Prajwal Bhat; Fran Supek; José M Fernández; Branislava Gemovic; Vladimir R Perovic; Radoslav S Davidović; Neven Sumonja; Nevena Veljkovic; Ehsaneddin Asgari; Mohammad R K Mofrad; Giuseppe Profiti; Castrense Savojardo; Pier Luigi Martelli; Rita Casadio; Florian Boecker; Heiko Schoof; Indika Kahanda; Natalie Thurlby; Alice C McHardy; Alexandre Renaux; Rabie Saidi; Julian Gough; Alex A Freitas; Magdalena Antczak; Fabio Fabris; Mark N Wass; Jie Hou; Jianlin Cheng; Zheng Wang; Alfonso E Romero; Alberto Paccanaro; Haixuan Yang; Tatyana Goldberg; Chenguang Zhao; Liisa Holm; Petri Törönen; Alan J Medlar; Elaine Zosa; Itamar Borukhov; Ilya Novikov; Angela Wilkins; Olivier Lichtarge; Po-Han Chi; Wei-Cheng Tseng; Michal Linial; Peter W Rose; Christophe Dessimoz; Vedrana Vidulin; Saso Dzeroski; Ian Sillitoe; Sayoni Das; Jonathan Gill Lees; David T Jones; Cen Wan; Domenico Cozzetto; Rui Fa; Mateo Torres; Alex Warwick Vesztrocy; Jose Manuel Rodriguez; Michael L Tress; Marco Frasca; Marco Notaro; Giuliano Grossi; Alessandro Petrini; Matteo Re; Giorgio Valentini; Marco Mesiti; Daniel B Roche; Jonas Reeb; David W Ritchie; Sabeur Aridhi; Seyed Ziaeddin Alborzi; Marie-Dominique Devignes; Da Chen Emily Koo; Richard Bonneau; Vladimir Gligorijević; Meet Barot; Hai Fang; Stefano Toppo; Enrico Lavezzo; Marco Falda; Michele Berselli; Silvio C E Tosatto; Marco Carraro; Damiano Piovesan; Hafeez Ur Rehman; Qizhong Mao; Shanshan Zhang; Slobodan Vucetic; Gage S Black; Dane Jo; Erica Suh; Jonathan B Dayton; Dallas J Larsen; Ashton R Omdahl; Liam J McGuffin; Danielle A Brackenridge; Patricia C Babbitt; Jeffrey M Yunes; Paolo Fontana; Feng Zhang; Shanfeng Zhu; Ronghui You; Zihan Zhang; Suyang Dai; Shuwei Yao; Weidong Tian; Renzhi Cao; Caleb Chandler; Miguel Amezola; Devon Johnson; Jia-Ming Chang; Wen-Hung Liao; Yi-Wei Liu; Stefano Pascarelli; Yotam Frank; Robert Hoehndorf; Maxat Kulmanov; Imane Boudellioua; Gianfranco Politano; Stefano Di Carlo; Alfredo Benso; Kai Hakala; Filip Ginter; Farrokh Mehryary; Suwisa Kaewphan; Jari Björne; Hans Moen; Martti E E Tolvanen; Tapio Salakoski; Daisuke Kihara; Aashish Jain; Tomislav Šmuc; Adrian Altenhoff; Asa Ben-Hur; Burkhard Rost; Steven E Brenner; Christine A Orengo; Constance J Jeffery; Giovanni Bosco; Deborah A Hogan; Maria J Martin; Claire O'Donovan; Sean D Mooney; Casey S Greene; Predrag Radivojac; Iddo Friedberg

doi:10.1186/s13059-019-1835-8

The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens

Genome Biol. 2019 Nov 19;20(1):244. doi: 10.1186/s13059-019-1835-8.

Authors

Naihui Zhou^{1

2}, Yuxiang Jiang³, Timothy R Bergquist⁴, Alexandra J Lee⁵, Balint Z Kacsoh^{6

7}, Alex W Crocker⁸, Kimberley A Lewis⁸, George Georghiou⁹, Huy N Nguyen^{1

10}, Md Nafiz Hamid^{1

2}, Larry Davis², Tunca Dogan^{11

12}, Volkan Atalay¹³, Ahmet S Rifaioglu^{13

14}, Alperen Dalkıran¹³, Rengul Cetin Atalay¹⁵, Chengxin Zhang¹⁶, Rebecca L Hurto¹⁷, Peter L Freddolino^{16

17}, Yang Zhang^{16

17}, Prajwal Bhat¹⁸, Fran Supek^{19

20}, José M Fernández^{21

22}, Branislava Gemovic²³, Vladimir R Perovic²³, Radoslav S Davidović²³, Neven Sumonja²³, Nevena Veljkovic²³, Ehsaneddin Asgari^{24

25}, Mohammad R K Mofrad²⁶, Giuseppe Profiti^{27

28}, Castrense Savojardo²⁷, Pier Luigi Martelli²⁷, Rita Casadio²⁷, Florian Boecker²⁹, Heiko Schoof³⁰, Indika Kahanda³¹, Natalie Thurlby³², Alice C McHardy^{33

34}, Alexandre Renaux^{35

36

37}, Rabie Saidi¹², Julian Gough³⁸, Alex A Freitas³⁹, Magdalena Antczak⁴⁰, Fabio Fabris³⁹, Mark N Wass⁴⁰, Jie Hou^{41

42}, Jianlin Cheng⁴², Zheng Wang⁴³, Alfonso E Romero⁴⁴, Alberto Paccanaro⁴⁴, Haixuan Yang^{45

46}, Tatyana Goldberg⁴⁷, Chenguang Zhao^{48

49

50}, Liisa Holm⁵¹, Petri Törönen⁵¹, Alan J Medlar⁵¹, Elaine Zosa⁵², Itamar Borukhov⁵³, Ilya Novikov⁵⁴, Angela Wilkins⁵⁵, Olivier Lichtarge⁵⁵, Po-Han Chi⁵⁶, Wei-Cheng Tseng⁵⁷, Michal Linial⁵⁸, Peter W Rose⁵⁹, Christophe Dessimoz^{60

61

62}, Vedrana Vidulin⁶³, Saso Dzeroski^{64

65}, Ian Sillitoe⁶⁶, Sayoni Das⁶⁷, Jonathan Gill Lees^{67

68}, David T Jones^{69

70}, Cen Wan^{71

69}, Domenico Cozzetto^{71

69}, Rui Fa^{71

69}, Mateo Torres⁴⁴, Alex Warwick Vesztrocy^{70

72}, Jose Manuel Rodriguez⁷³, Michael L Tress⁷⁴, Marco Frasca⁷⁵, Marco Notaro⁷⁵, Giuliano Grossi⁷⁵, Alessandro Petrini⁷⁵, Matteo Re⁷⁵, Giorgio Valentini⁷⁵, Marco Mesiti^{75

76}, Daniel B Roche⁷⁷, Jonas Reeb⁷⁷, David W Ritchie⁷⁸, Sabeur Aridhi⁷⁸, Seyed Ziaeddin Alborzi^{78

79}, Marie-Dominique Devignes^{78

80

79}, Da Chen Emily Koo⁸¹, Richard Bonneau^{82

83}, Vladimir Gligorijević⁸⁴, Meet Barot⁸⁵, Hai Fang⁸⁶, Stefano Toppo⁸⁷, Enrico Lavezzo⁸⁷, Marco Falda⁸⁸, Michele Berselli⁸⁷, Silvio C E Tosatto^{89

90}, Marco Carraro⁹⁰, Damiano Piovesan⁹⁰, Hafeez Ur Rehman⁹¹, Qizhong Mao^{92

93}, Shanshan Zhang⁹², Slobodan Vucetic⁹², Gage S Black^{94

95}, Dane Jo^{94

95}, Erica Suh⁹⁴, Jonathan B Dayton^{94

95}, Dallas J Larsen^{94

95}, Ashton R Omdahl^{94

95}, Liam J McGuffin⁹⁶, Danielle A Brackenridge⁹⁶, Patricia C Babbitt^{97

98}, Jeffrey M Yunes^{99

98}, Paolo Fontana¹⁰⁰, Feng Zhang^{101

102}, Shanfeng Zhu^{103

104

105}, Ronghui You^{103

104

105}, Zihan Zhang^{103

105}, Suyang Dai^{103

105}, Shuwei Yao^{103

104}, Weidong Tian^{106

107}, Renzhi Cao¹⁰⁸, Caleb Chandler¹⁰⁸, Miguel Amezola¹⁰⁸, Devon Johnson¹⁰⁸, Jia-Ming Chang¹⁰⁹, Wen-Hung Liao¹⁰⁹, Yi-Wei Liu¹⁰⁹, Stefano Pascarelli¹¹⁰, Yotam Frank¹¹¹, Robert Hoehndorf¹¹², Maxat Kulmanov¹¹², Imane Boudellioua^{113

114}, Gianfranco Politano¹¹⁵, Stefano Di Carlo¹¹⁵, Alfredo Benso¹¹⁵, Kai Hakala^{116

117}, Filip Ginter^{116

118}, Farrokh Mehryary^{116

117}, Suwisa Kaewphan^{116

117

119}, Jari Björne^{120

121}, Hans Moen¹¹⁸, Martti E E Tolvanen¹²², Tapio Salakoski^{120

121}, Daisuke Kihara^{123

124}, Aashish Jain¹²⁵, Tomislav Šmuc¹²⁶, Adrian Altenhoff^{127

128}, Asa Ben-Hur¹²⁹, Burkhard Rost^{47

130}, Steven E Brenner¹³¹, Christine A Orengo⁶⁷, Constance J Jeffery¹³², Giovanni Bosco¹³³, Deborah A Hogan^{6

8}, Maria J Martin⁹, Claire O'Donovan⁹, Sean D Mooney⁴, Casey S Greene^{134

135}, Predrag Radivojac¹³⁶, Iddo Friedberg¹³⁷

Affiliations

¹ Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA, USA.
² Program in Bioinformatics and Computational Biology, Ames, IA, USA.
³ Indiana University Bloomington, Bloomington, Indiana, USA.
⁴ Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, USA.
⁵ Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, PA, USA.
⁶ Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
⁷ Department of Molecular and Systems Biology, Hanover, NH, USA.
⁸ Department of Microbiology and Immunology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
⁹ European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, United Kingdom.
¹⁰ Program in Computer Science, Ames, IA, USA.
¹¹ Department of Computer Engineering, Hacettepe University, Ankara, Turkey.
¹² European Molecular Biolo gy Labora tory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK.
¹³ Department of Computer Engineering, Middle East Technical University (METU), Ankara, Turkey.
¹⁴ Department of Computer Engineering, Iskenderun Technical University, Hatay, Turkey.
¹⁵ CanSyL, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey.
¹⁶ Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
¹⁷ Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA.
¹⁸ Achira Labs, Bangalore, India.
¹⁹ Institute for Research in Biomedicine (IRB Barcelona), Barcelona, Spain.
²⁰ Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
²¹ INB Coordination Unit, Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Catalonia, Spain.
²² (former) INB GN2, Structural and Computational Biology Programme, Spanish National Cancer Research Centre, Barcelona, Catalonia, Spain.
²³ Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences VINCA, University of Belgrade, Belgrade, Serbia.
²⁴ Molecular Cell Biomechanics Laboratory, Departments of Bioengineering, University of California Berkeley, Berkeley, CA, USA.
²⁵ Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Berkeley, CA, USA.
²⁶ Departments of Bioengineering and Mechanical Engineering, Berkeley, CA, USA.
²⁷ Bologna Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.
²⁸ National Research Council, IBIOM, Bologna, Italy.
²⁹ University of Bonn: INRES Crop Bioinformatics, Bonn, North Rhine-Westphalia, Germany.
³⁰ INRES Crop Bioinformatics, University of Bonn, Bonn, Germany.
³¹ Gianforte School of Computing, Montana State University, Bozeman, Montana, USA.
³² University of Bristol, Computer Science, Bristol, Bristol, United Kingdom.
³³ Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Brunswick, Germany.
³⁴ RESIST, DFG Cluster of Excellence 2155, Brunswick, Germany.
³⁵ Interuniversity Institute of Bioinformatics in Brussels, Université libre de Bruxelles - Vrije Universiteit Brussel, Brussels, Belgium.
³⁶ Machine Learning Group, Université libre de Bruxelles, Brussels, Belgium.
³⁷ Artificial Intelligence lab, Vrije Universiteit Brussel, Brussels, Belgium.
³⁸ MRC Laboratory of Molecular Biology, Cambridge, United Kingdom.
³⁹ University of Kent, School of Computing, Canterbury, United Kingdom.
⁴⁰ School of Biosciences, University of Kent, Canterbury, Kent, United Kingdom.
⁴¹ University of Missouri, Computer Science, Columbia, Missouri, USA.
⁴² Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA.
⁴³ University of Miami, Coral Gables, Florida, USA.
⁴⁴ Centre for Systems and Synthetic Biology, Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, United Kingdom.
⁴⁵ School of Mathematics, Statistics and Applied Mathematics, National University of Ireland, Galway, Galway, Ireland.
⁴⁶ Technical University of Munich, Garching, Germany.
⁴⁷ Department of Informatics, Bioinformatics & Computational Biology-i12, Technische Universitat Munchen, Munich, Germany.
⁴⁸ Faculty for Informatics, Garching, Germany.
⁴⁹ Department for Bioinformatics and Computational Biology, Garching, Germany.
⁵⁰ School of Computing Sciences and Computer Engineering, Hattiesburg, Mississippi, USA.
⁵¹ Institute of Biotechnology, Helsinki Institute of Life Sciences, University of Helsinki, Finland, Helsinki, Finland.
⁵² Institute of Biotechnology, University of Helsinki, Helsinki, Finland.
⁵³ Compugen Ltd., Holon, Israel.
⁵⁴ Baylor College of Medicine, Department of Biochemistry and Molecular Biology, Houston, TX, USA.
⁵⁵ Baylor College of Medicine, Department of Molecular and Human Genetics, Houston, TX, USA.
⁵⁶ National TsingHua University, Hsinchu, Taiwan.
⁵⁷ Department of Electrical Engineering in National Tsing Hua University, Hsinchu City, Taiwan.
⁵⁸ The Hebrew University of Jerusalem, Jerusalem, Israel.
⁵⁹ University of California San Diego, San Diego Supercomputer Center, La Jolla, California, USA.
⁶⁰ Department of Computational Biology and Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.
⁶¹ Department of Genetics, Evolution & Environment, and Department of Computer Science, University College London, London, UK.
⁶² Swiss Institute of Bioinformatics, Lausanne, Switzerland.
⁶³ Department of Knowledge Technologies, Jozef Stefan Institute, Ljubljana, Slovenia.
⁶⁴ Jozef Stefan Institute, Ljubljana, Slovenia.
⁶⁵ Jozef Stefan International Postgraduate School, Ljubljana, Slovenia.
⁶⁶ Research Department of Structural and Molecular Biology, University College London, London, England.
⁶⁷ Research Department of Structural and Molecular Biology, University College London, London, United Kingdom.
⁶⁸ Department of Health and Life Sciences, Oxford Brookes University, London, UK.
⁶⁹ The Francis Crick Institute, Biomedical Data Science Laboratory, London, United Kingdom.
⁷⁰ Department of Genetics, Evolution and Environment, University College London, Gower Street, London, WC1E 6BT, United Kingdom.
⁷¹ Department of Computer Science, University College London, London, United Kingdom.
⁷² SIB Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland.
⁷³ Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), Madrid, Spain.
⁷⁴ Spanish National Cancer Research Centre (CNIO), Madrid, Spain.
⁷⁵ Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy.
⁷⁶ Institut de Biologie Computationnelle, LIRMM, CNRS-UMR 5506, Universite de Montpellier, Montpellier, France.
⁷⁷ Department of Informatics, Bioinformatics and Computational Biology-i12, Technische Universitat Munchen, Munich, Germany.
⁷⁸ University of Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France.
⁷⁹ Inria, Nancy, France.
⁸⁰ University of Lorraine, Nancy, Lorraine, France.
⁸¹ Department of Biology, New York University, New York, NY, USA.
⁸² NYU Center for Data Science, New York, 10010, NY, USA.
⁸³ Flatiron Institute, CCB, New York, 10010, NY, USA.
⁸⁴ Center for Computational Biology (CCB), Flatiron Institute, Simons Foundation, New York, New York, USA.
⁸⁵ Center for Data Science, New York University, New York, 10011, NY, USA.
⁸⁶ Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK.
⁸⁷ Department of Molecular Medicine, University of Padova, Padova, Italy.
⁸⁸ Department of Biology, University of Padova, Padova, Italy.
⁸⁹ CNR Institute of Neuroscience, Padova, Italy.
⁹⁰ Department of Biomedical Sciences, University of Padua, Padova, Italy.
⁹¹ Department of Computer Science, National University of Computer and Emerging Sciences, Peshawar, Khyber Pakhtoonkhwa, Pakistan.
⁹² Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA.
⁹³ University of California, Riverside, Philadelphia, PA, USA.
⁹⁴ Department of Biology, Brigham Young University, Provo, UT, USA.
⁹⁵ Bioinformatics Research Group, Provo, UT, USA.
⁹⁶ School of Biological Sciences, University of Reading, Reading, England, United Kingdom.
⁹⁷ Department of Pharmaceutical Chemistry, San Francisco, CA, USA.
⁹⁸ Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, 94158, CA, USA.
⁹⁹ UC Berkeley - UCSF Graduate Program in Bioengineering, University of California, San Francisco, 94158, CA, USA.
¹⁰⁰ Research and Innovation Center, Edmund Mach Foundation, San Michele all'Adige, Italy.
¹⁰¹ State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai, Shanghai, China.
¹⁰² Department of Biostatistics and Computational Biology, School of Life Sciences, Fudan University, Shanghai, Shanghai, China.
¹⁰³ School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, China.
¹⁰⁴ Institute of Science and Technology for Brain-Inspired Intelligence and Shanghai Institute of Artificial Intelligence Algorithms, Fudan University, Shanghai, China.
¹⁰⁵ Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, China.
¹⁰⁶ State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, Department of Biostatistics and Computational Biology, School of Life Sciences, Fudan University, Shanghai, Shanghai, China.
¹⁰⁷ Department of Pediatrics, Brain Tumor Center, Division of Experimental Hematology and Cancer Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA.
¹⁰⁸ Department of Computer Science, Pacific Lutheran University, Tacoma, WA, USA.
¹⁰⁹ Department of Computer Science, National Chengchi University, Taipei, Taiwan.
¹¹⁰ Okinawa Institute of Science and Technology, Tancha, Okinawa, Japan.
¹¹¹ Tel Aviv University, Tel Aviv, Israel.
¹¹² Computer, Electrical and Mathematical Sciences & Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Jeddah, Saudi Arabia.
¹¹³ Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.
¹¹⁴ Computer, Electrical and Mathematical Sciences Engineering Division (CEMSE), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.
¹¹⁵ Control and Computer Engineering Department, Politecnico di Torino, Torino, TO, Italy.
¹¹⁶ Department of Future Technologies, Turku NLP Group, University of Turku, Turku, Finland.
¹¹⁷ University of Turku Graduate School (UTUGS), Turku, Finland.
¹¹⁸ University of Turku, Turku, Finland.
¹¹⁹ Turku Centre for Computer Science (TUCS), Turku, Finland.
¹²⁰ Department of Future Technologies, Faculty of Science and Engineering, University of Turku, Turku, FI-20014, Finland.
¹²¹ Turku Centre for Computer Science (TUCS), Agora, Vesilinnantie 3, Turku, FI-20500, Finland.
¹²² Department of Future Technologies, University of Turku, Turku, Finland.
¹²³ Department of Biological Sciences, Department of Computer Science, Purdue University, 47907, IN, USA.
¹²⁴ Department of Pediatrics, University of Cincinnati, Cincinnati, 45229, OH, USA.
¹²⁵ Department of Computer Science, Purdue University, West Lafayette, IN, USA.
¹²⁶ Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia.
¹²⁷ Department of Computer Science, ETH Zurich, Zurich, Switzerland.
¹²⁸ SIB Swiss Institute of Bioinformatics, Zurich, Switzerland.
¹²⁹ Department of Computer Science, Colorado State University, Fort Collins, CO, USA.
¹³⁰ Institute for Food and Plant Sciences WZW, Technische Universität München, Freising, Germany.
¹³¹ University of California, Berkeley, CA, USA.
¹³² Biological Sciences, University of Illinois at Chicago, Chicago, Illinois, USA.
¹³³ Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
¹³⁴ Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
¹³⁵ Childhood Cancer Data Lab, Alex's Lemonade Stand Foundation, Philadelphia, Pennsylvania, USA.
¹³⁶ Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA. predrag@northeastern.edu.
¹³⁷ Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA, USA. idoerg@iastate.edu.

Abstract

Background: The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function.

Results: Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory.

Conclusion: We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.

Keywords: Biofilm; Community challenge; Critical assessment; Long-term memory; Protein function prediction.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Animals
Biofilms
Candida albicans / genetics
Drosophila melanogaster / genetics
Genome, Bacterial
Genome, Fungal
Humans
Locomotion
Memory, Long-Term
Molecular Sequence Annotation / methods
Molecular Sequence Annotation / trends*
Pseudomonas aeruginosa / genetics

Abstract

Publication types

MeSH terms

Grants and funding