Welcome to AgBioData!

 
We are a consortium of agricultural biological databases with the mission of consolidating standards and best practices for acquiring, displaying, and reusing genomic, genetic, and breeding (GGB) data (Harper et al., 2018). Formed in 2015, our consortium involves 44 GGB databases (the complete list here) and over 200 members, including database curators, researchers, librarians, and anybody that works with agricultural data. 

 
The AgBioData consortium embraces the Findable, Accessible, Interoperable, and Reusable (FAIR) principles to facilitate and maximize the accessibility and reuse of large-scale data in agricultural research. We recently received funding for a National Science Foundation (NSF) for Research Coordination Network project that aims to:

  1. define community-based standards for FAIR agricultural data;
  2. expand our network by recruiting anybody that generates, uses, curate, archive, and publish data;
  3. provide educational material to train scientists on FAIR data sharing;
  4. develop a roadmap for a sustainable GGB database ecosystem.

If you are interested in our work and want to be part of our community, you can:

Participating Databases and Resources

  • YamBase is a database containing breeding data for Yam (genus Dioscorea). Yam species that are being used for breeding include Dioscorea rotundata, Dioscorea cayenensis (both are native to Africa and the major cultivated species), Dioscorea aleata (native to Southeast Asia), and Dioscorea praehensilis, as well as several other species. YamBase contains phenotypic and genotypic data, as well as trial metadata from breeding programs in Africa.

  • The European Variation Archive is an open-access database of all types of genetic variation data from all species.

  • The Bio-Analytic Resources for Plant Biology (The BAR) houses web-based tools for visualizing functional genomics and other data.

  • SweetPotatoBase houses genomic and breeding resources for the improvement of sweet potato 

  • MusaBase is a breeding database that implements tools to manage data and next-generation breeding protocols based on high-density marker data for banana species, which are major food sources in Africa.

  • microPublication publishes brief articles on research findings presented in a single figure to rapidly disseminate experimentally sound results to the community. These can be novel findings, negative results, replication successful (including being scooped), replication unsuccessful; perceived impact is not relevant. Each article is peer-reviewed, assigned a DOI, and your data is curated and deposited in third-party authoritative scientific databases upon publication. 

  • The iBeetle-Base provides access to sequence information for all Tribolium castaneum genes, including genomic organization, mRNA, and CDS sequences, and results from omics experiments such as ATAC-seq and RNA-seq. It also makes available annotations of RNAi phenotypes, which were mainly gathered in a large-scale RNAi screen (iBeetle screen).

  • The Hymenoptera Genome Database is a genome informatics resource that supports the research of insects of the order Hymenoptera (e.g. bees, wasps, ants). HGD provides tools for data mining (HymenopteraMine), sequence searching (BLAST), genome browsing (JBrowse), genome annotation (Apollo), and data download.

  • FAANG is the Functional Annotation of ANimal Genomes project to understand the genotype-to-phenotype link in domesticated animals.

  • Ensembl is a genome browser for vertebrate genomes that supports research in comparative genomics, evolution, sequence variation and transcriptional regulation.

  • Citrus Greening houses resources for studying the citrus greening disease (also called huanglongbing), which is devastating the citrus industry in the USA.

  • CassavaBase is a genetics, genomics, and breeding database resource for Cassava (Manihot esculenta), a major staple crop.

  • The Bovine Genome Database supports the efforts of bovine genomics researchers by providing data mining, genome navigation and annotation tools for the bovine reference genome based on the hereford cow, L1 Dominette 01449.

  • AgroPortal is the home of ontologies and semantic artifacts in agri-food and related domains.

  • The Enterprise Breeding System (EBS) is an open-source breeding informatics software.

  • SorghumBase is a web portal for comparative plant genomics focused on Sorghum crop varieties.

  • The central goal of the Plant Metabolic Network (PMN) is to bring together biochemical pathway databases and research communities focused on plant metabolism.

  • The Pulse Crop Database (PulseDB) hosts genomics, genetics, and breeding resources for pulse crop improvement.

  • The Genome Database for Rosaceae is a curated and integrated web-based relational database that hosts resources for Rosaceae research discovery and crop improvement.

  • MaizeGDB is a community-oriented, long-term, federally funded informatics service to researchers focused on the crop plant and model organism Zea mays.

  • WheatIS provide a single-access web base system for the wheat research community

  • GrainGenes, a database for Triticeae and Avena, is a comprehensive resource for molecular and phenotypic information.

  • AgBase is a curated, open-source, web-accessible resource for functional analysis of agricultural plant and animal gene products.

  • The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana. 

  • SoyBase integrates genetic and genomic information to aid soybean breeders and researchers. 

  • The Sol Genomics Network (SGN) is a database and website dedicated to the genomic information of the nightshade family, which includes species such as tomato, potato, pepper, petunia, and eggplant.

  • The Planteome project takes an integrated approach, adopting common annotation standards and a set of reference ontologies for plants.

  • PeanutBase integrates genetic and genomic information across many Arachis species including cultivated peanut. 

  • It is the mission of NADC to conduct basic and applied research on selected diseases of economic importance to the U.S. livestock and poultry industries.

  • The TreeGenes database provides custom informatics tools to manage the flood of information resulting from high-throughput genomics projects in forest trees, from sample collection to downstream analysis.

  • The i5k Workspace@NAL is a platform for communities around ‘orphaned’ arthropod genome projects to access, visualize, curate and disseminate their data. 

  • CottonGen is a cotton community genomics, genetics, and breeding database that enables basic, translational, and applied research in cotton. 

  • The Animal Quantitative Trait Loci (QTL) Database (Animal QTLdb) strives to collect all publicly available trait mapping data, i.e. QTL (phenotype/expression, eQTL), candidate gene and association data (GWAS), and copy number variations (CNV) mapped to livestock animal genomes, to facilitate locating and comparing discoveries within and between species.

  • The Genome Database for Vaccinium (GDV) houses and integrates genomic, genetic, and breeding data for blueberry, cranberry, and other Vaccinium species.

  • The Citrus Genome Database (CGD) houses resources for citrus genetics, genomics, breeding, and disease research.

  • The USDA-ARS Germplasm Resources Information Network (GRIN) provides information about USDA national collections of animal, microbial, and plant genetic resources (germplasm) important for food and agricultural production.

  • T3 is a repository for public wheat data generated by the Wheat Coordinated Agricultural Project (Wheat CAP). 

  • The Legume Information System (LIS) integrates genetic and genomic information to aid researchers working on legume crops and related species.

  • CyVerse provides scientists with powerful platforms to handle huge datasets and complex analyses, thus enabling data-driven discovery. Our extensible platforms provide data storage, bioinformatics tools, data visualization, interactive analyses, cloud services, APIs, and more.

  • Gramene is a curated, open-source, integrated data resource for comparative functional genomics in crops and model plant species.