Reimagining a sustainable data network to
accelerate agricultural research and discovery
(Award Abstract # 2126334)
Project Summary
Fueled by the generation of larger and more complex genomic, genetic, and breeding data (GGB), advances in agricultural science are increasingly led by data-driven discovery. When data are correctly stored, described, integrated, and shared, they can be easily found and reused, increasing their value for the global research community.
Ensuring high-quality agricultural data that comply with the Findable, Interoperable, Accessible, and Reusable (FAIR) principles is the mission of AgBioData. This RCN project will bring together the agricultural biological sciences to identify how to improve data management, curation, and stewardship at scale. We will create a network of GGB Databases, researchers, scholarly publishers, and funders to advance data access, interoperability, reusability, and sustainability. The products of this network will expedite and facilitate research outcomes for AgBioData member database stakeholders, with the ultimate goal of enabling food security through agricultural research.
Project aims
Our overall mission to promote FAIR principles in agricultural research builds on four major steps:
- Establish working groups (WGs) to identify the current major issues related to data sharing and management and develop community standards for FAIR data in agriculture.
- Expand the network to include key stakeholders, including researchers, funders, and publishers, and work with them to refine and implement standards.
- Develop educational materials to train researchers about FAIR data standards and best practices.
- Create a roadmap for a sustainable integrated database ecosystem that honors the needs of individual research communities, including those for new and emerging species, while incorporating database standardization to promote a unified user experience across databases.
Project activities
Working Groups
The best way to identify the issues related to data archiving and management is through an open discussion among all the stakeholders that generate, use, curate, and manage data in agriculture. We currently have nine working groups focusing on different subjects related to FAIR data in agriculture. Working group participants are members of the global research community, including database scientists, agricultural researchers, data librarians, bioinformaticians, and publishers.
If you want to know more about our working groups, you can visit their webpage (see below), contact the working groups' chairs, or email agbiodata@gmail.com.
Current Working Groups:
- Data Federation Training
- Data Reuse
- Diversity, Equity, and Inclusion (DEI) Recruiting
- Education
- FAIR Scientific Literature
- Phenotypic Data Management and Standardization
- scRNASeq Biocuration
- Standards for Genetic Variation
- Sustainability
Past Working Groups
- Data Federation
- Genome Assembly and Annotation Nomenclature
- Genotype-Phenotype Data
- Ontologies
- Pan-genomes
Workshops
AgBioData 2022 Community Workshop
To ensure effective communication among member databases on the FAIR data issues, we had a three-day free online workshop on March 15-16-17, 2022. The goal of the all-hands meeting was to provide a forum for the working groups to pose questions and gather feedback from the AgBioData community, including members of the other working groups. Each day had a two-hour session, with short presentations from selected working groups at the beginning, followed by breakout sessions, where WG and non-WG members met and discussed relevant topics, and a brief reporting period at the end.
You can find the talk schedule and recording links here!
AgBioData Workshop at PAG 30 (San Diego, CA)
On January 13th, 2023, we held a 2-hours workshop at the Plant & Animal Genome Conference (PAG) 30 Conference in San Diego (CA). In this workshop, we introduced the AgBioData consortium; shared challenges, recommendations, and resources developed by six AgBioData working groups; and provided preliminary insights on the long-term sustainability of the AgBioData database resources. The abstracts and slides of the WG talks are available here.
AgBioData 2023 Community Workshop
We held a two-day in-person workshop in Chicago (IL) on May 1-2, 2023, with the main objective of bringing together the WG members and member database resources to share the progress and achievements made by the WGs and discuss the future directions of the consortium. The event was hybrid with in-person and virtual participation. We had presentations from current and ended WGs about their work and accomplishments, followed by breakout sessions focusing on questions and issues suggested by the current WGs. For the in-person participants, we also organized team coaching activities and round table discussions on the future of the AgBioData consortium. The session recordings and the presentations' slides are available here.
Bringing FAIR Data into the Classroom
On Friday, August 4th, 2023, at 10 AM ET, we held a 30' virtual workshop during the Plant Biology 2023 Conference (Savannah, GA). In this workshop, we introduced the open-access educational curriculum created by the AgBioData Education working group about how to explore the current plant biology database ecosystem and communicate concrete recommendations for FAIR data handling. This workshop featured presentations about the curriculum and a question-and-answer session to explore curriculum and educational gaps that still need to be filled. More details are available here.
Plant Bioinformatics Resources for FAIR Agricultural Data Discovery and Reuse
On Sunday, August 6th, 2023, at 10.30 AM ET, we held a two-hour workshop about the digital bioinformatics resources available in plant biology. In particular, this workshop introduced and updated plant biologists on community databases, data repositories, online bioinformatics tools, and knowledge bases that provide equitable access to structured, integrated data and analytic tools. This workshop also featured overview presentations from the AgBioData RCN and member digital resources, followed by a panel discussion on community needs, challenges, and database support for FAIR data compliance. The meeting agenda and more information are available here.
AgBioData Workshop at PAG 31 (San Diego, CA)
On January 12th, 2024, we held a 2-hour workshop at the Plant & Animal Genome Conference (PAG) 31 Conference in San Diego (CA). In this workshop, we presented the outcomes from four of our working groups (i.e., Data federation training; FAIR scientific literature; Standards for Genetic Variation; and Sustainability). We also presented two talks from AgBioData early career researchers on their work regarding data reuse and FAIRification of germplasm bank resources. The abstracts and slides of the WG talks are available here.
Surveys
Baseline surveys on the standardized data curation principles and their implementation in GGB data repositories
In Year 1 of our NSF RCN grant (Award Abstract # 2126334), the AgBioData Consortium, in partnership with Washington State University, surveyed database stakeholders and team members on standardized data curation principles and their implementation in data repositories for agricultural research and breeding programs. We will run a similar survey again at the end of the RCN funding period to assess whether changes in the perception or understanding of FAIR data in our community have occurred. Summary reports of the surveys are available at the links below:
- 2022 Survey of Genomic, Genetic, and Breeding (GGB) Database Team Members
- 2022 Survey of Genomic, Genetic, and Breeding (GGB) Database Stakeholders
Sustainability survey
One of the core aims of our NSF RCN grant is to develop a roadmap for a sustainable integrated database ecosystem. Most GGB databases rely on short-term funding for a majority of their operating costs and are vulnerable to loss of personnel and knowledge if funding lapses, even while demand for their services by researchers continues to increase. Researchers rely upon databases for data discovery, analysis, and management. To ensure that researchers continue to have access to reliable, high-quality, curated, and FAIR data in the future, we need to plan and develop infrastructure, strategies, and tools to ensure the long-term sustainability of GGB data and GGB databases.
To achieve this goal, we self-assessed the long-term financial stability of the AgBioData member databases through written surveys and interviews. We gathered data from 36 AgBioData member databases about the cost of operations, staff level, sources of funding, usage level, data types, species and strains, stakeholders served, and anticipated future needs. A summary report of this survey is available here!
Our External Advisory Committee (EAC)
The EAC comprises research community members with expertise in relevant areas such as generating and managing GGB data, implementing the FAIR principles, diversity and inclusion, and educational outreach. We meet with EAC annually to assess our grant progress and seek advice on specific aims and needs.
Meet our EAC members!
- Robert W. Cottingham, Group Leader at the Oak Ridge National Laboratory and member of KBase: The United States Department of Energy Systems Biology Knowledgebase
- Anne Kwitek, Professor at the Medical College of Wisconsin and member of the Alliance for Genome Resources Consortium
- James Koltes, Assistant Professor at Iowa State University and member of the Agricultural Genome to Phenome Initiative
- Marie-Angelique Laporte, Associate Scientist at the Alliance of Bioversity International - CIAT and member of the Research Data Alliance
- Susanna Sansone, Professor at the University of Oxford and member of Elixir and FAIRsharing