Editing Mika/Temp/WikiFCD
From WikiDotMako
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 50: | Line 50: | ||
<!-- Please write your response below --> | <!-- Please write your response below --> | ||
[ | [https://en.wikipedia.org/wiki/Food_composition_data Food composition data (FCD)] are an essential part of nutrition research. FCD provide nutrient data for processed/cooked (e.g. veggie burger, hard-boiled eggs) and unprocessed (e.g. apples) food. Many FCDs are available online, although they come in various different formats (e.g. PDF, CSV) with varying degrees of details in content. Nutrient content of unprocessed food can also vary for the same item because of factors such as climate and terroir. Area- and time-specific data are key to understanding nutrition and health. Importantly, even though there are also wide regional variations in foods that are commonly consumed, some places lack access to regionally suitable FCD, FCD in their own languages, or only have older FCD, leading to disparities in data availability and accessibility and ultimately, in scientific evidence in health research. | ||
Despite several attempts by research institutes and intergovernmental agencies to create a global FCD in the past, none has succeeded in developing a universally accessible, up-to-date, easy-to-use, and comprehensive global FCD. Development and maintenance of these databases are difficult if the contributors are limited to small sets of researchers and employees in this field. The wiki system has a potential to bring a better solution to this problem. We propose WikiFCD to compile detailed food composition data that are already available online. The need for diverse participants in this project is very much in line with the missions of projects supported by Wikimedia Foundation and, through this pilot project, we hope to show how peer production can contribute to the improvement in data/knowledge disparities in global nutrition. | |||
===What is your solution to this problem?=== | ===What is your solution to this problem?=== | ||
Line 64: | Line 62: | ||
'''1. What is the solution to this problem?''' | '''1. What is the solution to this problem?''' | ||
We will test several automated and manual methods to populate the wikibase with nutrient data from | We will use an iterative process to test several automated and manual methods to populate the wikibase with nutrient data from 4 food composition databases from around the world (see [[Mika/Temp/WikiFCD#Project_plan|Project Plan]] section for details). We will write schemas to describe our data model. We will map our properties to Wikidata properties. | ||
'''2. Why is this a good idea?''' | '''2. Why is this a good idea?''' | ||
First, this wikibase system will significantly improve the usability of FCD from different sources for diverse users - from health-conscious individuals to academic researchers to public health workers. Building a structured dataset is also a key step in identifying most appropriate data to borrow in resource-poor settings where up-to-date, detailed, and regionally appropriate FCD are not readily available. This new database will also open up ways to explore new research questions to explore more nuanced nutrition data (e.g. changes in nutrient content of the same product, depending on the climate conditions of the year), which can potentially make substantial advances in nutrition and health research. | |||
Secondly, by creating an instance of Wikibase for this project, we will be able to design our own data models to incorporate data from heterogeneous data sources. If subsets of the data are appropriate for Wikidata, we will be able to provide machine-actionable ShEx schemas that will help us prepare data for other systems. In this way the data will be readily-available for incorporation into Wikidata if desired. | |||
Finally, this project will reach diverse communities from around the world as these FCD can be translated into/from many languages. The design of Wikibase will allow us to more easily support additional languages in the data itself, as well as in user interfaces. | |||
==Project goals== | ==Project goals== | ||
Line 121: | Line 113: | ||
<!-- Please write your response below --> | <!-- Please write your response below --> | ||
# 50 participants covering at least 3 languages. | |||
# 50 participants covering at least | # 3 new data sources. | ||
# | |||
==Project plan== | ==Project plan== | ||
Line 138: | Line 128: | ||
; Data Modeling & bulk data import | ; Data Modeling & bulk data import | ||
# Description- We will use [http://shex.io/shex-semantics/index.html| ShEx] to express the schemas for our data models. We will align the properties in our Wikibase with relevant Wikidata properties. | # Description- We will use [http://shex.io/shex-semantics/index.html| ShEx] to express the schemas for our data models. We will align the properties in our Wikibase with relevant Wikidata properties. | ||
::We will | ::We will create a wikibase, based on our analyses of 2 large food composition databases as the starting examples first: | ||
:: 1) [http://www.fao.org/fileadmin/templates/food_composition/documents/AnFooD2.0.xlsx FAO/INFOODS Analytical Food Composition Database Version 2.0 (AnFooD2.0)] | :: 1) [http://www.fao.org/fileadmin/templates/food_composition/documents/AnFooD2.0.xlsx FAO/INFOODS Analytical Food Composition Database Version 2.0 (AnFooD2.0)] | ||
:: 2) [https://fdc.nal.usda.gov/download-datasets.html USDA Foundation Foods database December 2019] | :: 2) [https://fdc.nal.usda.gov/download-datasets.html USDA Foundation Foods database December 2019] | ||
: | ::Then we will check how much overlap exists between these larger databases and individual databases listed below. If any information is omitted in the larger databases listed above, we will add those to our system. | ||
: | |||
:: 3) [https://drive.google.com/file/d/1eqQ578gHiPoIaHaVYjQa_3sFe_LzGhm1/view Indian Food Composition Tables 2017] | :: 3) [https://drive.google.com/file/d/1eqQ578gHiPoIaHaVYjQa_3sFe_LzGhm1/view Indian Food Composition Tables 2017] | ||
:: 4) [http://www.fao.org/3/I8897EN/i8897en.pdf Kenya Food Composition Tables 2018]. | :: 4) [http://www.fao.org/3/I8897EN/i8897en.pdf Kenya Food Composition Tables 2018]. | ||
:: 5) [https://www.mext.go.jp/en/policy/science_technology/policy/title01/detail01/sdetail01/sdetail01/1385122.htm STANDARD TABLES OF FOOD COMPOSITION IN JAPAN - 2015 - (Seventh Revised Edition)] | :: 5) [https://www.mext.go.jp/en/policy/science_technology/policy/title01/detail01/sdetail01/sdetail01/1385122.htm STANDARD TABLES OF FOOD COMPOSITION IN JAPAN - 2015 - (Seventh Revised Edition)] | ||
: 2. Outputs - Linked FCD in Wikibase. | :2. Outputs- ShEx schemas and SPARQL query code | ||
;Editathon (Data entry, cross-reference checking, and translation) | |||
# Description - We will host five Edit-a-thon events in Seattle (2), in Boston (1), in Baltimore (1), and in Kerala (1) to check "automated" data entries against the original databases, check global vs individual databases, and translate data in this Wikibase into Japanese, Spanish, and Malayalam. Appropriate communities (universities and wikidata communities) have been contacted for hosting these events. | |||
# Outputs - Linked FCD in Wikibase. | |||
;Documentation | ;Documentation | ||
Line 195: | Line 184: | ||
| Data scientist (10 hours per week for 8 months (34 weeks)) || $30x10x34 = $10,200 | | Data scientist (10 hours per week for 8 months (34 weeks)) || $30x10x34 = $10,200 | ||
|- | |- | ||
| Software engineer (10 hours per week for 8 months)|| $ | | Software engineer (10 hours per week for 8 months)|| $30x10*34 = $10,200 | ||
|- | |- | ||
| Community outreach | | Community outreach intern (8 hours per week for 8 months) || $25x8x34 = $6,800 | ||
|- | |- | ||
| Server hosting (12 months) || $22x12 = $264 | | Server hosting (12 months at Johns Hopkins School of Public Health) || $22x12 = $264 | ||
|- | |- | ||
| | | Event costs || $1000 x 5 = $5,000 | ||
|- | |- | ||
| Travel (Wikimania 2020, 2 people) || $ 4,000 | | Travel (Wikimania 2020, 2 people) || $ 4,000 | ||
Line 227: | Line 216: | ||
* Software Engineer- [https://www.wikidata.org/wiki/User:KSN72 Kenneth Seals-Nutt ] | * Software Engineer- [https://www.wikidata.org/wiki/User:KSN72 Kenneth Seals-Nutt ] | ||
* Food composition advisor/nutritional epidemiologist (volunteer) - Sabri Bromage | * Food composition advisor/nutritional epidemiologist (volunteer) - Sabri Bromage | ||
* Volunteer - | |||
===Community notification=== | ===Community notification=== |