Editing Talk:Mika/Temp/WikiFCD

From WikiDotMako

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 77: Line 77:
* [https://foodsystems.org/ meta data on nutrition databases] by ISLI/Agriculture and Food Systems Institute.
* [https://foodsystems.org/ meta data on nutrition databases] by ISLI/Agriculture and Food Systems Institute.
* [http://www.langual.org/Default.asp Framework for food description].
* [http://www.langual.org/Default.asp Framework for food description].
* [https://soilandhealth.org/wp-content/uploads/06clipfile/RCE/rceALL.html Bibliography related to variation in mineral composition of vegetables]
* [https://soilandhealth.org/wp-content/uploads/06clipfile/Nutritional%20Quality%20of%20Organically-Grown%20Food.html Bibliography related to Nutritional Quality of Organically Grown Food]
* [https://soilandhealth.org/wp-content/uploads/06clipfile/Organically%20Produced%20Foods%20Nutritive%20Content.htm Bibliography related to Organically Produced Foods: Nutritive Content]


=response to the committee=
=response to the committee=
 
Additional comments from the Committee:
Thank you so much for reviewing our proposal! We wanted to respond to the three major lines of criticism raised in our proposal.
 
1. Relationships with existing initiatives
 
The most important issue raised by reviewers were concerns that our project overlapped with existing initiatives, [https://world.openfoodfacts.org/ Open Food Facts] (OFF) in particular. This issue was raised on the [[Talk:FIXME|talk page for our proposal]] during the discussion face but reviewers felt that our response there was not convincing.
 
We have made major changes to the text our proposal to try to explain in more detail how this project and OFF are complementary and to try to explain the difference between nutrition-label data (OFF's domain) and food composition data (WikiFCD's). Although they are related, they are different data with different sources, different audiences, different challenges, and different uses. Food composition dat is best understood as a "downstream" source of granular data for projects like OFF as well for Wikibase instances like Wikidata.
 
We have spoken in depth with Stéphane Gigandet (the founder and leader of OFF) who has in turn spoken about our project with the OFF board. Gigandet is excited about our project and, with support of the OFF board, and has graciously added his name to the list of advisors for our proposal. Part of the reason Gigandet is excited about proposal is that OFF has previously attempted to incorporate some fruit and vegetable FCD from [[:wikipedia:USDA|USDA]] and [[:wikipedia:CIQUAL|CIQUAL]]. After running into some of the issues our team discusses in this proposal, like extremely granular data and divergent and shifting formats, OFF decided ''not'' to move forward with supporting the types of FCD our proposal targets. With Gigandet's advice, we will work closely with OFF to ensure that WikiFCD not only does not compete or duplicate effort with OFF but that it provides a useful resource that they can draw from.
 
Although it was not raised in the reviews, a new Wikibase instance like WikiFCD will reduce the burden for other communities such as Wikidata as well so that they can focus on their main project aims. We have updated our proposal to make it clear that we will work closely with Gigandet to integrate our work with OFF as a way of demonstrating how other Wikibase instances can incorporate data from WikiFCDs.
 
2. Community engagement 
 
A number of reviewers raised concerns about our ability to build an engage a community. Although we agree that this reflects the biggest challenge and risk for this project, we believe that, with a WMF grant, we will have the resources we need to succeed.
 
If our project has less in the way of existing community support than some other grant proposals, it is because our goal is to engage ''new'' groups of experts in the WMF ecosystem rather than calling upon already overtaxed Wikimedia volunteers. The type of outreach we are proposing will involvement building partnerships which could lead to expansion and diversification of Wikimedia contributors. Our approach is to provide domain experts experiences with Wikimedia systems and tooling  that they find valuable. This strategy for engaging domain experts is consistent with the findings and recommendations of the [https://tools.wmflabs.org/scholia/topic/Q5531528 GeneWiki program of work].
 
[[User:Hackfish]] is an established academic expert in global health and nutrition. She is currently working at both [[:wikipedia:San Francisco State University|San Francisco State University]] and [[:wikipedia:Harvard School of Public Health|Harvard School of Public Health]] and will be starting as an Assistant Professor at the [[:wikipedia:Johns Hopkins University|Johns Hopkins University]] in September 2020. [[User:Hackfish]] is well positioned to user her deep connections in the academic nutrition community to help this project succeed and this engagement project will be the large part of what the intern will work on. We have already garnered strong interest in this project among the nutritionists at both Harvard and Johns Hopkins and will be working with teams at both places to contribute to WikiFCD and to engage broader communities.
 
3. Budget Question
 
There was a lack of clarity in our proposal about what the intern would be doing for 8 hours each week. We have edited the proposal to clarify that the intern will be working with the project manager to create online learning tools and seminars to develop and deliver curriculum focused on teaching nutrition professionals to use and contribute to WikiFCD as well as about peer production, WM projects, and ways to contribute to Wiki-based projects in general. We aim to employ a student who is interested in working with online communities who can support us one day a week for two semesters so as not to create burden on their workload and interfere with their academic work. We are confident that we will be able to identify such an individual.
 
------
 
 
Integrate into the proposal:
 
Based on the in-depth discussion both with various WM members and nutrition researchers over the past year, we decided that it would be most useful to '''diverse communities''' with '''diverse needs''' if we build a comprehensive Wikibase instance for all existing FCDs from around the world. While some communities may be only interested in the most common nutrient information (e.g. total fat, trans fat, calories, sugar), other communities may want more specific information (e.g. Phytic acid (by HPLC/HPAE) : Zinc ratio) or other information related to the food item (e.g. scientific names, varieties of fruits, geo-locations).
 
WikiFCD can make significant contributions to various WM communities with interests in nutritional data. The Wikibase we are creating will be an expert-curated data set that is mapped to Wikidata. The Wikidata community as well as any other wikibase community will be able to reuse entity schemas or entity data (or both!) from our system. Data will flow back into Wikidata if desired. We strongly believe that WikiFCD will make a positive impact on WM and other communities by accommodating their varying needs and bringing more equitable access to an easily usable database. This Wikibase instance is a new and innovative approach that addresses problems that need to be, and yet have not been, solved.
 
 
 
 
-----
 
Comments from reviewers


1. Relationships with existing initiatives
1. Relationships with existing initiatives
Line 137: Line 95:
* The proposal has very little community engagement with current Wikipedia communities.
* The proposal has very little community engagement with current Wikipedia communities.


3. Other comments
*    This seems iterative, but minimally so.
*    This seems iterative, but minimally so.
 
*    The goals are measurable, but I am not sure how innovative or impactful they will be.
=Grant proposal edits=
*   This proposal has realistic measures of success and clear targets for evaluating impact and capturing learning. In addition, it is well-positioned to create long-term impact.
 
*   The project goals can be accomplished in the timeframe and budget.
Through this pilot project, we will write schemas to describe our data model based on five large food composition datasets that are already available online and develop good documentation for both project development and use. '''The focus on equity and global nature of the project requires diverse participants''', which is very much in line with the missions of projects supported by the Wikimedia Foundation. Through this pilot project, we hope to show how peer production can contribute to the improvement in data/knowledge disparities in global nutrition. We believe that WikiData is an awesome way to build connections between a range of free culture related nutrition projects like [https://fosdem.org/2020/schedule/event/open_food_facts/ Open Food Facts] that might do the same.
*   The scope can be achieved within 12 months or less and the budget is realistic and efficient. But it isn't clear from the budget what the Community outreach/communication intern would be doing for 8 hours per week for 8 months
*   There is no significant community engagement and support
We will test several automated and manual methods to populate the wikibase with nutrient data from 5 food composition databases from around the world (see the Project Plan section section for details). We will write schemas to describe our data model. We will map our properties to Wikidata properties.
 
'''2. Why is this a good idea?'''
 
* First, this Wikibase instance will '''significantly improve the usability of FCD from different sources for diverse users''' - from WikiProjects and Wikipedia editors and viewers to academic researchers to public health workers. [https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Food_and_drink WikiProject food and Drink] on English Wikipedia and [https://www.wikidata.org/wiki/Q8485990#sitelinks-wikipedia its equivalents in other languages] are universally popular WikiProjects among editors and likewise, many articles on food and drink are within the top 10% of any Wikipedia's articles by pageviews. This new project can contribute to a topic that is of high interest to many people.
 
: Building a structured dataset is also a key step in identifying most appropriate data to borrow in resource-poor settings where up-to-date, detailed, and regionally appropriate FCD are not readily available. This new database will also open up ways to explore new research questions to explore more nuanced nutrition data (e.g. changes in nutrient content of the same product, depending on the climate conditions of the year), which can potentially make substantial advances in nutrition and health research.
 
* Secondly, by creating an instance of Wikibase for this project, we will be able to design our own data models, with input from Wikidata, to '''incorporate data from heterogeneous data sources'''. If subsets of the data are appropriate for Wikidata, we will be able to provide machine-actionable ShEx schemas that will help us prepare data for other systems. In this way the data will be readily-available for incorporation into Wikidata if desired.
 
; '''UPDATE''': In our recent communication with the Open Food Facts community, they discussed that OFF is in fact interested in using data from USDA (USA) and CIQUAL (France). However, it is burdensome having to deal with diverse and dynamic formats - they mentioned three separate format changes in USDA since they started looking at this. Having another project like WikiFCD can help each community focus on their main project goals instead of each having to deal with these issues we raised in this proposal. This conversation with OFF reinforces our belief that WikiFCD will be helpful to diverse peer production communities.
 
* Finally, we will '''complete this project with diverse communities from around the world''' as these FCD can be translated into/from many languages. The design of Wikibase will allow us to more easily support additional languages in the data itself, as well as in user interfaces.
 
= Brown Bag Seminar =
 
Some ideas for things to present:
 
* slides on the project idea, our vision for the overall process
* slides on the process of "automated" reading of databases - maybe USDA API example or SMILING excel sheet example
* slides on connecting to other Wikibase instances - wikidata example
* demo on how a contributor can easily identify which language is missing the food item in wikidata, which makes it easier for contributors to see where they can make contributions in terms of translation. (I forgot the name of this tool...)
* demo on making a query based on research questions (e.g. nutrient content differences across databases; comparison of numbers of nutrients included in different databases; Wikipedia articles mentioning a particular food item etc etc)
* demo on how to convert data to csv or other formats that can be used in R, STATA etc.
* slides on our plans for the community expansion and engagement (what we will do initially ourselves; what peer contributors will do once the system is ready)
 
==Kat flag==
*Bots: We wrote a series of bots using the WikidataIntegrator python module [https://github.com/SuLab/WikidataIntegrator]. These bots can be used to read in data from a source and then create statements in the Wikibase according to our data models. As of November, 2020 we have written bots to:
# add countries to the Wikibase (sourced from Wikidata)
# add taxon names that have a GRIN id (sourced from Wikidata)
# add human languages (sourced from Wikidata)
# add USDA Food Data Central (sourced from FDC's API)
 
 
==Properties used by FCT==
{| class="wikitable"
|-
! 😊 Vietnam !! link !! 😊 Indonesia !! FDC !! link
|-
| water (P5) || [http://wikifcd.wiki.opencura.com/prop/P5] || water (P5)|| water (P5)|| [http://wikifcd.wiki.opencura.com/prop/P5]
|-
| energy (P6) || [http://wikifcd.wiki.opencura.com/prop/P6] || energy (P6) ||energy (P6) || [http://wikifcd.wiki.opencura.com/prop/P6]
|-
| Protein (P7)|| [http://wikifcd.wiki.opencura.com/prop/P7] || Protein (P7)||  Protein (P7)|| [http://wikifcd.wiki.opencura.com/prop/P7]
|-
| total lipid (P8) || [http://wikifcd.wiki.opencura.com/prop/P8] || total lipid (P8)|| total lipid (P8)|| [http://wikifcd.wiki.opencura.com/prop/P8]
|-
| Ash (P9) || [http://wikifcd.wiki.opencura.com/prop/P9] ||Ash (P9)||Ash (P9) || [http://wikifcd.wiki.opencura.com/prop/P9]
|-
| carbohydrate (P10)||[http://wikifcd.wiki.opencura.com/prop/P10] || carbohydrate (P10) || dietary fiber (P11) || [http://wikifcd.wiki.opencura.com/prop/P11]
|-
| dietary fiber (P11) ||[http://wikifcd.wiki.opencura.com/prop/P11] || dietary fiber (P11) || Iron (P14)|| [http://wikifcd.wiki.opencura.com/prop/P14]
|-
| Calcium (P13)|| [http://wikifcd.wiki.opencura.com/prop/P13] || Calcium (P13) || Magnesium (P15) || [http://wikifcd.wiki.opencura.com/prop/P15]
|-
| Iron (P14)|| [http://wikifcd.wiki.opencura.com/prop/P14] || Iron (P14) || Phosphorus (P16) || [http://wikifcd.wiki.opencura.com/prop/P16]
|-
| Zinc (P19) || [http://wikifcd.wiki.opencura.com/prop/P19] || Zinc (P19) || Potassium (P17) || [http://wikifcd.wiki.opencura.com/prop/P17]
|-
| Vitamin C (P24) || [http://wikifcd.wiki.opencura.com/prop/P24] || Vitamin C (P24) || Sodium (P18) || [http://wikifcd.wiki.opencura.com/prop/P18]
|-
| Thiamin (P25) ||[http://wikifcd.wiki.opencura.com/prop/P25] ||  Thiamin (P25) || Zinc (P19) || [http://wikifcd.wiki.opencura.com/prop/P19]
|-
| Riboflavin (P26)|| [http://wikifcd.wiki.opencura.com/prop/P26] || Riboflavin (P26) || Copper (P20) || [http://wikifcd.wiki.opencura.com/prop/P20]
|-
| Niacin (P27) || [http://wikifcd.wiki.opencura.com/prop/P27] || Niacin (P27) || Manganese (P21) || [http://wikifcd.wiki.opencura.com/prop/P21]
|-
| Vitamin B-6 (P29) || [http://wikifcd.wiki.opencura.com/prop/P29] || Vitamin B-6 (P29) || Selenium (P23) || [http://wikifcd.wiki.opencura.com/prop/P23]
|-
| Folate, total (P39) || [http://wikifcd.wiki.opencura.com/prop/P39] || Folate, total (P39) || Vitamin C (P24) || [http://wikifcd.wiki.opencura.com/prop/P24]
|-
| Folate, DFE (P42) || [http://wikifcd.wiki.opencura.com/prop/P42] || Folate, DFE (P42) || Thiamin (P25) || [http://wikifcd.wiki.opencura.com/prop/P25]
|-
| Vitamin A, RAE (P45)|| [http://wikifcd.wiki.opencura.com/prop/P45] || Vitamin A, RAE (P45) || Riboflavin (P26) || [http://wikifcd.wiki.opencura.com/prop/P26]
|-
| Vitamin A, IU (P49) || [http://wikifcd.wiki.opencura.com/prop/P49] || Carotene, beta (P46) || Niacin (P27) || [http://wikifcd.wiki.opencura.com/prop/P27]
|-
| Retinol (P75) ||[http://wikifcd.wiki.opencura.com/prop/P75] ||  Retinol (P75) || Pantothenic acid (P28) || [http://wikifcd.wiki.opencura.com/prop/P28]
|-
| common name (P76) || [http://wikifcd.wiki.opencura.com/prop/P76] || common name (P76) || Vitamin B-6 (P29) || [http://wikifcd.wiki.opencura.com/prop/P29]
|-
| SMILING Vietnam food code (P77)|| [http://wikifcd.wiki.opencura.com/prop/P77] || SMILING Indonesia food code (P78) || Sucrose (P32) || [http://wikifcd.wiki.opencura.com/prop/P32]
|-
||| || || Glucose (P33) || [http://wikifcd.wiki.opencura.com/prop/P33]
|-
||| || || Fructose (P34) || [http://wikifcd.wiki.opencura.com/prop/P34]
|-
||| || || Lactose (P35) || [http://wikifcd.wiki.opencura.com/prop/P35]
|-
||| || || Maltose (P36) || [http://wikifcd.wiki.opencura.com/prop/P36]
|-
||| || || Galactose (P37) || [http://wikifcd.wiki.opencura.com/prop/P37]
|-
||| || || Folate, total (P39) || [http://wikifcd.wiki.opencura.com/prop/P39]
|-
||| || || Folic acid (P40) || [http://wikifcd.wiki.opencura.com/prop/P40]
|-
||| || || Folate, food (P41) || [http://wikifcd.wiki.opencura.com/prop/P41]
|-
||| || || Folate, DFE (P42) || [http://wikifcd.wiki.opencura.com/prop/P42]
|-
||| || || Vitamin A, RAE (P45) || [http://wikifcd.wiki.opencura.com/prop/P45]
|-
||| || || Vitamin A, IU (P49) || [http://wikifcd.wiki.opencura.com/prop/P49]
|-
||| || || Vitamin K (phylloquinone) (P52) || [http://wikifcd.wiki.opencura.com/prop/P52]
|-
||| || || Vitamin K (Dihydrophylloquinone) (P53) || [http://wikifcd.wiki.opencura.com/prop/P53]
|-
||| || || Retinol (P75) || [http://wikifcd.wiki.opencura.com/prop/P75]
|-
||| || || USDA Food Data Central fcdid (P80) || [http://wikifcd.wiki.opencura.com/prop/P80]
|-
||| || || Fatty acids, total saturated (P86) || [http://wikifcd.wiki.opencura.com/prop/P86]
|-
||| || || Fatty acids, total polyunsaturated (P88) || [http://wikifcd.wiki.opencura.com/prop/P88]
|-
||| || || Carbohydrate, by difference (P89) || [http://wikifcd.wiki.opencura.com/prop/P89]
|-
||| || || Vitamin B-12 (P90) || [http://wikifcd.wiki.opencura.com/prop/P90]
|-
||| || || Cholesterol (P99) || [http://wikifcd.wiki.opencura.com/prop/P99]
|-
||| || || Tocopherol, delta (P100) || [http://wikifcd.wiki.opencura.com/prop/P100]
|-
||| || || Tocotrienol, gamma (P101) || [http://wikifcd.wiki.opencura.com/prop/P101]
|-
||| || || Tocotrienol, delta (P102) || [http://wikifcd.wiki.opencura.com/prop/P102]
|-
||| || || Sugars, total including NLEA (P104) || [http://wikifcd.wiki.opencura.com/prop/P104]
|-
||| || || Vitamin E (alpha-tocopherol) (P105) || [http://wikifcd.wiki.opencura.com/prop/P105]
|-
||| || || Tocopherol, beta (P106) || [http://wikifcd.wiki.opencura.com/prop/P106]
|-
||| || || Tocopherol, gamma (P107) || [http://wikifcd.wiki.opencura.com/prop/P107]
|-
||| || || Tocotrienol, alpha (P108) || [http://wikifcd.wiki.opencura.com/prop/P108]
|-
||| || || Tocotrienol, beta (P109) || [http://wikifcd.wiki.opencura.com/prop/P109]
|-
||| || || Fatty acids, total monounsaturated (P119) || [http://wikifcd.wiki.opencura.com/prop/P119]
|-
||| || || 8:0 (P124) || [http://wikifcd.wiki.opencura.com/prop/P124]
|-
||| || || 10:0 (P125) || [http://wikifcd.wiki.opencura.com/prop/P125]
|-
||| || || 12:0 (P126) || [http://wikifcd.wiki.opencura.com/prop/P126]
|-
||| || || 14:0 (P127) || [http://wikifcd.wiki.opencura.com/prop/P127]
|-
||| || || 16:0 (P128) || [http://wikifcd.wiki.opencura.com/prop/P128]
|-
||| || || 18:0 (P129) || [http://wikifcd.wiki.opencura.com/prop/P129]
|-
||| || || 20:0 (P130) ||[http://wikifcd.wiki.opencura.com/prop/P130]
|-
||| || || 18:1 (P131) || [http://wikifcd.wiki.opencura.com/prop/P131]
|-
||| || || 18:2 (P132) || [http://wikifcd.wiki.opencura.com/prop/P132]
|-
||| || || 18:3 (P133) || [http://wikifcd.wiki.opencura.com/prop/P133]
|-
||| || || 20:4 (P134) || [http://wikifcd.wiki.opencura.com/prop/P134]
|-
||| || || 22:0 (P136) || [http://wikifcd.wiki.opencura.com/prop/P136]
|-
||| || || 14:1 (P137) || [http://wikifcd.wiki.opencura.com/prop/P137]
|-
||| || || 16:1 (P138) || [http://wikifcd.wiki.opencura.com/prop/P138]
|-
||| || || 20:1 (P140) || [http://wikifcd.wiki.opencura.com/prop/P140]
|-
||| || || 15:0 (P154) || [http://wikifcd.wiki.opencura.com/prop/P154]
|-
||| || || 17:0 (P155) || [http://wikifcd.wiki.opencura.com/prop/P155]
|-
||| || || 20:2 n-6 c,c (P165) || [http://wikifcd.wiki.opencura.com/prop/P165]
|-
||| || || 18:3 n-6 c,c,c (P170) || [http://wikifcd.wiki.opencura.com/prop/P170]
|-
||| || || 17:1 (P171) || [http://wikifcd.wiki.opencura.com/prop/P171]
|-
||| || || 20:3 (P172) || [http://wikifcd.wiki.opencura.com/prop/P172]
|-
||| || || 15:1 (P177) || [http://wikifcd.wiki.opencura.com/prop/P177]
|-
||| || || Starch (P218) || [http://wikifcd.wiki.opencura.com/prop/P218]
|-
||| || || Fatty acids, total trans (P271) || [http://wikifcd.wiki.opencura.com/prop/P271]
|}
 
= Questions =
 
* Some databases are published in multiple languages. How should we deal with any discrepancies between the translation by the organizations and by Wikidata? (e.g Japanese databases are available in English and Japanese)
** I think we will be able to present multiple aliases/ multiple values for names. Some of these may conflict, but each will have a reference back to the source. If our group can determine something is incorrect, we can deprecate it.
* Some databases state that they borrow data from other databases but do not specify exactly which items were borrowed. How should we deal with this? (e.g. Bangladesh FCT 2013)
** This is my current working model [https://wikifcd.wiki.opencura.com/wiki/Item:Q135079]. I am using stated in for the FCT where we found the value and based on for the source they note. How does this seem to you? It is very tricky because sometimes we don't have enough information to decide what to do here.
* In Wikidata, we can add values without adding references but is it possible to make it a requirement to have at least one reference on WikiFCD?
** Yes, this is possible.
* We discussed this before but I forgot the conclusion - do we create different items for the same fruits from different databases (e.g. Apple)?
** We create a single item for a food item and then statements from all different databases are placed on that item.
* Is it easy to have Recoin on WikiFCD?
** Not sure about this. I haven't seen it available for wikibases yet. I'll keep looking.
* Perhaps we can add some of the identifier properties from [https://www.wikidata.org/wiki/Wikidata:WikiProject_Medicine/Properties WikiProject_Medicine]?
** My current plan is get this data via federated SPARQL queries with Wikidata.
Please note that all contributions to WikiDotMako are considered to be released under the Attribution-Share Alike 3.0 Unported (see WikiDotMako:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)