Talk:Mika/Temp/WikiFCD: Difference between revisions
(→response to the committee: add sentences emphasizing that the data will be mapped so there is low risk to Wikidata missing out on data) |
|||
Line 96: | Line 96: | ||
It is important to keep in mind that while many communities are only interested in basic nutrient information (e.g. total fat, trans fat, calories, sugar), other communities need more specific information (e.g. Phytic acid (by HPLC/HPAE) : Zinc ratio) or other information related to the food item (e.g. scientific names, variety of fruits). We strongly believe that this instance can connect various WM communities as well as other peer produced knowledge bases. | It is important to keep in mind that while many communities are only interested in basic nutrient information (e.g. total fat, trans fat, calories, sugar), other communities need more specific information (e.g. Phytic acid (by HPLC/HPAE) : Zinc ratio) or other information related to the food item (e.g. scientific names, variety of fruits). We strongly believe that this instance can connect various WM communities as well as other peer produced knowledge bases. | ||
The Wikibase we are creating will be an expert-curated data set that is mapped to Wikidata. The Wikidata community as well as any other wikibase community will be able to reuse entity schemas or entity data (or both!) from our system. | The Wikibase we are creating will be an expert-curated data set that is mapped to Wikidata. The Wikidata community as well as any other wikibase community will be able to reuse entity schemas or entity data (or both!) from our system. There is no reason that this data can't flow back into Wikidata if desired. | ||
2. Community engagement | 2. Community engagement |
Revision as of 14:24, 14 May 2020
Related Work
This paper from ISWC 2019 describes a knowledge graph that includes nutrient data. [1] Data: [2]
Wikiprojects to notify
- What's the best to notify them? Mika (talk) 12:33, 20 January 2020 (EST)
- I think there is a ping project template that we can use. Ping Project. Hweyl (talk) 19:51, 21 January 2020 (CET)
- Great! Is the plan to say that we'll build this in a way that'd be easy to be incorporated into WikiData if they choose to do so in the future? Mika (talk) 22:17, 21 January 2020 (CET)
- Draft text: Greetings members of WikiProject Food! We'd like to let you know about a complementary set of work we are planing related to food composition data. We are planing to create a Wikibase just for food composition data. To support this work we are proposing a Grant. We invite your review of the description of the project so we can learn of any feedback you may be willing to share with us.
- Great! Is the plan to say that we'll build this in a way that'd be easy to be incorporated into WikiData if they choose to do so in the future? Mika (talk) 22:17, 21 January 2020 (CET)
- I think there is a ping project template that we can use. Ping Project. Hweyl (talk) 19:51, 21 January 2020 (CET)
- What's the best to notify them? Mika (talk) 12:33, 20 January 2020 (EST)
People to review
Gene Wiki
About WD
- draft response
- Thank you for the feedback! We have been thinking about this for a while as we'd initially planned to do this in Wikidata. Knowing how variable the types and depths of information are in FCDs, it is possible that we will include data that may never be appropriate for Wikidata. We decided that it'd make sense to build a Wikibase where we can hold all the details.
Example datasets may look like this FAO data on detailed information on phytate or more standard data which includes aggregate phytate data. And this is just one nutrient; there are many more nutrient data as well as meta data. We believe that it is important have these discussions in WD. We also feel that the discussion on WD and this Wikibase database could develop simultaneously.
Our approach includes the creation of ShEx schemas Wikidata:WikiProject Schemas we will publish these schemas in Wikidata's E namespace for entity schemas. This way the data models that we are discussing could also be shared and discussed on Wikidata. Our approach is to prepare data in a Wikibase and then help coordinate getting the data into Wikidata once the Wikidata community has built consensus on how much nutrient data will be appropriate for Wikidata.
Response to other project
- Happy to clarify these points!
- First, we want to emphasize that our focus in this project is the re-organization and standardization of the existing databases and we will do our best to classify and store every bit of information from each database. We will not limit the amount of data to be included in our Wikibase instance, with the hope that different communities, such as Wikidata, can pick and choose what to include in their own databases. Our Wikibase instance can serve as a place to sort data from all the databases that are available on the Internet, including ones you mentioned, into one place, so that users can pull, combine, and analyze necessary data more easily. We believe that this project will be able to offer something these FCDs currently do not/cannot do, by harnessing the power of peer production.
- I use many of these FCDs as a nutritional epidemiologist for research, and also as a migrant individual who records dietary information related to foods from my native country as well as other countries. The current situation of having multiple incomplete databases in various formats is much less than ideal for meeting the needs of diverse communities and individuals. For instance:
- * I keep my daily dietary records in English. The software I'm using primarily uses the data from USDA.
- * I sometimes eat foods more commonly consumed in Japan like 海ぶどう.
- * I look for this item in the USDA databases, using several keywords including its scientific name, but I can't find the information. Perhaps this data exists in another database but I'd need to check each database one by one.
- * I use another algae item as a substitute in my record but the nutrient data are available in the Japanese database.
- This is just one example of the kinds of problems people may run into because of the incomplete connectivity among the existing databases. A global FCD can open up ways to explore many, many more new questions and solutions not just in food and nutrition but in many other topics that Wikimedians may be interested in. We believe that having this placeholder for all FCD information is a good way to contribute to different Wikimedia projects.
- I believe OFF and our Wikibase instance take distinct approaches. According to their website:
- "Open Food Facts is a database of food products with ingredients, allergens, nutrition facts and all the tidbits of information we can find on product labels."
- OFF builds up the database by individually contributing nutrient data from food product labels. Our project's approach is different. We will be using existing databases and compile them into a standardized and structured database. As I mentioned before, OFF and our Wikibase instance are complimentary. On of the strengths of OFF is the ability to have product nutrient data that are not yet in larger public databases like USDA databases. Combining OFF and our Wikibase instance, we can have a more comprehensive FCD than any single database.
- Great point about the plan beyond data importation. Like any Wikimedia project, peer production has the potential to actually keeping information more up-to-date than any working groups with limited numbers of participants. Any methods we employ to import, check updates, and maintain this database will be documented, so that future participants can easily learn how to do each of these activities and start contributing to the project. We will engage in outreach activities to involve diverse participants and we hope that documentation/tutorials and community engagement will increase the chance of frequent updates of the data.
- In short, there is an enormous amount of food composition data on the Internet already. There have been several attempts to combine some of these databases but none has succeeded in creating a comprehensive and easy-to-use global database. Open and collaborative peer production has its advantage over smaller working groups in compiling these databases into one place and maintain the data. We believe that this is a powerful solution to the decades-old problem in nutrition and will benefit a wide range of audiences, from Wikidata users to FCD developers.
- We hope this clarifies your questions. Thanks for engaging in this topic!
Response to alex
- Thank you so much for the comments and suggestions! I love the idea of digital community engagement through content drive/contest. We will work with the community engagement intern to plan and incorporate this idea. We will edit the proposal to add this point.
- We hope that our wikibase will be useful to OFF and we'll be able to demonstrate seamless data exchanges between the two. I completely agree that this is an important knowledge base for several SDGs and we look forward to working with diverse communities to improve it.
- Thanks again for the feedback!
Wikibase for prototyping wikiFCD
Testing Wikibase
Inventory of FCTs
- We may want to consider using our wikibase to inventory the FCTs of interest.
- example item for FCT [6]
- Here is a Query for all food composition tables listed the wikifcd wikibase. I also created some additional properties as we discussed in our last meeting. Properties 55-58 mentioned on this page can also be used now. Hweyl (talk) 00:45, 21 April 2020 (CEST)
- Great! Thank you. As I work through the list on FAO/INFOODS, I'm discovering more and more about the complexity of these databases. Regional databases are tricky as they combine existing databases from different countries and sometimes add new information. The easiest thing to do may be to stick with single-country databases that are not based off another database...will keep you posted. Mika (talk) 03:12, 21 April 2020 (CEST)
Useful links
- International Nutrition Databases at Harvard; not clear if it's still maintained but it has information on some databases (overlapping information with FAO/INFOODS).
- Crop Composition Database by ISLI/Agriculture and Food Systems Institute.
- meta data on nutrition databases by ISLI/Agriculture and Food Systems Institute.
- Framework for food description.
response to the committee
Additional comments from the Committee:
Thank you so much for the feedback! We categorized the comments into two major categories (Point 1 and 2 below). We also added response to comments that did not fit into one of these two categories as well.
In addition, we have clarified how we measure success in the proposal and how impactful this project is for the WM communities and also to broader audiences.
1. Relationships with existing initiatives
- I am encouraged to see this project proposal suggesting looking into another aspect of knowledge gaps. I am concerned there are already existing initiatives and wondering why the proposers have chosen to not to align with those.
- It does fit with Wikimedia's strategic priorities. However, it looks like a competitor to "Open Food Facts", another free software and open data but the concept of both projects is different.
- I do not find this project to be critical to the current state of knowledge. There are existing resources they could join to do this work.
- It does fit with Wikimedia's strategic priorities and the budget is reasonable but not enough community engagement. There is a need for a wider or major community discussion to ascertain whether this could be hosted on Wikidata or not. However, it looks like a competitor to "Open Food Facts", another free software and open data but the concept of both projects is different (from the proposal & answers to questions on the proposal talk page).
- A lot of concerns about the creation of another instance. The answers in the discussion page don't convince me.
We understand your concern on our relationships with existing initiatives. We have had in-depth discussion both with various WM members, amongst ourselves, and with individuals and groups who are interested in using what we plan to build (e.g. nutritionists, OFF members, academic researchers) over the past year. We decided that it would be most useful to these various communities if we build a comprehensive wikibase instance for all existing FCDs from around the world. For instance, in recent conversations with the OFF community, we'd learned that they were interested in adding some information from USDA and CIQUAL. Our Wikibase instance could serve diverse communities, whether they need information from all FCDs or not.
It is important to keep in mind that while many communities are only interested in basic nutrient information (e.g. total fat, trans fat, calories, sugar), other communities need more specific information (e.g. Phytic acid (by HPLC/HPAE) : Zinc ratio) or other information related to the food item (e.g. scientific names, variety of fruits). We strongly believe that this instance can connect various WM communities as well as other peer produced knowledge bases.
The Wikibase we are creating will be an expert-curated data set that is mapped to Wikidata. The Wikidata community as well as any other wikibase community will be able to reuse entity schemas or entity data (or both!) from our system. There is no reason that this data can't flow back into Wikidata if desired.
2. Community engagement
- A new instance of Wikibase? Without a community?
- It does fit with Wikimedia's strategic priorities and the budget is reasonable but not enough community engagement. There is a need for a wider or major community discussion to ascertain whether this could be hosted on Wikidata or not. However, it looks like a competitor to "Open Food
- Interesting concept but some concerns about the methodology.
- They really need to become more involved since they were not able to get any endorsements.
- The proposal has very little community engagement with current Wikipedia communities.
We have reached out several communities, both within Wikimedia projects and outside. One of our main interests is to introduce and engage newcomers to WM projects. We have garnered strong interests among the academic nutrition scholars at Harvard School of Public Health and Johns Hopkins University, who are keen to contribute to the projects as they have recognized the issue of difficulties using FCDs in various formats and with linguistic biases for decades. Many of them have not worked on WM projects and so they are very much interested in learning. This is part of what the intern will work on. We added this point on newcomer engagement to the proposal to emphasize this point.
We are engaging stakeholders from academic communities who may not be aware of Wikidata or Wikibase. This type of outreach nourishes partnerships which could lead to expansion and diversification of Wikimedia contributors. Providing domain experts experiences with Wikimedia systems and tooling through experiences that they find valuable is consistent with the findings and recommendations of the GeneWiki program of work.
3. Other comments
- This seems iterative, but minimally so.
- The goals are measurable, but I am not sure how innovative or impactful they will be.
- This proposal has realistic measures of success and clear targets for evaluating impact and capturing learning. In addition, it is well-positioned to create long-term impact.
- The project goals can be accomplished in the timeframe and budget.
- The scope can be achieved within 12 months or less and the budget is realistic and efficient. But it isn't clear from the budget what the Community outreach/communication intern would be doing for 8 hours per week for 8 months
- There is no significant community engagement and support