Revision as of 00:10, 26 February 2020

Related Work

This paper from ISWC 2019 describes a knowledge graph that includes nutrient data. [1] Data: [2]

Wikiprojects to notify

Wikiproject Food

What's the best to notify them? Mika (talk) 12:33, 20 January 2020 (EST)

I think there is a ping project template that we can use. Ping Project. Hweyl (talk) 19:51, 21 January 2020 (CET)

Great! Is the plan to say that we'll build this in a way that'd be easy to be incorporated into WikiData if they choose to do so in the future? Mika (talk) 22:17, 21 January 2020 (CET)

Draft text: Greetings members of WikiProject Food! We'd like to let you know about a complementary set of work we are planing related to food composition data. We are planing to create a Wikibase just for food composition data. To support this work we are proposing a Grant. We invite your review of the description of the project so we can learn of any feedback you may be willing to share with us.

People to review

Gene Wiki

[3]

About WD

draft response

Thank you for the feedback! We have been thinking about this for a while as we'd initially planned to do this in Wikidata. Knowing how variable the types and depths of information are in FCDs, it is possible that we will include data that may never be appropriate for Wikidata. We decided that it'd make sense to build a Wikibase where we can hold all the details.

Example datasets may look like this FAO data on detailed information on phytate or more standard data which includes aggregate phytate data. And this is just one nutrient; there are many more nutrient data as well as meta data. We believe that it is important have these discussions in WD. We also feel that the discussion on WD and this Wikibase database could develop simultaneously.

Our approach includes the creation of ShEx schemas Wikidata:WikiProject Schemas we will publish these schemas in Wikidata's E namespace for entity schemas. This way the data models that we are discussing could also be shared and discussed on Wikidata. Our approach is to prepare data in a Wikibase and then help coordinate getting the data into Wikidata once the Wikidata community has built consensus on how much nutrient data will be appropriate for Wikidata.

Response to other project

Happy to clarify these points!

First, we want to emphasize that our focus in this project is the re-organization and standardization of the existing databases and we will do our best to classify and store every bit of information from each database. We will not limit the amount of data to be included in our Wikibase instance, with the hope that different communities, such as Wikidata, can pick and choose what to include in their own databases. Our Wikibase instance can serve as a place to sort data from all the databases that are available on the Internet, including ones you mentioned, into one place, so that users can pull, combine, and analyze necessary data more easily. We believe that this project will be able to offer something these FCDs do not/cannot do by harnessing the power of peer production.

I use many of these databases as a nutritional epidemiologist for research, and also as a migrant individual who records dietary information related to foods from my native country as well as other countries - and I can say with confidence that the current situation of having multiple incomplete databases in various formats is much less than ideal! For instance:

* I keep my daily dietary records in English. The software I'm using primarily uses the data from USDA.

* I sometimes eat foods more commonly consumed in Japan like 海葡萄.

* I look for this item in the USDA databases, using several keywords including its scientific name, but I can't find the information.

* I use another algae item as a substitute in my record but the nutrient data are available in the Japanese database.

This is just one example of the kinds of problems people may run into because of the lack of a way to connect existing databases. A global FCD can open up many, many more new questions and solutions not just in nutrition but in many other topics that Wikimedians may be interested in. We believe that having this placeholder for all FCD information is a good way to contribute to different Wikimedia projects.

The problem we are trying to tackle is this: there is an enormous amount of food composition data on the Internet already. There have been several attempts to combine some of these databases but none has succeeded in creating a comprehensive and easy-to-use global database. Peer production has its advantage over smaller working groups in compiling these databases into one place and maintain the data. We believe that this is a powerful solution to the decades-old problem in nutrition and will benefit a wide range of audiences, from Wikidata users to academic researchers to governmental workers who work on databases in their own settings.

I believe OFF and our Wikibase instance take distinct approaches. According to their website:

"Open Food Facts is a database of food products with ingredients, allergens, nutrition facts and all the tidbits of information we can find on product labels."

OFF builds up the database by individually contributing nutrient data from food product labels. Our project's approach is different. We will be using existing databases and compile them into a standardized and structured database. As I mentioned before, OFF and our Wikibase instance are complimentary. The strength of OFF is the ability to have product nutrient data that are not in larger public databases like USDA databases. Combining OFF and our Wikibase instance, we can have a more comprehensive food composition database than any single database.

Great point about the plan beyond data importation. Like any Wikimedia project, peer production has the potential to actually keeping information more up-to-date than any working groups with limited numbers of participants. Any methods we employ to import, check updates, and maintain this database will be documented. We will engage in outreach activities to get more people involved and we hope that documentation/tutorials and community engagement will increase the chance of frequent updates of the data.

We hope this clarifies your questions. Thanks for engaging in this topic! Mika (talk) 22:08, 25 February 2020 (CET)

@@ Line 30: / Line 30: @@
 ::: Happy to clarify these points!
-::: First, we want to emphasize that our focus in this project is the '''re-organization''' and '''standardization''' of the existing databases and we will do our best to classify and store every bit of information from each database. We will not limit the amount of data to be included in our Wikibase instance, with the hope that different communities, such as Wikidata, can pick and choose what to include in their own databases. Our Wikibase instance can serve as a place to sort data from all the databases that are available on the Internet into one place, including ones you mentioned, so that users can pull, combine, and analyze necessary data more easily. We believe that this project will be able to offer something these FCDs do not/cannot do by harnessing the power of peer production.
+::: First, we want to emphasize that our focus in this project is the '''re-organization''' and '''standardization''' of the existing databases and we will do our best to classify and store every bit of information from each database. We will not limit the amount of data to be included in our Wikibase instance, with the hope that different communities, such as Wikidata, can pick and choose what to include in their own databases. Our Wikibase instance can serve as a place to sort data from all the databases that are available on the Internet, including ones you mentioned,  into one place, so that users can pull, combine, and analyze necessary data more easily. We believe that this project will be able to offer something these FCDs do not/cannot do by harnessing the power of peer production.
 ::: I use many of these databases as a nutritional epidemiologist for research, and also as a migrant individual who records dietary information related to foods from my native country as well as other countries - and I can say with confidence that the current situation of having multiple incomplete databases in various formats is much less than ideal! For instance:

Talk:Mika/Temp/WikiFCD: Difference between revisions