Not logged in
Talk
Contributions
Create account
Log in
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Page
Discussion
Edit
View history
Editing
Mika/Temp/WikiFCD/Grants
(section)
From WikiDotMako
Jump to:
navigation
,
search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=Methods= * Data Acquisition We worked from the FAO's list of food composition tables [http://www.fao.org/infoods/infoods/tables-and-databases/en/] to identify existing FCDs that we could add to our Wikibase. We then found copies of these FCTs where possible. We then extracted the data from these tables. The FCDs were originally published as CSV or as tabular data encoded in a PDF. * * Database Design and Population We will create a database model that can represent heterogeneous food composition tables. We will use this model to map multiple food composition tables so that we can then import them into a Wikibase instance. Our alignment of food composition table data with Wikidata will allow us to leverage the sum of knowledge in the projects of the Wikimedia foundation. Because Wikimedia Commons, the media repository of Wikimedia projects, has also been aligned with Wikidata, we will be able to easily reuse images of food items, molecular structure models, and food dishes alongside our projects. This query from our SPARQL endpoint [https://tinyurl.com/y99qtk7p] lists all of the food items in our project Wikibase that have an associated image in Wikimedia Commons. We used the wbstack platform to create an instance of Wikibase for testing\footnote{\href{https://www.wbstack.com/}{https://www.wbstack.com/}}. The wbstack service provides a hosted version of Wikibase that users can load with their own data. Wikibase is the software used to support Wikidata itself. WikidataIntegrator (WDI) is a python library for interacting with data from Wikidata \cite{waagmeester2020science}. WDI was created by the Su Lab of Scripps Research Institute and shared under an open-source software license via GitHub\footnote{\href{https://github.com/SuLab/WikidataIntegrator}{https://github.com/SuLab/WikidataIntegrator}}. Using WDI as a framework, we wrote bots to transfer data from FCTs to our Wikibase. * Ontology Engineering We will write schemas for the data models related to food composition data and food items. These schemas will serve as the ontology for our knowledge graph. Our Wikibase has a schema namespace that support the Shape Expressions (ShEx) language [http://shex.io/shex-semantics/index.html]. ShEx is a data modeling a data validation language for RDF graphs. We provide an example below of a ShEx schema describing how food composition tables are modeled in our Wikibase. Defining ShEx schemas for our data models allows us to communicate the expected structure of data for a food composition table to others who may like to contribute data to our public Wikibase. We have published the schema in the Schema namespace [https://wikifcd.wiki.opencura.com/wiki/EntitySchema:E1]. <code>PREFIX wd: <http://www.wikidata.org/entity/> PREFIX wbt: <http://wikifcd.wiki.opencura.com/prop/direct/> PREFIX wb: <http://wikifcd.wiki.opencura.com/entity/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> start = @<#food_composition_table> <#food_composition_table> EXTRA p:P1{ wbt:P1 [wb:Q12] ; wbt:P22 IRI ? ; wbt:P58 xsd:string ? ; wbt:P68 xsd:string * ; wbt:P65 @<#P65_country> *; wbt:P56 xsd:string *; wbt:P69 xsd:string *; wbt:P70 xsd:string *; } <#P65_country> { wbt:P31 [wb:Q127865] }</code> These ShEx schemas will also reduce work for anyone looking to combine data from our knowledge graph with other data sets. For example, if researchers would like to explore our data, rather than writing exploratory SPARQL queries to find out what data can be found and the details of our data models, they can simply review our ShEx schemas to quickly understand our data models. * Validating RDF Graphs ShEx can be used to validate RDF graphs for conformance to a schema. This allows us to create forms for data contributors that will ensure data consistency. Data contributors will not need to familiarize themselves with our data models, the form-based contribution interaction will guide curation. Our ShEx schemas will also be useful when integrating additional RDF data sets as the project matures. When we encounter new RDF data sources we can explore them with the use of our ShEx schemas to determine where they overlap with our existing data models. We will also be able to extend our schemas as the need for greater expressivity or complexity arises. * Data Provenance Our emphasis on reusing data from multiple published sources requires precision in data provenance. The structure of references in the Wikibase data model allows us to assert provenance at the level of the statement. Simply put, we can connect our sources to individual statements of fact in our knowledge graph. In this way we can always be sure of where data was originally found should we need to communicate that to others or follow up with the reference material. Using the SPARQL query language, we can also write tailored queries to extract subgraphs supported by a single source. In this way we support views of the data across multiple sources as well as views of the data drawn from individual sources. Researchers will not need to separate data manually, the provenance metadata is machine actionable and stored at the level of individual statements in the graph.
Summary:
Please note that all contributions to WikiDotMako are considered to be released under the Attribution-Share Alike 3.0 Unported (see
WikiDotMako:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:
Cancel
Editing help
(opens in new window)
Tools
What links here
Related changes
Special pages
Page information