Editing Mika/Temp/WikiFCD/Grants

From WikiDotMako

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 7: Line 7:
|-
|-
| NSF
| NSF
| [https://www.nsf.gov/pubs/2020/nsf20591/nsf20591.htm Information Integration and Informatics  (III)] under CISE
| [https://www.nsf.gov/pubs/2019/nsf19589/nsf19589.htm: Information Integration and Informatics  (III)] under CISE
NEW: no deadlines for SMALL projects (submit anytime after Oct 1, 2020); September 7, 2020 - September 14, 2020 for MEDIUM projects
October 29, 2020 - November 12, 2020 for SMALL projects; September 7, 2020 - September 14, 2020 for MEDIUM projects
| "The III program supports innovative research on computational methods for the full data lifecycle, from collection through archiving and knowledge discovery, to maximize the utility of information resources to science and engineering and broadly to society. III projects range from formal theoretical research to those that advance data-intensive applications of scientific, engineering or societal importance. Research areas within III include:
| "The III program supports innovative research on computational methods for the full data lifecycle, from collection through archiving and knowledge discovery, to maximize the utility of information resources to science and engineering and broadly to society. III projects range from formal theoretical research to those that advance data-intensive applications of scientific, engineering or societal importance. Research areas within III include:


Line 42: Line 42:
* Visualizations of this data
* Visualizations of this data


The primary output of this work is a knowlege graph of structured data published in a publicly-avilable instance of the Wikibase platform. Wikibase is a set of extensions to the MediaWiki software platform and is developed by the Wikimedia Foundation as free software [https://www.mediawiki.org/wiki/Wikibase].  Wikibase is a novel infrastructural platform for data management suitable for data from many domains. This is the first application built on Wikibase tailored to the needs of the epidemiological community. The output of this research will be a knowledge graph of structured data in the form of a Wikibase instance populated with data from heterogeneous food composition tables.  
Wikibase is a novel infrastructural platform for data management suitable for data from many domains. This is the first application built on Wikibase tailored to the needs of the epidemiological community. The output of this research will be a knowledge graph of structured data in the form of a Wikibase instance populated with data from heterogeneous food composition tables.  


Multiple data visualization options are available via the Query Service of our Wikibase instance. The Query Service is a SPARQL endpoint which supports querying the data in the knowledge graph via the SPARQL query language [https://www.w3.org/TR/sparql11-query/]. Graphs, charts, network diagrams, and maps are some of the visualizations we will be able to offer end-users of this knowledge base.   
Multiple data visualization options are available via the Query Service of our Wikibase instance. Graphs, charts, network diagrams, and maps are some of the visualizations we will be able to offer end-users of this knowledge base.   


* Case Study One: Fermented foods
* Case Study One: Fermented foods
Line 107: Line 107:


This project will support multilingual data, reducing barriers to data reuse for speakers of many languages beyond English. Users will be able to query using any of the supported human languages, and see results in the language of their choice. Through the reuse of data from Wikidata, a multilingual knowledge base, we will add common names as well as scientific names for foods items and plant and animal species in as many human languages as possible.
This project will support multilingual data, reducing barriers to data reuse for speakers of many languages beyond English. Users will be able to query using any of the supported human languages, and see results in the language of their choice. Through the reuse of data from Wikidata, a multilingual knowledge base, we will add common names as well as scientific names for foods items and plant and animal species in as many human languages as possible.
In many FCTs food items are identified with a single label. Our approach supports searching across multiple aliases for a single resource. This broadens search options so that lookups are not constrained to a single search term. These aliases serve several disambiguation functions. They allow the use of common names as well as scientific names and they allow multilingual indexing. They also allow us to store historic names, whether scientific or common, that are no longer used, but may be found in the literature or in historical sources. For example, this is the record for Jugulans regia in our knowledge graph [https://wikifcd.wiki.opencura.com/wiki/Item:Q82650]. In addition to the species name we also support the aliases 'common walnut', 'Old World Walnut','Walnut, 'Persian Walnut' and 'Juglans fallax' for this item. A more extensive example is Vaccinium vitis-idaea [https://wikifcd.wiki.opencura.com/wiki/Item:Q117098], for which we provide 13 aliases beyond the species name.


Our choice to use Wikibase allows us to access the data serialized as RDF. The SPARQL endpoint we have created allows us to ask questions of this data that previously were not possible to ask. For example, we can now ask questions such as "show me all recipes that call for one or more ingredients containing proanthocyanidins".   
Our choice to use Wikibase allows us to access the data serialized as RDF. The SPARQL endpoint we have created allows us to ask questions of this data that previously were not possible to ask. For example, we can now ask questions such as "show me all recipes that call for one or more ingredients containing proanthocyanidins".   
Line 120: Line 118:
The ability to federate SPARQL queries between our Wikibase and Wikidata allows us to combine our data with resources from the media repository of the Wikimedia Foundation, Wikimedia Commons. The ability to quickly locate images, videos and sound files related to the resources in our Wikibase allows us to provide interactive multi-media interactions in applications we build on top of our Wikibase. Wikimedia Commons has images of many of the taxa of which our food items are products.  
The ability to federate SPARQL queries between our Wikibase and Wikidata allows us to combine our data with resources from the media repository of the Wikimedia Foundation, Wikimedia Commons. The ability to quickly locate images, videos and sound files related to the resources in our Wikibase allows us to provide interactive multi-media interactions in applications we build on top of our Wikibase. Wikimedia Commons has images of many of the taxa of which our food items are products.  


The Wikibase infrastructure supports both human and algorithmic curation. Thus we can programmatically ingest data from external sources and also support crowdsourced recipes from anyone with access to the internet. The World Wide Web Consortium (W3C) published the following definition of the Semantic Web in 2009. "Semantic Web is the idea of having data on the Web defined and linked in a way that it
The Wikibase infrastructure supports both human and algorithmic curation. Thus we can programmatically ingest data from external sources and also support crowdsourced recipes from anyone with access to the internet.
can be used by machines not just for display purposes, but for automation, integration, and
reuse of data across various applications.” (W3C Semantic Web Activity, 2009). The Wikidata knowledge base fulfills the requirements outlined by the W3C in that each resource has a unique identifier, is liked to other resources by properties, and that all of the data is machine actionable as well as editable by both humans and machines.
 
Our decision to build this knowledge base using the infrastructure of the Wikimedia Foundation means that other researchers will be able to access this data for reuse in their own projects in a variety of formats. Results from our SPARQL endpoint are available for download as JSON, TSV, CSV and HTML. Preformatted code snipits for making requests to our SPARQL endpoint are available in PHP, jQuery, JavaScript, Java, Perl, Python, Ruby, R and Matlab. These options allow researchers to more quickly integrate data from our knowledge base into their existing projects using the tools of their choice.


=People=
=People=
Please note that all contributions to WikiDotMako are considered to be released under the Attribution-Share Alike 3.0 Unported (see WikiDotMako:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)