Elsevier, a world-leading provider of scientific, technical and medical information products and services, announced the winners of the 2012 Semantic Web Challenge (SWC). Determined by a jury of leading experts from both academia and industry, winners were announced at the International Semantic Web Conference held in Boston, MA, Nov 2012. The challenge and allocated prizes were sponsored by Elsevier.

Of this year’s 24 submissions, the panel of experts selected 4 Open Track Challenge winners and 1 Billion Triples Track winner.

Open Track challenge:

  • 1st prize: “Event Media” by Houd Khrouf, Vuk Milicic and Raphaël Troncy from EURECOM, Sophia Antipolis, France – This application demonstrates how to use semantic web technology to more efficiently and easily integrate multiple online and social media content sources that evolve over time. (watch this video and have a look to this paper)
  • 2nd prize:  “Semantic Processing of Urban Data”, by S. Kotoulas, V. Lopez, R Lloyd, M. Sbodid, F. Lecue, M. Stephenson, E. Daly, V. Bicer, A. Gkoulalas-Divanis, G. Di Lorenzo, A. Schumann and P. Aonghusa from IBM research’s Smart Cities Team –  The mayor of Dublin wanted to know why his ambulances were perennially late so this team ‘knowledge mined’ hundreds of information sources emanating from the city ranging from usual twitter feeds to garbage collection tags and many more to help the mayor improve city service.
  • 3rd prize: jointly awarded to: “Open Self Medication” by Olivier Curé of Universite Paris-Est, LIGM, CNRS – This application advises on self-medication, using the Linked Open Data cloud to mine contra indications for various over the counter medications and adds to these where they were missing. A mobile geo location price comparison tool enables users to find nearby pharmacies that sell the cheapest drugs, enabling French health care insurance companies to reduce their costs; “Wildfire Monitoring” by K. Kyzirakos, M. Karpathiotakis, G. Garbis, C. Nikoladu, K. Bereta, I Papatousis, T. Herekakis, D. Michail , M. Koubarakis and C. Kontoes from the National and Kapodistrian University of Athens, National Observatory of Athens and the Harokopeio University of Athens – This application combines multimedia satellite images with ontologies and Linked Geospatial Data to improve the wildfire monitoring service used by the Greek civil protection agencies, military, and firefighters.

Billion Triples Track challenge:

  • Exploring the linked data cloud,“ by X. Zhang, D. Song, S.Priya, Z. Daniels, K. Reynolds and J. Heflin of Lehigh University, USA. This system allows users to understand how massive data sets are populated and reveals patterns in within these data sets.

The availability of inference services in the Semantic Web context is fundamental for performing several tasks such as the consistency check of an ontology, the construction of a concept taxonomy, the concept retrieval etc.

Currently, the main approach used for performing inferences is deductive reasoning. In traditional Aristotelian logic, deductive reasoning is defined as the inference in which the (logically derived) conclusion is of no greater generality than the premises. Other logic theories define deductive reasoning as the inference in which the conclusion is just as certain as the premises. The conclusion of a deductive inference is necessitated by the premises: the premises cannot be true while the conclusion is false. Such characteristics of deductive reasoning are the reason of its usage in the SW. Indeed computing class hierarchy as well as checking ontology consistency require certain and correct results and do not need of high general conclusions with respect to the premises.

Conversely, tasks such as ontology learning, ontology population by assertions, ontology evaluation, ontology mapping and alignment require inferences that are able to return higher general conclusions with respect to the premises. To this end, inductive learning methods, based on inductive reasoning, could be effectively used. Indeed, inductive reasoning generates conclusions that are of greater generality than the premises, even if, differently from the deductive reasoning, such conclusions have less certainty than the premises. Specifically, in contrast to the deduction, the starting premises of the induction are specific (typically facts or examples) rather than general axioms. The goal of the inference is to formulate plausible general assertions explaining the given facts and that are able to predict new facts. Namely, inductive reasoning attempt to derive a complete and correct description of a given phenomenon or part of it.

It is important to mention that, of the two aspects of inductive inference: the generation of plausible hypothesis and their validation (the establishment of their truth status), only the first one is of primary interest to inductive learning research, because it is assumed that the generated hypothesis are judged by human experts and tested by known methods of deductive inference and statistics.

Elsevier announced the winners of the 2010 Semantic Web Challenge. The Elsevier sponsored Challenge occurred at the International Semantic Web Conference held in Shanghai, China from 7-11 November, 2010. A jury consisting of seven leading experts from both academia and industry awarded the four best applications with cash prizes exceeding 3000 Euro in total.

Over the last eight years, the Challenge has attracted more than 140 entries. All submissions are evaluated rigorously by a jury composed of leading scientists and experts from industry in a 3 round knockout competition consisting of a poster session, oral presentations and live demonstrations.

Organized this year by Christian Bizer from the Freie Universität Berlin, Germany, and Diana Maynard from the University of Sheffield, UK, the Semantic Web Challenge consists of two categories: “Open Track” and “Billion Triples Track.”

The Open Track requires that the applications can be used by ordinary people or scientists and must make use of the meaning of information on the web. The Billion Triples track requires applications to scale up to deal with huge amounts of information which has been gathered from the open web.

The winners of the 2010 Open Track challenge were the team from Stanford University comprising of Clement Jonquet, Paea LePendu, Sean M. Falconer, Adrien Coulet, Natalya F. Noy, Mark A. Musen, and Nigam H. Shah for “NCBO Resource Index: Ontology-Based Search and Mining of Biomedical Resources”. Their entry provides very clear benefits to the biomedical community, bringing together knowledge from many different entities on the web with a large corpus of scientific literature though the clever application of semantic web technologies and principles.

The second prize in the open track was awarded to the team from Rensselaer Polytechnic Institute comprising of Dominic DiFranzo, Li Ding, John S. Erickson, Xian Li, Tim Lebo, James Michaelis, Alvaro Graves, Gregory Todd Williams, Jin Guang Zheng, Johanna Flores, Zhenning Shangguan, Gino Gervasio, Deborah L. McGuinness and Jim Hendler, for the development of “TWC LOGD: A Portal for Linking Open Government Data” – a massive semantic effort in opening up and linking all the public US government data, and providing the ecosystem and education for re-use.

The third prize in the 2010 Open Track was won by a combined team from the Karlsruhe Institute of Technology, Oxford University and the University of Southern California comprising of Denny Vrandecic, Varun Ratnakar, Markus Krötzsch, and Yolanda Gil for their entry “Shortipedia” – a Web-based knowledge repository and collaborative curating system, pulling together a growing number of sources in order to provide a comprehensive, multilingual and diversived view on entities of interest – a Wikipedia on steroids.

The Billion Triples Track was won by “Creating voiD Descriptions for Web-scale Data” by Christoph Böhm, Johannes Lorey, Dandy Fenz, Eyk Kny, Matthias Pohl, Felix Naumann from Potsdam Univesity, Germany. This entry uses state of the art parallelisation techniques, and some serious cloud computing power, to dissect the enormous Billion Triples dataset into topic-specific views.

Further Information

Further information about the Semantic Web Challenge 2010, the runners-up, all submissions and the evaluation committee is found on the Former Challenges page as well as in the Elsevier Press release about the Semantic Web Challenge 2010.

 

Blank Node

A blank node is an unnamed node, whose name is set by the underlying RDF software and cannot be guaranteed to have the same name for different sessions. Within a graph, it is guaranteed to resolve to the same thing (not a resource/URI but a separate way to represent a node), and between graphs, “it would be incorrect to assume that blank nodes from different graphs having the same blank node identifiers are the same” (see RDF Primer). If you want multiple independent graphs to refer to the same resource, you have to give it an explicit URI.

The most authoritative source for named graphs (being a W3C Recommendation) is SPARQL. Serialization syntaxes, such as RDF/XML or Turtle, allow you to assign an explicit name (a “blank node identifier”) to a blank node, but this is only to distinguish between different blank nodes or to refer to the same blank node from different triples within the same graph. If you give the same blank node identifier to blank nodes in different graphs, these blank nodes are still different from each other; in fact, there will be no relationship or interaction between them at all.

 

Named Graph

Named Graphs is the idea that having multiple RDF graphs in a single document/repository and naming them with URIs provides useful additional functionality built on top of the RDF Recommendations.

Named Graphs turn the RDF triple model into a quad model by extending a triple to include an additional item of information. This extra piece of information takes the form of a URI which provides some additional context to the triple with which it is associated, providing an extra degree of freedom when it comes to managing RDF data. The ability to group triples around a URI underlies features such as: Tracking provenance of RDF data, Access Control and Versioning.

There’s some useful background available on Named Graph Provenance and Trust,  on Named Graphs in general in a paper about NG4J, and specifically on their use in OpenAnzo.

Named Graphs are an important part of the overall technical framework for managing, publishing and querying RDF and Linked Data, and its important to understand the trade-offs in different approaches to using them.

Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. In library and information science controlled vocabulary is a carefully selected list of words and phrases, which are used to tag units of information (document or work) so that they may be more easily retrieved by a search.

The fundamental difference between an ontology and a controlled vocabulary is the level of abstraction and relationships among concept. A formal ontology is a controlled vocabulary expressed in an ontology representation language. This language has a grammar for using vocabulary terms to express something meaningful within a specified domain of interest. The grammar contains formal constraints (e.g., specifies what it means to be a well-formed statement, assertion, query, etc.) on how terms in the ontology’s controlled vocabulary can be used together.

Controlled vocabulary uses in making ontology not only to reduce the duplication of effort involved in building an ontology from scratch by using the existing vocabulary, but also to establish a mechanism for allowing differing vocabularies to be mapped onto the ontology.

Here is the list of famous controlled vocabularies:

  • FOAF: Friend Of A Friend—the most well known vocabulary for modeling people (and one of the most well known RDF vocabularies), FOAF can represent basic person information, such as contact details, and basic relationships, such as who a person knows.
  • SKOS: Simple Knowledge Organization System — it provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, and other similar types of controlled vocabulary. Useful for describing models that have some hierarchy and structure but are not sufficiently discrete and formal to map directly into OWL.
  • AIISO: Academic Institution Internal Structure Ontology—effectively models organizational relationships, such as Institution->School->Department->Faculty with the property part_of and defines courses taught by those Departments with the teaches property. AIISO was developed within the past year by Talis, a software company dedicated to semantic technologies, for their academic resource list management system, Talis Aspire.
  • University Ontology—University ontology is undergoing active development and is currently unstable, but does a good job of modeling the details of course scheduling. It is being developed by Patrick Murray-John at University of Mary Washington, who is in touch with the developers of the AIISO ontology at Talis.
  • SWRC: Semantic Web for Research Communities—there is much overlap between AIISO and SWRC. While there is a text on the development of SWRC, it is hard to find a clear documentation of the ontology itself, so a comparison of the two would take more time.
  • DC: Dublin Core—One of the original and most widely used vocabularies, Dublin Core can be used for cataloging publications.
  • bibTeX.owl—bibTeX is a format description for source citation. bibTeX.owl is the bibTeX ontology chosen by Nick Matsakis to use in his BibTeX RDFizer that is part of MIT’s SIMILE project. Depending on whether bibTeX data is prevalent and used throughout the community, this may be another option for cataloging publications.
  • Bibliography ontology—Bibliography reuses many existing ontologies such as Dublin Core and FOAF properties. It’s goal is to be a superset of legacy formats like BibTeX. It has multiple levels, such as level one which is for simple bibliographic data, or level three which can aggregate many medium sources like: writings, speeches, and conferences. It is used in the University ontology.
  • SIOC:Semantically-Interlinked Online Communities—SIOC Core Ontology Specification – an RDFS/OWL vocabulary/ontology for describing the main concepts and properties for online communities.

On 28. August 2010 the Jena project celebrated its 10th year of providing us with a Semantic Web Framework, Jena now is probably one of the most popular Java RDF APIs in the community. It started as an idea by a developer at HPLabs in Bristol and Andy mentioned Brian McBride’s email (
http://lists.w3.org/Archives/Public/www-rdf-interest/2000Aug/0128.html ) as the official starting point for the project.

TDB is a persistent graph storage layer for Jena. TDB works with the Jena SPARQL query engine (ARQ) to provide complete SPARQL together with a number of extensions (e.g. property functions, aggregates, arbitrary length property paths). It is a pure-Java, employing memory mapped I/O, a custom implementation of B+Trees and optimized range filters for XSD value stapces (integers, decimals, dates, dateTime).

TDB has been used to load UniProt v13.4 (1.7B triples, 1.5B unique) on a single machine with 64 bit hardware (36 hours, 12k triples/s). TDB 0.5 Results for the Berlin SPARQL Benchmark (August 2008).

The Web is the largest human information construct and information channel in history and, thus, of utmost relevance to any organization. In order to understand what the Web is, engineer its future and ensure its social benefit we need a new interdisciplinary field that we call Web Science.

The term Web Science was coined by Berners-Lee and colleagues (2006) in a short Science article. Since then many researchers adopted the paradigm, organized specialized Web Science conferences and developed the paradigm further (e.g. Hendler et al. 2008).

Web Science is a new, interdisciplinary scientific paradigm (or even discipline) that seeks to understand the Web in its whole with a focus on technical and social challenges. To model and predict and thus to understand the development of the Web, one requires a mix of expertise from a wide set of disciplines ranging from sociology up to computer science.

“People think that you need to be a computer scientist to study the Web, but that is not the case,” according to Professor Dame Wendy Hall, Director of the Web Science Doctoral Training Centre at the University of Southampton. “We need economists, sociologists, political scientists and linguists to fully understand the impact the Web is having on our lives.”

In 2010, there are some conferences in the field of web science. The first one is Web Science conference (WebSci10), which will be held on Raleigh, North Carolina, 26 & 27 April 2010 and Tim Berners-Lee is one of the invited speakers on this event.

The second event is keynote talk of Professor Stephan staab with the title “Web Science: Your Business, too!” that will be held at Business Information Systems,( BIS-2010 ), Berlin, Germany, May 4, 2010.

And the last one is the minitrack at 16th Americas Conference on Information Systems (AMCIS 2010), with the title “Web Science – A New Paradigm in IS-Research”. All particular submissions are welcomed in the following areas:  (1) examining social aspects of the Web, (2) using Web data for forecasting or other purposes, and (3) proposing architectural principles of a Web infrastructure for social software.

You may find some more information about Web Science by visiting the following link:

http://webscience.org/webscience.html

Elsevier announced the winners of the 2009 Semantic Web Challenge, which took place at the International Semantic Web Conference held in Washington, D.C., from October 25-29, 2009. A jury consisting of eleven leading experts from both academia and industry awarded the four best applications with cash prizes of 2750 Euro in total, sponsored by Elsevier.

The 2009 Semantic Web Challenge was organized by Peter Mika of Yahoo! Research and Chris Bizer of Freie Universität Berlin and consists of two categories: “Open Track” and “Billion Triples Track.” Open Track requires that the applications utilize the semantics (meaning) of data and that they have been designed to operate in an open web environment, whilst the Billion Triples Track focuses on dealing with very large data sets of low quality commonly found on the web.

The Billion Triples Track was won by “Scalable Reduction” by Gregory Todd Williams, Jesse Weaver, Medha Atre, and James A. Hendler (Rensselaer Polytechnic Institute, USA). The entry showed how massive parallelization can be applied to quickly clean and filter large amounts of RDF data.

The winners of the 2009 Open Track were Chintan Patel, Sharib Khan, and Karthik Gomadam from Applied Informatics, Inc for “TrialX” (http://trialx.com). TrialX enables finding new treatments by intelligently matching patients to clinical trials using advanced medical onthologies to combine several electronic health records with user generated information.

The second prize of the 2009 Open Track was awarded to Andreas Harth from the Institute of Applied Informatics and Formal Description Methods, Universität Karlsruhe, Germany for “VisiNav” (http://visinav.deri.org/). The third prize in the 2009 open Track was awarded to Giovanni Tummarello, Richard Cyganiak, Michele Catasta, Szymon Danielczyk, and Stefan Decker from the Digital Enterprise Research Institute, Ireland for the development of “Sig.ma” (http://sig.ma/).

“This year’s winner of the Open Track is an application that we can hold up as an example to those outside of our community. In comparison, the Billion Triples Track have attracted less submissions this year, but it has been noticeable that all submissions have dealt with increasing amounts of information. Altogether we see clear progress toward implementing the vision of the Semantic Web.” said Chris Bizer and Peter Mika, co-chairs of the Semantic Web Challenge.

Open Track
1st place:
TrialX
Chintan Patel, Sharib Khan, and Karthik Gomadam from Applied Informatics, Inc
http://www.cs.vu.nl/~pmika/swc/documents/TrialX-healthx-iswc09-challenge.pdf

2nd place:
VisiNav
Andreas Harth from the Institute of Applied Informatics and Formal Description Methods, Universität Karlsruhe, Germany
http://www.cs.vu.nl/~pmika/swc/documents/VisiNav-paper.pdf

3rd place:
Sig.ma
Giovanni Tummarello, Richard Cyganiak, Michele Catasta, Szymon Danielczyk, and Stefan Decker from the Digital Enterprise Research Institute, Ireland
http://www.cs.vu.nl/~pmika/swc/documents/Sig.ma:%20Live%20views%20on%20the%20web%20of%20data-sigma.pdf

Billion Triples Track:
Winner:
Scalable Reduction
Gregory Todd Williams, Jesse Weaver, Medha Atre, and James A. Hendler from the Rensselaer Polytechnic Institute, USA
http://www.cs.vu.nl/~pmika/swc/documents/Scalable%20Reduction%20of%20Large%20Datasets%20to%20Interesting%20Subsets-btc2009.pdf

More information on the 2009 Semantic Web Challenge Awards, as well as a demo and links to all the competing applications can be found on http://challenge.semanticweb.org

The Web is increasingly understood as a global information space consisting not just of linked documents, but also of linked data. The term Linked Data was coined by Tim Berners-Lee in his Linked Data Web architecture note. The goal of Linked Data is to enable people to share structured data on the Web as easily as they can share documents today. More specifically, Wikipedia defines Linked Data as “a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using .

More than just a vision, the Web of Data has been brought into being by the maturing of the Semantic Web technology stack, and by the publication of an increasing number of datasets according to the principles of Linked Data. Today, this emerging Web of Data includes data sets as extensive and diverse as DBpedia, Geonames, US Census, EuroStat, MusicBrainz, BBC Programmes, Flickr, DBLP, PubMed, UniProt, FOAF, SIOC, OpenCyc, UMBEL and Yago. The availability of these and many other data sets has paved the way for an increasing number of applications that build on Linked Data, support services designed to reduce the complexity of integrating heterogeneous data from distributed sources, as well as new business opportunities for start-up companies in this space.

The basic tenets of Linked Data are to:

  • use the RDF data model to publish structured data on the Web
  • use RDF links to interlink data from different data sources

Applying both principles leads to the creation of a data commons on the Web, a space where people and organizations can post and consume data about anything. This data commons is often called the Web of Data or Semantic Web.

In summary, Linked Data is simply about using the Web to create typed links between data from different sources. It is important to know that Linked Data is not the Semantic Web, it’s the basement for it.

For more information, you may refer to:

The Tabulator Extension is an extension for Firefox that provides a human-readable interface for linked data. It is based on the Tabulator, a web-based interface for browsing RDF. Using Tabulator’s outline mode, query views, and back-end code, the Tabulator Extension integrates the browsing of linked data directly into the Firefox browser, making for a more natural and seamless experience when browsing linked data on the Web.

A primary goal of the Tabulator Extension is to explore how linked data could be displayed in the next generation of Web browsers. The Tabulator aims to make linked data human-readable by taking a document and picking out the actual things that the document describes. The properties of these things are the displayed in a table, and then the links in that table can be followed to load more data about other things in other documents.

A link to the latest version of the extension can be found on the Tabulator Extension site: The Tabulator Extension. Moreover, Tabulator is now hosted on addons.mozilla.org. If you download and install from there, it will provide automatic updates through the Firefox Addon Manager.

 

Once the extension file is downloaded, it should automatically install. After restarting Firefox, all documents served as application/rdf+xml and text/n3 (and for a while legacy documents served as text/rdf+n3) will be automatically loaded in the Tabulator’s outline view. It may be necessary to disable other RDF-related extensions that could override the Tabulator’s capture of these documents.

 

For more information, read this article : Tabulator: Exploring and Analyzing linked data on the Semantic Web