Why does Semantic Web not work

Semantic Web

The World Wide Web (WWW) has drastically changed the availability of electronic information. There are currently more than a billion websites and the number continues to grow rapidly. However, this rapid growth has made it increasingly difficult to find, organize and manage information again.

To solve these problems, Tim Berners-Lee - the inventor of the WWW - proposed the Semantic Web as an extension of the existing Internet. The core of this proposed solution is to annotate the information available on the web with machine-processable semantics.

semantics

According to the Duden, semantics meansa branch of linguistics that deals with the meanings of linguistic signs and strings, i.e. with the content analysis of words, sentences or texts.

Semantic Web

According to the inventor Berners-Lee, the Semantic Web is an extension of the conventional Web, in which information is given clear meanings in order to facilitate the work between humans and machines: The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation (Berners-Lee, Hendler, Lassila, 2001). With the Semantic Web, the World Wide Web should become “intelligent” - thanks to this technology, not only people but also machines should “understand” and process Internet content.

Figure: World Wide Web from W3C[1]

Figure: Semantic Web from W3C[2]


So far, machines have only understood web pages as a collection of characters and links, in the representation resources with links. The machines do not "know" what kind of resource there is and what kind of relationship there is between different resources. In comparison, only the Semantic Web allows an automated definition of terms and order. This means that the machines can "understand" what the data resource is and how it relates to other resources.

In 1998 Tim Berners-Lee published his idea for "Semantic Web". Since then, the Semantic Web has been a topic of the W3C (World Wide Web Consortium).

The World Wide Web Consortium (W3C) is a forum for information, e-commerce and communication. This organization develops interoperable technologies e.g. specifications, guidelines, software and tools to lead the web to its full potential.

The W3C Semantic Web Activity, together with many other researchers and industrial partners, deals with Semantic Web. Their job is to set the standards and develop the technologies that define the data on the Web and connect it in a way that makes information retrieval, data automation, integration and reuse more effective.

According to Berners-Lee's idea, not all words should be cataloged or pages should be re-created. The Semantic Web is not intended to replace the existing Web, but to expand it. The HTML pages would not have to be rewritten, but supplemented with a semantic description. To implement the Semantic Web, semantic metadata, that is, data that describes data, must be added to information sources so that machines can effectively process the data based on the descriptive semantic information. The relevant data would be prepared in a machine-readable language in order to allow communication between machines. A common language is a prerequisite for successful communication.

Syntax: XML

XML (eXtensible Markup Language) is available for the syntax of the language. XML is similar to HTML and makes it easy to represent documents on the web. It allows you to use your own tags so that the documents are displayed in a structure defined by the author himself.

Semantics: RDF

The meaning is defined by the RDF: Resource Description Framework. RDF describes resources on the web. It builds on existing XML and URI technologies. URI stands for Uniform Resource Identifier. They are used to identify the individual resources and to provide statements about resources. RDF statements describe a resource, the properties of a resource and the values ​​of these properties. These statements are often referred to as triples. These triples consist of subject, predicate and object. This corresponds to a resource (subject), a property (predicate) and a property value (object). These three elements are each identified by a URI.

The following is an example of an RDF statement using a simple English sentence:

[Resource] [property] [value] The secret agent is Niki Devgood [subject] [predicate] [object]

RDF triples are often represented graphically as follows:

Illustration from ALTOVA[3]

After these triples have been created, further triples can now be created in order to link the secret agent to something else, e.g. an email address and a picture of her red convertible, as in the following graph, can be added.


Illustration from ALTOVA[4]

ontology

The web is currently decentralized and confusing. It can therefore be assumed that different URIs are used for the same concept on the web, for example "zip code" in the USA and "zip code" in Germany. If a program works with this information, it should know that different URIs are used for these resources, but they mean the same thing. This problem is to be solved by the third basic component of the Semantic Web - ontology.

The "ontology" is defined in connection with the Semantic Web as a schema that expressly defines the hierarchies and relationships between different resources. Semantic Web ontologies consist of a taxonomy and a series of inference rules that machines can use to draw logical conclusions.

Taxonomy

A taxonomy in this context is a classification system, such as the scientific system for classifying plants (kingdom / department / class / order, etc.), in which resources are grouped into classes and subclasses based on their relationships and common properties.

Inference rules

With the interference rules, programs can draw conclusions based on the defined preconditions. A simple example: If "animal - mammal - dog - poodle" is defined in a taxonomy, a program with only the information "poodle" can conclude that it is a mammal of the dog genus and not a species of bird or, for example a reptile.

RDF Schema (RDFS) and Web Ontology Language (OWL)

RDFS creates the vocabularies that describe groups of related RDF resources and the relationships between those resources. An RDFS vocabulary defines which properties are to be assigned to the RDF resources in a certain area. RDFS can still be used to create resource classes with common properties. Building on the same triple model as RDF, RDFS triples consist of classes, class properties and property values.

In an RDFS vocabulary, resources are defined as instances of classes. A class is also a resource, and each class can be a subclass of another. Thanks to this hierarchically structured semantic information, machines are able to determine their meaning based on the properties and classes of resources.

Roughly speaking, RDFS is a simple vocabulary language used to describe the relationships between resources. Web Ontology Language (OWL) is based on RDFS and is a much more extensive vocabulary for defining Semantic Web Ontologies, namely for illustrating the hierarchies and relationships between different resources. Since taxomomies express the hierarchical relationships between resources, OWL can be used to assign properties to resource classes and to pass the same properties on to their subclasses. OWL also supports class axioms such as subClassOf, disjointWith, etc. and class descriptions such as unionOf, intersectionOf, etc. Many other concepts have also been integrated into OWL, making OWL the most extensive standard ontology description language that exists today.

Let's put ourselves a few years into the future. You work as a software consultant and today you have a working lunch with one of your most important customers. His company has an urgent project to do at its San Francisco office. For this he needs you as a consultant and asks you to fly to San Francisco as soon as possible to start work. What are you doing now? You pick up your handheld computer, activate the Semantic Web agent and instruct it to book a non-stop flight to San Francisco, which departs before 10 a.m. tomorrow. If possible, you would like an aisle seat. As soon as your agent finds a suitable flight with an aisle seat available, he or she books the flight with your American Express Card. At the same time, he will inform you that you will miss a dentist appointment at home and add a note to your calendar that you will have to postpone the appointment. Next, state that you need a limo to get to the client's office. Your agent will then look for limousine services with a service rating of "very good" and book a driver who will pick you up 30 minutes after the aircraft arrives. Your agent will also reserve a room for you in your favorite hotel in San Francisco and secure the lowest price for you with your Reward Card number. Finally, the agent updates your schedule, enters the travel information and prints out the travel confirmations in the office.

With just a few clicks, your Semantic Web Agent has found a flight and limousine service and updated the calendar. He even compared your travel planning with your appointment calendar and found the conflict with the dentist appointment. To do this, the agent had to find, interpret, combine and operate on information from various sources.

This example of the application of the Semantic Web is of course still a long way off. But it did show the potential of Semantic Web technologies. Only in the future will we see whether the vision will become reality.

The Semantic Web Agent is not based on artificial intelligence, but on structured information and inference rules that allow it to "understand" the relationships between different data resources. Although the computer does not understand the information like a human, it can establish logical connections and make decisions based on the information available.

The World Wide Web has revolutionized the world of information. Millions of people have access to the web every day, producing and updating information in all forms. Now the semantic web will lead to an evolution of the web. For some, it offers a convenience with which their PDA, laptop, desktop, server and car can communicate with each other. For others, it is possible that the decisions that people previously made with great effort can be made automatically in the Semantic Web. Some proponents of the Semantic Web even claim that it will lead to an evolution of human knowledge itself, as it will for the first time allow humans to filter and combine the vast amounts of data in this world in a relevant and productive way.

On the other hand, however, it should be noted that the implementation of RDF, OWL and the Semantic Web will be gradual. A widespread expansion of such concepts seems to be feasible only in the long term. Most homepage owners are currently overwhelmed with the creation of simple HTML pages, not to mention more complex techniques such as XML and RDF or URI.

  • Semantic Web Activity: [6]
  • The 3rd Annual European Semantic Web Conference: [7]
  • International Semantic Web Conference (ISWC): [8]
  • Semantic Web Community Portal: [9]
  • Semantic Web: [10]
  • Altova (Ed.): What is the Semantic Web. Available online at: [11], last accessed: January 31, 2006
  • Berners-Lee, Tim; Hendler, James; Lassila, Ora (2001): The Semantic Web. A new form of web content that is meaningful to computers will unleash a revolution of new possibilities. Available online at: [12], last accessed: January 31, 2006
  • Berners-Lee, Tim; Miller, Eric (2002): The Semantic Web lifts off. Available online at: [13], last accessed: January 31, 2006
  • Bestle, Tristian (2004): The Smarter Web. In: Internetworld, Volume 6, pp.78-79.
  • Swartz, Aaron (2002): The Semantic Web In Breadth. Available online at: [14], last accessed: January 31, 2006