What Semantic Web researchers need to know about Machine Learning?

(ISWC 2007 Tutorial)


Marko Grobelnik, Blaz Fortuna, Dunja Mladenic

(Jozef Stefan Institute, Ljubljana, Slovenia)


The tutorial will cover basic topics from the field of Machine Learning explained in an intuitive way relevant for Semantic Web researchers and practitioners. In the first part the topics will cover brief top level overview of the Machine Learning field, its algorithms, and data types being analyzed. In the second part we will cover relation to Semantic Web and Web2.0. In the last part we will perform hands-on exercise with some of the tools for modeling text semantics and social networks in analytical way.

Tutorial Slides

A brief description of the tutorial

Semantic web and Machine Learning are covering conceptually different sides of the same story – Semantic Web’s typical approach is top-down modeling of knowledge and proceeding down towards the data while Machine Learning is almost entirely data-driven bottom-up approach trying to discover the structure in the data and express it in the more abstract ways and rich knowledge formalisms.

While ISWC conference covers mainly top-down style of approaches we can spot in the recent time strong moves toward using also data-driven bottom-up approaches, especially in the context of modeling data from the web and dealing with Web2.0 related research.

The goal of this tutorial is to provide understanding of analytical techniques from Machine Learning and related research fields which could be used to model Semantic Web related problems. The aim is to explain the topics in an intuitive way which could serve for later understanding of relevant Semantic Web problems also in the light of data-driven approaches.

Draft Outline

Justification of the tutorial, including relevance to ISWC 2007

The goal of this tutorial is to increase the level of knowledge of semantic web community in the areas of machine learning and related analytic research fields. Namely, recent developments in semantic web and especially Web2.0 show increased need for integrating hard exact logic approaches with softer inexact analytic approaches – the reason is availability of data coming from various sources and of various types which need to be semantically modeled. We also spotted increased number of contributions at ISWC, ESWC and WWW conferences at semantic web tracks with statistical approaches being used and combined with more traditional logic approaches.

As an additional point it is worth mentioning the needs for combination of machine learning and Semantic web approaches within current projects in Europe and US (such as CALO, NEON, NEPOMUK, XMEDIA, etc.)

Background knowledge required

The required prior knowledge for the tutorial is basic knowledge of logic formalisms (common to most of the ISWC participants) and the very basics of probability theory and information retrieval (basic undergraduate level). The tutorial will consist of several independent modules which will not assume complete understanding of the previous modules.

Information on presenters

Marko Grobelnik / Blaz Fortuna / Dunja Mladenic

Address: J. Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia
E-mail: Marko.Grobelnik@ijs.si / Blaz.Fortuna@ijs.si / Dunja.Mladenic@ijs.si
Phone: +386 1 4773 778
Fax: +386 1 4773 315

Marko Grobelnik is primary contact.

Presenters had in the past the following tutorials:

·         5th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'01), Freiburg, Germany 2001 with the title: “Tutorial on Text Mining: What if your data is made of words?” (http://www.afia.polytechnique.fr/CAFE/ECML01/text_mining.html).

·         PASCAL Network of Excellence Workshop on Text Classification (Grenoble January 2004): Tutorial on Text Mining (http://www.pascal-network.org/Reports/Workshops/129/)

·         WWW2004 – ACM Conference on World Wide Web (NYC 2004) : Tutorial on Text Mining approaches for Web Data (http://www2004.org/tutorial.htm)

·         ESWS2004 – European Conference on Semantic Web (Heraklion, Crete 2004): Tutorial on Knowledge Discovery & the Semantic Web (http://www.esws2004.org/sub/tutorials.htm)

·         Workshop on Complex Object Visualizations (Koper 2005): Tutorial on Text Visualization (http://www.ijp.si/cov/)

·         ECML/PKDD 2005 – European Conference on Machine Learning (Porto 2005): Tutorial on Ontology Learning from Text (http://ecmlpkdd05.liacc.up.pt/tutorials.html)

·         ISWC2006 – International Conference on Semantic Web (Athens, GA): Tutorial on Context Sensitivity in Knowledge Rich Systems (http://iswc2006.semanticweb.org/workshop_tutorial/tutorials.php)

·         IJCAI2007 – International Joint Conference on Artificial Intelligence (Hyderabad 2007): Tutorial on Text Mining and Link Analysis for Web and Semantic Web (http://www.ijcai2007.org/)

Software for hands-on training

The software for hands-on training will be separately packaged and available from URL to decrease overhead for installation. In particular, the software will run on WindowsXP/Vista without any extra preinstalled packages. The package will include selection of modules available from the web-sites: