Log in ....Tribune

Monday, September 8, 2003
Feature

Semantic search engines know what you are looking for
N.S. Soundara Rajan

A new network system to make the Web smarter — to handle semantics or linguistic meanings, more like the way we humans do.

Presently, when a Web search is conducted the search engine looks for strings of letters on various pages but has no idea what those letters actually mean or how they are inter-related. Owing to this, when you type, let’s say ‘car’ the search engine doesn’t know that cars could also be called automobiles, motor cars or station wagons. Instead of looking for pages about cars, the search engine looks for the word ‘car’ on them. Thus we end up with too little information. In some cases too much of it. For example, if you enter ‘sting’ you are likely to get dumped with Web pages related to the musician, the Paul Newman movie and about the bee!

Hence, it has become a necessity to develop programs / tools to host machine-understandable data on the Web. This would help the Web to reach its full potential by providing data which can be shared and processed by automated tools and people as well. The semantic Web would have a new addressing system to navigate the plethora of sites that make up the Internet. Under a collaborative project known as the semantic Web, computer scientists around the world are working on ways to revolutionise the Internet. The researchers from all over the globe — Europe, Asia and the United States — are developing standards, protocols and technologies that will help advance the emergence of a more meaning-oriented Web. The brain behind this endeavour to build the semantic Web is Berners-Lee, inventor of the World Wide Web, and his group at the World Wide Web Consortium (W3C). Berners-Lee founded this consortium at MIT to build the Semantic Web; a Web that can understand and look for the information the searcher is after.

For the Web to scale, the programs of tomorrow should be able to share and process data even when these have been designed independently. So the development of the semantic Web would be based on a range of technologies and techniques. These tools would encompass agent, database, language and human-computer interface technologies — in addition to a branch of research known as knowledge discovery, a field that investigates how to mine data more efficiently and effectively. One of them would be the Resource Description Framework (RDF), which integrates a variety of applications using XML for syntax and URIs (Uniform Resource Indicators) for naming. With the Web ontology language (OWL), developed on top of XML, search engines will be able to discern whether two Websites have the same content even if they are described using different terminology or meta language.

On the W3C’s site, (www.w3.org), you can see an example of what the Semantic Web might look like. Here, on this early version, searching for a person yields photographs, contact information, papers and Websites relating to that particular person. The program understands that you’re looking for everything relevant to that individual. Smart software programs, or ‘intelligent agents,’ built from code, are what will give the semantic Web this capability.

According to Eric Miller, W3C Activity Leader, no date has been set for the launch of the semantic Web. Also, when it arrives it wouldn’t steam roll over the World Wide Web in its present form. Eric Miller sees the semantic Web infiltrating gradually, with early adopters pushing and expanding its functionality. In fact, this process has already begun and online communities have picked up on W3C developers’ early work and are applying it, Miller adds.

The aim of the semantic Web efforts is to be able to find and access sites and resources, not by keywords as Google or some other search engine do today, but by descriptions of their contents and capabilities. A lot of tools are being developed currently to make that happen. The semantic Web is a vision: the idea of having data on the Web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications. According to this vision — which optimists say will be realised only a few years from now — a search on the Net will be conducted according to the searcher’s needs and will be expressed in ordinary language.