Log in ....Tribune


Dot.ComLatest in ITLearning ComputersFree DownloadsOn hardware

Monday, September 25, 2000
Lead Article

In helping us find what we want, search engines are

By Roli

EVEN if you are searching for a giant Titanic-like object in a vast ocean, don’t you think that without proper navigation and a lot of relevant information, you will be totally lost?

Similarly, the Internet is also an ocean of information, packed in billions of Web sites and Web pages on a wide range of subjects. The Internet can be a resource for almost anything, that the mind thinks of. However, before actually getting it you have to search for the relevant information.

Search engines do the job of seeking relevant information in the virtual world. Search sites are actually those Web sites that perform the specialised function of searching the World Wide Web to dig out the information that has been solicited.

A search query produces a page that contains hyper links and a brief synopsis of the site. The search results generated by a query given could vary from one to many.

 


How does a search engine work?

The functioning of search engines can be broadly divided into three parts:

1) The first part is called crawler. A crawler, also called spider, which is a program that automatically explores the World-Wide Web by retrieving a document and recursively retrieving some or all the documents that are referenced in it. This is in contrast with a normal web browser operated by a human being that doesn't automatically follow links other than inline images and URL redirection.

Graphic by Gaurav SoodThus, as a preliminary step, the crawler goes to all pages of a Web site.

2) The algorithm used to pick which references to follow strongly depends on the program's purpose.

Index-building spiders usually retrieve a significant proportion of the references. The search engine creates a huge index, also called catalogue, of the information collected from the pages that have been read by the crawler.

Graphic by Gaurav Sood3) The search engine receives the search request sent by the surfer and after matching it with the index or the catalogue; it returns the list of various probable matches with summaries and hyperlinks.

This is the most common way of functioning of a search engine. However, there are some variants. This includes Web directories. The Web directories contribute to a large extent in searching databases. These are in index form and organised into categories and sub-categories. For example, entertainment category is further classified into sub-categories of music, movies and humour. Similarly, the health category can be classified into disease, drugs and alternative medicine.

Yahoo and Magellan are the most widely used Web directories and have their own built-in search engines. However, main drawbacks of searching through the Web directories are that these are time consuming and often the results are limited. In order to have broader search capabilities, a number of Web portal sites offer both, a search engine and a directory.

Various search engines

A lot of search engines are available on the Net. A few popular search engines are Yahoo (www.yahoo.com), Infoseek (www.infoseek.com), Google (www.google.com), Alta Vista (www.altavista.com), Lycos (www.lycos.com), besides a few others. Out of these, Yahoo, is the most popular search engine, as it searches not only through its own resources but also gives the results of simultaneous searches from other search indexes.

Such programmes are referred to as general search engines and on being asked a question, they generate volumes of results, spread over a number of Web pages. The results so generated are prioritised on the basis of their relevance with the query and are in descending order of the percentage of relevance. Thus, a result, which matches closest or near 100 per cent will be, categorised in the first few places. Similarly other search details are also arranged in the order of their relevance. However, for more focussed search and for specialised content, specialised search engines like www.search.com are available.

Through these search engines, one can be selective about what part of the Web is crawled and indexed. For example, TechTarget sites for products, like AS/400 (http://www.search400.com) and Windows NT (http://www.searchnt.com) selectively index only the best sites about these products and provide a shorter but focused list of results.

There are some user-friendly search engines also. For example — Ask Jeeves (www.askjeeves.com). Using this search engine, a general search could be conducted. For more focussed results, a query in the form of a question rather than merely a key word search, can also be put through. Though this site also uses keyword search yet by firing queries in a text string, search can be narrowed down.

If an extensive search on the Internet is desired, then special search tools, like WebFerret (www.softferret.com) can be used. Using this tool, one can combine a number of search engines and puts a query only once, though a a comprehensive search result list is displayed.

Though you can search the Net on a majority of sites yet most of these sites do not have their own search engines. Rather, they have the licence to use other major search engines available on the Net.

An interesting thing about search engines is that it is not necessary that you get the same results from all search engines since the database and indexed data of search engines may vary. This could be due to the size of the database, search capabilities, design and speed. Therefore, one gets different results from various search engines.

Using a search engine

It is easy to use search engines and we all know the basics. Typing in the key word in the Find box and pressing Go can have result. However, this way one may get a long list of results, including a number of unwanted sites.

The task of searching can become easy if exact parameters are given. One can clearly define search by writing first letter of noun or any important word as capital. Using phrases and double quotes also helps in seeking accurate results. For example, if you are searching for a specific product, like a book, then the name should be written in quotes, like "A Tale of Two Cities". Now, the search engine will search for the whole name and not an individual word in a string. Thus the results will be related to the book you ask for.

The search results can be further refined by using symbols, like +, -, *, as it helps in getting accurate and shorter search results. In search results, if you want to include all related sites, using the plus (+) sign is extremely helpful. It is written preceding the key word, for example, Books + Agatha Christie. It will now list all related sites on books of Agatha Christie.

Similarly, the use of minus (-) is helpful for excluding the unwanted search, like US –Washington. The search results that appear shall now be of the USA but minus the Washington-related sites.

The symbol asterix (*) is beneficial in case where different spelling or slang are used for a same word. It also helps in searching the related words, say if you type colour*, then results will appear for coloured, colourful, colouring, etc.

For a more accurate search most of the search engines, supports use of Boolean operators, like AND, OR, NOT, whose function is somewhat similar to symbols. Use of AND is just like (+) symbol, while NOT is like (-) symbol.

AND, OR, NOT makes the search to the point and accurate. It is not necessary that all search engines provide the facility of using symbols, Boolean operators and capitalisation of word. For help on these symbols and Boolean operators, one can take help from help menu.

Despite using these symbols, if the search results are not satisfactory, then the help of advanced search feature could be taken. Advanced search features provide information about the use of symbols and Boolean operators in a more organised and detailed form.

Also by using this feature, one can create a complex query for searching by using the syntax (must contain the word/phrase, should contain the word/phrase etc.)

Home Top