The search engines must be the brains of the Internet, as the browsers must be the heart of the Net. Without searching, the Internet would have not had any future.
I believe Yahoo deserves the award of Internet search pioneer. They put together a list of Web pages by category. That was a start. The Net surfers had places to go. Yahoo was called a directory or portal. The portals are manmade. The directories are still useful. But they have serious limitations. First, it's impossible to categorize all Web pages, given the extraordinarily fast pace of Web changes. There are billions of Web pages, and counting! No group of human editors can keep up with the immensity of the Internet. Another serious limitation of the portals is the bias. The selection and inclusion in a directory are very subjective decisions. It was made much worse by the pay-for-inclusion models. Money has become a substitute for quality. That would kill the Internet, as in "Kaput Internet!"
Next, a huge search engine took over the Internet: Alta Vista. It was a powerful search engine, capable of indexing and listing millions of Web pages. Alta Vista shed light on huge dark areas of the Net, invisible before the search engine came to life. Alta Vista was quantitatively a power. It was a poor performer when it came to the quality of the search results. I remember searches when Alta Vista listed one URL over and over again! I remember one Web page listed more than twenty times. The listing included every modification of the page! The relevancy of the search was also very poor in the old Alta Vista. (By the way, Alta Vista is now part of Yahoo search and uses the same search technology: Inktomi.)
Then, a smaller search engine defined the concept of relevancy: Hotbot. I remember Hotbot getting all the awards of a notable computing publication, PC Magazine. It was still the pioneering era of searching on the Internet. O tempora! O mores!
The Internet was taken by storm with the introduction of Google and its famous Beta. Google picked up where Hotbot left off. The relevancy became the major focus of the search technology. Only the Google insiders know exactly the algorithm. It is assumed that the keystone is the back-linking. That is, if a Web page (or its parent site) is listed on many other pages, that page must have merit. If it is good, other people refer other Net surfers to the same resource. Such concept does have a logical foundation.
One of my Web pages ranks very high on the keyword deviation standard. Yet, it is ranked very lowly on the keyword standard deviation! Isn't it the very same concept? Sure is. In both situations, my page offers the most comprehensive treatise of the subject, including pertinent software. The page is also naturally integrated in a Web site of related articles.
The above anachronism is also caused by the scholastic syndrome. The search engines consider that the established educational outlets must be the best in treating a subject. Problem is, the schools tend to be overly conservative, even mummified. Most notable advancement in knowledge has occurred outside traditional institutions. Matter of fact, the mummified education makes impossible advancement in knowledge. As in that Pink Floyd song:
We don't need no education,
We don't need no thought control!
There are several problems with back-linking, especially after many Internet authors "discovered" the Google algorithm. First, the backlinking can be the result of pay-for-inclusion. Again, money speaks, not the quality; therefore the relevancy can be a moot point. Second, more and more Web authors exchange links. One thousand good friends but lousy Web authors can beat, at any time, one genius Web author. The back linking has seriously damaged the relevancy of Internet searching. I have stumbled upon miserable Web pages with high ranking in all major search engines. Such pages of misery succeed to rank high because of keyword manipulation and back linking.
The book model could seriously impede spamming. Laziness is the creator of short Web pages with the keywords repeated again, and again, without meaning at all. Writing several pages dedicated to the same topic and closely related topics indicates seriousness. It is a good indicator that the keywords are treated in a more thorough manner. The integration should be viewed only as part of the same Web site. Otherwise, it would make it very easy for any sucker to write down a few lines of keywords and then offer hundreds of links to external Web pages of high quality! Every Web idiot would rank higher than geniuses that don't offer links to any external websites!
It takes more quality effort to write a book than scribble a sketchy page. The effort is even higher if the book is accompanied by a CD (e.g. a site dedicated to software downloads). Complexity should be valued, not punished.
Look at the search facility of my Web site. I still don't call it a search engine. It only indexes my Web pages. I agree close to 100% on the relevancy of searches of my Web site. If such technology could be expanded to index the entire Internet, it could become a high quality Internet search engine. Perhaps some programmer of Internet searching decides to open-source his/her script or program. Hundreds, if not thousands of Internet programmers, could chip in and improve the original to a highest quality of Web searching. I name that level:
At this point of Internet infancy - the year of grace 2006 - the three major Internet search engines look very much alike. Looks like Google, Yahoo, and MSN (now Bing) apply the same search algorithm to the same search index. The difference is in the IT capacity. Google beats them all because of a far larger computer capacity. Many more computers and far larger storage capacity (hard drives) translate to many, many more pages indexed. It also looks like the Big Three are applying now my concept of book paradigm - to some extent. I can teach them even better tricks...
Read related articles:
Doctor in Occult Science of Searching (OssD)