* CSU Stanislaus Library Library Home

Searching the Web:

A Brief Guide and Tips

Search Engines | General Subject Directories | Portals

General Search Tips

Invisible Web

Conducting Scholarly Research on the Web


What's Behind Internet Searching?

There are two main methods web sites use to compile a list of links which can be retrieved through a "search box."

  1. Computerized Search "Bots"

    Special computer programs (sometimes called spiders or bots) record every word on each web page they can find and note them in a database of web sites. The most sophisticated programs gather information on billions of web pages. The four most commonly used of these databases are compiled by Google, Yahoo, Microsoft Live and Teoma. In addition to their own search sites, these companies often sell access to their databases, which are then used to provide results for other web search sites.

  2. Directories Compiled by Humans

    At some sites, a real person must examine and agree to add a specific web site before it is included in a directory. While they cannot match the number of sites gathered by the search bots, the people who compile these directories can exclude links of dubious quality, ensuring only the best sites are retrieved through their search interface. Examples include the WWW Virtual Library, the Librarian's Index to the Internet, the Open Directory project, and and the directory sections of commercial sites like Yahoo!


Search Engines

Search engines allow users to search for words found on any web page included in their coverage. Special programs (sometimes called spiders or bots) examine the content of web pages on web servers they can find. These programs record words and links from each page in a database, which is searchable by users via sites like Google.

Major Search Engines

In Fall, 2005, four companies provided large, unique databases of web sites and dominated search engine usage.

Google

URL: http://www.google.com/

- covers more sites than any other single search engine

- most popular search site due to size and relevancy of results

- options include searching for images, Usenet News Groups content

- Advanced Search provides options for refining search

Yahoo! Search

URL: http://search.yahoo.com/

- options also available for images, news, video, shopping, and their subject directory

- Advanced Search provides options for refining a search

Live Search

URL: http://www.live.com/

- also known as Windows Live Search or Microsoft Live

- options also available to search news, images, maps, and more

Teoma (ExpertRank)

URL: http://search.ask.com/

- the heart of Ask.com

- Advanced Search provides options for refining a search

Examples of Other Unique Search Engines

Other companies continue to develop their own unique databases of websites, and often develop innovative techniques in searching.

Gigablast

URL: http://www.gigablast.com

Exalead

URL: http://www.exalead.com/


General Subject Directories

Compilations of sites selected by human compilers, categorized by subject (i.e. - not gathered automatically by a computer program like a "spider" or "bot").

Examples

Open Directory

URL: http://dmoz.org/

- searches 2.6 million pages listed in 360,000 categories

Internet Public Library

URL: http://www.ipl.org/

- links to pages selected by librarians organized by subject and age

About.com

URL: http://www.about.com/

- links to sites compiled by various volunteer guides

- also includes original articles written by those guides

Yahoo! Directory

URL: http://dir.yahoo.com/

- formerly the core of the Yahoo! service, it has been eclipsed by the search and news found in Yahoo!'s main "portal"


General Search Tips

Each search site is slightly different, but the basic search on most search engines retrieves documents that contain any word in the search box. Some search engines return links to documents that have any one of the terms searched. Others return only those links that include all terms searched.

Most search interfaces include an "advanced" option to customize your search. Many of the following features work in other web search interfaces.

Common features on most search sites:

  • Exclude specific term: - (minus sign)

    condit -levy — results contain condit but not levy

  • Include term: +

    +asian +americans +marriages -arranged — all terms but "arranged" in results

  • Phrase search: " "

    "consumer confidence" — results contain words next to each other

Google (http://www.google.com/)

See the Advanced Search page for many options on searching Google. Many of these features are also available by typing specific characters in your search box.

  • Match all terms: default (basic) search

    asian americans arranged marriages — all terms in every result

  • Match either term — OR

    qaeda OR qaida

  • Exclude specific term: - (minus sign)

    condit -levy — results contain condit but not levy

  • Phrase search: " "

    "consumer confidence" — results contain words next to each other

  • Limit search to title of document: intitle:

    research guide intitle:asian intitle:american

  • Limit search to a specific web server domain: site:

    library site:csustan.edu

  • Limit search by a term in a URL: inurl:

    energy policy inurl:gov

For more help: Search Engine Watch (http://www.searchenginewatch.com/)

See especially their pages on:


Portals

Portals usually provide access to a pre-compiled subject directory, a general web search engine, and links to other useful sources like news, weather, telephone directories, maps, zip code searches, shopping sites, classified ads, etc.

Examples

Yahoo!

URL: http://www.yahoo.com/

- one of the most popular web sites

- directory acts like a "Yellow Pages" of the websites

- also includes telephone listings, maps, shopping sites, email searches

Netscape Netcenter

URL: http://www.netscape.com/

- uses the Google database to retrieve results from around the Web

MSN.com

URL: http://www.msn.com/

- includes current news, an encyclopedia, access to mail, etc. along with the search engine


Invisible Web

The "Invisible" or "Deep" web consists of information that is available through the web but is not stored in static web pages. This information is retrievable through thousands of different dynamic databases available on the web, but usually not through a search engine like Google

Some of the invisible web database sites are available for free to anybody using the web. (e.g. - U.S. Department of Education NCES statistical tables). Others require users pay a fee to access the information on the web (Valueline stock reports).  The University Library's Research Databases page provide access to many journal articles and other research materials that are part of the Invisible Web. 

Example: Free Site

Education Statistics at a Glance

URL: http://nces.ed.gov/edstats/indisearch.asp

- search and retrieve tables from 3 main annual publications

Example: Fee-Based Sites

Valueline Investment Survey **

URL: http://www.valueline.com/

- provides analyses of most major industries and companies

- automatic access on-campus

- the link on the library's web page tells the company the library has a subscription. It is not free to the general public.

University Library Databases **

URL: http://library.csustan.edu/databases/

- 100+ specialized research databases via the University Library

- full-text of many articles are linked from the search results page or each database; others are available through the University Library's Find It! system

** Indicates the resource is not free to the general public, but is accessible at no charge through the University Library's subscription (and requires a current Stanislaus ID# to access off-campus).

Library Catalog | Contact Us | Quick Links | What's New | Help


This document is maintained by: the CSUS Library(wwwlibrary@wwwlibrary.csustan.edu)
Page updated: 08/25/2008