I've spent many thousands of hours on Internet searching for information, jobs, contracts, people, and other items of interest. You can literally find out anything! The trick is learning how to find relevant and hidden information in an efficient manner. This is job of an 'Internet Sourcer.'
--- What is an Internet Sourcer? ---
The Internet Sourcer is a relatively new position for many organizations. The most common use of a Sourcer is in recruiting and talent-search fields. Usually, a Sourcer scours Web for resumes and candidates using several search techniques to ensure their searches are complete and accurate. Some of better Sourcers come from computer industry and work independently as well as have an extreme amount of focus, patience, and inquisitiveness.
--- Data Mining ---
Data mining uses various techniques to examine data and organize that data into a meaningful presentation. This is also a part of an area known as Knowledge Management---an entirely different world and best left for a later tome.
* Finding Information
As applied to Internet Sourcing, data mining consists of a set of search techniques (i.e., Flip Search, X-Ray, Peel Back) to acquire information. These techniques allow you to locate relevant and hidden information on Internet that would otherwise be out of your reach. Each of techniques, mentioned shortly, can be applied to any of larger search engines such as AltaVista (http://www.altavista.com/) and HotBot (http://www.hotbot.com/).
* Organizing Information
Once you locate information, you have to organize it by relevancy. This can be accomplished with various tools, including, Correlate (http://www.correlate.com/). This tool allows you to organize links, text, and documents in a tree format to better view and understand information you've acquired.
--- Various Search Techniques ---
Locating information on Web is not as straightforward as you might think. Of course, you can always do a simple keyword search and locate a few thousand links, of which only 25% to 50% are truly relevant to your specific search. To really dig into Web, you need to understand three search techniques explained below. To present valid examples, following explanations use techniques for searching potential candidates and resumes on Web.
* Flip Search
Flip Search locates items by link association. For instance, instead of searching for potential candidate pages based on specific keywords, Flip Search returns pages that are 'linked to' a target Web site. Links might be personal homepages, colleges, industry organizations, companies, publications, or associations. Each of these 'linkers' represents a potential of providing candidates or related information.
Two of primary search engines that support various Flip Search mechanisms are as follows. Once you understand premise for this search, you can determine specifics for other major search engines on Web.
- AltaVista: On 'Advanced Search' page, in Boolean Query text field, enter 'link:host.com AND homepage AND "java programmer"' and press Enter. With this search string, you're searching for all links that are associated with keywords 'homepage' and '"java programmer".' You can refine search using skills, job titles, and any term that might refine your search target.
- HotBot: On 'Advanced Search' page, enter URL or domain name in Search text field. In Look For drop-down box, select 'links to this URL'. Refine your search by entering skills, job titles, and any term that defines your search target in 'Word Filter' text fields.
Examine results as you work with different searches to see how this search works. It is extremely powerful and can generate numerous relevant links for any given search condition.
* X-Ray Search
Most sites have documents that aren't accessible through links on their site's pages---hidden from view, yet publicly available. The X-Ray technique searches files in a server and lets you view most of these 'hidden' documents.
To try this out, go to AltaVista's 'Advanced Search' page and type 'host:tripod.com' in Boolean Query text field. Like 'link:', 'host:' tells search engine to look for keywords in documents on specified Web site---the Web site for 'tripod.com' domain.
When you click 'Search' button, you could end up with several million documents from your target host. To obtain a more manageable group of results for this example---look for freelance writers. For example, enter following search string into Boolean Query text field: host:tripod.com AND "freelance writing" When I did search, I got about 100 results. Consider that, intuitively, many people name their resume page 'resume.' With this assumption, let's fine-tune search again to look for resumes using following search string: host:tripod.com AND title:resume AND "freelance writing" The word 'title:' tells engine to look for keywords in tag in header of a Web page--- text that appears at top of your browser's window.