| In This Issue |
| WF Workshops |
XML Extender
in DB2 |
WF for Resumes
and Jobs |
| Seminars on WF |
Searchable Public
Records |
Intarka Prospect-
Miner |
| WF Book Review |
Globalization
To The Max |
| WF at Siebel |
WebData
Databases |
| Focused Crawling |
Latest Web
Assessment |
| Copernic Server |
R. Hackathorn
editor
Take a Quick
Survey about this
Newsletter
Comments?
Subscribe
To
This Newsletter
Previous Issues
PDF
Version
Home
Page for
WebFarming.com
The
Web
Farming Book
An Introduction
to Web Farming |
WF
Workshop Series |
A Web Farming workshop series starts on September 9 with a one-day treatment of Global
Discovery Services: Use and Misuse. Frustrated with HotBot, AltaVista and the
like? This is a MUST for you!
Sponsored by US West Advanced Technologies,
it is conducted at their Boulder facilities. |
Seminars
on Web Farming |
Three-day seminars on Building
Web Farming Systems: Methods & Tools are offered on:
- September 15-17 in San Francisco
- November 10-12 in Dallas
For full details and registration, see the DCI online
brochure. Sign up NOW. Space is limited. Content is awesome! |
Intarka
ProspectMiner |
Intarka is a startup focused on using web farming technology in various settings. Their first
product is ProspectMiner that discovers and filters sales prospects using a combination of
keyword searching, relevance analysis, and learning feedback. Funded by New Enterprise
Associates, they have 38 employees at their San Jose and West Bengal offices. Read their white paper on ProspectMiner.
It is one of the better business justifications for a topic-specific web farming
application. |
WF
at Siebel |
Siebel is following in SAP
footsteps. (see the May issue.) As a major vendor of sales automation and similar systems, Siebel offers InterActive Briefings
which gathers information on company profiles, business news, subsidiaries, and affiliates
from web-based sources. In addition, Siebel just signed an agreement with Dun
& Bradstreet to enhance their external data.  |
Focused
Crawling |
Another IBM Almaden
project is leading the way to better web farming technology. As the Best Paper Prize at
the recent WWW8 Conference, this paper describes the use of custom
crawling and link analysis to generate lists of topic-specific websites.  |
Copernic
Server |
| Copernic
has repackaged their search tool into a server for businesses wishing to offer their own
specialized meta-search site. See the MetNets site as an example. |
|
XML
Extender in DB2 |
In the June issue, a brief mention was made about the new XML
Extender in IBM DB2 UDB. Here is a
longer description of the functionality supported by this extender, as it is going through
beta testing.  |
WF
for Resumes/Jobs |
The management
recruiting industry is hot! With a tight labor market for high tech personnel, these firms
are under pressure to find the right people quickly. Guess what information resource they
are using - the Web!  |
Searchable
Public Records |
Search Systems offers a directory of 982
searchable databases containing land records, court cases, licensing registrations, and
much more for most of the U.S. state governments.
 |
WF
Book Review |
The June issue of Enterprise Systems Journal contained a review
by Prof. Elliot King at Loyola College of the book Web Farming
for the Data Warehouse. He has captured the spirit, along with the content, of the
book.  |
Globalization
To The Max |
A recent book The Lexis and the Olive Tree
by Thomas Friedman has created quite a stir. Regardless of your political
orientation, the book contains insights into the instant global world in which we now
live. Great airplane reading. And, be sure to read the Amazon.com reader
reviews. |
WebData
Databases |
ExperTelligence
offers WebData, another
comprehensive guide to searchable databases. Excellent Yahoo-style
organization. They specialize in "finding, categorizing and organizing online
databases." |
Latest
Web Assessment |
Ever wonder how big the Web
really is? The latest assessment comes from Steve Lawrence and Lee
Giles of the NEC Research Labs. Based on a random sampling of IP
addresses, they estimated that there are 2.8 million active websites containing 800
million indexable webpages in 15 TB of text (only 6 TB if you strip out the HTML junk).
More importantly, they estimated that the major search engines are covering only 16% of
the indexable pages, with a strong bias towards popular, U.S. and commercial sites.
Request the free report.  |
|