Farming the Web

Home · New · Introduction · Mission · Services · Discussions · Contact · Contents
 
Endorsements  

The Internet has shaken up the world of business intelligence, especially in the retail sector. Want to find out what your competition is doing? Just log on to the Internet, and browse their online catalog.

Never has retail competition been so fierce, and never has business-intelligence data been so accessible. Online catalogs are a mother lode of your competition's pricing and inventory data, and it's in a machine-readable form. But getting at the data in online catalogs in a consistent way isn't always easy. Online catalogs are often vast and have relatively complex layouts.

The idea of extracting business intelligence from your competition's Web pages can be described as "Web farming," a term created by Dr. Richard Hackathorn. In his book, Web Farming for the Data Warehouse (Morgan Kaufmann, 1998), Hackathorn defines Web farming as "the systematic refining of information resources on the Web for business intelligence."

Pricing and product information is core intelligence data for retail-product industries, and many retail E-businesses employ intelligence gatherers who track the online catalogs of the competition. Automating this gathering effort is a necessity, and various software applications are emerging that make it fairly straightforward.

This category of products and services is still quite young. Hackathorn's book is the definitive reference, and his Web site (www.webfarming.com) tracks many of the products and related standards. He argues eloquently that the "benefits of Web farming can be global in scope for the enterprise." And I agree.

- InternetView: Farming The Web By Jason Levitt, InformationWeek Columnist, November 1, 1999. click for more information

 

Richard Hackathorn clearly remembers the moment the central idea for his new book Web Farming for the Data Warehouse: Exploiting Business Intelligence and Knowledge Management (Morgan Kaufman Publishers, Inc.: San Francisco 1999) came to him. He was sitting through a tedious presentation at a conference for chief information officers in 1997 when he jotted on his notepad that "the Web is the mother of all data warehouses." Hackathorn, a former professor who worked on fundamental concepts of enterprise systems, database management, decision support and data warehousing, is a well-known industry innovator and consultant. He defines Web farming as the systematic refining of information resources on the Web for business intelligence. To achieve that goal, relevant content found on the Web must be refined into a form that is compatible with a data warehouse.

Although the Web is a dynamic and expansive information space, it is a database designer’s worst nightmare, Hackathorn notes. It is a free-form combination of text, images and virtually any kind of information object for which somebody can develop a browser. It has no structure at all, just a series of links and pointers, many of which no longer work. Finding information on the Web, Hackathorn observes, is like trying to find a needle in a haystack while people are constantly adding and subtracting from the pile.

And while the Web has been productively used to distribute information from data warehouses to users for analysis, nobody has seriously addressed the possibility of using Web data as input for a data warehouse. Hackathorn makes just that argument in this book. While significant barriers must be overcome to make Web content suitable for a data warehouse, the benefits, he suggests, outweigh the costs.

Hackathorn has divided his book into four parts. Acknowledging the negative reaction many IT professionals have to even trying to incorporate Web content into a data warehouse, his first section is primarily motivational. He provides the business argument for Web farming -- noting that the efficient processes to turn data into information and then knowledge are essential to the well-being of any enterprise.

The second section of the book lays out a strategy for initiating Web farming efforts inside an enterprise. Adopting a structure first articulated in the 1970s by Richard Nolan, a professor at the Harvard Business School, Hackathorn lays out a four-step process. First, the business case based on the objectives and business environment of the enterprise must be made for Web farming. Then, the concept has to be accepted and an infrastructure built. Next, pipelines to users have to be established. Finally, the Web content must be structured for the warehouse.

The final two sections of the book look at some of the tools and sources of content available to implement Web farming projects -- there currently is no single solution for Web farming -- and the social and cultural ramifications of these efforts. In an extremely interesting section at the end of the book, Hackathorn proposes a code of ethics for Web farming.

Clearly, Hackathorn has a vision. As he notes, Web farming is not about technology or the Web. It is about basic business practices in the contemporary environment. Moreover, he has the certainty that a visionary needs. Indeed, he writes, "The development of Web farming is certain. It will become a standard function within data warehousing systems as companies strive in desperation for their next competitive advantage." Doing business as usual, he writes, will no longer be a viable strategy.

- Drill Down: Who Wrote the Books for DW? by Elliot King, an Associate Professor of Communications at Loyola College in Maryland. Book review in Enterprise Systems Journal, June 11, 1999 click for more information

 

Thorough: enough theory and plenty of examples. Dr. Hackathorn's compendium of data farming theory, techniques, and resources is about the most useful guide you can find for understanding the mining possibilities of the sprawling Internet. Not too technical first half is readable, and the second half is a treasure-trove of tools and resources.

- A reader from Maryland, Amazon.com Review, June 4, 1999 

 

Ever hear of Web farming? Neither did we, until tipped off to this site. Web farming takes the best of data mining, intelligent agents and push technology, and creates a whole new discipline. Consultants and those selling just-named technology ought to learn more about Web farming -- it just might turn out to be the next "knowledge management."

Web farming is best defined as business intelligence using Web-based information resources. Similar to data mining, Web farming deals not with internally stored information, but the collection and internalization of information from external sources.

In layman's terms, Web farming is Web surfing, polished and with a purpose. Instead of searching the Web for information, Web farmers look for reliable data "seeds" which when combined and watered daily reap a harvest of usable information. The agriculture metaphor becomes tiresome but we think you get the point.

Why Web farming? In today's speed-of-light market twists and turns, internal data becomes less of a factor. In fact by focusing only on internal data you may actually be more vulnerable to outside factors. This site creates the wonderful analogy of a person "serenely contemplating his navel" while an unseen lion moves in for the kill.

At Webfarming.com you'll find lots of background information on this emerging practice, including links to Web Farming articles in other magazines. A thin discussion forum offers the chance to share ideas on Web farming. Information on upcoming presentations, seminars, and workshops are to come in the next year. We'd like to see more content, and the design needs work (canary-yellow links on a white background?).

The site also represents a novel idea: write a book and promote it online with resources and discussion forums. While we expect Random House isn't beating down the barn doors to get at Webfarming.com's secrets, this use of the Web to promote the print medium is noteworthy.

The future of Web farming is up for debate, though the idea has merit. Even if you have little interest in Web farming, you should check out this site for no other reason than to say you knew it before it was big news.

- KM World Website Reviews (April 6, 1999)

 

In Web Farming for the Data Warehouse, author Richard D. Hackathorn applies his 30-plus years of information expertise to the novel concept of "Web farming." He lays out the methodology of cultivating the global Web for information relevant to an enterprise's operation. Although this title is targeted at information managers in large organizations, the basic ideas contained within can easily be applied to businesses of all sizes to some degree.

The first part of the book, aptly titled "Plowing the Soil," presents the importance of gathering information to build institutional knowledge for competitive reasons. The author explains how Web farming fits in with the more established concept of data warehousing and emphasizes the way high-capacity information gathering can change business processes.

In the central portion of the book, the author explains the process of moving an organization to Web farming from both technological and managerial perspectives. Then he gets into the details, explaining all of the related Internet standards, information tools, and online databases waiting to be tapped. This section is amazingly comprehensive.

The book finishes with a discussion of privacy and the effects of new information technology on society. If you're interested in Web farming or simply want a taste of where the Internet is likely to take us, this title is sure to provide a fresh perspective.

-Stephen W. Plain, Amazon.Com

 

The gap between Internet and data warehouse is an important one to be bridged. This book does a very good job in spanning it.

-Bill Inmon, Founder, Pine Cone Systems, Inc.

 

This book is about using the web as a source of data for the data warehouse.  It gives good advice on separating useful web information from the useless "flaky-free" data.  As Richard points out, the internet is the "mother of all data warehouses" but few corporations have begun to harvest the hundreds of sources of data that will allow them to look at what is going on outside the enterprise.  His vision and the simplicity of explanation make this book a road map for the next five years of Business Intelligence.

What makes this book doubly useful, aside from the easy to read writing style, is that Richard has melded together the three biggest trends in our industry into a single strategy.  Combining the Internet, data warehousing, and knowledge management into one vision, Richard gives us insight into the next wave that will crash upon the industry.

I've been preaching this message to our customers only to find someone has written an entire book on it!

- Dan Graham, Strategy & Solutions Executive, IBM Global Business Intelligence Solutions

 

Hackathorn delivers a creative, thorough, and down-to-earth guide that lets you harvest Web information to improve decision-making and to enrich data warehouses.

-Maurice Frank, Business Intelligence Services National Practice, IBM Global Services

 

I have reviewed Dick Hackathorn's book, Web Farming for the Data Warehouse, and am very impressed.  He has done an outstanding job of surveying the many aspects of this new but rapidly emerging field.  His insight that "...the Web is the mother of all data warehouses" was only the starting point for the intellectual journey he describes in easy to digest prose.  Known to many as the "father of middleware" because of his pioneering work early in this decade, Dr. Hackathorn is once again in the forefront of another  technological wave.

-Dr. Donald R. Deutsch, Vice President, Middleware Development and Interoperability Center, Sybase Corporation

 

Great book on an important subject. . . Dr. Hackathorn's new book on web Farming is an important look at the merger of two major technologies - data warehousing and the World Wide Web. Readers will see the enormous value that can be gained from a systematic approach to collecting web information. Hackathorn's writing style makes the subject understandable to both the business manager and IT professional. The extensive list of resources is helpful to those who wish to quickly implement Web Farming systems.

-Christopher Ryan, President and CEO, Saligent Software

 

¨       Brings the reader to a new frontier of information processing ... a tantalizing proposal for those interested in the dynamics of being market responsive...

¨       The synthesis of many years of study in information warehousing surprisingly leads the reader to yet another plateau of applicability...

¨       A thoroughly researched, and thought stimulating book ... if you want to learn what will be happening in the Information Warehouse space over the next few years, get started by reading the book!

-Ian MacFadyen

 

Now that I have had the opportunity to review the book, I can say that Dick's work and the book based on that work represents one of the most exciting enterprises in the new world of human function and work. The process of identifying web content, acquiring it as validated sources, structuring it for storage and retrieval, disseminating it effectively based on tight customer profiles, and managing these tasks as part of a new data center service agency is indeed best practice in value-added knowledge management.  This book provides a cogent primer and description of the motivation, perspectives, foundations, methodology, architecture, management, standards, tools, resources, techniques, information landscape, challenges and exciting opportunities associated with web farming.  It describes a journey (not a destination), its emerging technology as applied to the web, and the many potential ways that this journey (web farming) will transform business function and management today and beyond.

Feel free to use any part of this, including my name, as you deem appropriate.

-James F. Williams,II, Dean of Libraries, University of Colorado

 

Web Farming for the Data Warehouse is a compendium of information that is best-described as systematically making intelligent use of the Web. The author makes a strong case for the Web as a valuable source of information for data warehousing and business intelligence. He explains the methodology for farming the Web -- creating a plan, building the infrastructure, identifying information sources, extracting data, analyzing it, and presenting information.

The book contains a wealth of information about content-providers, protocols, standards, tools, discovery services, knowledge management, Web agents, and data mining software. The book also addresses topics such as leveraging knowledge and creating information markets. The author, Richard Hackathorn, is a recognized expert in enterprise computing and database connectivity. This is his second book on data warehousing. He co-authored Using the Data Warehouse (Wiley) with Bill Inmon.

Hackathorn's book is as close to leisure reading as I expect to find in an IT book. It belongs on the must-read list for anyone having an interest in exploring the Web's potential as an information resource.

- Ken North, consultant and database expert

 

This is a breakthrough book about gaining competitive advantage through effective use of information-based technology.

-Jerry Donahue, President, BTI, the NBIA winner of International Technology Incubator-1998, and U.S. SBA National award for Technology Development.

 

The book is an excellent survey on the state of the art in web-based information gathering.  Dick covers the gamut-it's a A-to-Z resource for effective tools, data, and techniques for anyone who bears the dubious title of "knowledge worker".  WebFarming is well researched by an industry veteran on the cutting edge of web-based information gathering and business intelligence.

-Jim Harding, Chief Technology Officer, Cartia, Inc

 

Dr. Richard Hackathorn presents a different philosophy and more effective manner of collecting data using the vast World Wide Web, in his concept of Web Farming.  His idea encourages the user to look at the big picture, using greater vision, and long-term focus.  The practical process of data collection used in Web Farming is presented in an easy manner in Web Farming for the Data Warehouse, as Dr. Hackathorn's writing style is conversational and simple to follow.  The illustrator he selected compliments this manner, as the pictures are clear, humorous and entertaining.  I do not possess a hard technical background, yet I have worked in the high-tech industry for the past 6 years.  I was able to read this book and understand the message Dr. Hackathorn was conveying.  As a manager, understanding the concept of Web Farming and knowing the tools necessary to set up a web farm will play a great role in my career in high technology.  I highly recommend this book to professionals in the data warehouse and research community, managers in the high-tech industry, or any industry for that matter, and those visionaries looking towards the horizon.

-Angenette N. Rider, Manager, Access Graphics

 

In early 1998, at the Fortune IT Strategy Forum, Peter Drucker told his audience that "the single biggest challenge you face is to organize outside data, because change occurs from the outside" and went on to observe that today's management, while swamped with inside data, doesn't have any more real information than it did 40 years ago. Richard Hackathorn's book addresses Drucker's point in spades. Frankly, the book is ahead of its time. I think that not only will it help readers think outside the proverbial box, but also give them the roadmap for implementing their own Web farming. Outstanding list of annotated resources as well.

-Karen Watterson, data and knowledge warehouse design consultant


Home · New · Introduction · Mission · Services · Discussions · Contact · Contents

Copyright ©1998-2003 Bolder Technology, Inc. dba  WebFarming.com.
All rights reserved worldwide. Revised 2003-06-02 04:56 PM
Site Design by A Net Presence, Inc.