| The most difficult part of web farming is the
rendezvous with the data warehousing system. Many people pursue data warehousing systems
for simplistic reasons and with unrealistic expectations. These systems often become
black holes into which data is poured never to be seen again. Both the Web
and data warehousing are hot technologies receiving considerable attention within the IT
industry. In many areas, the combination has proven highly successful. However, no one has
seriously considered extracting content from the Web and using it as input to the data
warehouse. Reactions to using web content tend to be negative. Web content is too
unreliable and unstable for business decisions. The interaction with web sites is too
messy. Transformation of hypertext into a structured database is often impossible. Images
and sound contain a lot of hidden content but are not discernible to a machine.
Consider a simple data schema for a sales
warehouse. In this warehouse, we have sales data by customer, product, and store
aggregated on a weekly basis. Let's assume that we have mostly corporate customers, rather
than individuals, as in a large office furniture company.
Web farming would be valuable by enhancing the demographics (for example, quarterly
financials) about customers, such as you can find in the EDGAR Web site. By adding
information on customer demographics, you can perform selective marketing based on the profitability and requirements
of customers. By knowing what types of customers buy what types of products at which
stores, we can promote specific sales and anticipate demand.
Demographic information is added to the customer dimension to enhance analyses. As
experience with the demographics matures, data mining techniques can cluster customers
into meaningful categories based on demographics.
In many ways, the data warehouse is not a
requirement for Web farming. You could successfully farm the Web, reaping tremendous value
for the business and bypass the data warehouse entirely. However, establishing the Web
farming function is much easier for an enterprise if it has a mature understanding of data
warehousing. In many ways, the current benefits from data warehousing are "low-lying
fruit" -- easy accomplishments (relatively speaking) of purging the sins of
monolithic legacy systems.
Web farming will challenge us with deeper issues concerning information refinement and
knowledge management. Web farming will be an agent of change (even of a disruptive sort)
to the controlled and structured world of data warehousing. This is a necessary change --
a maturing of the basic objectives of data warehousing into a practical step toward
knowledge management for the enterprise. |