Data Warehousing Term paper
While the free essays can give you inspiration for writing, they cannot be used 'as is' because they will not meet your assignment's requirements. If you are in a time crunch, then you need a custom written term paper on your subject (data warehousing)
Here you can hire an independent writer/researcher to custom write you an authentic essay to your specifications that will pass any plagiarism test (e.g. Turnitin). Waste no more time!
Contents
1. Introduction
2. What is a data warehouse
3. Past, Present and Future
4. Data Warehouses and Business Organisations
5. Conclusion
6. Bibliography
1.0 Introduction
In recent years, data warehousing has emerged as the primary method of analysing sales and marketing data for a competitive advantage. As the number of knowledge workers using the data warehouse/data mart grows and the amount of data increases daily, performance problems have become a major concern of both the Information Systems staff and the users.
Many options have been tried in an attempt to solve the performance problems - from bigger hardware to different software or database tuning and redesign using star schemas or snowflake data structures. However, all have limitations - either in functionality or in terms of cost - and their strengths are almost inevitably outstripped by users' demands.
During the past three years, data warehousing has emerged as one of the hottest trends in information technology for corporations seeking to utilise the massive amounts of data they are accumulating.
Managers from all business disciplines want enterprise wide information access, as well as the ability to manipulate and analyse information that the company has gathered for a single purpose, to make more intelligent business decisions. Whether to increase customer value, identify new markets or improve the management of the firm's assets, the data warehouse promises to deliver the information necessary to accomplish these tasks quickly and efficiently.
This report entails various aspects of Data Warehousing, ranging from a clear and concise definition of its working system through to its operational environment. It discusses its implications and effects on internal and external interaction. I have presented my finding with the backing of some actual case studies and elaborated upon the evolution, the current state and what the future holds for Data Warehousing.
The report is summarised by a final conclusion.
2.0 What is a Data Warehouse
Data warehouse is the center of the architecture for information systems in the 1990s. Data warehouse supports informational processing by providing a solid platform of integrated, historical data from which to do analysis. Data warehouse provides the facility for integration in a world of unintegrated application systems. Data warehouse is achieved in a step-at-a-time fashion. Data warehouse organises and stores the data needed for informational, analytical processing over a long historical time perspective.
There is indeed a tremendous advantage in building and maintaining a data warehouse.
So now the question arises, what is a data warehouse?
A data warehouse is a
+ subject-orientated
+ integrated
+ time-variant
+ non-volatile
collection of data in support of management s decision making process.
The data entering the data warehouse comes from the operational environment in almost every case.
The data warehouse is always a physically separate store of data transformed from the data found in the operational environment.
To understand the data warehouse in more detail, I shall now elaborate upon its main characteristics.
2.1 Subject-Orientated
The first feature of the data warehouse is that it is oriented around the major subjects of the enterprise. The data-driven, subject orientation is in contrast to the more classical process/functional orientation of applications, which most older operational systems are organised around.
For example if an operational world was designed around applications and functions such as loans, savings, bank card and trust for a financial institution. The data warehouse world would be organised around major subjects such as customer, vendor, product and activity. The alignment around subject areas affects the design and implementation of the data found in the data warehouse. Most importantly, the major subject areas influence the most important part of the key structure.
The application world is concerned both with database design and process design. The data warehouse world focuses on data modelling and database design exclusively. Process design is not part of the data warehouse environment.
The differences between process/function application orientation and subject orientation show up as a difference in the content of data at the detailed level as well. Data warehouse data excludes data that will not be used for Decision Support System (DSS) processing, while operational application-oriented data contains data to satisfy immediate functional/processing requirements that may or may not be of use to the DSS analyst.
2.2 Integration
Easily the most important aspect of the data warehouse environment is that data found within the data warehouse is integrated. Always, with no exceptions.
The integration shows up in many different ways - in consistent naming conventions, in consistent measurement of variables, in consistent encoding structures, in consistent physical attributes of data, and much more.
When data is moved to the data warehouse from the application-oriented operational environment, the data is integrated before entering the warehouse.
Over the years the different applications designers have made numerous individual decisions as to how an application should be built. The style and the individualised design decisions of the application designer show up in a hundred ways. In differences in encoding. In differences in key structures. In differences in physical characteristics. In differences in naming conventions, and so forth.
The collective ability of many application designers to create inconsistent applications is legendary.
I have below shown 2 examples to simplify my explanation:
Encoding - application designers have chosen to encode the field GENDER in different ways. One designer represents GENDER as an "M" and an "F." Another application designer represents GENDER as a "1" and a "0." Whist another represents GENDER as an "x" and a "y." And yet another represents it as "male" and "female." It doesn't matter much how GENDER arrives in the data warehouse. "M" and "F" are probably as good as any representation. What matters is that whatever source GENDER comes from, it must arrive in the data warehouse in a consistent integrated state. Therefore when GENDER is loaded into the data warehouse from an application where it has been represented in other than an "M" and "F" format, the data must be converted to the data warehouse format.
Measurement of attributes - application designers have chosen to measure pipeline in a variety of ways over the years. One application designer stores pipeline data in centimetres. Another, stores pipeline data in terms of inches. Whilst, another stores the data in million cubic feet per second. And another designer stores pipeline information in terms of yards. Whatever the source, when the pipeline information arrives in the data warehouse it needs to be measured the same way.
The issues of integration affect almost every aspect of design - the physical characteristics of data, the dilemma of having more than one source of data, the issue of inconsistent naming standards, inconsistent date formats, the list is endless.
Whatever the design issue, the result is the same - the data needs to be stored in the data warehouse in a singular, globally-acceptable fashion even when the underlying operational systems store the data differently.
When the DSS analyst looks at the data warehouse, the focus of the analyst should be on using the data that is in the warehouse, rather than on wondering about the credibility or consistency of the data.
2.3 Time Variancy
All data in the data warehouse is accurate as of some moment in time. This basic characteristic of data in the warehouse is very different from data found in the operational environment. In the operational environment when you access a unit of data, you expect that it will reflect accurate values as of the moment of access.
Because data in the data warehouse is accurate as of some moment in time (i.e., not "right now"), data found in the warehouse is said to be "time variant."
The time variancy of data warehouse data shows up in several ways. The simplest way is that data warehouse data represents data over a long time horizon - from five to ten years. The time horizon represented for the operational environment is much shorter - from the current values of today up to sixty to ninety days. Applications that must perform well and must be available for transaction processing must carry the minimum amount of data if they are to have any degree of flexibility at all. Therefore operational applications have a short time horizon, as a matter of sound application design.
Another way that time variancy appears is that data warehouse data, once correctly recorded, cannot be updated. In some cases it may be unethical or even illegal for data in the data warehouse to be altered. Operational data, being accurate as of the moment of access, can be updated as the need arises.
2.4 Non-volatile
The fourth defining characteristic of the data warehouse is that it is non-volatile. This basically refers to the factor that the information in the operational environment needs to be changed, deleted , updated and other data inserted, whereas the data in the data warehouse has only two operations, the initial loading of the data, and the access of the data. This seemed very simple to me, but after extensive research I understood that its implications were very powerful.
For example, at the design level, the need to be cautious of the update function holds no importance at all, since update of data is not done. Therefore at the physical level of design, liberties can be taken to optimise the access of data, particularly in dealing with the issues of normalisation and physical de-normalisation.
3.0 Past, Present and Future
Data warehouses represent the latest great paradigm of database management. The earliest data management systems were hierarchical, run on massive mainframes, and were used primarily for archival purposes. The first big change came in the early 1980's, with the adoption of relational database systems, which have primarily operational applications. These systems, typically run on minicomputers, are used for online transaction processing, or O.L.T.P., to operate networks of automated teller machines, for example. Now come data warehouses, commonly run on client/server networks of personal computers and more...
MLA Style
. EssayMania.com. Retrieved on 23 May, 2012 from
<http://essaymania.com/133409/data-warehousing>
More College Papers
Darwinism 2 essay
Darwinism
Darwinism, a scientific theory that supported the belief of evolution, was manipulated and applied to different areas of life, and thus it became the shaping force in European thought in the last half of the nineteenth century. Darwin, through observation of organisms, determined that a
Darwin And The Victrian Era essay
Darwin and the Victrian era
The Victorian Age was a time when many views on human existence and destiny were formed and discussed. Strictly speaking the Victorian era denotes the reign of Queen Victoria from 1837-1901. When this era came to an end, the ongoing concepts and controversies did not van
Dangers Of Steriods essay
Dangers of Anabolic Steroids
In the past three decades, steroids has been becoming a
serious problem more than ever in the athletic field. Steroids are
anabolic drug "to build" growth hormones that include the androgens
(male sex hormones) principally testosterone and estrogen and
progestog
