Unused data in the data warehouse is of no use at all. It takes up valuable space and often has a negative effect on the performance. For that reason, we generally don’t load all the data we have available in our source systems into a datawarehouse. We select only the data that is considered relevant for decision-making, customer relationship management (CRM) and performance management. However, after analysis and reporting, you may want to have some detailed data available to be able to take certain actions.
Direct access to operational systems
A large pharmaceutical wholesaler wants, due to the economic climate, to cut inventory without affecting the delivery service levels. After thorough analysis of the revenues, logistics management decides to change the policy regarding the product assortment. Products sold less than five times a year will be removed from the product list, with the exception of those that are bought by loyal or profitable customers. The vendors of these products need to be informed.
To do this properly the purchasers should have a list of vendors, with their product codes, their full address data, the contact people, their phone numbers and other detailed product data. That type of information is not available in the data warehouse itself, so either this data needs to be transferred from the operational systems or it needs to be done manually, which means a lot of work.
Another possibility is to get the IT department to take the data that has been selected from the data warehouse and use it to select data from the original source systems to provide the additional information required. Finally, the selection that has been created should be downloaded to allow reporting, or possibly a mail merge to send a letter to the vendor. All in all not what anyone would call simple closed-loop business intelligence!
A new concept: on-demand data integration
All this complex moving data backwards and forwards is no longer necessary with a concept called on-demand data integration. First, all of an organization’s data is modelled in a meta data repository, which could be the repository of the ETL tool. Per data element, we define the transformations, straight forward for simple detail data, or more complex steps for derived data like performance indicators.
In addition, we decide which elements we need to store in the data warehouse (for most organizations the focus here is on decision making) and which elements we do not need (mostly operational detail data).
Webservices provide one interface to corporate data
After building and loading the data warehouse, we build a webservice that presents the meta data layer to the user; this will probably be either one of the Business Intelligence solutions or a portal. All the data the organization keeps about customers, products, vendors, processes, etc. is presented to the user through one interface. To the user it looks like one system, in the background different non-integrated systems are involved.
How does it work?
The webservice interacts with the ETL tool, so that when a user asks to see specific information, the webservice knows if the data is stored in the data warehouse, or should be extracted, integrated and transformed real-time from the source system(s). In the background, an ETL process is called to retrieve and transform the data.
For performance reasons it is necessary to ensure that the operational system(s) are not overburdened with queries retrieving a lot of data.
It is also important that for on-demand data integration that all the different components of the architecture are properly integrated; that means the reporting tools and portal with the webservice, the webservice with the ETL tool and meta data repository, and the ETL tool with the source systems.
If it works properly, on-demand data integration fully supports the principle of closed-loop (operational) business intelligence. Employees can easily switch from the analysis phase to the action phase, being able to base their decisions (and consequently their actions) on reliable information and up to date knowledge.
The operational data store – ODS
Some organizations have built an operational data store (ODS), for the kind of detailed information described above allowing analysis and reporting on both detailed and near real-time data. The data in an ODS is generally fairly new, not more than a few hours (often minutes) old, and has same structure as the source systems. Although such a database can be valuable, it requires a lot of work to built and maintain it, even with an ETL tool.
Additionally it makes the data warehouse architecture much more complex. However, an operational data store may be necessary when you require (near) real-time insight into business processes performance, or when highly detailed information is needed. These goals are often difficult to combine with a central data warehouse, modelled with star- or snowflake schemas, and optimized for (high-level) decision-making.
When an ODS is built, most organizations try to maximize its’ use, moving the operational reports built-in the source systems to the ODS, by doing so, a fully independent (operational) reporting environment has been created. All the information needed for operational, tactical and strategic decision-making can be found in one system.
Organizations should be aware of the concept and advantages of on-demand data integration, and decide for themselves if it can be useful. If so, it can be seen as an alternative to building a relatively expensive ODS.