Although popular, this method of transferring data is also the most prone to problems. It is subject to these major drawbacks: unreliable transfer of data, lack of data validity, undesired downtime, and the overall latency of data gathering. 9.1.1 ETL Reliability and Data Validity The process of merging and eliminating duplicate records is not an exact science. Checking for completeness and data integrity can be a manual process. If a file transfer fails, the process may just need to be restarted. If the file transfer succeeds, but the import fails, a lengthy reconciliation may need to occur to deal with a partial import. In a global organization, not every country or region has reliable networking links. There may be slow dial-up lines, and WANs over satellite links that can be susceptible to connection problems as well. According to a CIO from a global manufacturing company that I recently interviewed, the solution to this problem often involves adding more IT staff. If an "overnight" dump-and-load batch job takes several hours to complete, compensating for failures in this process can be even more costly in terms of time. The time needed to reconcile the resulting problems may even overlap the time when the next nightly batch transfer is set to begin. This can result in a bad situation that is difficult to recover from. And there is a deeper, more fundamental problem, even when everything works as it's supposed to: the inherent latency of batch processing. In a global economy, there is no such thing as "overnight" anymore. Any business that has regional offices, business units, or remote storefronts can't really afford to just "power down" once a day in order to synchronize all of its data. The applications in question are ones that affect your ability to generate revenue, so any time they are not up and running could be costly to your organization. What's really needed is a move toward near real-time processing of information. As we will see in this chapter's case study, today's solutions often incur the overhead of missed business opportunities and unnecessary lockup of excess capital. 9.1.2 Undesired Downtime and the Logistics of Data Synchronization In a case study of another global manufacturing company that is currently adopting an ESB, the entire end-to-end integration process along the supply chain was being done using batch file transfers. Some of the applications were able to do a complete dump-and-load replace of data within a two-hour window, but the application had to go offline in order for this to occur. In other companies, a complete dump and load wasn't possible; they required a more complicated merge-and-purge process, which usually took eight hours to complete. The application involved was too important to take down every day for eight hours, so they compromised by taking the application offline once a week for a merge-and-purge operation. 9.1.3 Overall Latency of Data Gathering The biggest problem of ETL is that even when all systems are running smoothly and all exports and imports of data are happening successfully, there is still at least a 24-hour delay in getting access to real information. And this is the best-case scenario as we have seen, the delay in getting data transferred and synchronized can take up to a week. This problem is often further magnified by the number of times the data is transferred between systems (Figure 9-1). Figure 9-2. The latency of overnight batch processing is magnified across departments With the latency of overnight batch processing, it's difficult to determine the real on-hand inventory, which could result in false backorder situations or overpromised delivery of goods. Companies have traditionally been forced to overbloat inventory (described in the next section) in order to compensate for not having an accurate snapshot of their supply line at any given moment in time. To see how latency can hurt your business, let's walk through an example of a large-quantity order with a return, using a 24-hour delay as the latency period. Figure 9-2 shows the three applications involved in this process: Order Management, Inventory Management, and Shipping/Fulfillment Processing. Each night, the sales order, inventory, and customer service applications synchronize their data using an ETL process. As illustrated in the figure, the order management updates the master inventory application, and the shipping/fulfillment application also provides information about the goods returned, restocked, and reshipped (for replacement inventory). The consolidated data with a newly calculated on-hand inventory is then synchronized from the master inventory back to the order management and the shipping/fulfillment processing. Figure 9-3. Nightly batch transfers of on-hand inventory data between applications The manufacturing company has Service-Level Agreements (SLAs) in place with their buyers that specify that they must reply within 2 hours of receiving a request for an order. Let's say that on day 1, after a successful nightly batch transfer, Customer A orders 400K units, as illustrated in Figure 9-3. All applications are showing a consistent representation of 500K units in stock; therefore, the order is confirmed within the timeframe mandated by the SLA. Figure 9-4. Latency due to overnight batch transfers can lead to false indications of inventory On day 2, after another successful nightly batch transfer, all applications reflect the deducted inventory and show an on-hand supply of 100K units. As illustrated in Figure 9-4, Customer B now returns 300K units, replenishing the stock back to 400K, and Customer C places an order for 400K units. But because the on-hand inventory amount has not been updated, the sales order application thinks there are still only 100K units in inventory. Because the SLA requires that a firm commitment to deliver be issued within 2 hours, the order request can't be fulfilled, resulting in lost business. Figure 9-5. False backlog condition due to latency of batch transfer and update Another variation on this problem is the violation of the SLA based on overpromising goods. As in the previous example, this too is based on not having a real-time snapshot of on-hand inventory. Again, we start out with a 500K supply of on-hand inventory, 400K of which is purchased by Customer A. At the same time, Customer B returns the 300K units because they were defective. The problem is that Customer B has an SLA specifying that the defective units must be replaced immediately. As illustrated in Figure 9-5, this condition results in an overpromise of 200K units over the available inventory. Figure 9-6. Negative inventory balance due to latency of batch transfer of data The SLAs with both buyers also carry stiff fines when the orders cannot be fulfilled as promised. The buyer needs these kinds of SLAs because if it can't get its supplies, it places its own revenue-generating opportunities at risk. And the latency problems don't end there in this case, the supplier doesn't even know that a problem has occurred until the next successful batch update! |