Causal Analytics
 
 

The Results of Causal Analytics
On the previous page, we described three causal analytic methods: Statistical Correlation, Significant Event Detection and Non-Linear Regression Modeling.  These three methods individually have strengths and when they are fused together they create an improved final result.  In this discussion, we use 1000 one-minute data rows (only about 16 hours data) from an oil stabilization column, where the product is a desired composition oil.  This composition is measured by its Reid Vapor Pressure, or "RVP".  Raw oil comes into a tower's middle section, is heated, gas vapors go out the tower's top and the stabilized oil comes out the bottom.  The RVP is controlled by adjusting the tower temperature, which is adjusted by taking a bottom oil product side-stream, heating it in a heater, then sending it back into the side of the tower.

In this tower's analysis we use the three causal analytical methods together, then fuse the results into a first-order "Causal Map".  "First order" means the direct independent variables' effects upon a variable of interest.  This processing took a few minutes on a 2 GHz laptop and yielded substantial information about the tower behavior.

Statistical Correlation Sub-Results
In the Statistical Correlation we see the linear correlation between variables through time.  In this case, we look at tower temperature [the control handle] vs. RVP [the performance variable].  The chart below is windowed correlation (vertical axis) vs. time (horizontal axis) as we traverse through the tower's data.  The first 100 records can be ignored because with the settings used, the causal analysis tool is still aggregating data.  We can see that Tower Temp is mostly inversely correlated to RVP.  Thus as temperature goes up, the RVP goes down (this is correct).

The Statistical Correlation's R-Squared (Coefficient of Determination) is given by the tool to be 0.13, which is low.  We can also see "flips" in correlation above, which points us to interesting time-periods to investigate later, if we desire.  In an on-line scenario, alerts could be given to reversals in the effect of our control handle.  This same set of analytics indicates, from a regression standpoint, there is a time lag between changing Tower Temperature and RVP by an average of 8 minutes (see below).  The chart below has time lag on the vertical axis vs. time on the horizontal axis as we traverse through the tower's data.  Again, the first 100 records can be ignored, because the causal analysis tool is still aggregating data.  And like correlation, significant changes in lag time tells us that "something is awry" with the process.

Significant Event Sub-Results
Using Significant Event analytics, we receive information on lag time and causal direction.  Causal direction is similar to the meaning of statistical correlation, where a negative direction implies a negative correlation between variables.  In the chart below, we see a windowed lag time (vertical axis) vs. time (horizontal axis) as we traverse through the tower's data.  The lack of data in the beginning is because a significant event shared between the two variables had not occurred yet, thus we did not yet have an indication of the relation.  We can see in the chart that the lag time was shortening as the process ran and averaged about 15 periods.

The chart below shows us the windowed directional relation (vertical axis) vs. time (horizontal axis) as we traverse through the tower's data. As the analysis process progressed, the significant event detector became more sure that the Tower Temp is inversely correlated to RVP.  Thus as temperature goes up, the RVP goes down (this is also correct).

Model-Based Sub-Results
Model-based analytics tells us that the time lag between Tower Temp and RVP is also 8 minutes, as can be seen in the chart below.  This chart has causal model performance (accuracy or R2 of the model) on the vertical axis and time lag on the horizontal axis.  You can see that the peak explanatory power of the model-series occurs at 8 minutes delay.

Fused Results
By fusing our results, we create a "Causal Map" (see below) where the effects of all provided variables are mapped upon our variable of interest: RVP.  Consider this as a form of Ishikawa Fish Bone diagram.  On the vertical axis of the Causal Map is "Causal Strength", our fused metric for the extent a variable influences the variable of interest.  The horizontal axis is the time delay of the impact on the variable of interest by the individual variables.  

Negative "Causal Strength" values mean there is a negative relationship between the variable and the variable of interest.  In the study of random numbers, we have found that causes with a strength of less that 0.15 may be random causal associations.

This information is very useful.  You can see that we have some answers to the three burning questions of "Causal Analysis", all in one chart: Does A cause B?  When?  and By How Much?  In the chart, we see:

  • Our Tower Temperature control handle does have a reasonable influence and the causality strength is -0.3, not the 0.13 that Statistical Coefficient of Determination alone told us.  Tower Temperature seems to have a nearly random relationship to RVP using only statistical correlation, but this fused result is more in line with hands-on experience with the process.

  • The time lags are generally similar to what we see in the real process' geometry: Feed flows are the first influences, most distant from the result, then come mid-tower conditions, then generally bottoms conditions.

  • We should probably control any factors of similar or higher causal strength than our control handle, in this case other tower temperatures (really probably a profile), the tower pressure and the bottoms level.

  • The time lags and key variables to use for building predictive models that give on-line estimates of RVP.  In fact, this system has all the information necessary to build models already.

We can explore any other variable of interest, or pair of variables, that we desire using the same techniques.  It should be noted that we've engineered these mechanisms to support textual data analysis too which can also be fused to create a causal map.

Summary
In the analysis of this tower, we used three causal analytical methods together, then fused the results into a first-order "Causal Map".  Using less than one day's operating conditions data and a few minutes processing on a laptop, we were able to properly characterize the unit operation's cause and effect relations, providing substantial information.

One of the advantages of this analytical technology is that it can be used "autonomously", that is without human intervention.  "Agents" or "bots" can be assigned to watch, observe and analyze a process and develop a reasonably accurate "mental model" of how the operation functions.  Since such a "bot" knows the key variables and their time-lags, it could create accurate predictions of performance through automatic modeling .  Extending further, it is possible for such a "bot" to optimize the process with a few statements of objectives and constraints.

Goto Page 1 >>

For more information, please contact us at tech@biocompsystems.com

Legal Policy | Privacy Policy | Contact Us
Copyright © 1995-2007 BioComp Systems, Inc.. All rights reserved.
IntelliDynamics is a registered trademark of BioComp Systems, Inc.
Last modified: Sunday April 24, 2005.
Serving the web since Feb 2, 1996