|
The Results of Causal Analytics
On the previous page, we described
three causal analytic methods: Statistical Correlation, Significant Event Detection
and Non-Linear Regression Modeling. These three methods individually
have
strengths and when they are fused together they create an improved final
result.
In this discussion, we use 1000 one-minute data rows (only about 16
hours data) from an oil stabilization column, where
the product is a desired composition oil. This composition is
measured by its Reid Vapor Pressure, or
"RVP". Raw oil comes into a tower's middle section, is
heated, gas vapors go out the tower's top and the stabilized oil comes out the
bottom. The RVP is controlled by adjusting the tower temperature,
which is adjusted by taking a bottom oil product side-stream, heating it in a heater, then
sending it back into the side of the tower.
In this tower's analysis we
use the three causal analytical methods together, then fuse the
results into a first-order "Causal Map". "First
order" means the direct independent variables' effects upon a variable
of interest. This processing took a few minutes on a 2 GHz laptop
and yielded substantial information about the tower behavior.

Statistical
Correlation Sub-Results
In the Statistical
Correlation we see the linear correlation between variables
through time. In this case, we look at tower temperature [the
control handle] vs.
RVP [the performance variable].
The chart below is windowed correlation (vertical axis) vs. time
(horizontal axis) as we
traverse through the tower's data. The first 100 records can be
ignored because with the settings used, the causal analysis tool is still aggregating data.
We can see that Tower Temp is mostly inversely correlated to
RVP. Thus as temperature goes up, the RVP goes down (this is
correct).

The Statistical Correlation's
R-Squared (Coefficient
of Determination) is given by the tool to be 0.13, which is
low. We can also see "flips" in correlation above, which points
us to interesting time-periods to investigate later, if we desire. In an
on-line scenario, alerts could be given to reversals in the effect of our
control handle. This same set of analytics indicates, from a
regression standpoint, there is a time lag between changing Tower
Temperature and RVP by an average of 8 minutes (see below). The
chart below has time lag on the vertical axis vs. time on the horizontal axis as
we traverse through the tower's data. Again, the first 100 records
can be ignored, because the causal analysis tool is still aggregating data.
And like correlation, significant changes in lag time tells us that "something is
awry" with the process.


Significant Event
Sub-Results
Using Significant Event analytics, we receive information on lag time
and causal direction. Causal direction is similar to the meaning of
statistical correlation, where a negative direction implies a negative
correlation between variables. In the chart below, we see a windowed
lag time (vertical axis) vs. time (horizontal axis) as we
traverse through the tower's data. The lack of data in the beginning
is because a significant event shared between the two variables had not occurred
yet, thus we did not yet have an indication of the relation. We can
see in the chart that the lag time was shortening as the process ran and
averaged about 15 periods.

The chart below shows us
the windowed directional relation (vertical axis) vs. time (horizontal axis) as we
traverse through the tower's data. As the analysis process progressed, the
significant event detector became more sure that the Tower Temp is inversely
correlated to
RVP. Thus as temperature goes up, the RVP goes down (this is also correct).


Model-Based Sub-Results
Model-based analytics tells
us that the time lag between Tower Temp and RVP is also 8 minutes, as can
be seen in the chart below. This chart has causal model performance
(accuracy or R2 of the model) on the vertical axis and time lag on the horizontal axis.
You can see that the peak explanatory power of the model-series occurs at 8 minutes delay.


Fused Results
By fusing our results, we
create a "Causal Map" (see below) where the effects
of all provided variables are mapped upon our variable of interest: RVP.
Consider this as a form of Ishikawa Fish Bone diagram. On the
vertical axis of the Causal Map is "Causal Strength", our fused metric for the extent a variable influences the variable of
interest. The horizontal axis is
the time delay of the impact on the
variable of interest by the individual variables.
Negative "Causal Strength"
values mean there is a negative relationship between the variable and the variable of
interest. In the
study of random numbers, we have found that causes with a strength of less
that 0.15 may be random causal associations.

This information is very
useful. You can see that we have
some answers to the three burning questions of "Causal Analysis",
all in one
chart: Does A cause B? When? and By How Much? In the
chart, we see:
-
Our Tower Temperature
control handle does have a reasonable influence and the causality
strength is -0.3, not the 0.13 that Statistical
Coefficient of Determination alone told us. Tower Temperature seems to
have a nearly random relationship to RVP using only statistical
correlation, but this fused result is more in line with hands-on
experience with the process.
-
The
time lags are generally similar to what we see in the real process'
geometry: Feed flows are the first influences, most distant from the
result, then come mid-tower conditions, then generally bottoms
conditions.
-
We should probably
control any factors of similar or higher causal strength than our control handle,
in this case other tower temperatures (really probably a profile), the
tower pressure and
the bottoms level.
-
The time lags and key
variables to use for building predictive models that give on-line
estimates of RVP. In fact, this system has all the information
necessary to build models already.
We can explore any other
variable of interest, or pair of variables, that we desire using the same techniques.
It should be noted that we've engineered these mechanisms to support textual data
analysis too which can also be fused to create a causal map.

Summary
In the analysis of this tower, we used three causal analytical methods
together, then fused the results into a first-order "Causal
Map". Using less than one day's operating conditions data and a
few minutes processing on a laptop, we were able to properly characterize the unit
operation's cause and effect relations, providing substantial information.
One of the advantages of
this analytical technology is that it can be used
"autonomously", that is without human intervention. "Agents" or "bots" can be assigned to watch, observe
and analyze a process and develop a reasonably accurate "mental
model" of how the operation functions. Since such a "bot" knows the key variables and their
time-lags, it could create accurate predictions of performance through
automatic
modeling . Extending further, it is possible for such a "bot"
to optimize the process with a few statements of objectives and
constraints.
Goto Page 1 >>
|