OLAP DML Model Objects

This topic provides information about creating and executing OLAP DML models. It includes the following subtopics:

What is an OLAP DML Model?
Creating Models
Compiling Models
Running a Model
Debugging a Model

What is an OLAP DML Model?

An OLAP DML model is a set of interrelated equations that can assign results either to a variable or to a dimension value. For example, in a financial model, you can assign values to specific line items, such as gross.margin or net.income.

gross.margin = revenue - cogs

When an assignment statement assigns data to a dimension value or refers to a dimension value in its calculations, then it is called a dimension-based equation. A dimension-based equation does not refer to the dimension itself, but only to the values of the dimension. Therefore, when the model contains any dimension-based equations, then you must specify the name of each of these dimensions in a DIMENSION statement at the beginning of the model.

When a model contains any dimension-based equations, then you must supply the name of a solution variable when you run the model. The solution variable is both a source of data and the assignment target of model equations. It holds the input data used in dimension-based equations, and the calculated results are stored in designated values of the solution variable. For example, when you run a financial model based on the line dimension, you might specify actual as the solution variable.

Dimension-based equations provide flexibility in financial modeling. Since you do not need to specify the modeling variable until you solve a model, you can run the same model with the actual variable, the budget variable, or any other variable that is dimensioned by line.

Models can be quite complex. You can:

Include one model within another model as discussed in "Nesting Models"
Use data from different time periods as discussed in "Using Data from Past and Future Time Periods"
Solve simultaneous equations as discussed in "Solving Simultaneous Equations"
Create models for different scenarios as described in "Modeling for Multiple Scenarios"

Creating Models

To create an OLAP DML model, take the following steps:

Issue a DEFINE MODEL command to define the program object.
Issue a MODEL command which adds a specification to the model to specify the processing that you want performed.
Compile the model as described in "Compiling Models".
(Optional) If necessary, change the settings of model options listed in "Model Options".
Execute the model as described in "Running a Model".
Debug the model as described in "Debugging a Model".
When you want the model to be a permanent part of the analytic workspace, save the model using an UPDATE command followed by COMMIT.

For an example of creating a model, see Example 10-58, "Creating a Model".

Nesting Models

You can include one model within another model by using an INCLUDE statement within a MODEL command. The MODEL command that contains the INCLUDE statement is referred to as the parent model. The included model is referred to as the base model. You can nest models by placing an INCLUDE statement in a base model. For example, model myModel1 can include model myModel2, and model myModel2 can include model myModel3. The nested models form a hierarchy. In this example, myModel1 is at the top of the hierarchy, and myModel3 is at the root.

When a model contains an INCLUDE statement, then it cannot contain any DIMENSION statements. A parent model inherits its dimensions, if any, from the DIMENSION statements in the root model of the included hierarchy. In the example just given, models myModel1 and myModel2 both inherit their dimensions from the DIMENSION statements in model myModel3.

The INCLUDE statement enables you to create modular models. When certain equations are common to several models, then you can place these equations in a separate model and include that model in other models as needed.

The INCLUDE statement also facilitates what-if analyses. An experimental model can draw equations from a base model and selectively replace them with new equations. To support what-if analysis, you can use equations in a model to mask previous equations. The previous equations can come from the same model or from included models. A masked equation is not executed or shown in the MODEL.COMPRPT report for a model

Dimension Status and Model Equations

When a model contains an assignment statement to assign data to a dimension value, then the dimension is limited temporarily to that value, performs the calculation, and restores the initial status of the dimension.

For example, a model might have the following statements.

DIMENSION line
gross.margin = revenue - cogs

If you specify actual as the solution variable when you run the model, then the following code is constructed and executed.

PUSH line
LIMIT line TO gross.margin
actual = actual(line revenue) - actual(line cogs)
POP line

This behind-the-scenes construction lets you perform complex calculations with simple model equations. For example, line item data might be stored in the actual variable, which is dimensioned by line. However, detail line item data might be stored in a variable named detail.data, with a dimension named detail.line.

When your analytic workspace contains a relation between line and detail.line, which specifies the line item to which each detail item pertains, then you might write model equations such as the following ones.

revenue = total(detail.data line)
expenses = total(detail.data line)

The relation between detail.line and line is used automatically to aggregate the detail data into the appropriate line items. The code that is constructed when the model is run ensures that the appropriate total is assigned to each value of the line dimension. For example, while the equation for the revenue item is calculated, line is temporarily limited to revenue, and the TOTAL function returns the total of detail items for the revenue value of line.

Using Data from Past and Future Time Periods

Several OLAP DML functions make it easy for you to use data from past or future time periods. For example, the LAG function returns data from a specified previous time period, and the LEAD function returns data from a specified future period.

When you run a model that uses past or future data in its calculations, you must make sure that your solution variable contains the necessary past or future data. For example, a model might contain an assignment statement that bases an estimate of the revenue line item for the current month on the revenue line item for the previous month.

DIMENSION line month
...
revenue = LAG(revenue, 1, month) * 1.05

When the month dimension is limited to Apr2004 to Jun2004 when you run the model, then you must be sure that the solution variable contains revenue data for Mar96.

When your model contains a LEAD function, then your solution variable must contain the necessary future data. For example, when you want to calculate data for the months of April through June of 2004, and when the model retrieves data from one month in the future, then the solution variable must contain data for July 2004 when you run the model.

Handling NA Values in Models

Oracle OLAP observes the NASKIP2 option when it evaluates equations in a model. NASKIP2 controls how NA values are handled when + (plus) and - (minus) operations are performed. The setting of NASKIP2 is important when the solution variable contains NA values.

The results of a calculation may be NA not only when the solution variable contains an NA value that is used as input, but also when the target of a simultaneous equation is NA. Values in the solution variable are used as the initial values of the targets in the first iteration over a simultaneous block. Therefore, when the solution variable contains NA as the initial value of a target, an NA result may be produced in the first iteration, and the NA result may be perpetuated through subsequent iterations.

To avoid obtaining NA for the results, you can make sure that the solution variable does not contain NA values or you can set NASKIP2 to YES before running the model.

Solving Simultaneous Equations

An iterative method is used to solve the equations in a simultaneous block. In each iteration, a value is calculated for each equation, and compares the new value to the value from the previous iteration. When the comparison falls within a specified tolerance, then the equation is considered to have converged to a solution. When the comparison exceeds a specified limit, then the equation is considered to have diverged.

When all the equations in the block converge, then the block is considered solved. When any equation diverges or fails to converge within a specified number of iterations, then the solution of the block (and the model) fails and an error occurs.

You can exercise control over the solution of simultaneous equations, use the OLAP DML options described in "Model Options". For example, using these options, you can specify the solution method to use, the factors to use in testing for convergence and divergence, the maximum number of iterations to perform, and the action to take when the assignment statement diverges or fails to converge.

Modeling for Multiple Scenarios

Instead of calculating a single set of figures for a month and division, you might want to calculate several sets of figures, each based on different assumptions.

You can define a scenario model that calculates and stores forecast or budget figures based on different sets of input figures. For example, you might want to calculate profit based on optimistic, pessimistic, and best-guess figures.

To build a scenario model, follow these steps.

Define a scenario dimension.
Define a solution variable dimensioned by the scenario dimension.
Enter input data into the solution variable.
Write a model to calculate results based on the input data.

For an example of building a scenario model see, Example 10-59, "Building a Scenario Model".

Compiling Models

When you finish writing the statements in a model, you can use the COMPILE command to compile the model. During compilation, COMPILE checks for format errors, so you can use COMPILE to help debug your code before running a model. When you do not use COMPILE before you run the model, then the model is compiled automatically before it is solved.You can use the OBJ function with the ISCOMPILED choice to test whether a model is compiled.

SHOW OBJ(ISCOMPILED 'myModel')

When you compile a model, either by using a COMPILE statement or by running the model, the model compiler checks for problems that are unique to models. You receive an error message when any of the following occurs:

The model contains any statements other than DIMENSION, INCLUDE, and assignment (SET) statements.
The model contains both a DIMENSION statement and an INCLUDE statement.
A DIMENSION or INCLUDE statement is placed after the first equation in the model.
The dimension values in a single dimension-based equation refer to two or more different dimensions.
An equation refers to a name that the compiler cannot identify as an object in any attached analytic workspace. When this error occurs, it may be because an equation refers to the value of a dimension, but you have neglected to include the dimension in a DIMENSION statement. In addition, a DIMENSION statement may appear to be missing when you are compiling a model that includes another model and the other model fails to compile. When a root model (the innermost model in a hierarchy of included models) fails to compile, the parent model is unable to inherit any DIMENSION commands from the root model. In this case the compiler may report an error in the parent model when the source of the error is actually in the root model. See INCLUDE for additional information.

Resolving Names in Equations

The model compiler examines each name in an equation to determine the analytic workspace object to which the name refers. Since you can use a variable and a dimension value in exactly the same way in a model equation (basing calculations on it or assigning results to it), a name might be the name of a variable or it might be a value of any dimension listed in a DIMENSION statement.

To resolve each name reference, the compiler searches through the dimensions listed in explicit or inherited DIMENSION statements, in the order they are listed, to determine whether the name matches a dimension value of a listed dimension. The search concludes as soon as a match is found.

Therefore, when two or more listed dimensions have a dimension value with the same name, the compiler assumes that the value belongs to the dimension named earliest in a DIMENSION statement.

Similarly, the model compiler might misinterpret the dimension to which a literal INTEGER value belongs. For example, the model compiler will assume that the literal value '200' belongs to the first dimension that contains either a value at position 200 or the literal dimension value 200.

To avoid an incorrect identification, you can specify the desired dimension and enclose the value in parentheses and single quotes. See "Formatting Ambiguous Dimension Values".

When the compiler finds that a name is not a value of any dimension specified in a DIMENSION statement, it assumes that the name is the name of an analytic workspace variable. When a variable with that name is not defined in any attached analytic workspace, an error occurs.

Code for Looping Over Dimensions

The model compiler determines the dimensions over which the statements will loop. When an equation assigns results to a variable, the compiler constructs code that loops over the dimensions (or bases of a composite) of the variable.

When you run a model that contains dimension-based equations, the solution variable that you specify can be dimensioned by more dimensions than are listed in DIMENSION statements.

Evaluating Program Arguments

When you specify the value of a model dimension as an argument to a user-defined program, the compiler recognizes a dependence introduced by this argument.

For example, an equation might use a program named weight that tests for certain conditions and then weights and returns the Taxes line item based on those conditions. In this example, a model equation might look like the following one.

Net.Income = Opr.Income - weight(Taxes)

The compiler correctly recognizes that Net.Income depends on Opr.Income and Taxes. However, when the weight program refers to any dimension values or variables that are not specified as program arguments, the compiler does not detect any hidden dependencies introduced by these calculations.

Dependencies Between Equations

The model compiler analyzes dependencies between the equations in the model. A dependence exists when the expression on the right-hand side of the equal sign in one equation refers to the assignment target of another equation. When an equation indirectly depends on itself as the result of the dependencies among equations, a cyclic dependence exists between the equations.

The model compiler structures the model into blocks and orders the equations within blocks and the blocks themselves to reflect dependencies. When you run the model, it is solved one block at a time. The model compiler can produce three types of solution blocks:

Simple Solution Blocks—Simple blocks include equations that are independent of each other and equations that have dependencies on each other that are non-cyclic.

For example, when a block contains equations that solve for values A, B, and C, a non-cyclic dependence can be illustrated as A>B>C. The arrows indicate that A depends on B, and B depends on C.
Step Solution Blocks—Step blocks include equations that have a cyclic dependence that is a one-way dimensional dependence. A dimensional dependence occurs when the data for the current dimension value depends on data from previous or later dimension values. The dimensional dependence is one-way when the data depends on previous values only or later values only, but not both. For more information on one-way dimensional dependence, see "Ensuring One-Way Dimensional Dependence".

Dimensional dependence typically occurs over a time dimension. For example, it is common for a line item value to depend on the value of the same line item or a different line item in a previous time period. When a block contains equations that solve for values A and B, a one-way dimensional dependence can be illustrated as A>B>LAG(A). The arrows indicate that A depends on B, and B depends on the value of A from a previous time period.
Simultaneous Solution Blocks—Simultaneous blocks include equations that have a cyclic dependence that is other than one-way dimensional. The cyclic dependence may involve no dimensional qualifiers at all, or it may be a two-way dimensional dependence. For more information on two-way dimensional dependence, see "Structures for Which the Model Compiler Assumes Two-Way Dimensional Dependence".

When a model contains a block of simultaneous equations, COMPILE gives you a warning message. In this case, you may want to check the settings of the options that control simultaneous solutions before you run the model. "Model Options" lists these options.

An example of a cyclic dependence that does not depend on any dimensional qualifiers can be illustrated as A>B>C>A. The arrows indicate that A depends on B, B depends on C, and C depends on A.

An example of a cyclic dependence that is a two-way dimensional dependence can be illustrated as A>LEAD(B)>LAG(A). The arrows indicate that A depends on the value of B from a future period, while B depends on the value of A from a previous period.

Order of Simultaneous Equations The solution of a simultaneous block of equations is sensitive to the order of the equations. In general, you should rely on the model compiler to determine the optimal order for the equations. In some cases, however, you may be able to encourage convergence by placing the equations in a particular order.

To force the compiler to leave the simultaneous equations in each block in the order in which you place them, set the MODINPUTORDER option to YES before compiling the model. (MODINPUTORDER has no effect on the order of equations in simple blocks or step blocks.)

Structures for Which the Model Compiler Assumes Two-Way Dimensional Dependence

When dependence is introduced through any of the following structures, the model compiler assumes that two-way dimensional dependence occurs:

A two-way dimensional dependence can occur when you use an aggregation function, such as AVERAGE, TOTAL, ANY, or COUNT.
```
Opr.Income = Gross.Margin -
   (TOTAL(Marketing + Selling + R.D))
Marketing = LAG(Opr.Income, 1, month)
```

A two-way dimensional dependence can occur when you use a time-series function that requires a time-period argument, such as CUMSUM, LAG, or LEAD (except for the specific functions and conditions described in "Ensuring One-Way Dimensional Dependence").
A two-way dimensional dependence also can occur when you use a financial function, such as DEPRSL or NPV.

A cyclic dependence across a time dimension that you introduce through a loan or depreciation function may cause unexpected results. The loan functions include FINTSCHED, FPMTSCHED, VINTSCHED, and VPMTSCHED. The depreciation functions include DEPRDECL, DEPRDECLSW, DEPRSL, and DEPRSOYD.

Ensuring One-Way Dimensional Dependence

When dependence between equations is introduced through any of the following structures, a one-way dimensional dependence occurs:

A one-way dimensional dependence occurs when you use a LAG or LEAD function and when the argument for the number of time periods is coded as an explicit number (either as a value or a constant) or as the result of ABS. (Otherwise, there may be a two-way dependence, involving both previous and future dimension values, and the compiler assumes that a simultaneous solution is required.) The following example illustrates this use of LAG.
```
Opr.Income = Gross.Margin - (Marketing + Selling + R.D)
Marketing = LAG(Opr.Income, 1, month)
```
A one-way dimensional dependence occurs when you use a MOVINGAVERAGE, MOVINGMAX, MOVINGMIN, or MOVINGTOTAL function, when that the start and stop arguments are nonzero numbers, and when both the start and top arguments are positive or both are negative. (Otherwise, two-way dimensional dependence is assumed.)
```
Opr.Income = Gross.Margin - (Marketing + Selling + R.D)
Marketing = MOVINGAVERAGE(Opr.Income, -4, -1, 1, month)
```

Obtaining Analysis Results

After compiling a model, you can use the following tools to obtain information about the results of the analysis performed by the compiler:

The MODEL.COMPRPT program produces a report that shows how model equations are grouped into blocks. For step blocks and for simultaneous blocks with a cross-dimensional dependence, the report lists the dimensions involved in the dependence.
The MODEL.DEPRT program produces a report that lists the variables and dimension values on which each model equation depends. When a dependence is dimensional, the report gives the name of the dimension.
The INFO function lets you obtain specific items of information about the structure of the model.

Checking for Additional Problems

The compiler does not analyze the contents of any programs or formulas that are used in model equations. Therefore, you must check the programs and formulas yourself to make sure they do not do any of the following:

Refer to the value of any variable used in the model.
Refer to the solution variable.
Limit any of the dimensions used in the model.
Invoke other models.

When a model or program violates any of these restrictions, the results of the model may be incorrect.

Running a Model

When you run a model, you should keep these points in mind:

Before you run a model, the input data must be available in the solution variable.
Before running a model that contains a block of simultaneous equations, you might want to check or modify the values of some OLAP DML options that control the solution of simultaneous blocks. These options are described briefly in "Model Options".
When your model contains any dimension-based equations, then you must provide a numeric solution variable that serves both as a source of data and as the assignment target for equation results. The solution variable is usually dimensioned by all of the dimensions on which model equations are based and also by the other dimensions of the solution variable on which you are not basing equations.
When you run a model, a loop is performed automatically over the values in the current status list of each of the dimensions of the solution variable on which you have not based equations.
When a model equation bases its calculations on data from previous time periods, then the solution variable must contain data for these previous periods. When it does not, or when the first value of the dimension of type DAY, WEEK, MONTH, QUARTER, or YEAR is in status, then the results of the calculation are NA.

Syntax for Running a Model

To run or solve a model, use the following syntax.

model-name [solution-variable] [NOWARN]

where:

model-name is the name of the model.
solution-variable is the name of a numeric variable that serves as both the source and the target of data in a model that contains dimension-based equations. The solution variable is usually dimensioned by all the dimensions on which model equations are based (as specified in explicit or included DIMENSION commands). The solution-variable argument is required when the model contains any dimension-based equations. When all the model equations are based only on variables, a solution variable is not needed and an error occurs when you supply this argument. See "Dimensions of Solution Variables" for more information on dimensions of solution variables.
NOWARN is an optional argument that specifies that you do not want to be warned when the model contains a block of simultaneous equations.

Dimensions of Solution Variables

In a model with dimension-based equations, the solution variable is usually dimensioned by the dimensions on which model equations are based. Or, when a solution variable is dimensioned by a composite, the model equations can be based on base dimensions of the composite. The dimensions on which model equations are based are listed in explicit or inherited DIMENSION statements.

Special Cases of Solution Variables

The following special cases regarding the dimensions of the solution variable can occur:

The solution variable can have dimensions that are not listed in DIMENSION commands. Oracle OLAP automatically loops over the values in the status of the extra dimensions. For example, the model might contain a DIMENSION statement that lists the line and month dimensions, but you might specify a solution variable dimensioned by line, month, and division. Oracle OLAP automatically loops over the division dimension when you run the model. The solution variable can also be dimensioned by a composite that has one or more base dimensions that are not listed in DIMENSION commands. See "Solution Variables Dimensioned by a Composite"
When the solution variable has dimensions that are not listed in DIMENSION commands and when any of these other dimensions are the dimension of a step or simultaneous block, an error occurs.
Oracle OLAP loops over the values in the status of all the dimensions listed in DIMENSION commands, regardless of whether the solution variable is dimensioned by them. Therefore, Oracle OLAP will be doing extra, unnecessary work when the solution variable is not dimensioned by all the listed dimensions. Oracle OLAP warns you of this situation before it starts solving the model.
The inclusion of an unneeded dimension of type DAY, WEEK, MONTH, QUARTER, or YEAR in a DIMENSION statement causes incorrect results when you use a loan, depreciation, or aggregation function in a model equation. This happens because any component of a model equation that refers to the values of a model dimension behaves as if that component has all the dimensions of the model.

Solution Variables Dimensioned by a Composite

When a solution variable contains a composite in its dimension list, Oracle OLAP observes the sparsity of the composite whenever possible. As it solves the model, Oracle OLAP confines its loop over the composite to the values that exist in the composite. It observes the current status of the composite's base dimensions as it loops.

However, for proper solution of the model, Oracle OLAP must treat the following base dimensions of the composite as regular dimensions:

A base dimension that is listed in a DIMENSION statement.
A base dimension that is implicated in a model equation created using SET (for example, an equation that assigns data to a variable dimensioned by the base dimension).
A base dimension that is also a base dimension of a different composite that is specified in the ACROSS phrase of an equation. (See SET for more information on assignment statements and the use of ACROSS phrase.)

When a base dimension of a solution variable's composite falls in any of the preceding three categories, Oracle OLAP treats that dimension as a regular dimension and loops over all the values that are in the current status.

When the solution variable's composite has other base dimensions that do not fall in the special three categories, Oracle OLAP creates a temporary composite of these extra base dimensions. The values of the temporary composite are the combinations that existed in the original composite. Oracle OLAP loops over the temporary composite as it solves the model.

Debugging a Model

The following tools are available for debugging models:

To see the order in which the equations in a model are solved, you can set the MODTRACE option to YES before you run the model.When you set MODTRACE to YES, you can use a DBGOUTFILE statement to send debugging information to a file. The file produced by DBGOUTFILE interweaves each line of your model with its corresponding output.
You can use the MODEL.COMPRPT, MODEL.DEPRT, and MODEL.XEQRPT programs and the INFO function to obtain information about the structure of a compiled model and the solution status of a model you have run.