4
OP Schemas

Oracle9iAS Personalization (OP) uses several database schemas, as follows:

Mining Object Repository (MOR): The MOR controls the operation of OP and contains mining objects (deployable packages, reports, schedule items, etc., as well as package build results such as rule tables, etc.)
Mining Table Repository (MTR): Contains mining tables plus other important information such as bin boundaries, hot picks, taxonomy, etc.).
RE Schema. The RE schema is where recommendations are held, where model package information is held, is where deployable packages are held in the Oracle9iAS server. RE schemas are tables that support the recommendation engine -- they contain deployable packages used to generate real-time recommendations based on current and past session data.

All OP schemas reside on the system where Oracle9i is installed.

To see a small example of the MOR, MTR, and RE schemas that are correctly populated, install the REAPI Demo. (During installation, you are prompted with a question asking whether you want to install the demo data (movie data); if you answer yes, the demo data and the REAPI Demo are installed.) If you installed the REAPI Demo, you can examine the tables there. Alternatively, you can install an unpopulated MTR when you install OP. You can examine the schema of the unpopulated MTR and populate it with your own data.

Note that OP uses a fixed schema for the MTR. By "fixed," we mean that the MTR must be populated with tables matching OP table and column names.

Before you can obtain recommendations, you must create and deploy a package. You cannot create a package until there is some data available in the MTR. There are two ways to populate an MTR:

Collect data using the REProxyRT method addItem or addItems.
Populate with existing data, i.e., map existing historical data that was collected by your Web application and stored in an Oracle database.

Mining Table Repository

The OP MTR consists of the tables and views listed below.

`MTR_ATTR_ID_BIN_BOUNDARY`	`TABLE`
`MTR_ATTR_NAME_TO_ID_MAP`	`TABLE`
`MTR_BIN_BOUNDARY`	`TABLE`
`MTR_CATEGORY`	`TABLE`
`MTR_CONFIGURATION`	`TABLE`
`MTR_CUSTOMER`	`TABLE`
`MTR_CUSTOMER_NAV_DETAIL`	`TABLE`
`MTR_CUSTOMER_RATING_DETAIL`	`TABLE`
`MTR_HOTPICK`	`TABLE`
`MTR_HOTPICK_GROUP`	`TABLE`
`MTR_INTERNAL_CONFIGURATION`	`TABLE`
`MTR_ITEM`	`TABLE`
`MTR_NAVIGATION_DETAIL`	`VIEW`
`MTR_PROFILE_DATA`	`VIEW`
`MTR_PROXY`	`TABLE`
`MTR_PURCHASING_DETAIL`	`TABLE`
`MTR_RATING_DETAIL`	`VIEW`
`MTR_RECOMMENDATION_DETAIL`	`TABLE`
`MTR_SCHEMA_VERSION`	`VIEW`
`MTR_SESSION`	`TABLE`
`MTR_TAXONOMY (optional)`	`TABLE`
`MTR_TAXONOMY_CATEGORY (optional)`	`TABLE`
`MTR_TAXONOMY_CATEGORY_ITEM (optional)`	`TABLE`
`MTR_VISITOR_NAV_DETAIL`	`TABLE`
`MTR_VISITOR_RATING_DETAIL`	`TABLE`

Certain tables that make up the OP MTR must be populated with data specific to your Web site in accordance with the MTR schema. Other tables, such as the tables associated with sessions and recommendations, are automatically populated by OP.

The rest of this section describes the schemas for the MTR tables; tables that you must populate are described in detail.

Item Table

The item table contains a list of all the individual items that a Web site deals with. When OP returns a recommendation, it returns the ID and ITEM_TYPE; the item table provides more information. The item table is usually mapped to the catalog tables in the site database. The schema for MTR_ITEM has four fields; they are listed below, in order, with their data types.

`ID`	`NUMBER PK`
`ITEM_TYPE`	`VARCHAR2(30) PK`
`LABEL`	`VARCHAR2(150)`
`DESCRIPTION`	`VARCHAR2(4000)`

Bin Boundaries

The model-building algorithms in Oracle9iAS Personalization require discrete data. All numerical data must be discretized (binned) before the data is used to build a model. In OP, binning is performed in a transformation step before model build. The value ranges for binning (the bin boundaries) must be specified in order for OP to bin the values.

In the current release of OP, the bin boundaries must be specified in the bin boundaries table.

Binning is performed by joining the input data and the bin boundaries table.

Categorical data should also be binned in situations where there are many distinct values. If you bin categorical data, bin boundaries must be specified as for numerical data

In summary, OP requires all numerical data to be binned, and high cardinality categories should also be binned.

When you create bins of numeric values, specify the bounds (upper and lower values) for each bin. When you create bins of categorical data, specify the items in each bin. To map several values to the same bin, use several records with the same bin numbering.

The table MTR_BIN_BOUNDARY has seven fields; they are listed below, in order, with their data types:

`DATA_SOURCE_TYPE`	`NUMBER(3)`
`ITEM_TYPE`	`VARCHAR2(30)`
`ATTRIBUTE_NAME`	`VARCHAR2(30)`
`LOWER_VALUE`	`NUMBER`
`UPPER_VALUE`	`NUMBER`
`STRING_VALUE`	`VARCHAR2(60)`
`BIN_NUMBER`	`NUMBER(15)`

Examples of Specifying Bin Boundaries

The following examples illustrate how to specify bin boundaries.

Consider movie rating data on a scale of 1 - 5. Suppose that you want to bin ratings as follows:

1 and 2 are in bin number 1
3 is in bin number 2
4 and 5 are bin number 3

You should enter the following into the bin boundaries table:

(3, 'MOVIE', 'VALUE', 1, 2.1, NULL, 1), (3, 'MOVIE', 'VALUE', 3, 3.1, NULL, 2), (3, 'MOVIE', 'VALUE', 4, 5.1, NULL, 3).

The range of the bin includes all values that are greater than or equal to the lower value and strictly less than the upper value. The data source type for rating is 3 and string value is set to NULL for numeric data.

The following entries in a bin boundary table bin marital status, a categorical attribute:

(1, 'NONE', 'MARITAL_STATUS', NULL, NULL, 'Single', 1), (1, 'NONE', 'MARITAL_STATUS', NULL, NULL, 'Divorced', 2), (1, 'NONE', 'MARITAL_STATUS', NULL, NULL, 'Separated', 2), (1, 'NONE', 'MARITAL_STATUS', NULL, NULL, 'Married', 3), (1, 'NONE', 'MARITAL_STATUS', NULL, NULL, 'Widowed', 4)

The data source type is 1 and the item type is NONE for demographic data. Lower and upper values are NULL for categorical data.

Taxonomy

In OP, taxonomy refers to the structural organization of items in a company's catalog or site. Typically the catalog and/or the site has a hierarchical structure, with the most general category at the base (for example, "clothing"), and branching to increasingly specific categories (for example, from "clothing" to "shoes" to "sneakers" to "tennis shoes").

Items can belong to more than one category and to different levels of the structure (for example, "tennis shoes" can be a category in "clothing" and also a category in "sports equipment").

The structure of the OP taxonomy is a DAG (direct acyclic graph), which can contain multiple top-level nodes. The different portions of the taxonomy can be disconnected too. Any node can connect to any other node but there cannot be a path that connects a node's child back to the node itself.

OP also supports multiple taxonomies (different ways of organizing the items).

The taxonomy is implemented using a group of tables; they are specified by the customer at installation time:

MTR_TAXONOMY: Lists the different taxonomies used by the site. The schema for this table has three fields; they are listed below, in order, with their data types:

ID

NUMBER PK

NAME

VARCHAR2(150)

DESCRIPTION

VARCHAR2(4000)
MTR_TAXONOMY_CATEGORY: Specifies which categories belong to the different taxonomies. (A category can belong to multiple taxonomies; however, for a given taxonomy, there can be only one instance of any category.) The schema for this table has four fields; they are listed below, in order, with their data types:

TAXONOMY_LEVEL

NUMBER PK

TAXONOMY_ID

NUMBER PK

PARENT_ID

NUMBER PK

CHILD_ID

NUMBER PK
MTR_TAXONOMY_CATEGORY_ITEM: Specifies which items go with a given taxonomy-category pair. The schema for this table has four fields; they are listed below, in order, with their data types:

CATEGORY_ID

NUMBER PK

TAXONOMY_ID

NUMBER PK

ITEM_ID

NUMBER PK

ITEM_TYPE

VARCHAR2(30) PK
MTR_CATEGORY: Specifies the different categories used by the site. The schema for this table has three fields; they are listed below, in order, with their data types:

ID

NUMBER PK

NAME

VARCHAR2(150)

DESCRIPTION

VARCHAR2(4000)

Samples of the MTR Taxonomy Tables

The REAPI Demo includes a taxonomy; you can examine the demo MTR to see examples of all these tables.

Customer Table

The MTR_CUSTOMER table contains demographic information about the customer. Some customer attributes are common to all OP applications and some can be tailored to your application. The common attributes are customer ID, name, creation date, gender, age, marital status, personal income, whether or not the customer is the head of household, household income, household size, and whether the customer rents or owns.

You can specify up to 50 attributes specific to your Web application. These variable attributes are all strings.

The schema of the MTR_CUSTOMER table has the following fields. They are listed below, in order, with their data types.

`ID`	`VARCHAR2(32)`
`NAME`	`VARCHAR2(80)`
`CREATION_DATE`	`DATE`
`GENDER`	`VARCHAR2(10)`
`AGE`	`NUMBER(3)`
`MARITAL_STATUS`	`VARCHAR2(20)`
`PERSONAL_INCOME`	`NUMBER`
`IS_HEAD_OF_HOUSEHOLD`	`CHAR(1)`
`HOUSEHOLD_INCOME`	`NUMBER`
`HOUSEHOLD_SIZE`	`NUMBER(2)`
`RENT_OWN_INDICATOR`	`VARCHAR2(30)`
`ATTRIBUTE1`	`VARCHAR2(150)`
`ATTRIBUTE2`	`VARCHAR2(150)`
`ATTRIBUTE3`	`VARCHAR2(150)`
`ATTRIBUTE4`	`VARCHAR2(150)`
`...`
`ATTRIBUTE49`	`VARCHAR2(150)`
`ATTRIBUTE50`	`VARCHAR2(150)`

Hot Picks

Hot picks are used by some Web sites; for example, the daily specials might be hot picks. Information about hot picks is stored in two MTR tables, as follows:

MTR_HOTPICK_GROUP lists the distinct hot pick groups used by the site. There is one record for each group. Each record contains a group ID, the group name (label), and a brief description of the group. The schema for this table has three fields; they are listed below, in order, with their data types.

ID

NUMBER PK

LABEL

VARCHAR2(150)

DESCRIPTION

VARCHAR2(400)
MTR_HOTPICK lists the items in each hot pick group, arranged according to group ID. Each record consists of a group ID, an item ID, and an item type. The schema for this table has three fields; they are listed below, in order, with their data types.

ITEM_ID

NUMBER

ITEM_TYPE

VARCHAR2(30)

GROUP_ID

NUMBER

A hot pick group can also contain categories. In this case, the ITEM_TYPE is set to CATEGORY and item ID is set to the appropriate ID value in the MTR_CATEGORY table.

Detail Tables

Several tables in the MTR store the details of various activities.

MTR_CUSTOMER_NAV_DETAIL stores the navigation data corresponding to a customer session. This table is populated with data collected in the RE.
MTR_CUSTOMER_RATING_DETAIL stores rating data for customers. This table is populated using the data collected in the RE during data collection.
MTR_PURCHASING_DETAIL stores purchasing data on a per-session basis. Typically this data is collected by the Web application.
MTR_RECOMMENDATION_DETAIL stores the results of recommendation requests. The data stored in this table is used to generate reports on the performance of OP.
MTR_VISITOR_NAV_DETAIL stores the navigation data corresponding to a visitor session. This table is populated with data collected in the RE.
MTR_VISITOR_RATING_DETAIL stores rating data for visitors. This table is populated using the data collected in the RE during data collection.

Miscellaneous MTR Tables

The following tables are used internally by OP:

MTR_ATTR_NAME_TO_ID_MAP is used to speed up package building.
MTR_CONFIGURATION and MTR_INTERNAL_CONFIGURATION stores configuration information.
The MTR_SESSION table stores information about the session that OP creates internally on behalf of the application.
MTR_ATTR_ID_BIN_BOUNDARY is a materialized view of the join of the BIN_BOUNDARIES table and the ATTR_NAME_TO_ID_MAP table. It is used when transforming data during package builds.
MTR_PROXY is used to set up proxies for new items. When a new item is introduced, there may not be enough detail information about it to build packages; OP uses data about a similar existing product.

Recommendation Engine

The RE schema stores current session data, package information, navigational information, as well as information about the taxonomy structure. RE data is sychronized back to the MTR automatically. The RE includes the following tables (partial list):

`ATTR_ID_BIN_BOUNDARY`	`TABLE`
`HOTPICK`	`TABLE`
`HOTPICK_GROUP`	`TABLE`
`I_I_ANTECEDENT`	`TABLE`
`I_I_RULE`	`TABLE`
`P_I_CATEGORY_RULES`	`TABLE`
`P_I_ITEM_RULES`	`TABLE`
`RE_ACTIVE_USER`	`TABLE`
`RE_CONFIGURATION`	`TABLE`
`RE_CURRENT_SESSION_DATA`	`TABLE`
`RE_DEPLOYABLE_PACKAGE`	`TABLE`
`RE_DEPLOYABLE_PKG_CONTENTS`	`TABLE`
`RE_ERROR_TABLE`	`TABLE`
`RE_INTERNAL_CONFIGURATION`	`TABLE`
`RE_LOG`	`TABLE`
`RE_MESSAGE_LOG`	`TABLE`
`RE_PROFILE_DATA`	`TABLE`
`RE_RECOMMENDATION_DETAIL`	`TABLE`
`RE_SCHEMA_ACCESS`	`TABLE`
`TAXONOMY_CATEGORY`	`TABLE`
`TAXONOMY_CATEGORY_ITEM`	`TABLE`
`TAXONOMY_TRANS_CLOSURE`	`TABLE`

HOT_PICKGROUP and HOTPICK are copies of the corresponding tables in the MTR.

RE_CURRENT_SESSION_DATA holds all the data collected using the data collection methods. This data is written back to the MTR using data synchronization.

RE_PROFILE_DATA stores the historical profiles of active users. When a user is detected, the profile of that user is loaded from the MTR to this table.

RE_RECOMMENDATION_DETAIL is the source of data for the corresponding table in the MTR. The data is synchronized back to the MTR.

ATTR_ID_BIN_BOUNDARY is a copy of the corresponding table in the MTR.

RE_CONFIGURATION and RE_INTERNAL_CONFIGURATION store the configuration parameters for the RE.

RE_DEPLOYABLE_PACKAGE keeps track of the deployable package that is currently deployed in the RE.

RE_LOG records events occurring in the RE.

RE_ACTIVE_USER stores information about all users who are currently active in the system. Data from this table is used to populate the session table in the MTR.

All other tables are used internally by the RE.

Mining Object Repository

Much of the administrative work done by OP uses MOR tables and views. The MOR includes the following tables (partial list):

`MOR_VISITOR_TO_BROWSER_REPORT`	`TABLE`
`MOR_CONFIGURATION`	`TABLE`
`MOR_CROSS_SOLD_ITEMS_REPORT`	`TABLE`
`MOR_DEPLOYABLE_PACKAGE`	`TABLE`
`MOR_EFFECTIVENESS_REPORT`	`TABLE`
`MOR_EMAIL_ADDRESS`	`TABLE`
`MOR_ERROR_TABLE`	`TABLE`
`MOR_INTERNAL_CONFIGURATION`	`TABLE`
`MOR_MESSAGE_LOG`	`TABLE`
`MOR_MINING_MODEL`	`TABLE`
`MOR_MINING_RESULT`	`TABLE`
`MOR_MTR_CONNECTION`	`TABLE`
`MOR_RECOMMENDATION_ENGINE`	`TABLE`
`MOR_RECOMMENDATION_REPORT`	`TABLE`
`MOR_RECOMMENDATION_STRATEGY`	`TABLE`
`MOR_RE_FARM`	`TABLE`
`MOR_SCHEDULE_EVENT`	`TABLE`
`MOR_SCHEDULE_ITEM`	`TABLE`
`MOR_SCHEMA_ACCESS`	`TABLE`
`MOR_TAXONOMY_TRANS_CLOSURE`	`TABLE`
`MOR_TRANS_SUPERVISED_RESULT`	`TABLE`

`TAXONOMY_LEVEL`	`NUMBER PK`
`TAXONOMY_ID`	`NUMBER PK`
`PARENT_ID`	`NUMBER PK`
`CHILD_ID`	`NUMBER PK`

`CATEGORY_ID`	`NUMBER PK`
`TAXONOMY_ID`	`NUMBER PK`
`ITEM_ID`	`NUMBER PK`
`ITEM_TYPE`	`VARCHAR2(30) PK`

4OP Schemas