About fixed storage areas

Content Engine currently supports the following fixed content providers:

General principles

Definitions

Fixed content storage vs normal file storage area

Fixed storage area architecture

The core feature of Content Engine's support for fixed content is a new type of file storage area, known as a fixed storage area. The fixed storage area combines a traditional file storage with a 3rd-party fixed content system:

Fixed storage area architecture

This graphic shows the components comprising the fixed storage area architecture:

GCD (Global Configuration Data)

Each fixed storage area has a set of configuration data which is stored in the Content Engine's Global Configuration Data (GCD) database. The configuration data contains essential information including the complete shared path to the root staging directory of the storage area. This allows references to content in the object store database to contain only the specifics about the content; there is no need to store the path to each content element file, which is always computed from the configuration information and information about the document.

File storage area

The dedicated file storage area associated with the fixed storage area (also known as the "staging area"), can be a Windows NTFS volume or a UNIX file system. This staging area is created automatically when the fixed storage area is created. Note that you must first create a root folder in a shared directory to serve as the staging area.

Inbound

The Inbound area is a subdirectory in the staging area where content is uploaded. It is used as a work area for uploading new content. The inbound area for a fixed storage area behaves exactly like an inbound area of a file storage area.

Content

The Content area is the file storage directory structure in the staging area where content is placed after being fully uploaded. From there, depending on the capabilities of the fixed content device, content may be migrated to the fixed device. If content is migrated, it is then removed from the staging area. If content is not migrated because the fixed device does not support new content, the content remains in the staging area. Annotations always remain in the staging area.

The content area contains the finalized version of the content, as content element files. Each file in the content area contains the content for a single content element. Content element files are immutable; new content elements are created, but existing files are never altered (although they can be deleted).

Content becomes finalized when it's moved from the Inbound Area to the Content Area (within the transaction that commits the document to the database). After content is finalized, it is optionally full-text indexed using Verity.

The operation of the Content Area is different for a fixed storage area. Content elements are still moved to the Content Area when content is finalized, but the content may be migrated to a fixed store and then removed from the Content Area, after full-text indexing (and may be restored temporarily to the Content Area to perform re-indexing). See Content Element Model for more information on what types of documents are migrated to the fixed store.

Referrals

All content of a fixed storage area that has been migrated to a fixed content system has an associated referral. A referral is based on a Content Engine document's object GUID, and contains entries for all Content Elements of the document (one referral for each document, not for each content element). The referral information is stored as a BLOb column in the DocVersion table in the object store database. A referral always points to content in the fixed content system. Content that has not been migrated does not have a referral (un-migrated content is always stored in the content area of the staging area).

Retrieving content from a fixed storage area involves two steps: first retrieve the content referral from the object store database, and then use the referral to retrieve the actual content from the fixed content system. When an object store needs to retrieve content (for either retrieval or deletion), it examines the database to see if it has referral data for the document. If the data exists, it is passed to the content component, which goes directly to the fixed content system to retrieve or delete the content (assuming the fixed content store is configured to permit deletion).

Index area

The index area contains the Verity full-text index collections which are used to support the indexing of content elements in a fixed storage area. The documented Verity limits for the maximum number of content elements, documents per collection, collections per object store, and collections which may be used in a single Verity search, yield a maximum number of indexable documents for an object store to be between 64 million and 2 billion. Multiple fixed storage areas which all use the same fixed content store, may be created to enable indexing of more content, by supplying more disk space and enabling more Verity collections.

However, because the Verity full-text indexes will reside on an NTFS volume, the amount of data that can reside in Verity indexes will always be much smaller than the amount of content that can be contained in a fixed storage area.

Migration queue

The Migration Queue is stored as a table in the object store database. Content in the proper state (checked-in documents) are migrated from the staging area to the fixed content device. A content element is queued either when it is uploaded in the proper state or when it has finished indexing. When a queue item is executed, the staging file is copied to the fixed content device, a referral entry is added to the database, and a request to delete the staged content element is queued up. Eventually, the queued delete-staged-content request is executed when the staged content file is not locked (for reading), and it’ll be deleted.

Fixed content system

The 3rd-party fixed content system is the hardware/software solution for storing content. It can run on a Windows or non-Windows platform. The system itself must be one of those platforms explicitly supported by FileNet P8 Platform.

Content element model

The content element model for Content Engine is straightforward: a document can have zero, one, or many content elements, and each content element is represented by a single file in a file storage area. Each content element, and therefore each file in the file storage area, is immutable: once persisted, a content element file cannot be altered (but can be deleted, given the permission to do so). Each content element is uniquely identified by the combination of the object identifier (GUID) of the document, and an integer sequence number. For a given document, the content element sequence numbers are always unique and never reused.

When content is stored in a fixed content device, content elements are stored on special storage devices and a reference is kept to each file (for later retrieval). A FileNet interface/adapter manages the content on the fixed device. Note that fixed content devices may be slow, may limit concurrent connections/operations, may be write-once, and may have a data retention policy. Thus, content is staged to a file storage area repository before being migrated to the fixed device.

Document Versioning Model

There are two versioning states that a document can exist in that are important to this discussion :

  1. A document can be checked-in.
  2. A document can be a reservation.

The significance of these two states is that a document in state (1) has an immutable set of content elements, while the content element set for a document in state (2) can be altered. The content element set becomes immutable when either a new document or a document in the reservation state is checked-in . The process of check-in/check-out proceeds as follows:

  1. An existing checked-in document is checked out creating a new document object with a new object identifier (GUID). The new document does not contain any content elements (the content of the document that was the source of the check-out is not inherited by the new document), and is in the reservation versioning state.
  2. While still a reservation, content elements can then be added or removed from the new document. The document can undergo interim saves with various content element sets; in other words, the content set is not immutable.
  3. At some point the document is checked in. This changes the versioning state from reservation to checked in, and makes the content element set immutable.

Fixed Content System Content Element Model

While there are several possible content models, currently only a "many to one" mapping is supported, where all content elements for a document are represented as a complete set by one item in the fixed system.

The behavior for content of documents in the reservation state differs from the behavior for content of documents in the checked in state. Content of reservations is never stored in the fixed system, and is always stored within the file storage area. Content of checked-in documents is always stored in the fixed system, with all content elements of the document stored within a single item of the fixed system. When a document transitions from the reservation to checked-in state (an existing document is checked-in), all content elements (of the document) that are stored in the file storage area will be migrated to the fixed system.

Annotation and reservation content is never migrated

The content element set of an Content Engine annotation is like the set of elements for a document reservation, in that content elements can be added or deleted from the set. Therefore, content of annotations is always stored in the file storage area, and never migrated to the fixed store. This is an important consideration for the backup and restore of annotation and document reservation content.

Administration

You must create and configure the fixed content device using tools supplied by the software vendor. Use the Create a Fixed Content Device wizard in Enterprise Manager to connect to the device.

You can create a new fixed storage area only when these tasks are completed:

Create a fixed storage area

  1. Run the Create a Fixed Content Device wizard. Select the configuration parameters for the fixed content device.
  2. Run the Create a Storage Area wizard. Select a fixed storage area and select one of the pre-configured fixed content devices.

    The file storage area (also known as the staging area) associated with the fixed storage area can be a Windows NTFS volume or a UNIX file system. The staging area is created automatically when the fixed storage area is created. Note that you must first create the shared directory and root folder for the staging area before completing the wizard. For information about the wizard and launch locations, see Wizard Help.

View the fixed storage area object

All storage areas (file, fixed, and database) are displayed as result items under Enterprise Manager's Storage Areas folder in the object store node. You can distinguish between the two types of file storage areas by looking at the Type or Fixed Content Device columns. The type column provides a textual description of the storage area type. If the Fixed Content Device column data is filled in with a provider name, then the storage area is a fixed storage area. If the column data is blank, then the storage area is a regular file storage area. You can view and modify the property sheets of these two types of file storage areas in exactly the same way (right-click the object and select Properties).

Other administrative tasks

You administer a fixed storage area the same way you administer a "normal" file storage area. See file storage area "how to" topics for more information.