A Hadoop storage device connects to and supports the features in an advanced storage area. You must be running Content Platform Engine 5.2.1.3-P8CPE-FP003 or later to use this feature.
Hadoop manages distributed file system (HDFS) storage consisting of large volumes of structured and unstructured data. Hadoop HDFS storage is intended for batch analysis of data.
Before you begin
- To maximize performance of batch processing, dedicate Hadoop HDFS storage to data that is searched with Hadoop-specific applications.
Store content required by P8 interactive clients on non-Hadoop devices.
- Before you can create a Hadoop storage device in Content Platform Engine, you must first register the Hadoop storage
device add-on in the global configuration database and install it to an object store.
- Verify that the Hadoop storage device security requirements are met.
- Be prepared with the Hadoop storage device configuration values. A Hadoop replica uses a two level directory structure: you have to set the number of top level directory nodes and the number of leaf directory nodes. Also, you must set the user name and password to authenticate with the authentication provider for Knox Gateway security.
Procedure
To create a Hadoop storage device:
- Start the New Hadoop Storage Device wizard in
the administration console:
- In the tree view, click to open the object store that uses the device.
- In the object store tree view, right-click the folder and click Hadoop Storage Devices.
- In the Hadoop Storage Device pane, click
New.
- Complete the wizard.