Defining HTTP resources

When you define HTTP resources for the pipeline, you use the url attributes in the webd:get and webd:post elements to specify a fully qualified URL.

When you define HTTP resources for the pipeline, you use the url attributes in the webd:get and webd:post elements to specify a fully qualified URL.

You can also register a proxy server definition in the MCS configuration file.

To filter binary content you can use the webd:content element, and handle any scripts in the source with webd:script. See Handling HTTP content for more information.

Using an HTTP request

The web driver supports the HTTP methods GET and POST to access a remote page, represented by the webd:get and webd:post elements.

Then you use the contained webd:parameter, webd:header and webd:cookie elements to tune the request. Refer to Request parameters, headers and cookies for details.

<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/2002/06/xhtml2"
  xmlns:pipeline="http://www.volantis.com/xmlns/marlin-pipeline"
  xmlns:webd="http://www.volantis.com/xmlns/marlin-web-driver">
  <head>
    <title>DCI elements</title>
  </head>
  <body>
    <div>
      <pipeline:transform href="test.xsl">
        <webd:get url="http://weather.yahooapis.com/forecastrss">
          <webd:parameters>
            <webd:parameter name="p" value="USWA0395"/>
            <webd:parameter name="u" value="c"/>
          </webd:parameters>
        </webd:get>
      </pipeline:transform>
    </div>
  </body>
</html>
Note:

The response must contain the Content-Type header, otherwise it won't be processed by MCS.

Specifying a proxy server

Sometimes you may want to connect to a remote web site using a proxy server for security reasons, or to boost performance where the proxy is able to cache content. You can use the ref on the optional webd:proxy element to point to an entry in the mcs-config.xml file.

The pipeline-configuration section of the configuration file contains a web-driver element in which you can specify one or more proxy servers using the id, host and port attributes.

<web-driver connection-timeout="10000">
  <proxy
    id="myproxy"  
    host="myhost"                  
    port="8086"/>         
</web-driver>

Having defined the proxy sever in the pipeline configuration, you can then refer to it in the web driver webd:proxy element.

<webd:get url="http://weather.yahooapis.com/forecastrss">
  <webd:proxy ref="myproxy"/>
  <webd:parameters>
    <webd:parameter name="p" value="USWA0395"/>
    <webd:parameter name="u" value="c"/>
  </webd:parameters>
</webd:get>

Specifying a timeout value

You can also use the web-driver element to set a default timeout value for connections for webd:get and webd:post processes using the connection-timeout attribute. The value sets the default server and back timeout for all web driver operations in milliseconds. A value of '-1' means that the connection will never timeout.

To set a default value for a wider range of remote connections, use the timeout attribute on the connection element in the mcs-config.xml file. See the element references for details.

<pipeline-configuration>
  <caching-operation>
    <cache
      name="xmlcache"
      strategy="least-recently-used"
      max-entries="10"/>
  </caching-operation>
  <connection timeout="-1"
    enable-caching="false"
    max-cache-entries="1000"/>
</pipeline-configuration>

You can also set timeout values, which will override any defaults, in markup on the webd:get and webd:post and urid:fetch elements, using the timeout attribute.

If a remote server fails to complete the request within the timeout period defined in the markup or configuration file, the request is aborted, regardless of how much data has been returned. Any data returned so far is discarded, and an error is propagated down the pipeline.

Related topics