Handling HTTP content

Sometimes the content returned from a connection request is not what you expected, for example an image in a binary format or an unknown XML vocabulary. Web pages can also contain scripts, and HTML markup may not be well formed XML.

Filtering content

You can use the webd:content element's type and action attributes to control this behavior. With type you define the type of content to handle, for example 'image/gif', and in action you use the value 'ignore' to omit the content.

<webd:content
  type="image/gif"
  action="ignore"/>

Conditioning HTML

If the MIME type of the content that it has received from the remote web server is 'text/html', the web driver will attempt to tidy the content into valid XHTML.

The tidy operation adds quotes around attributes that are not quoted and closes any elements that are not explicitly closed. The tidy process is not configurable.

Client-side scripts

If your site relies on scripts to provide dynamic menus and links, you'll want to include script material in the pipeline process using the webd:script element. You refer to a script module definition in the mcs-config.xml file.

The filter element defines a MIME content-type and refers to a custom filter class that implements the rendering functions of particular script or scripts in XDIME. Both attributes are required.

<script>
  <module id="js_module">
    <filter content-type="application/x-javascript"
      class="com.myweb.javascriptHandler"/>
    </filter>
  </module>
</script>

Now you can refer to the script module in the pipeline.

<webd:get url="http://somesite.net/index.xml">
  <webd:script ref="js_module"/>
</webd:get>

Handling HTTP content

Filtering content

Conditioning HTML

Client-side scripts

Related topics