www.alphaworks.ibm.comwww.ibm.com/developerwww.ibm.com

Home

Readme
Download

Build







Migration

Releases

Feedback

Y2K Compliance


CVS Repository
Mail Archive

API Docs for SAX and DOM
 

Main Page   Class Hierarchy   Alphabetical List   Compound List   File List   Compound Members   File Members  

SAXParser.hpp

Go to the documentation of this file.
00001 /*
00002  * The Apache Software License, Version 1.1
00003  *
00004  * Copyright (c) 1999-2001 The Apache Software Foundation.  All rights
00005  * reserved.
00006  *
00007  * Redistribution and use in source and binary forms, with or without
00008  * modification, are permitted provided that the following conditions
00009  * are met:
00010  *
00011  * 1. Redistributions of source code must retain the above copyright
00012  *    notice, this list of conditions and the following disclaimer.
00013  *
00014  * 2. Redistributions in binary form must reproduce the above copyright
00015  *    notice, this list of conditions and the following disclaimer in
00016  *    the documentation and/or other materials provided with the
00017  *    distribution.
00018  *
00019  * 3. The end-user documentation included with the redistribution,
00020  *    if any, must include the following acknowledgment:
00021  *       "This product includes software developed by the
00022  *        Apache Software Foundation (http://www.apache.org/)."
00023  *    Alternately, this acknowledgment may appear in the software itself,
00024  *    if and wherever such third-party acknowledgments normally appear.
00025  *
00026  * 4. The names "Xerces" and "Apache Software Foundation" must
00027  *    not be used to endorse or promote products derived from this
00028  *    software without prior written permission. For written
00029  *    permission, please contact apache\@apache.org.
00030  *
00031  * 5. Products derived from this software may not be called "Apache",
00032  *    nor may "Apache" appear in their name, without prior written
00033  *    permission of the Apache Software Foundation.
00034  *
00035  * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
00036  * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
00037  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
00038  * DISCLAIMED.  IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR
00039  * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
00040  * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
00041  * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
00042  * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
00043  * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
00044  * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
00045  * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
00046  * SUCH DAMAGE.
00047  * ====================================================================
00048  *
00049  * This software consists of voluntary contributions made by many
00050  * individuals on behalf of the Apache Software Foundation, and was
00051  * originally based on software copyright (c) 1999, International
00052  * Business Machines, Inc., http://www.ibm.com .  For more information
00053  * on the Apache Software Foundation, please see
00054  * <http://www.apache.org/>.
00055  */
00056 
00057 /*
00058  * $Log: SAXParser.hpp,v $
00059  * Revision 1.19  2001/07/27 20:24:21  tng
00060  * put getScanner() back as they were there before, not to break existing apps.
00061  *
00062  * Revision 1.18  2001/07/16 12:52:09  tng
00063  * APIDocs fix: default for schema processing in DOMParser, IDOMParser, and SAXParser should be false.
00064  *
00065  * Revision 1.17  2001/06/23 14:13:16  tng
00066  * Remove getScanner from the Parser headers as this is not needed and Scanner is not internal class.
00067  *
00068  * Revision 1.16  2001/06/03 19:26:20  jberry
00069  * Add support for querying error count following parse; enables simple parse without requiring error handler.
00070  *
00071  * Revision 1.15  2001/05/11 13:26:22  tng
00072  * Copyright update.
00073  *
00074  * Revision 1.14  2001/05/03 19:09:25  knoaman
00075  * Support Warning/Error/FatalError messaging.
00076  * Validity constraints errors are treated as errors, with the ability by user to set
00077  * validity constraints as fatal errors.
00078  *
00079  * Revision 1.13  2001/03/30 16:46:57  tng
00080  * Schema: Use setDoSchema instead of setSchemaValidation which makes more sense.
00081  *
00082  * Revision 1.12  2001/03/21 21:56:09  tng
00083  * Schema: Add Schema Grammar, Schema Validator, and split the DTDValidator into DTDValidator, DTDScanner, and DTDGrammar.
00084  *
00085  * Revision 1.11  2001/02/15 15:56:29  tng
00086  * Schema: Add setSchemaValidation and getSchemaValidation for DOMParser and SAXParser.
00087  * Add feature "http://apache.org/xml/features/validation/schema" for SAX2XMLReader.
00088  * New data field  fSchemaValidation in XMLScanner as the flag.
00089  *
00090  * Revision 1.10  2001/01/12 21:23:41  tng
00091  * Documentation Enhancement: explain values of Val_Scheme
00092  *
00093  * Revision 1.9  2000/08/02 18:05:15  jpolast
00094  * changes required for sax2
00095  * (changed private members to protected)
00096  *
00097  * Revision 1.8  2000/04/12 22:58:30  roddey
00098  * Added support for 'auto validate' mode.
00099  *
00100  * Revision 1.7  2000/03/03 01:29:34  roddey
00101  * Added a scanReset()/parseReset() method to the scanner and
00102  * parsers, to allow for reset after early exit from a progressive parse.
00103  * Added calls to new Terminate() call to all of the samples. Improved
00104  * documentation in SAX and DOM parsers.
00105  *
00106  * Revision 1.6  2000/02/17 03:54:27  rahulj
00107  * Added some new getters to query the parser state and
00108  * clarified the documentation.
00109  *
00110  * Revision 1.5  2000/02/16 03:42:58  rahulj
00111  * Finished documenting the SAX Driver implementation.
00112  *
00113  * Revision 1.4  2000/02/15 04:47:37  rahulj
00114  * Documenting the SAXParser framework. Not done yet.
00115  *
00116  * Revision 1.3  2000/02/06 07:47:56  rahulj
00117  * Year 2K copyright swat.
00118  *
00119  * Revision 1.2  1999/12/15 19:57:48  roddey
00120  * Got rid of redundant 'const' on boolean return value. Some compilers choke
00121  * on this and its useless.
00122  *
00123  * Revision 1.1.1.1  1999/11/09 01:07:51  twl
00124  * Initial checkin
00125  *
00126  * Revision 1.6  1999/11/08 20:44:54  rahul
00127  * Swat for adding in Product name and CVS comment log variable.
00128  *
00129  */
00130 
00131 #if !defined(SAXPARSER_HPP)
00132 #define SAXPARSER_HPP
00133 
00134 #include <sax/Parser.hpp>
00135 #include <internal/VecAttrListImpl.hpp>
00136 #include <framework/XMLDocumentHandler.hpp>
00137 #include <framework/XMLElementDecl.hpp>
00138 #include <framework/XMLEntityHandler.hpp>
00139 #include <framework/XMLErrorReporter.hpp>
00140 #include <validators/DTD/DocTypeHandler.hpp>
00141 
00142 class DocumentHandler;
00143 class EntityResolver;
00144 class XMLPScanToken;
00145 class XMLScanner;
00146 class XMLValidator;
00147 
00148 
00158 
00159 class  SAXParser :
00160 
00161     public Parser
00162     , public XMLDocumentHandler
00163     , public XMLErrorReporter
00164     , public XMLEntityHandler
00165     , public DocTypeHandler
00166 {
00167 public :
00168     // -----------------------------------------------------------------------
00169     //  Class types
00170     // -----------------------------------------------------------------------
00171     enum ValSchemes
00172     {
00173         Val_Never
00174         , Val_Always
00175         , Val_Auto
00176     };
00177 
00178 
00179     // -----------------------------------------------------------------------
00180     //  Constructors and Destructor
00181     // -----------------------------------------------------------------------
00182 
00185 
00190     SAXParser(XMLValidator* const valToAdopt = 0);
00191 
00195     ~SAXParser();
00197 
00198 
00201 
00207     DocumentHandler* getDocumentHandler();
00208 
00215     const DocumentHandler* getDocumentHandler() const;
00216 
00223     EntityResolver* getEntityResolver();
00224 
00231     const EntityResolver* getEntityResolver() const;
00232 
00239     ErrorHandler* getErrorHandler();
00240 
00247     const ErrorHandler* getErrorHandler() const;
00248 
00255     const XMLScanner& getScanner() const;
00256 
00263     const XMLValidator& getValidator() const;
00264 
00272     ValSchemes getValidationScheme() const;
00273 
00284     bool getDoSchema() const;
00285 
00296     int getErrorCount() const;
00297 
00307     bool getDoNamespaces() const;
00308 
00318     bool getExitOnFirstFatalError() const;
00319 
00330     bool getValidationConstraintFatal() const;
00332 
00333 
00334     // -----------------------------------------------------------------------
00335     //  Setter methods
00336     // -----------------------------------------------------------------------
00337 
00340 
00357     void setDoNamespaces(const bool newState);
00358 
00375     void setValidationScheme(const ValSchemes newScheme);
00376 
00390     void setDoSchema(const bool newState);
00391 
00392 
00408     void setExitOnFirstFatalError(const bool newState);
00409 
00425     void setValidationConstraintFatal(const bool newState);
00427 
00428 
00429     // -----------------------------------------------------------------------
00430     //  Advanced document handler list maintenance methods
00431     // -----------------------------------------------------------------------
00432 
00435 
00448     void installAdvDocHandler(XMLDocumentHandler* const toInstall);
00449 
00459     bool removeAdvDocHandler(XMLDocumentHandler* const toRemove);
00461 
00462 
00463     // -----------------------------------------------------------------------
00464     //  Implementation of the SAXParser interface
00465     // -----------------------------------------------------------------------
00466 
00469 
00481     virtual void parse(const InputSource& source, const bool reuseGrammar = false);
00482 
00495     virtual void parse(const XMLCh* const systemId, const bool reuseGrammar = false);
00496 
00507     virtual void parse(const char* const systemId, const bool reuseGrammar = false);
00508 
00519     virtual void setDocumentHandler(DocumentHandler* const handler);
00520 
00530     virtual void setDTDHandler(DTDHandler* const handler);
00531 
00542     virtual void setErrorHandler(ErrorHandler* const handler);
00543 
00555     virtual void setEntityResolver(EntityResolver* const resolver);
00557 
00558 
00559     // -----------------------------------------------------------------------
00560     //  Progressive scan methods
00561     // -----------------------------------------------------------------------
00562 
00565 
00596     bool parseFirst
00597     (
00598         const   XMLCh* const    systemId
00599         ,       XMLPScanToken&  toFill
00600         , const bool            reuseGrammar = false
00601     );
00602 
00633     bool parseFirst
00634     (
00635         const   char* const     systemId
00636         ,       XMLPScanToken&  toFill
00637         , const bool            reuseGrammar = false
00638     );
00639 
00670     bool parseFirst
00671     (
00672         const   InputSource&    source
00673         ,       XMLPScanToken&  toFill
00674         , const bool            reuseGrammar = false
00675     );
00676 
00701     bool parseNext(XMLPScanToken& token);
00702 
00724     void parseReset(XMLPScanToken& token);
00725 
00727 
00728 
00729 
00730     // -----------------------------------------------------------------------
00731     //  Implementation of the DocTypeHandler Interface
00732     // -----------------------------------------------------------------------
00733 
00736 
00750     virtual void attDef
00751     (
00752         const   DTDElementDecl& elemDecl
00753         , const DTDAttDef&      attDef
00754         , const bool            ignoring
00755     );
00756 
00766     virtual void doctypeComment
00767     (
00768         const   XMLCh* const    comment
00769     );
00770 
00787     virtual void doctypeDecl
00788     (
00789         const   DTDElementDecl& elemDecl
00790         , const XMLCh* const    publicId
00791         , const XMLCh* const    systemId
00792         , const bool            hasIntSubset
00793     );
00794 
00808     virtual void doctypePI
00809     (
00810         const   XMLCh* const    target
00811         , const XMLCh* const    data
00812     );
00813 
00825     virtual void doctypeWhitespace
00826     (
00827         const   XMLCh* const    chars
00828         , const unsigned int    length
00829     );
00830 
00843     virtual void elementDecl
00844     (
00845         const   DTDElementDecl& decl
00846         , const bool            isIgnored
00847     );
00848 
00859     virtual void endAttList
00860     (
00861         const   DTDElementDecl& elemDecl
00862     );
00863 
00870     virtual void endIntSubset();
00871 
00878     virtual void endExtSubset();
00879 
00894     virtual void entityDecl
00895     (
00896         const   DTDEntityDecl&  entityDecl
00897         , const bool            isPEDecl
00898         , const bool            isIgnored
00899     );
00900 
00905     virtual void resetDocType();
00906 
00919     virtual void notationDecl
00920     (
00921         const   XMLNotationDecl&    notDecl
00922         , const bool                isIgnored
00923     );
00924 
00935     virtual void startAttList
00936     (
00937         const   DTDElementDecl& elemDecl
00938     );
00939 
00946     virtual void startIntSubset();
00947 
00954     virtual void startExtSubset();
00955 
00968     virtual void TextDecl
00969     (
00970         const   XMLCh* const    versionStr
00971         , const XMLCh* const    encodingStr
00972     );
00974 
00975 
00976     // -----------------------------------------------------------------------
00977     //  Implementation of the XMLDocumentHandler interface
00978     // -----------------------------------------------------------------------
00979 
00982 
00997     virtual void docCharacters
00998     (
00999         const   XMLCh* const    chars
01000         , const unsigned int    length
01001         , const bool            cdataSection
01002     );
01003 
01013     virtual void docComment
01014     (
01015         const   XMLCh* const    comment
01016     );
01017 
01037     virtual void docPI
01038     (
01039         const   XMLCh* const    target
01040         , const XMLCh* const    data
01041     );
01042 
01054     virtual void endDocument();
01055 
01072     virtual void endElement
01073     (
01074         const   XMLElementDecl& elemDecl
01075         , const unsigned int    urlId
01076         , const bool            isRoot
01077     );
01078 
01089     virtual void endEntityReference
01090     (
01091         const   XMLEntityDecl&  entDecl
01092     );
01093 
01113     virtual void ignorableWhitespace
01114     (
01115         const   XMLCh* const    chars
01116         , const unsigned int    length
01117         , const bool            cdataSection
01118     );
01119 
01124     virtual void resetDocument();
01125 
01136     virtual void startDocument();
01137 
01164     virtual void startElement
01165     (
01166         const   XMLElementDecl&         elemDecl
01167         , const unsigned int            urlId
01168         , const XMLCh* const            elemPrefix
01169         , const RefVectorOf<XMLAttr>&   attrList
01170         , const unsigned int            attrCount
01171         , const bool                    isEmpty
01172         , const bool                    isRoot
01173     );
01174 
01184     virtual void startEntityReference
01185     (
01186         const   XMLEntityDecl&  entDecl
01187     );
01188 
01206     virtual void XMLDecl
01207     (
01208         const   XMLCh* const    versionStr
01209         , const XMLCh* const    encodingStr
01210         , const XMLCh* const    standaloneStr
01211         , const XMLCh* const    actualEncodingStr
01212     );
01214 
01215 
01216     // -----------------------------------------------------------------------
01217     //  Implementation of the XMLErrorReporter interface
01218     // -----------------------------------------------------------------------
01219 
01222 
01245     virtual void error
01246     (
01247         const   unsigned int                errCode
01248         , const XMLCh* const                msgDomain
01249         , const XMLErrorReporter::ErrTypes  errType
01250         , const XMLCh* const                errorText
01251         , const XMLCh* const                systemId
01252         , const XMLCh* const                publicId
01253         , const unsigned int                lineNum
01254         , const unsigned int                colNum
01255     );
01256 
01265     virtual void resetErrors();
01267 
01268 
01269     // -----------------------------------------------------------------------
01270     //  Implementation of the XMLEntityHandler interface
01271     // -----------------------------------------------------------------------
01272 
01275 
01286     virtual void endInputSource(const InputSource& inputSource);
01287 
01302     virtual bool expandSystemId
01303     (
01304         const   XMLCh* const    systemId
01305         ,       XMLBuffer&      toFill
01306     );
01307 
01315     virtual void resetEntities();
01316 
01331     virtual InputSource* resolveEntity
01332     (
01333         const   XMLCh* const    publicId
01334         , const XMLCh* const    systemId
01335     );
01336 
01348     virtual void startInputSource(const InputSource& inputSource);
01350 
01351 
01354 
01364     bool getDoValidation() const;
01365 
01379     void setDoValidation(const bool newState);
01381 
01382 
01383 protected :
01384     // -----------------------------------------------------------------------
01385     //  Unimplemented constructors and operators
01386     // -----------------------------------------------------------------------
01387     SAXParser(const SAXParser&);
01388     void operator=(const SAXParser&);
01389 
01390 
01391     // -----------------------------------------------------------------------
01392     //  Private data members
01393     //
01394     //  fAttrList
01395     //      A temporary implementation of the basic SAX attribute list
01396     //      interface. We use this one over and over on each startElement
01397     //      event to allow SAX-like access to the element attributes.
01398     //
01399     //  fDocHandler
01400     //      The installed SAX doc handler, if any. Null if none.
01401     //
01402     //  fDTDHandler
01403     //      The installed SAX DTD handler, if any. Null if none.
01404     //
01405     //  fElemDepth
01406     //      This is used to track the element nesting depth, so that we can
01407     //      know when we are inside content. This is so we can ignore char
01408     //      data outside of content.
01409     //
01410     //  fEntityResolver
01411     //      The installed SAX entity handler, if any. Null if none.
01412     //
01413     //  fErrorHandler
01414     //      The installed SAX error handler, if any. Null if none.
01415     //
01416     //  fAdvDHCount
01417     //  fAdvDHList
01418     //  fAdvDHListSize
01419     //      This is an array of pointers to XMLDocumentHandlers, which is
01420     //      how we see installed advanced document handlers. There will
01421     //      usually not be very many at all, so a simple array is used
01422     //      instead of a collection, for performance. It will grow if needed,
01423     //      but that is unlikely.
01424     //
01425     //      The count is how many handlers are currently installed. The size
01426     //      is how big the array itself is (for expansion purposes.) When
01427     //      count == size, is time to expand.
01428     //
01429     //  fParseInProgress
01430     //      This flag is set once a parse starts. It is used to prevent
01431     //      multiple entrance or reentrance of the parser.
01432     //
01433     //  fScanner
01434     //      The scanner being used by this parser. It is created internally
01435     //      during construction.
01436     //
01437     // -----------------------------------------------------------------------
01438     VecAttrListImpl         fAttrList;
01439     DocumentHandler*        fDocHandler;
01440     DTDHandler*             fDTDHandler;
01441     unsigned int            fElemDepth;
01442     EntityResolver*         fEntityResolver;
01443     ErrorHandler*           fErrorHandler;
01444     unsigned int            fAdvDHCount;
01445     XMLDocumentHandler**    fAdvDHList;
01446     unsigned int            fAdvDHListSize;
01447     bool                    fParseInProgress;
01448     XMLScanner*             fScanner;
01449 };
01450 
01451 
01452 // ---------------------------------------------------------------------------
01453 //  SAXParser: Getter methods
01454 // ---------------------------------------------------------------------------
01455 inline DocumentHandler* SAXParser::getDocumentHandler()
01456 {
01457     return fDocHandler;
01458 }
01459 
01460 inline const DocumentHandler* SAXParser::getDocumentHandler() const
01461 {
01462     return fDocHandler;
01463 }
01464 
01465 inline EntityResolver* SAXParser::getEntityResolver()
01466 {
01467     return fEntityResolver;
01468 }
01469 
01470 inline const EntityResolver* SAXParser::getEntityResolver() const
01471 {
01472     return fEntityResolver;
01473 }
01474 
01475 inline ErrorHandler* SAXParser::getErrorHandler()
01476 {
01477     return fErrorHandler;
01478 }
01479 
01480 inline const ErrorHandler* SAXParser::getErrorHandler() const
01481 {
01482     return fErrorHandler;
01483 }
01484 
01485 inline const XMLScanner& SAXParser::getScanner() const
01486 {
01487     return *fScanner;
01488 }
01489 
01490 #endif


Copyright © 2000 The Apache Software Foundation. All Rights Reserved.