HTML Tidy  5.8.0
The HTACG Tidy HTML Project
Document Tree

Detailed Description

A parsed (and optionally repaired) document is represented by Tidy as a tree, much like a W3C DOM.

This tree may be traversed using these functions. The following snippet gives a basic idea how these functions can be used.

void dumpNode( TidyNode tnod, int indent ) {
TidyNode child;
for ( child = tidyGetChild(tnod); child; child = tidyGetNext(child) ) {
ctmbstr name;
switch ( tidyNodeGetType(child) ) {
case TidyNode_Root: name = "Root"; break;
case TidyNode_DocType: name = "DOCTYPE"; break;
case TidyNode_Comment: name = "Comment"; break;
case TidyNode_ProcIns: name = "Processing Instruction"; break;
case TidyNode_Text: name = "Text"; break;
case TidyNode_CDATA: name = "CDATA"; break;
case TidyNode_Section: name = "XML Section"; break;
case TidyNode_Asp: name = "ASP"; break;
case TidyNode_Jste: name = "JSTE"; break;
case TidyNode_Php: name = "PHP"; break;
case TidyNode_XmlDecl: name = "XML Declaration"; break;
default:
name = tidyNodeGetName( child );
break;
}
assert( name != NULL );
printf( "\%*.*sNode: \%s\\n", indent, indent, " ", name );
dumpNode( child, indent + 4 );
}
}
void dumpDoc( TidyDoc tdoc ) {
dumpNode( tidyGetRoot(tdoc), 0 );
}
void dumpBody( TidyDoc tdoc ) {
dumpNode( tidyGetBody(tdoc), 0 );
}
Instances of this represent a Tidy document, which encapsulates everything there is to know about a s...
Single nodes of a TidyDocument are represented by this datatype.
TidyNode TIDY_CALL tidyGetChild(TidyNode tnod)
Get the child of the indicated node.
ctmbstr TIDY_CALL tidyNodeGetName(TidyNode tnod)
Get the name of the node.
TidyNode TIDY_CALL tidyGetNext(TidyNode tnod)
Get the next sibling node.
TidyNode TIDY_CALL tidyGetBody(TidyDoc tdoc)
Get the BODY node.
TidyNodeType TIDY_CALL tidyNodeGetType(TidyNode tnod)
Get the type of node.
TidyNode TIDY_CALL tidyGetRoot(TidyDoc tdoc)
Get the root node.
@ TidyNode_End
End Tag.
Definition: tidyenum.h:843
@ TidyNode_StartEnd
Start/End (empty) Tag.
Definition: tidyenum.h:844
@ TidyNode_Section
XML Section.
Definition: tidyenum.h:846
@ TidyNode_Asp
ASP Source.
Definition: tidyenum.h:847
@ TidyNode_Start
Start Tag.
Definition: tidyenum.h:842
@ TidyNode_Jste
JSTE Source.
Definition: tidyenum.h:848
@ TidyNode_Php
PHP Source.
Definition: tidyenum.h:849
@ TidyNode_XmlDecl
XML Declaration.
Definition: tidyenum.h:850
@ TidyNode_DocType
DOCTYPE.
Definition: tidyenum.h:838
@ TidyNode_Text
Text.
Definition: tidyenum.h:841
@ TidyNode_Comment
Comment.
Definition: tidyenum.h:839
@ TidyNode_CDATA
Unparsed Text.
Definition: tidyenum.h:845
@ TidyNode_ProcIns
Processing Instruction.
Definition: tidyenum.h:840
@ TidyNode_Root
Root.
Definition: tidyenum.h:837
const tmbchar * ctmbstr
Definition: tidyplatform.h:609

Nodes for Document Sections

TidyNode TIDY_CALL tidyGetRoot (TidyDoc tdoc)
 Get the root node. More...
 
TidyNode TIDY_CALL tidyGetHtml (TidyDoc tdoc)
 Get the HTML node. More...
 
TidyNode TIDY_CALL tidyGetHead (TidyDoc tdoc)
 Get the HEAD node. More...
 
TidyNode TIDY_CALL tidyGetBody (TidyDoc tdoc)
 Get the BODY node. More...
 

Relative Nodes

TidyNode TIDY_CALL tidyGetParent (TidyNode tnod)
 Get the parent of the indicated node. More...
 
TidyNode TIDY_CALL tidyGetChild (TidyNode tnod)
 Get the child of the indicated node. More...
 
TidyNode TIDY_CALL tidyGetNext (TidyNode tnod)
 Get the next sibling node. More...
 
TidyNode TIDY_CALL tidyGetPrev (TidyNode tnod)
 Get the previous sibling node. More...
 

Miscellaneous Node Functions

TidyNode TIDY_CALL tidyDiscardElement (TidyDoc tdoc, TidyNode tnod)
 Remove the indicated node. More...
 

Node Attribute Functions

TidyAttr TIDY_CALL tidyAttrFirst (TidyNode tnod)
 Get the first attribute. More...
 
TidyAttr TIDY_CALL tidyAttrNext (TidyAttr tattr)
 Get the next attribute. More...
 
ctmbstr TIDY_CALL tidyAttrName (TidyAttr tattr)
 Get the name of a TidyAttr instance. More...
 
ctmbstr TIDY_CALL tidyAttrValue (TidyAttr tattr)
 Get the value of a TidyAttr instance. More...
 
void TIDY_CALL tidyAttrDiscard (TidyDoc itdoc, TidyNode tnod, TidyAttr tattr)
 Discard an attribute. More...
 
TidyAttrId TIDY_CALL tidyAttrGetId (TidyAttr tattr)
 Get the attribute ID given a tidy attribute. More...
 
Bool TIDY_CALL tidyAttrIsEvent (TidyAttr tattr)
 Indicates whether or not a given attribute is an event attribute. More...
 
TidyAttr TIDY_CALL tidyAttrGetById (TidyNode tnod, TidyAttrId attId)
 Get an instance of TidyAttr by specifying an attribute ID. More...
 

Additional Node Interrogation

TidyNodeType TIDY_CALL tidyNodeGetType (TidyNode tnod)
 Get the type of node. More...
 
ctmbstr TIDY_CALL tidyNodeGetName (TidyNode tnod)
 Get the name of the node. More...
 
Bool TIDY_CALL tidyNodeIsText (TidyNode tnod)
 Indicates whether or not a node is a text node. More...
 
Bool TIDY_CALL tidyNodeIsProp (TidyDoc tdoc, TidyNode tnod)
 Indicates whether or not the node is a propriety type. More...
 
Bool TIDY_CALL tidyNodeIsHeader (TidyNode tnod)
 Indicates whether or not a node represents and HTML header element, such as h1, h2, etc. More...
 
Bool TIDY_CALL tidyNodeHasText (TidyDoc tdoc, TidyNode tnod)
 Indicates whether or not the node has text. More...
 
Bool TIDY_CALL tidyNodeGetText (TidyDoc tdoc, TidyNode tnod, TidyBuffer *buf)
 Gets the text of a node and places it into the given TidyBuffer. More...
 
Bool TIDY_CALL tidyNodeGetValue (TidyDoc tdoc, TidyNode tnod, TidyBuffer *buf)
 Get the value of the node. More...
 
TidyTagId TIDY_CALL tidyNodeGetId (TidyNode tnod)
 Get the tag ID of the node. More...
 
uint TIDY_CALL tidyNodeLine (TidyNode tnod)
 Get the line number where the node occurs. More...
 
uint TIDY_CALL tidyNodeColumn (TidyNode tnod)
 Get the column location of the node. More...
 

Function Documentation

◆ tidyAttrDiscard()

void TIDY_CALL tidyAttrDiscard ( TidyDoc  itdoc,
TidyNode  tnod,
TidyAttr  tattr 
)

Discard an attribute.

Parameters
itdocThe tidy document from which to discard the attribute.
tnodThe node from which to discard the attribute.
tattrThe attribute to discard.

◆ tidyAttrFirst()

TidyAttr TIDY_CALL tidyAttrFirst ( TidyNode  tnod)

Get the first attribute.

Parameters
tnodThe node for which to get attributes.
Returns
Returns an instance of TidyAttr.

◆ tidyAttrGetById()

TidyAttr TIDY_CALL tidyAttrGetById ( TidyNode  tnod,
TidyAttrId  attId 
)

Get an instance of TidyAttr by specifying an attribute ID.

Returns
Returns a TidyAttr instance.
Parameters
tnodThe node to query.
attIdThe attribute ID to find.

◆ tidyAttrGetId()

TidyAttrId TIDY_CALL tidyAttrGetId ( TidyAttr  tattr)

Get the attribute ID given a tidy attribute.

Parameters
tattrThe attribute to query.
Returns
Returns the TidyAttrId of the given attribute.

◆ tidyAttrIsEvent()

Bool TIDY_CALL tidyAttrIsEvent ( TidyAttr  tattr)

Indicates whether or not a given attribute is an event attribute.

Parameters
tattrThe attribute to query.
Returns
Returns a bool indicating whether or not the attribute is an event.

◆ tidyAttrName()

ctmbstr TIDY_CALL tidyAttrName ( TidyAttr  tattr)

Get the name of a TidyAttr instance.

Parameters
tattrThe tidy attribute to query.
Returns
Returns a string indicating the name of the attribute.

◆ tidyAttrNext()

TidyAttr TIDY_CALL tidyAttrNext ( TidyAttr  tattr)

Get the next attribute.

Parameters
tattrThe current attribute, so the next one can be returned.
Returns
Returns and instance of TidyAttr.

◆ tidyAttrValue()

ctmbstr TIDY_CALL tidyAttrValue ( TidyAttr  tattr)

Get the value of a TidyAttr instance.

Parameters
tattrThe tidy attribute to query.
Returns
Returns a string indicating the value of the attribute.

◆ tidyDiscardElement()

TidyNode TIDY_CALL tidyDiscardElement ( TidyDoc  tdoc,
TidyNode  tnod 
)

Remove the indicated node.

Returns
Returns the next tidy node.
Parameters
tdocThe tidy document from which to remove the node.
tnodThe node to remove

◆ tidyGetBody()

TidyNode TIDY_CALL tidyGetBody ( TidyDoc  tdoc)

Get the BODY node.

Parameters
tdocThe document to query.
Returns
Returns a tidy node.

◆ tidyGetChild()

TidyNode TIDY_CALL tidyGetChild ( TidyNode  tnod)

Get the child of the indicated node.

Parameters
tnodThe node to query.
Returns
Returns a tidy node.

◆ tidyGetHead()

TidyNode TIDY_CALL tidyGetHead ( TidyDoc  tdoc)

Get the HEAD node.

Parameters
tdocThe document to query.
Returns
Returns a tidy node.

◆ tidyGetHtml()

TidyNode TIDY_CALL tidyGetHtml ( TidyDoc  tdoc)

Get the HTML node.

Parameters
tdocThe document to query.
Returns
Returns a tidy node.

◆ tidyGetNext()

TidyNode TIDY_CALL tidyGetNext ( TidyNode  tnod)

Get the next sibling node.

Parameters
tnodThe node to query.
Returns
Returns a tidy node.

◆ tidyGetParent()

TidyNode TIDY_CALL tidyGetParent ( TidyNode  tnod)

Get the parent of the indicated node.

Parameters
tnodThe node to query.
Returns
Returns a tidy node.

◆ tidyGetPrev()

TidyNode TIDY_CALL tidyGetPrev ( TidyNode  tnod)

Get the previous sibling node.

Parameters
tnodThe node to query.
Returns
Returns a tidy node.

◆ tidyGetRoot()

TidyNode TIDY_CALL tidyGetRoot ( TidyDoc  tdoc)

Get the root node.

Parameters
tdocThe document to query.
Returns
Returns a tidy node.

◆ tidyNodeColumn()

uint TIDY_CALL tidyNodeColumn ( TidyNode  tnod)

Get the column location of the node.

Parameters
tnodThe node to query.
Returns
Returns the column location of the node.

◆ tidyNodeGetId()

TidyTagId TIDY_CALL tidyNodeGetId ( TidyNode  tnod)

Get the tag ID of the node.

Parameters
tnodThe node to query.
Returns
Returns the tag ID of the node as TidyTagId.

◆ tidyNodeGetName()

ctmbstr TIDY_CALL tidyNodeGetName ( TidyNode  tnod)

Get the name of the node.

Parameters
tnodThe node to query.
Returns
Returns a string indicating the name of the node.

◆ tidyNodeGetText()

Bool TIDY_CALL tidyNodeGetText ( TidyDoc  tdoc,
TidyNode  tnod,
TidyBuffer *  buf 
)

Gets the text of a node and places it into the given TidyBuffer.

The text will be terminated with a TidyNewline. If you want the raw utf-8 stream see tidyNodeGetValue().

Returns
Returns a bool indicating success or not.
Parameters
tdocThe document to query.
tnodThe node to query.
[out]bufA TidyBuffer used to receive the node's text.

◆ tidyNodeGetType()

TidyNodeType TIDY_CALL tidyNodeGetType ( TidyNode  tnod)

Get the type of node.

Parameters
tnodThe node to query.
Returns
Returns the type of node as TidyNodeType.

◆ tidyNodeGetValue()

Bool TIDY_CALL tidyNodeGetValue ( TidyDoc  tdoc,
TidyNode  tnod,
TidyBuffer *  buf 
)

Get the value of the node.

This copies the unescaped value of this node into the given TidyBuffer at UTF-8.

Returns
Returns a bool indicating success or not.
Parameters
tdocThe document to query
tnodThe node to query
[out]bufA TidyBuffer used to receive the node's value.

◆ tidyNodeHasText()

Bool TIDY_CALL tidyNodeHasText ( TidyDoc  tdoc,
TidyNode  tnod 
)

Indicates whether or not the node has text.

Returns
Returns the type of node as TidyNodeType.
Parameters
tdocThe document to query.
tnodThe node to query.

◆ tidyNodeIsHeader()

Bool TIDY_CALL tidyNodeIsHeader ( TidyNode  tnod)

Indicates whether or not a node represents and HTML header element, such as h1, h2, etc.

Parameters
tnodThe node to query.
Returns
Returns a bool indicating whether or not the node is an HTML header.

◆ tidyNodeIsProp()

Bool TIDY_CALL tidyNodeIsProp ( TidyDoc  tdoc,
TidyNode  tnod 
)

Indicates whether or not the node is a propriety type.

Returns
Returns a bool indicating whether or not the node is a proprietary type.
Parameters
tdocThe document to query.
tnodThe node to query

◆ tidyNodeIsText()

Bool TIDY_CALL tidyNodeIsText ( TidyNode  tnod)

Indicates whether or not a node is a text node.

Parameters
tnodThe node to query.
Returns
Returns a bool indicating whether or not the node is a text node.

◆ tidyNodeLine()

uint TIDY_CALL tidyNodeLine ( TidyNode  tnod)

Get the line number where the node occurs.

Parameters
tnodThe node to query.
Returns
Returns the line number.