These functions and structures form the internal API for document parsing.
Functions | |
Bool | TY_❪CheckNodeIntegrity❫ (Node *node) |
Is used to perform a node integrity check recursively after parsing an HTML or XML document. More... | |
void | TY_❪CoerceNode❫ (TidyDocImpl *doc, Node *node, TidyTagId tid, Bool obsolete, Bool expected) |
Transforms a given node to another element, for example, from a p to a br . More... | |
Node * | TY_❪DiscardElement❫ (TidyDocImpl *doc, Node *element) |
Remove node from markup tree and discard it. More... | |
Node * | TY_❪DropEmptyElements❫ (TidyDocImpl *doc, Node *node) |
Trims a tree of empty elements recursively, returning the next node. More... | |
void | TY_❪InsertNodeAfterElement❫ (Node *element, Node *node) |
Insert node into markup tree after element. More... | |
void | TY_❪InsertNodeAtEnd❫ (Node *element, Node *node) |
Insert node into markup tree as the last element of content of element. More... | |
void | TY_❪InsertNodeAtStart❫ (Node *element, Node *node) |
Insert node into markup tree as the firt element of content of element. More... | |
void | TY_❪InsertNodeBeforeElement❫ (Node *element, Node *node) |
Insert node into markup tree before element. More... | |
Bool | TY_❪IsBlank❫ (Lexer *lexer, Node *node) |
Indicates whether or not a text node is blank, meaning that it consists of nothing, or a single space. More... | |
Bool | TY_❪IsJavaScript❫ (Node *node) |
Indicates whether or not a node is declared as containing javascript code. More... | |
Bool | TY_❪IsNewNode❫ (Node *node) |
Used to check if a node uses CM_NEW, which determines how attributes without values should be printed. More... | |
void | TY_❪ParseDocument❫ (TidyDocImpl *doc) |
Parses a document after lexing using the HTML parser. More... | |
void | TY_❪ParseXMLDocument❫ (TidyDocImpl *doc) |
Parses a document after lexing using the XML parser. More... | |
Node * | TY_❪RemoveNode❫ (Node *node) |
Extract a node and its children from a markup tree. More... | |
Bool | TY_❪TextNodeEndWithSpace❫ (Lexer *lexer, Node *node) |
Indicates whether or not a text node ends with a space or newline. More... | |
Node * | TY_❪TrimEmptyElement❫ (TidyDocImpl *doc, Node *element) |
Trims a single, empty element, returning the next node. More... | |
Bool | TY_❪XMLPreserveWhiteSpace❫ (TidyDocImpl *doc, Node *element) |
Indicates whether or not whitespace is to be preserved in XHTML/XML documents. More... | |
Bool TY_❪CheckNodeIntegrity❫ | ( | Node * | node | ) |
Is used to perform a node integrity check recursively after parsing an HTML or XML document.
node | The root node for the integrity check. |
void TY_❪CoerceNode❫ | ( | TidyDocImpl * | doc, |
Node * | node, | ||
TidyTagId | tid, | ||
Bool | obsolete, | ||
Bool | expected | ||
) |
Transforms a given node to another element, for example, from a p
to a br
.
doc | The document which the node belongs to. |
node | The node to coerce. |
tid | The tag type to coerce the node into. |
obsolete | If the old node was obsolete, a report will be generated. |
expected | If the old node was not expected to be found in this particular location, a report will be generated. |
Node* TY_❪DiscardElement❫ | ( | TidyDocImpl * | doc, |
Node * | element | ||
) |
Remove node from markup tree and discard it.
doc | The Tidy document from which to discarb the node. |
element | The node to discard. |
Node* TY_❪DropEmptyElements❫ | ( | TidyDocImpl * | doc, |
Node * | node | ||
) |
Trims a tree of empty elements recursively, returning the next node.
doc | The Tidy document. |
node | The element to trim. |
void TY_❪InsertNodeAfterElement❫ | ( | Node * | element, |
Node * | node | ||
) |
Insert node into markup tree after element.
element | The node after which the node is inserted. |
node | The node to insert. |
void TY_❪InsertNodeAtEnd❫ | ( | Node * | element, |
Node * | node | ||
) |
Insert node into markup tree as the last element of content of element.
element | The new destination node. |
node | The node to insert. |
void TY_❪InsertNodeAtStart❫ | ( | Node * | element, |
Node * | node | ||
) |
Insert node into markup tree as the firt element of content of element.
element | The new destination node. |
node | The node to insert. |
void TY_❪InsertNodeBeforeElement❫ | ( | Node * | element, |
Node * | node | ||
) |
Insert node into markup tree before element.
element | The node before which the node is inserted. |
node | The node to insert. |
Bool TY_❪IsBlank❫ | ( | Lexer * | lexer, |
Node * | node | ||
) |
Indicates whether or not a text node is blank, meaning that it consists of nothing, or a single space.
lexer | The lexer used to lex the document. |
node | The node to test. |
Bool TY_❪IsJavaScript❫ | ( | Node * | node | ) |
Indicates whether or not a node is declared as containing javascript code.
node | The node to test. |
Bool TY_❪IsNewNode❫ | ( | Node * | node | ) |
Used to check if a node uses CM_NEW, which determines how attributes without values should be printed.
This was introduced to deal with user-defined tags e.g. ColdFusion.
node | The node to check. |
void TY_❪ParseDocument❫ | ( | TidyDocImpl * | doc | ) |
Parses a document after lexing using the HTML parser.
It begins by properly configuring the overall HTML structure, and subsequently processes all remaining nodes. HTML is the root node.
doc | The Tidy document. |
void TY_❪ParseXMLDocument❫ | ( | TidyDocImpl * | doc | ) |
Parses a document after lexing using the XML parser.
doc | The Tidy document. |
Node* TY_❪RemoveNode❫ | ( | Node * | node | ) |
Extract a node and its children from a markup tree.
node | The node to remove. |
Bool TY_❪TextNodeEndWithSpace❫ | ( | Lexer * | lexer, |
Node * | node | ||
) |
Indicates whether or not a text node ends with a space or newline.
pprint.c
for some reason. lexer | A reference to the lexer used to lex the document. |
node | The node to check. |
Node* TY_❪TrimEmptyElement❫ | ( | TidyDocImpl * | doc, |
Node * | element | ||
) |
Trims a single, empty element, returning the next node.
doc | The Tidy document. |
element | The element to trim. |
Bool TY_❪XMLPreserveWhiteSpace❫ | ( | TidyDocImpl * | doc, |
Node * | element | ||
) |
Indicates whether or not whitespace is to be preserved in XHTML/XML documents.
doc | The Tidy document. |
element | The node to test. |