HTML and XML Parsers.
Tidy's HTML parser corrects many conditions and enforces certain user preferences during the parsing process. The XML parser produces a tree of nodes useful to Tidy but also suitable for use in other XML processing applications.
tidy.h
for the complete license.Go to the source code of this file.
Data Structures | |
struct | TidyParserMemory |
This typedef represents the state of a parser when it enters and exits. More... | |
struct | TidyParserStack |
This typedef represents a stack of parserState. More... | |
Functions | |
TY_PRIVATE Bool | TY_❪CheckNodeIntegrity❫ (Node *node) |
Is used to perform a node integrity check recursively after parsing an HTML or XML document. More... | |
TY_PRIVATE void | TY_❪CoerceNode❫ (TidyDocImpl *doc, Node *node, TidyTagId tid, Bool obsolete, Bool expected) |
Transforms a given node to another element, for example, from a p to a br . More... | |
TY_PRIVATE Node * | TY_❪DiscardElement❫ (TidyDocImpl *doc, Node *element) |
Remove node from markup tree and discard it. More... | |
TY_PRIVATE Node * | TY_❪DropEmptyElements❫ (TidyDocImpl *doc, Node *node) |
Trims a tree of empty elements recursively, returning the next node. More... | |
void | TY_❪FreeParserStack❫ (TidyDocImpl *doc) |
Frees the parser's stack when done. More... | |
void | TY_❪InitParserStack❫ (TidyDocImpl *doc) |
Allocates and initializes the parser's stack. More... | |
TY_PRIVATE void | TY_❪InsertNodeAfterElement❫ (Node *element, Node *node) |
Insert node into markup tree after element. More... | |
TY_PRIVATE void | TY_❪InsertNodeAtEnd❫ (Node *element, Node *node) |
Insert node into markup tree as the last element of content of element. More... | |
TY_PRIVATE void | TY_❪InsertNodeAtStart❫ (Node *element, Node *node) |
Insert node into markup tree as the first element of content of element. More... | |
TY_PRIVATE void | TY_❪InsertNodeBeforeElement❫ (Node *element, Node *node) |
Insert node into markup tree before element. More... | |
TY_PRIVATE Bool | TY_❪IsBlank❫ (Lexer *lexer, Node *node) |
Indicates whether or not a text node is blank, meaning that it consists of nothing, or a single space. More... | |
Bool | TY_❪isEmptyParserStack❫ (TidyDocImpl *doc) |
Indicates whether or not the stack is empty. More... | |
TY_PRIVATE Bool | TY_❪IsJavaScript❫ (Node *node) |
Indicates whether or not a node is declared as containing javascript code. More... | |
TY_PRIVATE Bool | TY_❪IsNewNode❫ (Node *node) |
Used to check if a node uses CM_NEW, which determines how attributes without values should be printed. More... | |
TY_PRIVATE void | TY_❪ParseDocument❫ (TidyDocImpl *doc) |
Parses a document after lexing using the HTML parser. More... | |
TY_PRIVATE void | TY_❪ParseXMLDocument❫ (TidyDocImpl *doc) |
Parses a document after lexing using the XML parser. More... | |
Parser * | TY_❪peekMemoryIdentity❫ (TidyDocImpl *doc) |
Peek at the parser memory "identity" field. More... | |
GetTokenMode | TY_❪peekMemoryMode❫ (TidyDocImpl *doc) |
Peek at the parser memory "mode" field. More... | |
TidyParserMemory | TY_❪peekMemory❫ (TidyDocImpl *doc) |
Peek at the parser memory. More... | |
TidyParserMemory | TY_❪popMemory❫ (TidyDocImpl *doc) |
Pop out a parser memory. More... | |
void | TY_❪pushMemory❫ (TidyDocImpl *doc, TidyParserMemory data) |
Push the parser memory to the stack. More... | |
TY_PRIVATE Node * | TY_❪RemoveNode❫ (Node *node) |
Extract a node and its children from a markup tree. More... | |
TY_PRIVATE Bool | TY_❪TextNodeEndWithSpace❫ (Lexer *lexer, Node *node) |
Indicates whether or not a text node ends with a space or newline. More... | |
TY_PRIVATE Node * | TY_❪TrimEmptyElement❫ (TidyDocImpl *doc, Node *element) |
Trims a single, empty element, returning the next node. More... | |
TY_PRIVATE Bool | TY_❪XMLPreserveWhiteSpace❫ (TidyDocImpl *doc, Node *element) |
Indicates whether or not whitespace is to be preserved in XHTML/XML documents. More... | |