TagSoupDocumentParser

public final class TagSoupDocumentParser


Uses TagSoup to parse html into Documents.

Summary

Public constructors

Public methods

static TagSoupDocumentParser
Document
parse(String html)

Parses the given html into an Document.

Public constructors

TagSoupDocumentParser

public TagSoupDocumentParser()

Public methods

newInstance

public static TagSoupDocumentParser newInstance()
Throws
org.xml.sax.SAXNotRecognizedException org.xml.sax.SAXNotRecognizedException
org.xml.sax.SAXNotSupportedException org.xml.sax.SAXNotSupportedException

parse

public Document parse(String html)

Parses the given html into an Document.

Throws
java.io.IOException java.io.IOException
org.xml.sax.SAXException org.xml.sax.SAXException