public class HtmlParserUtil
extends Object
Constructor and Description |
---|
HtmlParserUtil() |
Modifier and Type | Method and Description |
---|---|
static String |
extractText(String html)
Extracts the raw text from the HTML input, compressing its whitespace and
removing all attributes, scripts, and styles.
|
static String |
findAttributeValue(Predicate<Function<String,String>> findValuePredicate,
Function<Function<String,String>,String> returnValueFunction,
String html,
String startTagName) |
static String |
render(String html)
Renders the HTML content into text.
|
public static String extractText(String html)
For example, raw text returned by this method can be stored in a search index.
html
- the HTML textnull
if the
HTML input is null
public static String findAttributeValue(Predicate<Function<String,String>> findValuePredicate, Function<Function<String,String>,String> returnValueFunction, String html, String startTagName)
public static String render(String html)
Using the default settings, the output complies with the
Text/Plain; Format=Flowed (DelSp=No)
protocol described in
RFC-3676.
html
- the HTML textnull
if the HTML text is
null