UnicodeSetSpanner

public class UnicodeSetSpanner
extends Object

java.lang.Object
   ↳ android.icu.text.UnicodeSetSpanner


A helper class used to count, replace, and trim CharSequences based on UnicodeSet matches. An instance is immutable (and thus thread-safe) iff the source UnicodeSet is frozen.

Note: The counting, deletion, and replacement depend on alternating a SpanCondition with its inverse. That is, the code spans, then spans for the inverse, then spans, and so on. For the inverse, the following mapping is used:

These are actually not complete inverses. However, the alternating works because there are no gaps. For example, with [a{ab}{bc}], you get the following behavior when scanning forward:
SIMPLExxx[ab]cyyy
CONTAINEDxxx[abc]yyy
NOT_CONTAINED[xxx]ab[cyyy]

So here is what happens when you alternate:

start|xxxabcyyy
NOT_CONTAINEDxxx|abcyyy
CONTAINEDxxxabc|yyy
NOT_CONTAINEDxxxabcyyy|

The entire string is traversed.

Summary

Public constructors

UnicodeSetSpanner(UnicodeSet source)

Create a spanner from a UnicodeSet.

Public methods

int countIn(CharSequence sequence)

Returns the number of matching characters found in a character sequence, counting by CountMethod.MIN_ELEMENTS using SpanCondition.SIMPLE.

int countIn(CharSequence sequence, UnicodeSetSpanner.CountMethod countMethod, UnicodeSet.SpanCondition spanCondition)

Returns the number of matching characters found in a character sequence.

int countIn(CharSequence sequence, UnicodeSetSpanner.CountMethod countMethod)

Returns the number of matching characters found in a character sequence, using SpanCondition.SIMPLE.

String deleteFrom(CharSequence sequence, UnicodeSet.SpanCondition spanCondition)

Delete all matching spans in sequence, according to the spanCondition.

String deleteFrom(CharSequence sequence)

Delete all the matching spans in sequence, using SpanCondition.SIMPLE The code alternates spans; see the class doc for UnicodeSetSpanner for a note about boundary conditions.

boolean equals(Object other)

Indicates whether some other object is "equal to" this one.

UnicodeSet getUnicodeSet()

Returns the UnicodeSet used for processing.

int hashCode()

Returns a hash code value for the object.

String replaceFrom(CharSequence sequence, CharSequence replacement, UnicodeSetSpanner.CountMethod countMethod, UnicodeSet.SpanCondition spanCondition)

Replace all matching spans in sequence by replacement, according to the countMethod and spanCondition.

String replaceFrom(CharSequence sequence, CharSequence replacement)

Replace all matching spans in sequence by the replacement, counting by CountMethod.MIN_ELEMENTS using SpanCondition.SIMPLE.

String replaceFrom(CharSequence sequence, CharSequence replacement, UnicodeSetSpanner.CountMethod countMethod)

Replace all matching spans in sequence by replacement, according to the CountMethod, using SpanCondition.SIMPLE.

CharSequence trim(CharSequence sequence, UnicodeSetSpanner.TrimOption trimOption, UnicodeSet.SpanCondition spanCondition)

Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start or end of the string, depending on the trimOption and spanCondition.

CharSequence trim(CharSequence sequence, UnicodeSetSpanner.TrimOption trimOption)

Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start or end of the string, using the trimOption and SpanCondition.SIMPLE.

CharSequence trim(CharSequence sequence)

Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start and end of the string, using TrimOption.BOTH and SpanCondition.SIMPLE.

Inherited methods

Public constructors

UnicodeSetSpanner

Added in API level 24
public UnicodeSetSpanner (UnicodeSet source)

Create a spanner from a UnicodeSet. For speed and safety, the UnicodeSet should be frozen. However, this class can be used with a non-frozen version to avoid the cost of freezing.

Parameters
source UnicodeSet: the original UnicodeSet

Public methods

countIn

Added in API level 24
public int countIn (CharSequence sequence)

Returns the number of matching characters found in a character sequence, counting by CountMethod.MIN_ELEMENTS using SpanCondition.SIMPLE. The code alternates spans; see the class doc for UnicodeSetSpanner for a note about boundary conditions.

Parameters
sequence CharSequence: the sequence to count characters in

Returns
int the count. Zero if there are none.

countIn

Added in API level 24
public int countIn (CharSequence sequence, 
                UnicodeSetSpanner.CountMethod countMethod, 
                UnicodeSet.SpanCondition spanCondition)

Returns the number of matching characters found in a character sequence. The code alternates spans; see the class doc for UnicodeSetSpanner for a note about boundary conditions.

Parameters
sequence CharSequence: the sequence to count characters in

countMethod UnicodeSetSpanner.CountMethod: whether to treat an entire span as a match, or individual elements as matches

spanCondition UnicodeSet.SpanCondition: the spanCondition to use. SIMPLE or CONTAINED means only count the elements in the span; NOT_CONTAINED is the reverse.
WARNING: when a UnicodeSet contains strings, there may be unexpected behavior in edge cases.

Returns
int the count. Zero if there are none.

countIn

Added in API level 24
public int countIn (CharSequence sequence, 
                UnicodeSetSpanner.CountMethod countMethod)

Returns the number of matching characters found in a character sequence, using SpanCondition.SIMPLE. The code alternates spans; see the class doc for UnicodeSetSpanner for a note about boundary conditions.

Parameters
sequence CharSequence: the sequence to count characters in

countMethod UnicodeSetSpanner.CountMethod: whether to treat an entire span as a match, or individual elements as matches

Returns
int the count. Zero if there are none.

deleteFrom

Added in API level 24
public String deleteFrom (CharSequence sequence, 
                UnicodeSet.SpanCondition spanCondition)

Delete all matching spans in sequence, according to the spanCondition. The code alternates spans; see the class doc for UnicodeSetSpanner for a note about boundary conditions.

Parameters
sequence CharSequence: charsequence to replace matching spans in.

spanCondition UnicodeSet.SpanCondition: specify whether to modify the matching spans (CONTAINED or SIMPLE) or the non-matching (NOT_CONTAINED)

Returns
String modified string.

deleteFrom

Added in API level 24
public String deleteFrom (CharSequence sequence)

Delete all the matching spans in sequence, using SpanCondition.SIMPLE The code alternates spans; see the class doc for UnicodeSetSpanner for a note about boundary conditions.

Parameters
sequence CharSequence: charsequence to replace matching spans in.

Returns
String modified string.

equals

Added in API level 24
public boolean equals (Object other)

Indicates whether some other object is "equal to" this one.

The equals method implements an equivalence relation on non-null object references:

  • It is reflexive: for any non-null reference value x, x.equals(x) should return true.
  • It is symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
  • It is transitive: for any non-null reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
  • It is consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
  • For any non-null reference value x, x.equals(null) should return false.

An equivalence relation partitions the elements it operates on into equivalence classes; all the members of an equivalence class are equal to each other. Members of an equivalence class are substitutable for each other, at least for some purposes.

Parameters
other Object: the reference object with which to compare.

Returns
boolean true if this object is the same as the obj argument; false otherwise.

getUnicodeSet

Added in API level 24
public UnicodeSet getUnicodeSet ()

Returns the UnicodeSet used for processing. It is frozen iff the original was.

Returns
UnicodeSet the construction set.

hashCode

Added in API level 24
public int hashCode ()

Returns a hash code value for the object. This method is supported for the benefit of hash tables such as those provided by HashMap.

The general contract of hashCode is:

  • Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
  • If two objects are equal according to the equals method, then calling the hashCode method on each of the two objects must produce the same integer result.
  • It is not required that if two objects are unequal according to the equals method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

Returns
int a hash code value for this object.

replaceFrom

Added in API level 24
public String replaceFrom (CharSequence sequence, 
                CharSequence replacement, 
                UnicodeSetSpanner.CountMethod countMethod, 
                UnicodeSet.SpanCondition spanCondition)

Replace all matching spans in sequence by replacement, according to the countMethod and spanCondition. The code alternates spans; see the class doc for UnicodeSetSpanner for a note about boundary conditions.

Parameters
sequence CharSequence: charsequence to replace matching spans in.

replacement CharSequence: replacement sequence. To delete, use ""

countMethod UnicodeSetSpanner.CountMethod: whether to treat an entire span as a match, or individual elements as matches

spanCondition UnicodeSet.SpanCondition: specify whether to modify the matching spans (CONTAINED or SIMPLE) or the non-matching (NOT_CONTAINED)

Returns
String modified string.

replaceFrom

Added in API level 24
public String replaceFrom (CharSequence sequence, 
                CharSequence replacement)

Replace all matching spans in sequence by the replacement, counting by CountMethod.MIN_ELEMENTS using SpanCondition.SIMPLE. The code alternates spans; see the class doc for UnicodeSetSpanner for a note about boundary conditions.

Parameters
sequence CharSequence: charsequence to replace matching spans in.

replacement CharSequence: replacement sequence. To delete, use ""

Returns
String modified string.

replaceFrom

Added in API level 24
public String replaceFrom (CharSequence sequence, 
                CharSequence replacement, 
                UnicodeSetSpanner.CountMethod countMethod)

Replace all matching spans in sequence by replacement, according to the CountMethod, using SpanCondition.SIMPLE. The code alternates spans; see the class doc for UnicodeSetSpanner for a note about boundary conditions.

Parameters
sequence CharSequence: charsequence to replace matching spans in.

replacement CharSequence: replacement sequence. To delete, use ""

countMethod UnicodeSetSpanner.CountMethod: whether to treat an entire span as a match, or individual elements as matches

Returns
String modified string.

trim

Added in API level 24
public CharSequence trim (CharSequence sequence, 
                UnicodeSetSpanner.TrimOption trimOption, 
                UnicodeSet.SpanCondition spanCondition)

Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start or end of the string, depending on the trimOption and spanCondition. For example:

 new UnicodeSet("[ab]").trim("abacatbab", TrimOption.LEADING, SpanCondition.SIMPLE)
 
... returns "catbab".

Parameters
sequence CharSequence: the sequence to trim

trimOption UnicodeSetSpanner.TrimOption: LEADING, TRAILING, or BOTH

spanCondition UnicodeSet.SpanCondition: SIMPLE, CONTAINED or NOT_CONTAINED

Returns
CharSequence a subsequence

trim

Added in API level 24
public CharSequence trim (CharSequence sequence, 
                UnicodeSetSpanner.TrimOption trimOption)

Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start or end of the string, using the trimOption and SpanCondition.SIMPLE. For example:

 new UnicodeSet("[ab]").trim("abacatbab", TrimOption.LEADING)
 
... returns "catbab".

Parameters
sequence CharSequence: the sequence to trim

trimOption UnicodeSetSpanner.TrimOption: LEADING, TRAILING, or BOTH

Returns
CharSequence a subsequence

trim

Added in API level 24
public CharSequence trim (CharSequence sequence)

Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start and end of the string, using TrimOption.BOTH and SpanCondition.SIMPLE. For example:

 new UnicodeSet("[ab]").trim("abacatbab")
 
... returns "cat".

Parameters
sequence CharSequence: the sequence to trim

Returns
CharSequence a subsequence