stri_opts_brkiter {stringi} | R Documentation |
A convenience function to tune the ICU BreakIterator
's behavior
in some text boundary analysis functions, see
stringi-search-boundaries.
stri_opts_brkiter(type, locale, skip_word_none, skip_word_number, skip_word_letter, skip_word_kana, skip_word_ideo, skip_line_soft, skip_line_hard, skip_sentence_term, skip_sentence_sep, ...)
type |
single string; break iterator type, one of |
locale |
single string, |
skip_word_none |
logical; perform no action for "words" that do not fit into any other categories |
skip_word_number |
logical; perform no action for words that appear to be numbers |
skip_word_letter |
logical; perform no action for words that contain letters, excluding hiragana, katakana, or ideographic characters |
skip_word_kana |
logical; perform no action for words containing kana characters |
skip_word_ideo |
logical; perform no action for words containing ideographic characters |
skip_line_soft |
logical; perform no action for soft line breaks, i.e. positions at which a line break is acceptable but not required |
skip_line_hard |
logical; perform no action for hard, or mandatory line breaks |
skip_sentence_term |
logical; perform no action for sentences
ending with a sentence terminator (" |
skip_sentence_sep |
logical; perform no action for sentences that do not contain an ending sentence terminator, but are ended by a hard separator or end of input |
... |
any other arguments to this function are purposely ignored |
The skip_*
family of settings may be used to prevent performing
any special actions on particular types of text boundaries, e.g.
in case of the stri_locate_all_boundaries
and
stri_split_boundaries
functions.
Returns a named list object.
Omitted skip_*
values act as they have been set to FALSE
.
ubrk.h
File Reference – ICU4C API Documentation,
http://icu-project.org/apiref/icu4c/ubrk_8h.html
Boundary Analysis – ICU User Guide, http://userguide.icu-project.org/boundaryanalysis
Other text_boundaries: stri_count_boundaries
,
stri_count_words
;
stri_extract_all_boundaries
,
stri_extract_all_words
,
stri_extract_first_boundaries
,
stri_extract_first_words
,
stri_extract_last_boundaries
,
stri_extract_last_words
;
stri_locate_all_boundaries
,
stri_locate_all_words
,
stri_locate_first_boundaries
,
stri_locate_first_words
,
stri_locate_last_boundaries
,
stri_locate_last_words
;
stri_split_boundaries
;
stri_split_lines
,
stri_split_lines1
,
stri_split_lines1
;
stri_trans_tolower
,
stri_trans_totitle
,
stri_trans_toupper
;
stri_wrap
;
stringi-search-boundaries
;
stringi-search