stri_unique {stringi} | R Documentation |
This function returns a character vector like str
,
but with duplicate elements removed.
stri_unique(str, ..., opts_collator = NULL)
str |
a character vector |
... |
additional settings for |
opts_collator |
a named list with ICU Collator's options
as generated with |
As usual in stringi, no attributes are copied.
Unlike unique
, this function
tests for canonical equivalence of strings (and not
whether the strings are just bytewise equal). Such an operation
is locale-dependent. Hence, stri_unique
is significantly
slower (but much better suited for natural language processing)
than its base R counterpart.
See also stri_duplicated
for indicating non-unique elements.
Returns a character vector.
Collation - ICU User Guide, http://userguide.icu-project.org/collation
Other locale_sensitive: %s!==%
,
%s!=%
, %s<=%
,
%s<%
, %s===%
,
%s==%
, %s>=%
,
%s>%
, %stri!==%
,
%stri!=%
, %stri<=%
,
%stri<%
, %stri===%
,
%stri==%
, %stri>=%
,
%stri>%
; stri_cmp
,
stri_cmp_eq
, stri_cmp_equiv
,
stri_cmp_ge
, stri_cmp_gt
,
stri_cmp_le
, stri_cmp_lt
,
stri_cmp_neq
,
stri_cmp_nequiv
,
stri_compare
;
stri_count_boundaries
,
stri_count_words
;
stri_duplicated
,
stri_duplicated_any
;
stri_enc_detect2
;
stri_extract_all_boundaries
,
stri_extract_all_words
,
stri_extract_first_boundaries
,
stri_extract_first_words
,
stri_extract_last_boundaries
,
stri_extract_last_words
;
stri_locate_all_boundaries
,
stri_locate_all_words
,
stri_locate_first_boundaries
,
stri_locate_first_words
,
stri_locate_last_boundaries
,
stri_locate_last_words
;
stri_opts_collator
;
stri_order
, stri_sort
;
stri_split_boundaries
;
stri_trans_tolower
,
stri_trans_totitle
,
stri_trans_toupper
;
stri_wrap
; stringi-locale
;
stringi-search-boundaries
;
stringi-search-coll
# normalized and non-Unicode-normalized version of the same code point: stri_unique(c("\u0105", stri_trans_nfkd("\u0105"))) unique(c("\u0105", stri_trans_nfkd("\u0105"))) stri_unique(c("gro\u00df", "GROSS", "Gro\u00df", "Gross"), strength=1)