A Consistent Interface to Tokenize Natural Language Text


[Up] [Top]

Documentation for package ‘tokenizers’ version 0.1.4

Help Pages

tokenizers-package Tokenizers
basic-tokenizers Basic tokenizers
ngram-tokenizers N-gram tokenizers
stopwords Stopword lists
tokenizers Tokenizers
tokenize_characters Basic tokenizers
tokenize_character_shingles Character shingle tokenizers
tokenize_lines Basic tokenizers
tokenize_ngrams N-gram tokenizers
tokenize_paragraphs Basic tokenizers
tokenize_regex Basic tokenizers
tokenize_sentences Basic tokenizers
tokenize_skip_ngrams N-gram tokenizers
tokenize_words Basic tokenizers
tokenize_word_stems Word stem tokenizer