Contents
Tokenizer
Tokenizer.format()
Tokenizer.reader()
Interface to Stanza tokenizers. Args. lang (str): conventional language identifier. dir (str): directory for caching models. verbose (Bool): print download progress.
Convert sentences to CoNLL format.
Reading function that returns a generator of CoNLL-U sentences.