nlp - Part of speech for unknown and known words -


what different between part of speech tagging unknown words , part of speech tagging known words. there tool can predict part of speech tagging words ..

one common way of handling out-of-vocabulary words replacing words low occurrence (e.g., frequency < 3) in training corpus token *rare*, tagger capture how tag rare words. in testing phase, treat every word not in tagger's vocabulary *rare*.

an simpler way tag every out-of-vocabulary word majority tag. following code using nltk toolkit tags every unseen word 'nn'.

tagger = nltk.unigramtagger(trainingcorpus, backoff=nltk.defaulttagger('nn'))


Comments

Popular posts from this blog

.htaccess - First slash is removed after domain when entering a webpage in the browser -

Automatically create pages in phpfox -

c# - Farseer ContactListener is not working -