nltk - Dictionary not sorting correctly in python -
my code should output top 10 words highest frequency in corpus. however, giving output of 10 random words.
from nltk.corpus import brown import operator brown_tagged_sentences = brown.tagged_sents(categories='news') fd=nltk.freqdist(brown.words(categories='news')) sorted_fd = dict(sorted(fd.items(), key=operator.itemgetter(1), reverse=true)) print(sorted_fd) most_freq_words=list(sorted_fd)[:10] word in most_freq_words: print(word,':',sorted_fd[word])
the output coming below wrong:
rae : 1 discharge : 1 ignition : 1 contendere : 1 done : 24 meaning : 4 ashore : 1 francesca : 1 vietnamese : 1 data : 4
kindly help
the nltk's freqdist()
class can directly give contents in descending order of frequency, using method most_common()
:
fd=nltk.freqdist(brown.words(categories='news')) w, f in fd.most_common(10): print(w+":", f)
Comments
Post a Comment