java - How to count unique words in a text file? -
i have implemented code count number of: - chars - words - lines - bytes in text file. how count dictionary size: number of different words used in file? also, how implement iterator can iterate on letters? (ignore whitespaces)
public class wc { public static void main(string[] args) throws ioexception { //counters int charscount = 0; int wordscount = 0; int linescount = 0; scanner in = null; file file = new file("sample.txt"); try(scanner scanner = new scanner(new bufferedreader(new filereader(file)))){ while (scanner.hasnextline()) { string tmpstr = scanner.nextline(); if (!tmpstr.equalsignorecase("")) { string replaceall = tmpstr.replaceall("\\s+", ""); charscount += replaceall.length(); wordscount += tmpstr.split("\\s+").length; } ++linescount; } system.out.println("# of chars: " + charscount); system.out.println("# of words: " + wordscount); system.out.println("# of lines: " + linescount); system.out.println("# of bytes: " + file.length()); } } }
to unique words , counts:
1. split obtained line file string array
2. store contents of string array in hashset
3. repeat steps 1 , 2 till end of file
4. unique words , count hashset
i prefer posting logic , pseudo code op learn solving posted problem.
Comments
Post a Comment