-
Advanced WordCounter help
hi,
I've worked on doing an application to read a txt file, then print back the number of characters, words and the number of lines in the text file. Once thats been done the application must then be able to output the number of occurences of each word in the text file and on how many lines each word is:
e.g
the quick brow fox jumped over the
lazy dog
This would give:
2 occurences of "the"
1 line contains "the"
I'm new to java and i've managed to get the number of lines to report back but i'm not sure on how to get the rest done. I've explored using the StringTokenizer method and String.indexOf method to achieve my results but i'm finding it very difficult to code. If anyone has a chance to take a look here's the code i've been able to do:
Code:
import java.io.*;
import java.util.*;
class WordCounter {
// Main
public static void main (String[] args) {
WordCounter t = new WordCounter();
t.fileRead();
}
// Read the file and output
void fileRead() {
String record = null;
int numLines = 0;
int numWords = 0;
int numChars = 0;
try {
FileReader fr = new FileReader("test.txt");
BufferedReader br = new BufferedReader(fr);
record = new String();
while ((record = br.readLine()) != null) {
numLines++;
}
// Output values
System.out.println("Number of lines:" + numLines);
System.out.println("Number of words:" + numWords);
System.out.println("Number of chars:" + numChars);
} catch (IOException e) {
// Catch possible io errors from readLine()
System.out.println("IOException error!");
e.printStackTrace();
}
} // End of fileRead
} // End of class
Thanks if anyone can help me with this it's been annoying me now for a long time.
Chris.
Last edited by jarvio678; 08-11-2005 at 08:43 AM.
-
Sorry, I don't see the StringTokenizer and String.indexOf() code.
Please post the code that shows how you are trying to use them, a copy of the errors you are getting and your questions.
Those two classes can be used in this project. Also you'll want a way to keep track of the words and the count of usage. Look at Hashtable or HashMap. You'll store the words as keys and the counts as values.
-
 Originally Posted by jarvio678
...
Thanks if anyone can help me with this it's been annoying me now for a long time.
Chris.
Try this:
Code:
while ((record = br.readLine()) != null) {
numLines++;
// create a StringTokenizer
StringTokenizer tokens = new StringTokenizer(record);
while(tokens.hasMoreTokens()) {
tokens.nextToken();
numWords++;
}
// remove all spaces and create an array of char's of each line
String lineWithoutSpaces = record.replaceAll(" ", "");
// count the lenght of the lineWithoutSpaces to the total of numChars
numChars += lineWithoutSpaces.length();
}
-
Ok here's the code and my errors:
Code:
import java.io.*;
import java.util.*;
class WordCounter {
// Main
public static void main (String[] args) {
WordCounter t = new WordCounter();
t.fileRead();
}
// Read the file and output
void fileRead() {
String record = null;
int numLines = 0;
int numWords = 0;
int numChars = 0;
try {
FileReader fr = new FileReader("test.txt");
BufferedReader br = new BufferedReader(fr);
record = new String();
while ((record = br.readLine()) != null) {
numLines++;
// create a StringTokenizer
StringTokenizer tokens = new StringTokenizer(record);
while(tokens.hasMoreTokens()) {
tokens.nextToken();
numWords++;
}
// remove all spaces and create an array of char's of each line
String lineWithoutSpaces = record.replaceAll(" ", "");
// count the lenght of the lineWithoutSpaces to the total of numChars
numChars += lineWithoutSpaces.length();
}
}
// Output values
System.out.println("Number of lines:" + numLines);
System.out.println("Number of words:" + numWords);
System.out.println("Number of chars:" + numChars);
} catch (IOException e) {
// Catch possible io errors from readLine()
System.out.println("IOException error!");
e.printStackTrace();
}
} // End of fileRead
} // End of class
Errors:
C:\Documents and Settings\Mike\My Documents\WordCounter.java:23: 'try' without 'catch' or 'finally'
try {
^
C:\Documents and Settings\Mike\My Documents\WordCounter.java:55: illegal start of type
} catch (IOException e) {
^
C:\Documents and Settings\Mike\My Documents\WordCounter.java:62: <identifier> expected
^
C:\Documents and Settings\Mike\My Documents\WordCounter.java:65: 'class' or 'interface' expected
} // End of class
^
C:\Documents and Settings\Mike\My Documents\WordCounter.java:69: 'class' or 'interface' expected
^
5 errors
Tool completed with exit code 1
-
Looks like one of those annoying mismatched {} problems. Check that all { have a matching }. Some IDEs or intelligent editors can help.
It's amazing how a missing or extra } or ) can generate so many errors. The compiler keeps trying to make sense of your program and gets really confused. Insert/delete the needed } or 0 and all will be better.
-
Code:
import java.io.*;
import java.util.*;
class WordCounter {
// Main
public static void main (String[] args) {
WordCounter t = new WordCounter();
t.fileRead();
}
// Read the file and output
void fileRead() {
String record = null;
int numLines = 0;
int numWords = 0;
int numChars = 0;
try {
FileReader fr = new FileReader("test.txt");
BufferedReader br = new BufferedReader(fr);
record = new String();
while ((record = br.readLine()) != null) {
numLines++;
// create a StringTokenizer
StringTokenizer tokens = new StringTokenizer(record);
while(tokens.hasMoreTokens()) {
tokens.nextToken();
numWords++;
}
// remove all spaces and create an array of char's of each line
String lineWithoutSpaces = record.replaceAll(" ", "");
// count the lenght of the lineWithoutSpaces to the total of numChars
numChars += lineWithoutSpaces.length();
}
// Output values
System.out.println("Number of lines:" + numLines);
System.out.println("Number of words:" + numWords);
System.out.println("Number of chars:" + numChars);
}
catch (IOException e) {
// Catch possible io errors from readLine()
System.out.println("IOException error!");
e.printStackTrace();
}
}
} // class WordCounter
-
nice that works lovely 
Thanks a lot for getting it to work up to that part.
How would i now go about finding the number of occurences of each word in the text file and on how many lines each word is contained?
Thanks again,
Chris.
-
may i also ask which compilers you guys use and whether or not they are downloadable?
Thanks,
Chris.
-
This code calculates number of characters,words,and lines provided that there are only alphabets and no non alphabets like .,?,/ etc.
here is the code.
import java.util.*;
import java.io.*;
public class WordCounter
{
public static void main(String args[]) throws IOException
{
int word = 0;
int character = 0;
int line = 0;
StringTokenizer st;
String s;
String RegularExpression = "\\w{2,}";
int t;
String Line;
BufferedReader inFile = new BufferedReader (new FileReader ("Word.txt"));
while (( Line = inFile.readLine() ) != null)
{
st = new StringTokenizer ( Line );
line++;
while ( st.hasMoreTokens () )
{
s = st.nextToken ();
if ( s.matches(RegularExpression) == true )
{
word++;
} // end of if body
for ( int i = 0; i < s.length ( ); i++ )
{
t = (int) s.charAt( i );
if ( ( t >= 65 && t <= 90 ) || ( t >= 97 && t <= 122 ) )
{
character++;
} // end of if body
} // end of for loop
} //end of inner while loop
} //end of outer while loop
System.out.println("Words : " + word);
System.out.println("");
System.out.println("Characters : " + character);
System.out.println("");
System.out.println("Lines : " + line);
System.out.println("");
} // end of main
} // end of class WordCounter
-
ah thanks for that 
how would i now go aobut finding the number of occurances of words and on how many lines each word is contained?
-
Does anyone have any idea how to go about finding the number of occurences of words in the text file? and on how many lines each word is on?
Last edited by jarvio678; 08-11-2005 at 05:29 PM.
-
 Originally Posted by jarvio678
Does anyone have any idea how to go about finding the number of occurences of words in the text file? and on how many lines each word is on?
Sure, I did an assignment like did last year. You need to create a HashMap. A HashMap lets you store Objects with a unique key. Hope you know a little UML, here an approach:
Code:
__________________________________________________
|
| class WordObject
|__________________________________________________
|
| - text : String
| - occurrence : int
|__________________________________________________
|
| // Create a new WordObject
| + WordObject(text: String) : constructor
|
| // Get the number of occurrences of 'this'
| + getOccurrence() : int
|
| // Increase the number of occurrences by one
| + increase() : void
|
| // String representation
| + toString() : String
|
|__________________________________________________
__________________________________________________
|
| class WordMap
|__________________________________________________
|
| - wordMap : HashMap<String, WordObject>
| - sortedList : ArrayList<WordObject>
|__________________________________________________
|
| // Reads a textfile from disk
| + WordMap(fileName: String) : constructor
|
| // Adds a word to the HashMap or increases
| // an existing word
| - addWord(word: String) : void
|
| // Insertion all Words in an ArrayList from
| // the HashMap
| - sortWordMap() : void
|
| // Displays the sorted ArrayList
| + getSortedWord() : ArrayList<WordObject>
|
|__________________________________________________
Read something about HashMap's and ArrayList's.
Now you try to implement it and if your stuck, post some specific questions.
Good luck.
Similar Threads
-
By safarigirlnj in forum Careers
Replies: 0
Last Post: 02-25-2005, 10:21 AM
-
By Red-Gate in forum dotnet.announcements
Replies: 0
Last Post: 12-16-2002, 10:23 AM
-
By John Hancock in forum vb.announcements
Replies: 0
Last Post: 02-05-2002, 10:46 PM
-
By Geoff Kell in forum vb.announcements
Replies: 0
Last Post: 02-07-2001, 08:08 AM
-
By Zubair in forum Enterprise
Replies: 1
Last Post: 05-16-2000, 07:21 AM
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
Forum Rules
|
Top DevX Stories
Easy Web Services with SQL Server 2005 HTTP Endpoints
JavaOne 2005: Java Platform Roadmap Focuses on Ease of Development, Sun Focuses on the "Free" in F.O.S.S.
Wed Yourself to UML with the Power of Associations
Microsoft to Add AJAX Capabilities to ASP.NET
IBM's Cloudscape Versus MySQL
|
Bookmarks