Advanced WordCounter help


DevX Home    Today's Headlines   Articles Archive   Tip Bank   Forums   

Results 1 to 12 of 12

Thread: Advanced WordCounter help

Hybrid View

  1. #1
    Join Date
    Aug 2005
    Posts
    6

    Advanced WordCounter help

    hi,

    I've worked on doing an application to read a txt file, then print back the number of characters, words and the number of lines in the text file. Once thats been done the application must then be able to output the number of occurences of each word in the text file and on how many lines each word is:
    e.g
    the quick brow fox jumped over the
    lazy dog
    This would give:
    2 occurences of "the"
    1 line contains "the"

    I'm new to java and i've managed to get the number of lines to report back but i'm not sure on how to get the rest done. I've explored using the StringTokenizer method and String.indexOf method to achieve my results but i'm finding it very difficult to code. If anyone has a chance to take a look here's the code i've been able to do:

    Code:
    import java.io.*;
    import java.util.*;
    
    class WordCounter {
    
    // Main
    
    public static void main (String[] args) {
    WordCounter t = new WordCounter();
    t.fileRead();
    }
    
    
    // Read the file and output
    
    void fileRead() {
    
    String record = null;
    int numLines = 0;
    int numWords = 0;
    int numChars = 0;
    
    try {
    
    FileReader fr = new FileReader("test.txt");
    BufferedReader br = new BufferedReader(fr);
    
    
    record = new String();
    while ((record = br.readLine()) != null) {
    numLines++;
    
    }
    
    
    // Output values
    
    System.out.println("Number of lines:" + numLines);
    System.out.println("Number of words:" + numWords);
    System.out.println("Number of chars:" + numChars);
    
    } catch (IOException e) {
    
    // Catch possible io errors from readLine()
    
    System.out.println("IOException error!");
    e.printStackTrace();
    }
    
    } // End of fileRead
    
    } // End of class
    Thanks if anyone can help me with this it's been annoying me now for a long time.

    Chris.
    Last edited by jarvio678; 08-11-2005 at 08:43 AM.

  2. #2
    Join Date
    Jul 2005
    Location
    SW MO, USA
    Posts
    299
    Sorry, I don't see the StringTokenizer and String.indexOf() code.
    Please post the code that shows how you are trying to use them, a copy of the errors you are getting and your questions.
    Those two classes can be used in this project. Also you'll want a way to keep track of the words and the count of usage. Look at Hashtable or HashMap. You'll store the words as keys and the counts as values.

  3. #3
    Join Date
    Jul 2005
    Location
    the Netherlands
    Posts
    128
    Quote Originally Posted by jarvio678
    ...
    Thanks if anyone can help me with this it's been annoying me now for a long time.

    Chris.
    Try this:

    Code:
                while ((record = br.readLine()) != null) {
                    numLines++;
                    
                    // create a StringTokenizer
                    StringTokenizer tokens = new StringTokenizer(record);
                    while(tokens.hasMoreTokens()) {
                        tokens.nextToken();
                        numWords++;
                    }
    
                    // remove all spaces and create an array of char's of each line
                    String lineWithoutSpaces = record.replaceAll(" ", "");
                    // count the lenght of the lineWithoutSpaces to the total of numChars
                    numChars += lineWithoutSpaces.length();
                }

  4. #4
    Join Date
    Aug 2005
    Posts
    6
    Ok here's the code and my errors:

    Code:
    import java.io.*;
    import java.util.*;
    
    class WordCounter {
    
    // Main
    
    	public static void main (String[] args) {
    		WordCounter t = new WordCounter();
    		t.fileRead();
    }
    
    
    // Read the file and output
    
    void fileRead() {
    
    	String record = null;
    	int numLines = 0;
    	int numWords = 0;
    	int numChars = 0;
    
    	try {
    
    			FileReader fr = new FileReader("test.txt");
    			BufferedReader br = new BufferedReader(fr);
    
    
    				record = new String();
    				while ((record = br.readLine()) != null) {
    				numLines++;
    
    				// create a StringTokenizer
    				                StringTokenizer tokens = new StringTokenizer(record);
    				                while(tokens.hasMoreTokens()) {
    				                    tokens.nextToken();
    				                    numWords++;
    				                }
    
    				                // remove all spaces and create an array of char's of each line
    				                String lineWithoutSpaces = record.replaceAll(" ", "");
    				                // count the lenght of the lineWithoutSpaces to the total of numChars
    				                numChars += lineWithoutSpaces.length();
                }
    
    		}
    
    
    // Output values
    
    			System.out.println("Number of lines:" + numLines);
    			System.out.println("Number of words:" + numWords);
    			System.out.println("Number of chars:" + numChars);
    
    	} catch (IOException e) {
    
    // Catch possible io errors from readLine()
    
    				System.out.println("IOException error!");
    				e.printStackTrace();
    }
    
    } // End of fileRead
    
    } // End of class
    Errors:

    C:\Documents and Settings\Mike\My Documents\WordCounter.java:23: 'try' without 'catch' or 'finally'
    try {
    ^
    C:\Documents and Settings\Mike\My Documents\WordCounter.java:55: illegal start of type
    } catch (IOException e) {
    ^
    C:\Documents and Settings\Mike\My Documents\WordCounter.java:62: <identifier> expected
    ^
    C:\Documents and Settings\Mike\My Documents\WordCounter.java:65: 'class' or 'interface' expected
    } // End of class
    ^
    C:\Documents and Settings\Mike\My Documents\WordCounter.java:69: 'class' or 'interface' expected
    ^
    5 errors

    Tool completed with exit code 1

  5. #5
    Join Date
    Jul 2005
    Location
    SW MO, USA
    Posts
    299
    Looks like one of those annoying mismatched {} problems. Check that all { have a matching }. Some IDEs or intelligent editors can help.

    It's amazing how a missing or extra } or ) can generate so many errors. The compiler keeps trying to make sense of your program and gets really confused. Insert/delete the needed } or 0 and all will be better.

  6. #6
    Join Date
    Jul 2005
    Location
    the Netherlands
    Posts
    128
    Code:
    import java.io.*;
    import java.util.*;
    
    class WordCounter {
    
        // Main
        public static void main (String[] args) {
            WordCounter t = new WordCounter();
            t.fileRead();
        }
    
        // Read the file and output
        void fileRead() {
            
            String record = null;
            int numLines = 0;
            int numWords = 0;
            int numChars = 0;
            
            try {
                FileReader fr = new FileReader("test.txt");
                BufferedReader br = new BufferedReader(fr);
    
                record = new String();
                while ((record = br.readLine()) != null) {
                    numLines++;
                    
                    // create a StringTokenizer
                    StringTokenizer tokens = new StringTokenizer(record);
                    while(tokens.hasMoreTokens()) {
                        tokens.nextToken();
                        numWords++;
                    }
    
                    // remove all spaces and create an array of char's of each line
                    String lineWithoutSpaces = record.replaceAll(" ", "");
                    // count the lenght of the lineWithoutSpaces to the total of numChars
                    numChars += lineWithoutSpaces.length();
                }
    
                // Output values
                System.out.println("Number of lines:" + numLines);
                System.out.println("Number of words:" + numWords);
                System.out.println("Number of chars:" + numChars);
            }
            catch (IOException e) {
                // Catch possible io errors from readLine()
                System.out.println("IOException error!");
                e.printStackTrace();
            }
        }
    
    } // class WordCounter

  7. #7
    Join Date
    Aug 2005
    Posts
    6
    nice that works lovely

    Thanks a lot for getting it to work up to that part.

    How would i now go about finding the number of occurences of each word in the text file and on how many lines each word is contained?

    Thanks again,

    Chris.

  8. #8
    Join Date
    Aug 2005
    Posts
    6
    may i also ask which compilers you guys use and whether or not they are downloadable?

    Thanks,

    Chris.

  9. #9
    Join Date
    Aug 2005
    Posts
    24
    This code calculates number of characters,words,and lines provided that there are only alphabets and no non alphabets like .,?,/ etc.
    here is the code.
    import java.util.*;
    import java.io.*;

    public class WordCounter

    {
    public static void main(String args[]) throws IOException
    {
    int word = 0;
    int character = 0;
    int line = 0;
    StringTokenizer st;
    String s;
    String RegularExpression = "\\w{2,}";
    int t;
    String Line;
    BufferedReader inFile = new BufferedReader (new FileReader ("Word.txt"));

    while (( Line = inFile.readLine() ) != null)
    {
    st = new StringTokenizer ( Line );
    line++;

    while ( st.hasMoreTokens () )
    {
    s = st.nextToken ();
    if ( s.matches(RegularExpression) == true )
    {
    word++;

    } // end of if body

    for ( int i = 0; i < s.length ( ); i++ )
    {
    t = (int) s.charAt( i );

    if ( ( t >= 65 && t <= 90 ) || ( t >= 97 && t <= 122 ) )
    {
    character++;

    } // end of if body

    } // end of for loop

    } //end of inner while loop

    } //end of outer while loop

    System.out.println("Words : " + word);
    System.out.println("");
    System.out.println("Characters : " + character);
    System.out.println("");
    System.out.println("Lines : " + line);
    System.out.println("");

    } // end of main

    } // end of class WordCounter

  10. #10
    Join Date
    Aug 2005
    Posts
    6
    ah thanks for that

    how would i now go aobut finding the number of occurances of words and on how many lines each word is contained?

  11. #11
    Join Date
    Aug 2005
    Posts
    6
    Does anyone have any idea how to go about finding the number of occurences of words in the text file? and on how many lines each word is on?
    Last edited by jarvio678; 08-11-2005 at 05:29 PM.

  12. #12
    Join Date
    Jul 2005
    Location
    the Netherlands
    Posts
    128
    Quote Originally Posted by jarvio678
    Does anyone have any idea how to go about finding the number of occurences of words in the text file? and on how many lines each word is on?
    Sure, I did an assignment like did last year. You need to create a HashMap. A HashMap lets you store Objects with a unique key. Hope you know a little UML, here an approach:

    Code:
     __________________________________________________
    |                                                  
    |                class WordObject                  
    |__________________________________________________
    |                                                  
    | - text       : String                            
    | - occurrence : int                               
    |__________________________________________________
    |                                                  
    | // Create a new WordObject                       
    | + WordObject(text: String) : constructor         
    |                                                  
    | // Get the number of occurrences of 'this'       
    | + getOccurrence() : int                          
    |                                                  
    | // Increase the number of occurrences by one     
    | + increase() : void                              
    |                                                  
    | // String representation                         
    | + toString() : String                            
    |                                                  
    |__________________________________________________
                                                       
                                                       
     __________________________________________________
    |                                                  
    |                class WordMap                     
    |__________________________________________________
    |                                                  
    | - wordMap    : HashMap<String, WordObject>       
    | - sortedList : ArrayList<WordObject>             
    |__________________________________________________
    |                                                  
    | // Reads a textfile from disk                    
    | + WordMap(fileName: String) : constructor        
    |                                                  
    | // Adds a word to the HashMap or increases       
    | // an existing word                              
    | - addWord(word: String) : void                   
    |                                                  
    | // Insertion all Words in an ArrayList from      
    | // the HashMap                                   
    | - sortWordMap() : void                           
    |                                                  
    | // Displays the sorted ArrayList                 
    | + getSortedWord() : ArrayList<WordObject>        
    |                                                  
    |__________________________________________________
    Read something about HashMap's and ArrayList's.
    Now you try to implement it and if your stuck, post some specific questions.

    Good luck.

Similar Threads

  1. Replies: 0
    Last Post: 02-25-2005, 10:21 AM
  2. Advanced .NET Testing System - ANTS
    By Red-Gate in forum dotnet.announcements
    Replies: 0
    Last Post: 12-16-2002, 10:23 AM
  3. ANN: Archetype, advanced OO and design patterns tool
    By John Hancock in forum vb.announcements
    Replies: 0
    Last Post: 02-05-2002, 10:46 PM
  4. Advanced VB5/6 Diagnostic Tools V1.2 Released
    By Geoff Kell in forum vb.announcements
    Replies: 0
    Last Post: 02-07-2001, 08:08 AM
  5. Advanced DCOM Trainer
    By Zubair in forum Enterprise
    Replies: 1
    Last Post: 05-16-2000, 07:21 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center
 
 
FAQ
Latest Articles
Java
.NET
XML
Database
Enterprise
Questions? Contact us.
C++
Web Development
Wireless
Latest Tips
Open Source


   Development Centers

   -- Android Development Center
   -- Cloud Development Project Center
   -- HTML5 Development Center
   -- Windows Mobile Development Center