Character Frequency Array


DevX Home    Today's Headlines   Articles Archive   Tip Bank   Forums   

Page 1 of 2 12 LastLast
Results 1 to 15 of 19

Thread: Character Frequency Array

Hybrid View

  1. #1
    Join Date
    Sep 2005
    Posts
    74

    Character Frequency Array

    The problem I am working on asks me to have the user type text into a TextArea, and then count the frequency of all letters and display them, then display which letter was most frequent and how many times it was encountered.

    I know I need an array for this, but how to implement it? I was thinking of something like this:
    Code:
    char charArr[] = {'a','b','c','d','e','f','g','h','i','j','k','l','m',
    			'n','o','p','q','r','s','t','u','v','w','x','y','z'};
    And then using a loop to check if the current character is a letter and if so, match it with one of the characters in the array and increment a counter. But will I need a seperate counter for each letter, or should I use a temp variable to store the current letter and if it matches increment a counter, and then overwrite it?

  2. #2
    Join Date
    Jul 2005
    Location
    SW MO, USA
    Posts
    299
    Given that the value of characters are contiguous (when converted in int) then you can use the character itself as an index into a count array.
    For example to count the lower case letters from 'a' to 'z':
    (Off the top of my head, may not compile)
    int[] counters = new int['z'-'a'+1]; // hold counts here
    ...
    char x = ... // the char to be counted
    ...
    counters[x - 'a']++; // count occurances of char in x

  3. #3
    Join Date
    Sep 2005
    Posts
    74
    I'm not sure if I'm starting this right or not..but I have this:

    Code:
    /* countCharFreq(line) counts freq of each alphabetic char
    	 * Pre:		 line is not empty
    	 * Post:	Appended to the TextArea will be a count of the frequency
    	 * 			of each alphabetic character in the input and, after this,
    	 * 			the most frequent character and its count.
    	 */
    	public void countCharFreq(String line)
    	{
    		line.toLowerCase();
    		char charArr[] = new char[26];
    		
    		for (int k = 0; k < line.length(); k++)
    		{
    			char ch = line.charAt(k);
    			
    			for (int i = 0; i < charArr.length; i++)
    			{
    				if(Character.isLetter(ch))
    				{
    					charArr[i] = ch;
    				}// if()
    				
    			}// for()
    		} // for()
    	}// countCharFreq
    The outside for loop is supposed to store the current char, and the inside loop is supposed to check that char to see if it's a letter and if so, store it in the array. Although, once the char is stored I'm not sure how to count it, for example:

    The char is an 'a', I want to store it, check the next char, which is a 'b'. Then store that and check the next char which is an 'a'. Would I need two counters, one for 'a' and one for 'b'?

    Also, how do I 'reserve' space in the array for the char..ex: at index 0 I want 'a', (but when I am reading and storing letters I don't want index 0 to be overwritten)?

  4. #4
    Join Date
    Jul 2005
    Location
    SW MO, USA
    Posts
    299
    I guess you missed my post. It shows how to count the occurances of chars.

    You will need one counter for each char. You don't need to save the chars.

  5. #5
    Join Date
    Sep 2005
    Posts
    74
    Sorry, didn't understand your first post =p. Anyway, after messing around with it for a while, I got the program to work with just one counting variable:

    Code:
    import java.applet.*;
    import java.awt.*;
    import java.awt.event.*;
    import java.text.*;
    
    public class DocumentApplet2 extends Applet implements ActionListener
    {
    	//	Declare arrays
    	static final int ARRSIZE = 10000000;
    	static int charArr[] = new int[ARRSIZE];
    	static int intArr[] = {97,98,99,100,101,102,103,104,105,106,107,108,109,
    						  110,111,112,113,114,115,116,117,118,119,120,121,122};
    	//	Declare and create GUI Components
    	private TextArea display = new TextArea("",30,80,
    			   TextArea.SCROLLBARS_VERTICAL_ONLY);
    	private Button getInfo = new Button("Click to analyze");
    	
    	public void init()
    	{
    		
    		// Add ActionListener
    		getInfo.addActionListener(this);
    		
    		// Add components
    		add (display);
    		add (getInfo);
    		
    		//	Set the applet's size
    		setSize(600,510);
    	} // init()
    	
    	/* countCharFreq(line) counts freq of each alphabetic char
    	 * Pre:		line is not empty
    	 * Post:	Appended to the TextArea will be a count of the frequency
    	 * 			of each alphabetic character in the input and, after this,
    	 * 			the most frequent character and its count.
    	 */
    	public void countCharFreq(String line)
    	{
    		String str = line;
    		str = str.toLowerCase();
    		int count = 0;
    		int frequency = 0;
    		char current = 'a';
    		
    		for (int i = 0; i < intArr.length; i++)
    		{
    			int holdInt =  intArr[i];
    			char checkChar = (char)holdInt;
    			
    			for (int k = 0; k < str.length(); k++)
    			{
    				int currentChar = str.charAt(k);
    				
    				charArr[k] = currentChar;
    			
    				if (checkChar == currentChar)
    				{
    					count++;
    					if (count > frequency)
    					{	
    						current = (char)intArr[i];
    						frequency++;
    					}
    				}
    			}
    			display.append("\n" +checkChar+ " " + "\tcount: " 
    				      + "\t" + count + " ");
    			holdInt = intArr[i]++;
    			count = 0;
    		}
    		display.append("\n\n" + "The most frequent letter is: " + "\'" + 
    				current + "\'" + " (" + frequency + " time(s)" + ")");
    		display.append("\n--------------------------------------------------");
    	}// countCharFreq
    	public void actionPerformed(ActionEvent e)
    	{
    		if (e.getSource() == getInfo)
    		{
    			String str = display.getText();
    			if (str == " ")
    			{
    				display.setText("Error");
    			} else {
    			display.append("\n----------------------------------------------");
    			countCharFreq(str);
    			}
    		}
    	}// actionPerformed()
    }// DocumentApplet2
    There is only one problem. If the TextArea is left blank, the method will still be called and will try evaluating..resulting in bad output. How can I check if a TextArea is empty? Another problem I am having is, if the user does not erase the TextArea and hits the analyze button again, they will get some really off the wall values for their input. Is there some way I can clear the TextArea with each press of the analyze button?
    Last edited by Dark Rain; 10-28-2005 at 08:13 PM.

  6. #6
    Join Date
    Oct 2005
    Posts
    107
    Quote Originally Posted by Dark Rain
    Sorry, didn't understand your first post =p. Anyway, after messing around with it for a while, I got the program to work with just one counting variable:

    Code:
    import java.applet.*;
    import java.awt.*;
    import java.awt.event.*;
    import java.text.*;
    
    public class DocumentApplet2 extends Applet implements ActionListener
    {
    	//	Declare arrays
    	static final int ARRSIZE = 10000000;
    	static int charArr[] = new int[ARRSIZE];
    	static int intArr[] = {97,98,99,100,101,102,103,104,105,106,107,108,109,
    						  110,111,112,113,114,115,116,117,118,119,120,121,122};
    	//	Declare and create GUI Components
    	private TextArea display = new TextArea("",30,80,
    			   TextArea.SCROLLBARS_VERTICAL_ONLY);
    	private Button getInfo = new Button("Click to analyze");
    	
    	public void init()
    	{
    		
    		// Add ActionListener
    		getInfo.addActionListener(this);
    		
    		// Add components
    		add (display);
    		add (getInfo);
    		
    		//	Set the applet's size
    		setSize(600,510);
    	} // init()
    	
    	/* countCharFreq(line) counts freq of each alphabetic char
    	 * Pre:		line is not empty
    	 * Post:	Appended to the TextArea will be a count of the frequency
    	 * 			of each alphabetic character in the input and, after this,
    	 * 			the most frequent character and its count.
    	 */
    	public void countCharFreq(String line)
    	{
    		String str = line;
    		str = str.toLowerCase();
    		int count = 0;
    		int frequency = 0;
    		char current = 'a';
    		
    		for (int i = 0; i < intArr.length; i++)
    		{
    			int holdInt =  intArr[i];
    			char checkChar = (char)holdInt;
    			
    			for (int k = 0; k < str.length(); k++)
    			{
    				int currentChar = str.charAt(k);
    				
    				charArr[k] = currentChar;
    			
    				if (checkChar == currentChar)
    				{
    					count++;
    					if (count > frequency)
    					{	
    						current = (char)intArr[i];
    						frequency++;
    					}
    				}
    			}
    			display.append("\n" +checkChar+ " " + "\tcount: " 
    				      + "\t" + count + " ");
    			holdInt = intArr[i]++;
    			count = 0;
    		}
    		display.append("\n\n" + "The most frequent letter is: " + "\'" + 
    				current + "\'" + " (" + frequency + " time(s)" + ")");
    		display.append("\n--------------------------------------------------");
    	}// countCharFreq
    	public void actionPerformed(ActionEvent e)
    	{
    		if (e.getSource() == getInfo)
    		{
    			String str = display.getText();
    			if (str == " ")
    			{
    				display.setText("Error");
    			} else {
    			display.append("\n----------------------------------------------");
    			countCharFreq(str);
    			}
    		}
    	}// actionPerformed()
    }// DocumentApplet2
    There is only one problem. If the TextArea is left blank, the method will still be called and will try evaluating..resulting in bad output. How can I check if a TextArea is empty? Another problem I am having is, if the user does not erase the TextArea and hits the analyze button again, they will get some really off the wall values for their input. Is there some way I can clear the TextArea with each press of the analyze button?

    if(str.equals("")) not if(str == "")

    Why such a huge array?

    static final int ARRSIZE = 10000000;
    static int charArr[] = new int[ARRSIZE];

    You should use a simple int array.


    private void countCharFreq(String line){
    int [] charFrequency = new int[26]; //default values: 0
    char letter;
    for(int i = 0; i < line.length(); ++i){
    letter = line.charAt(i);
    if(letter >= 97 && letter <= 122){
    ++charFrequency[letter - 97];
    }
    }
    char c;
    for(int i = 0; i < 26; ++i){
    c = i + 97; //not sure if this is how to do this but u get idea
    display.append(c + " count: " + charFrequency[i] + " ");
    }
    }

  7. #7
    Join Date
    Sep 2005
    Posts
    74
    Yeah I got rid of the charArr array, since nothing was actually being stored..so now I only have the intArr array. Everything works nicely now, no more ten million index array =p.

  8. #8
    Join Date
    Nov 2004
    Location
    Norway
    Posts
    1,560

    This one uses no array fiddling....

    ...but if the use of arrays was a requirement then this solution is wrong.
    However, using arrays for solving this is not very sensible....

    Code:
    import java.util.*;
    
    /**
     * Single Character count class
     */
    class CharCount  implements Comparator {
      public Character ch=null;
      public Integer count=null;
    
      public CharCount(){}
    
      public CharCount(Character ch, Integer count) {
        this.ch=ch;
        this.count=count;
      }
      public String toString() {
        return ch.toString()+"\t"+count.toString();
      }
      public int compare(Object o1, Object o2) {
        CharCount c1=(CharCount)o1;
        CharCount c2=(CharCount)o2;
        return c2.count.intValue() - c1.count.intValue();
      }
      public boolean equals(Object obj) {
        CharCount cc=(CharCount)obj;
        return cc.ch.equals(ch) && cc.count.equals(count);
      }
    
    }
    /**
     * Count character frequency in a String and generate a
     * ranking list of characters and frequency count.
     */
    public class TokenCounter {
    
      ArrayList rankList=null;
    
      public TokenCounter(String s) {
        Hashtable ht=getCountHashTable(s);
        sortCharacterCount(ht);
      }
      public ArrayList getResult() {
        return rankList;
      }
      /**
       * Sort on character frequency
       */
      private ArrayList sortCharacterCount(Hashtable ht) {
        rankList=new ArrayList();
        Enumeration enum = ht.keys();
        while (enum.hasMoreElements()) {
          Character ch = (Character)enum.nextElement();
          Integer count=(Integer)ht.get(ch);
          rankList.add(new CharCount(ch, count));
        }
        Collections.sort(rankList,new CharCount());
        return rankList;
      }
      /**
       * Count frequencies
       */
      private Hashtable getCountHashTable(String s) {
    
        Hashtable cHt=new Hashtable();
    
        char [] cBuf=new char[s.length()];
        s.getChars(0,cBuf.length,cBuf,0);
    
        for (int i=0; i<cBuf.length; i++) {
          if (Character.isLetter(cBuf[i])) { // count letters only
            Character ch = new Character(cBuf[i]);
            Integer cnt = (Integer) cHt.get(ch);
            if (cnt == null) {
              cHt.put(ch, new Integer(1));
            }
            else {
              cHt.put(ch, new Integer(cnt.intValue() + 1));
            }
          }
        }
        return cHt;
      }
      public static void main(String[] args) {
        String s="This is a test strrrringgg 11 22 33";
        TokenCounter tc = new TokenCounter(s);
        ArrayList list=tc.getResult();
        System.out.println("Character frequency for:\n["+s+"]");
        for (int i=0; i<list.size(); i++) {
          System.out.println(list.get(i).toString());
        }
      }
    
    }
    eschew obfuscation

  9. #9
    Join Date
    Jul 2005
    Location
    SW MO, USA
    Posts
    299
    My earlier post suggested using the value of a char as an index into an array.
    A char can be cast to int. Think of a char as a number(it really is!). Change your problem description to be counting the number of occurances of the digits 5 to 9. If you map the digit 5 to an index of 0, etc then you can count the 5s by using the digit minus 5 as an index to an array that holds the counts. If you have an array that holds 5 values, you can index it by the value of the digit minus 5. Thus a 5 indexes element 0, a 6 element 1 etc thru 9 to element 4.

    >work with just one counting variable:
    Not possible for your problem! Perhaps you meant with one array of variables.

    You can clear a TextArea by setting its value to an empty String: "".
    Test if the TextArea is empty by getting its value and testing if its the empty String: ""

    Why do you initialize intArr to those values?
    Why is charArr so big? What is it used for?

    A lot of programmers use comments in there programs to describe the whats and whys of there code.

  10. #10
    Join Date
    Sep 2005
    Posts
    74
    I just finished commenting the code, fogot to comment that section earlier =p. charArr[] holds the entire string as an array of numbers (each letter is mapped to an index). intArr has the 26 lowercase letters of the alphabet in their numeric form.

    The first for loop starts with the first index of intArr and casts it to a char ('a'), then the second for loop takes the first letter of the user's string and compares it with the index of intArr, and if it matches it increments the counter. The inner loop loops to the end of the string comparing each letter with the letter 'a' looking for a match, if one is found the counter is incremented appropriatly. The counter is printed and reset (just realized that holdInt = intArr[i]++; doesn't do anything since i is incremented in the loop header..so I'm taking that out). The first loop gets incremented to the letter 'b' and the process starts over again.

    Code:
    String str = display.getText();
    if (str == " ")
    	{
    		display.setText("Error");
    	} else {
    			display.append("\n----------------------------------------------");
    	countCharFreq(str);
    	}
    I tried the empty string..and it doesn't work for me. Any idea why?

  11. #11
    Join Date
    Jul 2005
    Location
    SW MO, USA
    Posts
    299
    The == operator doesn't test the contents of an object, it tests if two object references are the same. Use the equals() method to compare two objects.

  12. #12
    Join Date
    Sep 2005
    Posts
    74
    Doh! Wow, those little semantic errors sure are tricky =p. Thanks.

  13. #13
    Join Date
    Nov 2004
    Location
    Norway
    Posts
    1,560
    Yeah, it works great for the English alphabet and you have saved some code lines by
    leaving out the ranking list
    eschew obfuscation

  14. #14
    Join Date
    Nov 2004
    Location
    Norway
    Posts
    1,560

    Cool Btw:

    Your code has around 97 lines..using an array I shortened it to 39 lines:
    This one does the same as your code (and it handles other afabets and is
    case sensitive),
    it's only 14 lines, but the question is: is code like this easy to maintain
    and modify and are the saved lines worth the cost of complicated and
    "clever" code ?


    I beleive the answer is no
    Code:
    public class PrintArrays {
      static int intArr[] = new int[256];
      public static void main(String[] args) {
        String str = "Hello world";
        char [] cBuf=new char[str.length()];
        str.getChars(0,cBuf.length,cBuf,0);
        for (int i=0; i<cBuf.length; i++)
          intArr[(int)cBuf[i]]+=(Character.isLetter(cBuf[i])) ? 1 : 0;
        int maxPos=0;
        for (int i=0; i<intArr.length; i++) maxPos=(intArr[i] > intArr[maxPos]) ? i : maxPos;
        System.out.println("The most frequent letter is: " + "\'" +
                           (char)maxPos + "\'" + "(" + intArr[maxPos] + " times" + ")");
      }
    }
    Last edited by sjalle; 10-31-2005 at 03:11 AM. Reason: Bug in line 9 fixed
    eschew obfuscation

  15. #15
    Join Date
    Sep 2005
    Posts
    74

    The irony of it all...

    I turned the assignment in today, and the professor rejected the way I did it. Basically she told me that it was the most inefficent way possible, and to use a frequency array.

    So can someone explain exactly what a frequency array IS (she didn't do a very good job =p), so I don't spend 2 hours on something that should take 30mins?

    Thanks.

Similar Threads

  1. Replies: 2
    Last Post: 04-15-2005, 09:06 PM
  2. Replies: 3
    Last Post: 04-08-2005, 05:49 AM
  3. Accessing jagged array values
    By Jason Salas in forum ASP.NET
    Replies: 2
    Last Post: 04-20-2003, 11:39 PM
  4. Replies: 1
    Last Post: 04-08-2002, 11:32 AM
  5. SafeArrayCopy SLOWER than iterating string array!
    By Mark Alexander Bertenshaw in forum VB Classic
    Replies: 0
    Last Post: 06-12-2000, 08:15 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center
 
 
FAQ
Latest Articles
Java
.NET
XML
Database
Enterprise
Questions? Contact us.
C++
Web Development
Wireless
Latest Tips
Open Source


   Development Centers

   -- Android Development Center
   -- Cloud Development Project Center
   -- HTML5 Development Center
   -- Windows Mobile Development Center