StreamTokenizer not recognising "-"


DevX Home    Today's Headlines   Articles Archive   Tip Bank   Forums   

Results 1 to 6 of 6

Thread: StreamTokenizer not recognising "-"

Hybrid View

  1. #1
    Join Date
    Jul 2005
    Posts
    78

    StreamTokenizer not recognising "-"

    Hello all

    I'm having a some trouble using StreamTokenizer to split up an input stream into tokens, and then put them into an ArrayList.

    My problem is that when the ST encounters a "-", ttype is set to 45. As a consequence, "-"'s are not passed into my ArrayList.

    I know that the return of 45 is the ASCII value of "-". The ST API specs state that 0-9, "." and "-" are treated as numerals.

    I have tried with and without wordChar settings. With and without parseNumbers()


    Sample input/output:

    -69 -> -69.0 (ttype = -2)
    69 -> 69.0 (ttype = -2)
    69- -> 69.0 + ttype = 45
    69-- -> 69.0 + ttype = 45 + ttype = 45
    i- -> i- (ttype = -3)
    i-- -> i-- (ttype = -3)
    i-j -> i-j (ttype = -3)
    i - j -> i +ttype=45 + j



    Question 1: Why is the ttype = 45 and not -2 (TT_NUMBER)?



    Question 2: How can I fix it?


    Thanks in advance


    Matthew


    This is my code:

    Code:
          // This reads in the file
          BufferedReader inFile = new BufferedReader (new
                                     FileReader ("Date.java"));
    
          // Creates the stream for the file
          StreamTokenizer streamT = new StreamTokenizer(inFile);
    
    
          streamT.wordChars(33,47); //characters make up words .
          streamT.wordChars(58,64); // 48-57 are numerals 58-64 are some punctuation stuff
          streamT.wordChars(91,96); // 65-90 are already wordchars 91-96 are more punctuation
          streamT.wordChars(123,126);// 97-121 are already wordchars 123-126 are more punctuation
          streamT.parseNumbers(); // this says to recognise numbers as numbers.
          streamT.eolIsSignificant(true); // the end of line is significant, and returns a different token, not just whitespace
    
    
          //  creates array to hold the seperate tokens from the program
          ArrayList programTokens = new ArrayList();
    
    
          int printIndex = 0; // this is here for debugging
    
          // while the next token isn't the end of file..
          while(streamT.nextToken() != StreamTokenizer.TT_EOF)
          {
    
             System.out.println(streamT.ttype); // this is here for debugging
             
             // if the type of variable is a word, then add the word
             if(streamT.ttype == StreamTokenizer.TT_WORD)
             {
                programTokens.add(streamT.sval);
             }
    
             // if the type of variable is a number, then add the number
             else if(streamT.ttype == StreamTokenizer.TT_NUMBER)
             {
                // ArrayList stores objects, so I convert my number to a string
                String stringInput = Double.toString(streamT.nval);
                programTokens.add(stringInput);
             }
    
             // if the type of variable is EOL, then print EOL.
             else if(streamT.ttype == StreamTokenizer.TT_EOL)
             {
                programTokens.add("ENDOFLINE");
             }
    
    
            System.out.println(programTokens.get(printIndex)); // this is here for debugging
             printIndex++; // this is here for debugging
    
          }
    Last edited by masher; 08-09-2005 at 11:26 AM. Reason: clarification, added sample I/O

  2. #2
    Join Date
    Jul 2005
    Location
    SW MO, USA
    Posts
    299
    I don't have an answer without writing a test program. Could you do that?
    Write the shortest, simplest program that demonstrates the problem. Use a StringReader so the input is in the program. Put System.out.println() for all the cases. Run the program and copy the screen with the results to an editor. Comment the lines where the problem is and post it.

  3. #3
    Join Date
    Aug 2003
    Posts
    313
    I believe that hyphens are only treated as numbers when they are prefixed to other numbers. So -69 would be -69 but 69- would be parsed as 2 entities since , for most things at least, you should prefix the sign to the number. What is it that you mean "fix it"? If you want the hyphens and such in your ArrayList as strings then you should just be able to do something like this:
    Code:
          // while the next token isn't the end of file..
          while(streamT.nextToken() != StreamTokenizer.TT_EOF)
          {
    
             System.out.println(streamT.ttype); // this is here for debugging
             
             // if the type of variable is a word, then add the word
             if(streamT.ttype == StreamTokenizer.TT_WORD)
             {
                programTokens.add(streamT.sval);
             }
    
             // if the type of variable is a number, then add the number
             else if(streamT.ttype == StreamTokenizer.TT_NUMBER)
             {
                // ArrayList stores objects, so I convert my number to a string
                String stringInput = Double.toString(streamT.nval);
                programTokens.add(stringInput);
             }
    
             // if the type of variable is EOL, then print EOL.
             else if(streamT.ttype == StreamTokenizer.TT_EOL)
             {
                programTokens.add("ENDOFLINE");
             }
             else { // It was special
                programTokens.add(""+streamT.ttype);
             }
    
    
            System.out.println(programTokens.get(printIndex));   // this is here for debugging
             printIndex++; // this is here for debugging
    
          }
    Hope this helps.
    ~evlich

  4. #4
    Join Date
    Jul 2005
    Posts
    78
    Even if I don't include parseNumbers() and do include wordChar(45), then it still doesn't recognise a "-" as a word.

    The workaround that I'm implementing atm is pretty much like yours, except that I'm writing the ASCII character that is equal to the ttype to the arraylist.

    At the moment I've got

    if(ttype == 45)
    {
    programTokens.add("-");
    }


    It will be generalised to:
    else
    {
    programTokens.add(theCharCorrespondingTo(ttype));
    }


    once I find out how to do it.
    Last edited by masher; 08-09-2005 at 09:39 PM.

  5. #5
    Join Date
    Aug 2003
    Posts
    313
    What exactly is the goal of this? If you just want to separate the string into words then you might try StringTokenizer instead of StreamTokenizer. Something like this:
    Code:
    StringTokenizer st = new StringTokenizer("This is the string to split", " ", false);
    ArrayList<String> words = new ArrayList<String>();
    while( st.hasMoreTokens() ) {
     words.add(st.nextToken());
    }
    Hope this helps.
    ~evlich

  6. #6
    Join Date
    Jul 2005
    Posts
    78
    I'm using Stream as opposed to StringTokenizer as I found it easier to get the file input to work.

    Anyhoo..

    This is my final else statement

    else
    {
    char inputChar = (char) streamT.ttype;
    String convertChar = String.valueOf(inputChar);
    programTokens.add(convertChar);
    }


    It all works now.

    If I get more time later, I may investigate the Scanner class...

Similar Threads

  1. Vector and StreamTokenizer problem
    By dagma20 in forum Java
    Replies: 5
    Last Post: 02-23-2005, 12:13 AM
  2. recognising pixels
    By trevallion in forum Java
    Replies: 0
    Last Post: 03-15-2002, 09:06 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center
 
 
FAQ
Latest Articles
Java
.NET
XML
Database
Enterprise
Questions? Contact us.
C++
Web Development
Wireless
Latest Tips
Open Source


   Development Centers

   -- Android Development Center
   -- Cloud Development Project Center
   -- HTML5 Development Center
   -- Windows Mobile Development Center