-
StreamTokenizer not recognising "-"
Hello all
I'm having a some trouble using StreamTokenizer to split up an input stream into tokens, and then put them into an ArrayList.
My problem is that when the ST encounters a "-", ttype is set to 45. As a consequence, "-"'s are not passed into my ArrayList.
I know that the return of 45 is the ASCII value of "-". The ST API specs state that 0-9, "." and "-" are treated as numerals.
I have tried with and without wordChar settings. With and without parseNumbers()
Sample input/output:
-69 -> -69.0 (ttype = -2)
69 -> 69.0 (ttype = -2)
69- -> 69.0 + ttype = 45
69-- -> 69.0 + ttype = 45 + ttype = 45
i- -> i- (ttype = -3)
i-- -> i-- (ttype = -3)
i-j -> i-j (ttype = -3)
i - j -> i +ttype=45 + j
Question 1: Why is the ttype = 45 and not -2 (TT_NUMBER)?
Question 2: How can I fix it?
Thanks in advance
Matthew
This is my code:
Code:
// This reads in the file
BufferedReader inFile = new BufferedReader (new
FileReader ("Date.java"));
// Creates the stream for the file
StreamTokenizer streamT = new StreamTokenizer(inFile);
streamT.wordChars(33,47); //characters make up words .
streamT.wordChars(58,64); // 48-57 are numerals 58-64 are some punctuation stuff
streamT.wordChars(91,96); // 65-90 are already wordchars 91-96 are more punctuation
streamT.wordChars(123,126);// 97-121 are already wordchars 123-126 are more punctuation
streamT.parseNumbers(); // this says to recognise numbers as numbers.
streamT.eolIsSignificant(true); // the end of line is significant, and returns a different token, not just whitespace
// creates array to hold the seperate tokens from the program
ArrayList programTokens = new ArrayList();
int printIndex = 0; // this is here for debugging
// while the next token isn't the end of file..
while(streamT.nextToken() != StreamTokenizer.TT_EOF)
{
System.out.println(streamT.ttype); // this is here for debugging
// if the type of variable is a word, then add the word
if(streamT.ttype == StreamTokenizer.TT_WORD)
{
programTokens.add(streamT.sval);
}
// if the type of variable is a number, then add the number
else if(streamT.ttype == StreamTokenizer.TT_NUMBER)
{
// ArrayList stores objects, so I convert my number to a string
String stringInput = Double.toString(streamT.nval);
programTokens.add(stringInput);
}
// if the type of variable is EOL, then print EOL.
else if(streamT.ttype == StreamTokenizer.TT_EOL)
{
programTokens.add("ENDOFLINE");
}
System.out.println(programTokens.get(printIndex)); // this is here for debugging
printIndex++; // this is here for debugging
}
Last edited by masher; 08-09-2005 at 11:26 AM.
Reason: clarification, added sample I/O
-
I don't have an answer without writing a test program. Could you do that?
Write the shortest, simplest program that demonstrates the problem. Use a StringReader so the input is in the program. Put System.out.println() for all the cases. Run the program and copy the screen with the results to an editor. Comment the lines where the problem is and post it.
-
I believe that hyphens are only treated as numbers when they are prefixed to other numbers. So -69 would be -69 but 69- would be parsed as 2 entities since , for most things at least, you should prefix the sign to the number. What is it that you mean "fix it"? If you want the hyphens and such in your ArrayList as strings then you should just be able to do something like this:
Code:
// while the next token isn't the end of file..
while(streamT.nextToken() != StreamTokenizer.TT_EOF)
{
System.out.println(streamT.ttype); // this is here for debugging
// if the type of variable is a word, then add the word
if(streamT.ttype == StreamTokenizer.TT_WORD)
{
programTokens.add(streamT.sval);
}
// if the type of variable is a number, then add the number
else if(streamT.ttype == StreamTokenizer.TT_NUMBER)
{
// ArrayList stores objects, so I convert my number to a string
String stringInput = Double.toString(streamT.nval);
programTokens.add(stringInput);
}
// if the type of variable is EOL, then print EOL.
else if(streamT.ttype == StreamTokenizer.TT_EOL)
{
programTokens.add("ENDOFLINE");
}
else { // It was special
programTokens.add(""+streamT.ttype);
}
System.out.println(programTokens.get(printIndex)); // this is here for debugging
printIndex++; // this is here for debugging
}
Hope this helps.
~evlich
-
Even if I don't include parseNumbers() and do include wordChar(45), then it still doesn't recognise a "-" as a word.
The workaround that I'm implementing atm is pretty much like yours, except that I'm writing the ASCII character that is equal to the ttype to the arraylist.
At the moment I've got
if(ttype == 45)
{
programTokens.add("-");
}
It will be generalised to:
else
{
programTokens.add(theCharCorrespondingTo(ttype));
}
once I find out how to do it.
Last edited by masher; 08-09-2005 at 09:39 PM.
-
What exactly is the goal of this? If you just want to separate the string into words then you might try StringTokenizer instead of StreamTokenizer. Something like this:
Code:
StringTokenizer st = new StringTokenizer("This is the string to split", " ", false);
ArrayList<String> words = new ArrayList<String>();
while( st.hasMoreTokens() ) {
words.add(st.nextToken());
}
Hope this helps.
~evlich
-
I'm using Stream as opposed to StringTokenizer as I found it easier to get the file input to work.
Anyhoo..
This is my final else statement
else
{
char inputChar = (char) streamT.ttype;
String convertChar = String.valueOf(inputChar);
programTokens.add(convertChar);
}
It all works now.
If I get more time later, I may investigate the Scanner class...
Similar Threads
-
Replies: 5
Last Post: 02-23-2005, 12:13 AM
-
By trevallion in forum Java
Replies: 0
Last Post: 03-15-2002, 09:06 AM
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Development Centers
-- Android Development Center
-- Cloud Development Project Center
-- HTML5 Development Center
-- Windows Mobile Development Center
|