I have written a little LZW copmression program... the program works fine, however, at some point int some files, it will stop working properly and starts giving strange characters. this happens only with large files.
for example if I have one sentence in the file, everything is fine. When I copy it in the file 900 times, everything is still fine, when I copy it 1800 times, it will start to give wrong output.
Any help would be really appreciated.
Thanks in advance.
Here is my code:
/*
* Created on Mar 29, 2005
*
* LZW.java
*
* Mar 29, 2005
*/
// display bit representation of specified int value
private String getBits( int value )
{
// create int value with 1 in leftmost bit and 0s elsewhere
int displayMask = 1 << 11;
StringBuffer buffer = new StringBuffer( 35 ); // buffer for output
// for each bit append 0 or 1 to buffer
for ( int bit = 1; bit <= 12; bit++ ) {
// use displayMask to isolate bit
buffer.append( ( value & displayMask ) == 0 ? '0' : '1' );
value <<= 1; // shift value one position to left
}
/**
* unCompresses A file which was <p>
* encoded using the LZW Algorithm.
*/
public File unCompress(File file) {
/*
* 1) make the initial dictionary.
* 2) read the file into an array
* converting each 12 bits into an integer.
* 3) apply the algorithm.
*/
/* create and intialize the dictionary */
HashMap dictionary = new HashMap();
initializeDictionary(dictionary);
//the final string to be decoded.
Vector toBeDecoded = new Vector();
/* read the file into an array */
readFileIntoArray(toBeDecoded, file);
/* decode and store the decoded text in output */
String output = decodeUsingLZW(toBeDecoded, dictionary);
/* create the file to write the output to */
File decoded = new File("decoded.txt");
/* write the output to the file */
try {
BufferedWriter out = new BufferedWriter(new FileWriter(decoded));
out.write(output);
out.flush();
}
catch (IOException e){
System.out.println(e);
}
/* return the output file */
return decoded;
}
//------------------------------
/**
* Initializes the Dictionary with the characters from 0 -255.
* The code of each character will be its ASCII representation.
* <p>
* @param dictionary The dictionary to be initialized.
*/
private void initializeDictionary(HashMap dictionary){
for (int i=0; i<256; i++)
dictionary.put(new Integer(i), (char)i+"");
}
//--------------------------------------
/**
* Reads the contents of the file to be decoded and stores
* each 12 bits as an integer in 'toBeDecoded'. <p>
* @param toBeDecoded The contents of the input
* file stored as integers.
* @param file The file to be decoded.
*/
private void readFileIntoArray(Vector toBeDecoded, File file) {
try {
BufferedReader input = new BufferedReader(new FileReader(file));
//the code recieved of one from the input file (12 bits).
String code = "";
//the code when changed to binary.
int integerCode;
//one bit read from the file.
int bit;
//true when the end of file is reached.
boolean eof = false;
/* read the bits from the file, 12 bits each time.
* Then change the the code into an integer.
* Break when the end of the file is reached.
* End of file is -1.
*/
while (!eof) {
for (int i = 0; i < 12; i++) {
bit = input.read();
/* if the end of file is not reached yet then
* convert the most recently read 12 bits into
* an integer and store it in the vector.
*/
if (!eof) {
integerCode = Integer.parseInt(code, 2);
toBeDecoded.add(new Integer(integerCode));
code = "";
}
}//end of while.
}
catch (IOException e){
System.out.println(e);
}
}
//--------------------------------------
/**
* Performs the LZW decoding. <p>
* @param toBeDecoded The array of code to be decoded.
* @param dic The dictionary which holds the codes.
* @return The decoded text.
*/
private String decodeUsingLZW(Vector toBeDcoded, HashMap dic) {
/* contains the decoded text */
String output="";
/* a temporary string necessary for the algorithm */
String temp="";
/* the code to be decoded */
Integer intCode;
/* another string necessary for the algorithm */
String entry=null;
/* apply the algorithm */
for (int i=0; i<toBeDcoded.size(); i++) {
intCode = (Integer)toBeDcoded.get(i);
For the input, try anything, it will fail only when the file is quite big.
and thanks again in advance.
06-02-2005, 06:28 AM
maxx007at
Next time please set code between code tags. It saves space
Maxx007at
06-02-2005, 08:30 AM
sjalle
Pretty hairy stuff. Why this happends I don't know, but I fail to find any close() -ing of
the files you use, that seems a bit risky to me. And I would use a StringBuffer
instead of a String when there is a lot of += going on. The buffer could be
set up initially to match the filesize.
But what really beats me is why you want this to stand a test of 1800
consecutive and identical (and useless) operations :confused:, kind of like
testing tyres with a drill...