-
Lzw
Hello....
I have written a little LZW copmression program... the program works fine, however, at some point int some files, it will stop working properly and starts giving strange characters. this happens only with large files.
for example if I have one sentence in the file, everything is fine. When I copy it in the file 900 times, everything is still fine, when I copy it 1800 times, it will start to give wrong output.
Any help would be really appreciated.
Thanks in advance.
Here is my code:
/*
* Created on Mar 29, 2005
*
* LZW.java
*
* Mar 29, 2005
*/
import java.io.*;
import java.io.FileReader;
import java.util.*;
/**
* Performs the encoding and decoding <p>
* using the LZW algorithm.
* @version 1.0
*/
public class LZW implements CompressionAlgorithm {
private Hashtable table = new Hashtable();
/* (non-Javadoc)
* @see CompressionAlgorithm#compress(java.io.File)
*/
public File compress(File file) {
try{
BufferedReader br = new BufferedReader(new FileReader("in.txt"));
PrintStream p = new PrintStream(new FileOutputStream("out.txt"));
/*intitialize the dictionary*/
for(int i = 0; i < 256; i++)
table.put((char)i + "", new Integer(i));
String str = (char)br.read() + "";
int c, code = 256;
while( (c = br.read()) > -1){
if(table.containsKey(str + (char)c))
str += (char)c;
else{
Integer tmp = (Integer)table.get(str);
p.print(getBits(tmp.intValue()));
table.put(str + (char)c, new Integer(code));
str = (char)c + "";
code++;
}
}
Integer tmp = (Integer)table.get(str);
p.print(getBits(tmp.intValue()));
}
catch(IOException e)
{
System.err.println("File Error.");
}
return null;
}
// display bit representation of specified int value
private String getBits( int value )
{
// create int value with 1 in leftmost bit and 0s elsewhere
int displayMask = 1 << 11;
StringBuffer buffer = new StringBuffer( 35 ); // buffer for output
// for each bit append 0 or 1 to buffer
for ( int bit = 1; bit <= 12; bit++ ) {
// use displayMask to isolate bit
buffer.append( ( value & displayMask ) == 0 ? '0' : '1' );
value <<= 1; // shift value one position to left
}
return buffer.toString();
}
//------------------------
/**
* unCompresses A file which was <p>
* encoded using the LZW Algorithm.
*/
public File unCompress(File file) {
/*
* 1) make the initial dictionary.
* 2) read the file into an array
* converting each 12 bits into an integer.
* 3) apply the algorithm.
*/
/* create and intialize the dictionary */
HashMap dictionary = new HashMap();
initializeDictionary(dictionary);
//the final string to be decoded.
Vector toBeDecoded = new Vector();
/* read the file into an array */
readFileIntoArray(toBeDecoded, file);
/* decode and store the decoded text in output */
String output = decodeUsingLZW(toBeDecoded, dictionary);
/* create the file to write the output to */
File decoded = new File("decoded.txt");
/* write the output to the file */
try {
BufferedWriter out = new BufferedWriter(new FileWriter(decoded));
out.write(output);
out.flush();
}
catch (IOException e){
System.out.println(e);
}
/* return the output file */
return decoded;
}
//------------------------------
/**
* Initializes the Dictionary with the characters from 0 -255.
* The code of each character will be its ASCII representation.
* <p>
* @param dictionary The dictionary to be initialized.
*/
private void initializeDictionary(HashMap dictionary){
for (int i=0; i<256; i++)
dictionary.put(new Integer(i), (char)i+"");
}
//--------------------------------------
/**
* Reads the contents of the file to be decoded and stores
* each 12 bits as an integer in 'toBeDecoded'. <p>
* @param toBeDecoded The contents of the input
* file stored as integers.
* @param file The file to be decoded.
*/
private void readFileIntoArray(Vector toBeDecoded, File file) {
try {
BufferedReader input = new BufferedReader(new FileReader(file));
//the code recieved of one from the input file (12 bits).
String code = "";
//the code when changed to binary.
int integerCode;
//one bit read from the file.
int bit;
//true when the end of file is reached.
boolean eof = false;
/* read the bits from the file, 12 bits each time.
* Then change the the code into an integer.
* Break when the end of the file is reached.
* End of file is -1.
*/
while (!eof) {
for (int i = 0; i < 12; i++) {
bit = input.read();
if (bit == -1) {
eof = true;
break;
}
else
code += (bit - 48);
}
/* if the end of file is not reached yet then
* convert the most recently read 12 bits into
* an integer and store it in the vector.
*/
if (!eof) {
integerCode = Integer.parseInt(code, 2);
toBeDecoded.add(new Integer(integerCode));
code = "";
}
}//end of while.
}
catch (IOException e){
System.out.println(e);
}
}
//--------------------------------------
/**
* Performs the LZW decoding. <p>
* @param toBeDecoded The array of code to be decoded.
* @param dic The dictionary which holds the codes.
* @return The decoded text.
*/
private String decodeUsingLZW(Vector toBeDcoded, HashMap dic) {
/* contains the decoded text */
String output="";
/* a temporary string necessary for the algorithm */
String temp="";
/* the code to be decoded */
Integer intCode;
/* another string necessary for the algorithm */
String entry=null;
/* apply the algorithm */
for (int i=0; i<toBeDcoded.size(); i++) {
intCode = (Integer)toBeDcoded.get(i);
entry = (String)dic.get(intCode);
if (entry == null)
entry = temp + temp.charAt(0);
output += entry;
if (temp != ""){
dic.put(new Integer(dic.size()), temp + entry.charAt(0));
}
temp = entry;
}
/* return the output file
* that contains the decoded text
*/
return output;
}
//------------------------
public static void main(String [] args) {
LZW l = new LZW();
l.compress(new File("in.txt"));
l.unCompress(new File("out.txt"));
}
}
/*-**************************************************************/
For the input, try anything, it will fail only when the file is quite big.
and thanks again in advance.
Last edited by barhoooooom; 04-28-2009 at 09:51 PM.
-
Next time please set code between code tags. It saves space
Maxx007at
-
Pretty hairy stuff. Why this happends I don't know, but I fail to find any close() -ing of
the files you use, that seems a bit risky to me. And I would use a StringBuffer
instead of a String when there is a lot of += going on. The buffer could be
set up initially to match the filesize.
But what really beats me is why you want this to stand a test of 1800
consecutive and identical (and useless) operations , kind of like
testing tyres with a drill...
Last edited by sjalle; 06-02-2005 at 08:36 AM.
eschew obfuscation
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
Forum Rules
|
Top DevX Stories
Easy Web Services with SQL Server 2005 HTTP Endpoints
JavaOne 2005: Java Platform Roadmap Focuses on Ease of Development, Sun Focuses on the "Free" in F.O.S.S.
Wed Yourself to UML with the Power of Associations
Microsoft to Add AJAX Capabilities to ASP.NET
IBM's Cloudscape Versus MySQL
|
Bookmarks