I am doing my project, which is a web browser with filtering function, using Java.
Now I am implementing the keyword scanning function, where if that are some keyword (like bad words) appear too much on a web page (let's say 5 times), then the webpage will be blocked.
I try to use the .length method to catch the words that appear in a web page, but it seems not scan through the words which appear in a webpage.
In my code, the class for me to implement the website recognizing and blocking function is in 'dcToolBar'.
Ah, forgot to ask 1 thing, how come the way my browser display the webpage just like when the webpage is not loading properly, with lots of strange code on the page itself?
Is anyone here have the idea on how to implement this such function? Please teach me if that are any, THANKS !!!
Lets say that you have a file "badwords.txt" and another file "input.txt". If you want to count the number of occurances of any word in badwords.txt in input.txt. Then you can do the following:
Code:
protected String[] readArray(File file) {
BufferedReader in = new BufferedReader(new FileReader(file));
SortedSet<String> words = new TreeSet<String>();
String word;
while( (word = in.readLine()) != null ) {
words.add(word);
}
return words.toArray();
}
public int countWords(String[] badWords, BufferedReader input) {
String line;
int count = 0;
while( (line = input.readLine()) != null ) {
// you should put punctuation and space delimiters in the string
StringTokenizer st = new StringTokenizer(line," .,/\\!@#$%^&*()_-+=", false);
while( st.hasMoreTokens() ) {
String token = st.nextToken();
if( Arrays.binarySearch(badWords, token) != -1 ) {
count++;
}
}
}
return count;
}
You would then call something like this:
Code:
String[] badWords = readArray(new File("badwords.txt"));
BufferedReader input = new BufferedReader(new FileReader(new File("input.txt")));
int count = countWords(badWords, input);
// count is the number of occurances of any word in badwords in the file.
I had tried out to do, now I can scan and save the text from a webpage into a temporary file for comparison, but here comes the problems: IT ONLY SAVE 1 LINES OF TEXT into the text file !!!
I have no idea why it can only read and scan 1 line of text, I will need it to scan and save the ENTIRE body text from the webpage so that I can compare it with to find out keyword (the keyword are stored in another text file Keyword.txt, I will use Java I/O to read from it and do comparison)
Please please teach me if that's any solution for this, THANKS !!!
while ((str = inR.readLine()) != null) {
// str is one line of text; readLine() strips the newline character(s)
buffWrite=new BufferedWriter(new FileWriter("Website.txt"));
buffWrite.write(str);
buffWrite.flush();
}
You don't want to open a new buffered writer every line. Try:
Code:
buffWrite=new BufferedWriter(new FileWriter("Website.txt"));
while ((str = inR.readLine()) != null) {
// str is one line of text; readLine() strips the newline character(s)
buffWrite.write(str);
buffWrite.flush();
}
I had improve my program with your example, it's running properly now but it have some problem with the function, the keyword filtering still not work properly. I try out a target webpage but that page still displaying...
I set the keyword in my Keyword.txt with this word "car" and then try to access this webpage: http://db.gamefaqs.com/console/ps2/...t_auto_sa_h.txt, which content a lot of word "car" ( I checked already). But at the end the page still displaying...
It suppose to save a webpage into the 'Website.txt' (I use BufferReader and BufferWritter), and compare it with the content of 'Keyword.txt'. IF the keyword are match more than 5 times then the website will be blocked.
I implement this function in my 'dcToolBar' class, I have no idea why it's not working because the code seems very logic already....
Can someone please teach me some solution, I really cry out due to this problem, PLEASE HELP ME !!!
*crying*
Bookmarks