Extracting Keywords and Content
Is there a way to extract the keywords and/or content from an HTML document? I've been playing with the URL class, and trying to use the getContent method. However, that method returns an Object.
The question is: how do I retrieve the content from that object? Ultimately, the content needs to end up as members of a Set, but simply knowing how to view the content from the returned Object would be nice. Does anyone know how to do this? If so, what format are they returned in (I'm assuming String...)?
Cheers in advance.
do a search for posts here by me, containing word "URLConnection" or "URL" or "openStream"
also, look at HTTPUnit
Top DevX Stories
Easy Web Services with SQL Server 2005 HTTP Endpoints
JavaOne 2005: Java Platform Roadmap Focuses on Ease of Development, Sun Focuses on the "Free" in F.O.S.S.
Wed Yourself to UML with the Power of Associations
Microsoft to Add AJAX Capabilities to ASP.NET
IBM's Cloudscape Versus MySQL