-
simulating synchronous HTTP transfers
hello,
I am interested in crawling websites automatically. Ideally, I would be able to get an image capture and a HTML source every few minutes.
However, there are some sites that resist asynchronous HTTP transfers. One example is the Yahoo front page. If you access it asynchronously, perhaps through a Java script and employing a method like 'readRawSource' or 'loadStrings', et. al., you're making an asynchronous request. The response you get in return is never what is showing up on the web page at the time you make the asynchronous request.
Is it simply impossible to crawl a website like this? Or, can some kind of browser emulation be performed in a Java applet that makes a synchronous request of a problem website, saves the source and saves an image capture?
thanks.
Similar Threads
-
By manishlondon in forum Java
Replies: 0
Last Post: 10-17-2006, 05:19 AM
-
By freesoft_2000 in forum Java
Replies: 12
Last Post: 08-03-2005, 12:50 PM
-
By Andrei Coler in forum .NET
Replies: 0
Last Post: 08-20-2003, 11:00 AM
-
By Michael D. Kersey in forum .NET
Replies: 2
Last Post: 08-30-2002, 12:05 AM
-
By Constance J. Petersen in forum .NET
Replies: 13
Last Post: 08-28-2002, 10:06 PM
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Development Centers
-- Android Development Center
-- Cloud Development Project Center
-- HTML5 Development Center
-- Windows Mobile Development Center
|