Thursday, 17. May 2012

Google's bots learn to read interactive webpages more like humans


Google feeds its search engine's index with site data from a virtual army of "bots"—Web-crawling applications that scour sites for content. But in the past, Google's bots hit a wall when they ran into interactive content that was loaded through JavaScript—especially on pages that use Asynchronous JavaScript and XML (AJAX) to allow users access to additional content without reloading pages. But now, according to Vancouver-based developer Alex Pankratov, it appears Google's bots have been trained to act more like humans to mine interactive site content, running the JavaScript on pages they crawl to see what gets coughed up.

arstechnica.com

... Comment