back... |
The Web Changes Everything: Understanding the Dynamics of Web Content Eytan Adar, Jaime Teevan, Susan Dumais, Jonathan Elsas
The Web is a dynamic, ever changing collection of information.
This paper explores changes in Web content by analyzing a crawl
of 55,000 Web pages, selected to represent different user
visitation patterns. Although change over long intervals has been
explored on random (and potentially unvisited) samples of Web
pages, little is known about the nature of finer grained changes to
pages that are actively consumed by users, such as those in our
sample. We describe algorithms, analyses, and models for
characterizing changes in Web content, focusing on both time (by
using hourly and sub-hourly crawls) and structure (by looking at
page-, DOM-, and term-level changes). Change rates are higher in
our behavior-based sample than found in previous work on
randomly sampled pages, with a large portion of pages changing
more than hourly. Detailed content and structure analyses identify
stable and dynamic content within each page. The understanding
of Web change we develop in this paper has implications for tools
designed to help people interact with dynamic Web content, such
as search engines, advertising, and Web browsers.
PDF (625Kb), WSDM'09, Barcelona, Spain, Feb. 9-12, 2009 (Best Student Paper Award) |