Scatter/Gather Browser - a Visual Web Interface for Text Navigation and Search
Program List

Team members:
Alex Berry
Sujit Gadkari
Weimao Ke
Following the link below to the working Scatter/Gather Browser:
Following the link below to the working Scatter/Gather Browser:


Effective and efficient browsing methods for large text collections have been widely examined in recent years. Among existing implementations of various browsing methods, Scatter/Gather browsing is well known for its ease to use and effectiveness in situations where it is difficult to precisely specify a query (Cutting, Karger, Pedersen, and Tukey 1992; Hearst and Pedersen 1996). It combines search and interactive navigation by gathering and reclustering user-selected clusters (e.g. represented by levels of keywords).

Scatter/Gather browsing method was first proposed by Cutting, Karger, Pedersen, and Tukey (1992). In each iteration of this browsing method, the system scatters the dataset into a small number of clusters/groups, and presents short summaries of them to the user. The user can select one or more of the groups for future study. The selected groups are then gathered together and clustered again using the same clustering algorithm. With each successive iteration the groups become smaller and more focused. Iterations in this method can help users refine their queries and find the desired information from a large data collection.

This project aims to implement a Scatter-Gather browser, a dynamic visualization for text navigation/search. Using visualization techniques, this browser will help users refine their search queries and narrow down search results interactively and visually. We are going to constraint ourselves to a smaller text corpus for proof of concept. We will modularize it to be able to attach to any text collections in the future.


[1] Baeza-Yates, R. and B. Ribeiro-Neto (2004). Modern Information Retrieval. Addison Wesley Longman publishing.
[2] Cutting, D. R., D. Karger, J. O. Pedersen, and J. W. Tukey (1992). Scatter/gather: A cluster-based approach to browsing large document collections. In The 15th Annual ACM-SIGIR, pp. 318-329.
[3] Hearst, M. A. and J. O. Pedersen (1996). Reexamining the cluster hypothesis: scatter/gather on retrieval results. In SIGIR '96: Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, New York, NY, USA, pp. 76-84. ACM Press.
[4] Korfhage, R. R. (1997). Information Storage and Retrieval. Wiley Computer Pub.
[5] Salton, G., A. Wong, and C. S. Yang (1975). A vector space model for automatic indexing. Commun. ACM 18 (11), 613-620.

Scatter/Gather Iterative Sessions

Scatter/Gather Browser Prototype

(c) Copyright 2007 Scatter/gather -- Powered by Perl