Effective and efficient browsing methods for large text collections have been widely examined
in recent years. Among existing implementations of various browsing methods,
Scatter/Gather browsing is well known for its ease to use and effectiveness in situations
where it is difficult to precisely specify a query (Cutting, Karger, Pedersen, and Tukey
1992; Hearst and Pedersen 1996). It combines search and interactive navigation by gathering
and reclustering user-selected clusters (e.g. represented by levels of keywords).
Scatter/Gather browsing method was first proposed by Cutting, Karger, Pedersen, and
Tukey (1992). In each iteration of this browsing method, the system scatters the dataset
into a small number of clusters/groups, and presents short summaries of them to the user.
The user can select one or more of the groups for future study. The selected groups are
then gathered together and clustered again using the same clustering algorithm. With
each successive iteration the groups become smaller and more focused. Iterations in this
method can help users refine their queries and find the desired information from a large
This project aims to implement a Scatter-Gather browser, a dynamic visualization for
text navigation/search. Using visualization techniques, this browser will help users refine
their search queries and narrow down search results interactively and visually. We are
going to constraint ourselves to a smaller text corpus for proof of concept. We will
modularize it to be able to attach to any text collections in the future.
 Baeza-Yates, R. and B. Ribeiro-Neto (2004). Modern Information Retrieval. Addison
Wesley Longman publishing.
 Cutting, D. R., D. Karger, J. O. Pedersen, and J. W. Tukey (1992). Scatter/gather: A
cluster-based approach to browsing large document collections. In The 15th Annual
ACM-SIGIR, pp. 318-329.
 Hearst, M. A. and J. O. Pedersen (1996). Reexamining the cluster hypothesis:
scatter/gather on retrieval results. In SIGIR '96: Proceedings of the 19th annual
international ACM SIGIR conference on Research and development in information
retrieval, New York, NY, USA, pp. 76-84. ACM Press.
 Korfhage, R. R. (1997). Information Storage and Retrieval. Wiley Computer Pub.
 Salton, G., A. Wong, and C. S. Yang (1975). A vector space model for automatic
indexing. Commun. ACM 18 (11), 613-620.
Scatter/Gather Iterative Sessions
Scatter/Gather Browser Prototype