Information theory and retrieval modeling

This research aimed to study existing information theories as well as potential new information measures that can be used for IR modeling, among other applications. I have developed a new theory, namely the Least Information Theory (LIT), and conducted several studies to evaluate its application in IR, which produced very strong empirical results as compared to classic methods derived from existing theories.

Complex systems and networks

My work on distributed IR has drawn on theories and inspirations not only from information retrieval but also from complex networks research (interconnectivity of distributed systems). Understanding structural properties of interconnected systems provides important insight into a broad range of applications such as communication, distributed computing, and bibliometrics (e.g., by taking citations/coauthorships as network edges).


Information Retrieval Literature

Here are lists of major information retrieval conferences, journals, and databases where you can nd good literature in related areas. Among these venues, the SIGIR conference (special interest group on information retrieval) and ACM TOIS (Transactions on Information Systems) are two of the highly regarded. Most of these can be accessed through:

Large-scale and distributed information retrieval

The goal was to investigate basic principles underlying efficient and effective search operations given the magnitude of information (big data) and distributed computing resources (cloud). I developed the theory of clustering paradox in distributed IR and studied its impacts on searching in large-scale information networks (see representative publication to appear in ACM TOIS 2013).

Hybrid Scatter/Gather browsing based on Bing search API

Another Scatter/Gather implementation for searching + browsing based on Bing search API:

Scatter/Gather browser on a news collection

We implemented a Scatter/Gather browser using 33k news documents provided by HARD track from the Text REtrieval Conference (TREC). Scatter/Gather provides a different approach (from classic paradigms of searching and browsing) to finding information and is often useful in situations where the user may feel difficult to formulate a search query. The included image is a snapshot a working/demo system, which can be accessed at:


