Research & Projects

Experiences shape interests. For years, I enjoyed being an engineer in the IT industry, leading the design and development of several large-scale information systems. Coming from the industrial background, I have been motivated by critical problems in the real world. To address these problems, I focused on high-level theoretical investigations of the abstraction and relied on ``big data'' in the real world for experimental research.
 
My research has been centered on information retrieval (IR) systems, particularly large-scale distributed systems for IR. The goal was to study intelligent information systems that can adapt and scale in the growing magnitude and dynamics of information/data. Around this topic, I have also conducted research on ad-hoc retrieval models, interactive IR, text mining, and complex networks. I organized my research into the following themes.
 

Projects overview

Information theory and retrieval modeling

This research aimed to study existing information theories as well as potential new information measures that can be used for IR modeling, among other applications. I have developed a new theory, namely the Least Information Theory (LIT), and conducted several studies to evaluate its application in IR, which produced very strong empirical results as compared to classic methods derived from existing theories.

Complex systems and networks

My work on distributed IR has drawn on theories and inspirations not only from information retrieval but also from complex networks research (interconnectivity of distributed systems). Understanding structural properties of interconnected systems provides important insight into a broad range of applications such as communication, distributed computing, and bibliometrics (e.g., by taking citations/coauthorships as network edges).

Large-scale and distributed information retrieval

The goal was to investigate basic principles underlying efficient and effective search operations given the magnitude of information (big data) and distributed computing resources (cloud). I developed the theory of clustering paradox in distributed IR and studied its impacts on searching in large-scale information networks (see representative publication to appear in ACM TOIS 2013).