A growing proportion of daily human activities are recorded and served by various hidden web sites. In addition to online social networking and media sites (OSNs), there are millions of other hidden web data sources. In this digitalized world, a question of interests to both data providers and their users is: what information can be inferred statistically from a sample obtained remotely from restrictive web interfaces. The information inferred ranges from individual traits, to summarizing statistics and network structural properties. The answer to this question has applications in business intelligence, social media marketing, criminal group detection, and political mobilization. While our main goal is to uncover the hidden properties, the same techniques can be also used by data providers to design the searchable interface to protect the data.
In this talk I will present several estimation techniques in the context of graph, and give an application in the detection of millions of fake followers in Weibo.
Jianguo Lu is a professor in Computer Science at University of Windsor in Ontario, Canada. He received his bachelor, master, and doctoral degrees from Nanjing University in 1985, 1988, and 1991. He has a background in software engineering and intelligent software agents. In recent years, he focuses on the discovery of interesting properties in the deep web and online social networks, and publishes in journals such as TKDE (IEEE Transactions on Knowledge and Data Engineering), IR (Information Retrieval), DKE (Data and Knowledge Engineering), and IPM (Information Processing and Management).