New Directions and Challenges of Near-Duplicate Detection

Call for Papers

Near-duplicate detection has attracted in recent years considerable attention due to the large practical significance of this problem for online application and the continuing growth observed on the Internet. While much focus has been directed at reducing the search-engine index size and preventing plagiarism, new applications such as spam filtering are finding the problem highly relevant. Also, apart from the traditional text domain, image and audio based near-duplicate detection has been of growing importance. Despite much progress, some issues continue to pose siginificant problems (e.g., on the definitional level, at which point documents stop being near-duplicates and are just highly similar to one another).

In this special session we aim to bring together researchers and practitioners tackling various aspects of near-replica detection. The proposed areas of focus are (although are not limited to):

  • Benchmark collections and evaluation
  • Application-specific definitions of near duplicates
  • Image/Audio based near-replica detection
  • Block-level vs document level near-replica detection
  • Spam and email filtering
  • Plagiarism
  • Scalability
  • New and novel applications of near duplicate detection
Submissions formatted according to the CIDM 2007 guidelines should be sent via email to alek [at] ir.iit.edu.

Important Dates:

Paper Submission November 15, 2006
Notification of Acceptance: December 15, 2006
Camera Ready Submission January 15, 2007
Conference April 1-5, 2007

Session Chairs:

Aleksander Kołcz
Abdur Chowdhury