Group Meeting Minutes

 

Date: July 14th, 2005

Place: SB220

Attendees: Wai Gen, Linh, Dongmei

 

²       Linh’s presentation on the paper Making Peer-to-Peer Keyword Searching Feasible Using Multi-level Partitioning. Shuming Shi et al. Tshinghua Univ. Third International Workshop on Peer to Peer Systems 2004 (IPTPS’04).

Ø        It’s about large scale keyword searching on P2P networks. They proposed a new index partitioning and building scheme called multi-level partitioning (MLP) and compared it with the other two schemas, partition-by-document and partition-by-keyword.

Ø        MLP is implemented on SkipNet, which organizes nodes into a circular distributed data structure. Every node has two IDs- lexicographic ID (LexID) and numeric ID (NumID).

o       LexID is generated by mapping the coordinates of a node in a d-dimensional to 1-demensional space using space-filling curves.

o       NumID is generated by the hash value of a node’s public key or its IP addr., which is similar as how DHT does.

Ø        Node group is hierarchy built by dividing LexID digit by digit.

Ø        Each group maintains the local inverted index for all the docs within this group. For each group on level l, the index is partitioned among nodes according to the partition-by-keyword scheme.

Ø        A query will first broadcast to all groups of level 1, and then to all groups of next levels until level l is reached. On level l, each group’s nodes containing the inverted lists of the query keywords will be responsible for answering the query. The search results of the group will be generated by intersecting the inverted lists.

Ø        From their experiments, they concluded that MLP can dramatically reduce the bisection bandwidth and the end-user latency, compared to the partition-by-keyword scheme, and it need only broadcast a query to moderate number of peers to generate precise results, compared to partition-by-doc.

 

²       Linh:

Ø        Try to critique the ICDE paper (query masking paper).

Ø        Try to see if there’s any interesting pattern from the results of query masking paper and why it behaves in a certain way.

 

²       Dongmei:

Ø        Pick a paper to present for next week.

Content-Based Community Formation in Hybrid Peer-to-Peer. Atip Asvanund, Ramayya Krishnan, Carnegie Mellon University. In the Proceedings of 2004 SIGIR workshop on Peer-to-Peer Information Retrieval, page 24~34.

               http://p2pir.is.informatik.uni-duisburg.de/2004/17.pdf

Ø        Collect and analyze the simulation results on the small dataset after modifying the global and local distribution of categories based on their size and the distribution of files within a category.

Ø        Keep on working on limewire prototype. Currently hashing binary files and comparing to see if two files are the same are done. Next step: plug it into the limewire project to make it work.

Ø        Order the business cards.