Group Meeting Minutes
Date: July 21st,
2005
Place: SB232C
Attendees: Wai Gen, Linh,
Dongmei
1/ Dongmei presented
experiment results for the uniform distribution of documents, using additional
ranking functions.
2/ Dongmei’s presentation on
the paper “Content-Based Community Formation in Hybrid Peer-to-Peer Networks”:
- The authors model digital
libraries as leaf nodes and regional directory services as ultrapeers in
the hybrid peer-to-peer architectures. They used community formation
method that allows nodes in the networks gradually change their set of
neighbors towards a topology that improves IR efficiency.
- In order to enable users
whose content interests and IR patterns are mutually beneficial to
collocate in the same or nearby community, the authors proposed a
valuation function for estimating the value of establishing a connection
between a pair of peers, and a community formation algorithm.
- Valuation function: a query
model of a peer is composed based on the peer’s past queries. In addition,
a content model of a peer is also built. The K-L divergence method is used
to measure the similarity between past queries model of one peer and
content model of another peer.
- Formation algorithm: given
a set of foreign peers, a peer will evaluate the benefit of establishing a
connection with each new peer using the valuation function. If the benefit
of connecting to any new peer is greater than the benefit of an existed
peer in the neighbor set, that peer will be replaced by the new peer. The
set of foreign peers may be retrieved through a host cacher or through
PONG messages.
- Other extensions: pruning
method and intelligence host cache server are proposed as additional
techniques to enhance the IR efficiency.
3/ Linh reported a bug in the
simulator that makes the performance of all ranking functions the same as
performance of ranking by arrival, for the case of server masking. The solution
is to change the value of one parameter.
4/ Linh reported cutting half
of the average number of responses per query while maintaining approximately
same performance for server masking, by early stop masking when there are some
responses. The idea come from query relaxation techniques, that deal with the
“failed-queries” (queries that return nothing) in database. The question is how
this “early stop masking” technique affects different ranking functions?
5/ Assigments:
Dongmei:
- Create a parameter for the
uniform distribution of documents and post the source code on IR machine.
- Explain the experiments’
results (may come out with hypothesises and new design of experiments).
- Big goal: building Limewire
and the Journal paper.
- Remember the conference
deadline.
Linh:
- Try different masking
algorithms (combination of existing ones).
- Try backup masking
algorithms
- Explore query relaxation.
- Remember the conference
deadline.
6/ Paper for next meeting on
Aug 4: “Sloppy Hashing and Self-Organizing Clusters”
http://iptps03.cs.berkeley.edu/final-papers/coral.pdf