# CS 790g Seminar: Complex Networks

## Due on Monday Nov 30, 2009 at 2:30 pm

PageRank intuition (3 points)

Go to pagerank. You'll see a small, directed, network, and by clicking on the 'iterate' button, you will be calculating the PageRank of each node. It will take several iterations for the algorithm to converge. At each iteration, the probability that a random walker is found at any given node A is proportional to the probability that it was on a node B with a directed edge to A, divided by the outdegree of node B. The edge width in the visualization is proportional to the probability that a node transitions from B to A.

1. Can you explain the different widths you see? Approximately how many iterations does the algorithm take to converge?
2. Try increasing the teleportation probability. How does this influence the PageRanks assigned to the nodes?
3. Try allowing sinks. Without sinks allowed, once a random walker reaches a node with no outgoing edges, it jumps randomly to another node. With sinks allowed, it stays at that node with probability (1teleportation) and jumps to a random node with probability = teleportation. What affect does allowing sinks have on the distribution of PageRanks?
PageRank (3 points)

Construct a small directed network (about 10 nodes) in GDF or .net format and load it into GUESS. Construct it such that you have at least one node that will have low indegree but high PageRank

If you are unable to paste text into the text box (where currenty there is some text about an Iraqi official), try a different browser.

• Compute the PageRank of each node by typing g.nodes.pagerank
• Color by PageRank colorize(pagerank,green,yellow)
• Compute the indegree g.nodes.indegree
• Size the nodes by indegree resizeLinear(indegree,minsize,maxsize) (you are choosing minsize and maxsize)
1. Turn in an image of your network.
2. Point out a node that has high PageRank but low indegree. Explain qualitatively how this came about.

You can also use the GUESS toolbar pageranktoolW.py, if you'd like to see how the algorithm converges.

LexRank (4 points)

Select a piece of text (10-20 sentences) that you would like to summarize and paste it in the appropriate box of the LexRank demo.

If you are unable to paste text into the text box (where currenty there is some text about an Iraqi official), try a different browser.

1. There are two parameters you can vary:
• the cosine similarity threshold determines how similar two sentences have to be in order to share and edge.
• the salience threshold determines how high a sentence's PageRank has to be in order for that sentence to be included in the summary.
Vary the cosine similarity threshold and record the most salient sentence. Does the most salient sentence change as you vary the threshold?
2. Accordingly, report on a cosine similarity threshold that gave you the best result (if applicable).
3. Compare the 1 sentence summary to the 2 or 3-sentence summary. In your opinion, how much do the 2nd and 3rd sentences add (in terms of adding more information).
4. Would you have chosen them, or a different sentence? Relate your answer to the structure of the lexical similarity graph.