CS 765 Complex Networks

Fall 2014

Network Lab 3

Due on Thursday Sep 25, 2014 at 9:30 am

PageRank intuition (2points)

Go to pagerank (You might need to add a security exception for the website). You'll see a small, directed, network, and by clicking on the 'iterate' button, you will be calculating the PageRank of each node. It will take several iterations for the algorithm to converge. At each iteration, the probability that a random walker is found at any given node A is proportional to the probability that it was on a node B with a directed edge to A, divided by the outdegree of node B. The edge width in the visualization is proportional to the probability that a node transitions from B to A.

  1. Can you explain the different widths you see? Approximately how many iterations does the algorithm take to converge?
  2. Try increasing the teleportation probability. How does this influence the PageRanks assigned to the nodes?
  3. Try allowing sinks. Without sinks allowed, once a random walker reaches a node with no outgoing edges, it jumps randomly to another node. With sinks allowed, it stays at that node with probability (teleportation) and jumps to a random node with probability = teleportation. What affect does allowing sinks have on the distribution of PageRanks?

PageRank (4 points)

Construct a small directed network (about 10 nodes) in GDF or .net format and load it into GUESS. Construct it such that you have at least one node that will have low indegree but high PageRank

  1. Compute the PageRank of each node by typing g.nodes.pagerank
  2. Color by PageRank colorize(pagerank,green,yellow)
  3. Compute the indegree g.nodes.indegree
  4. Size the nodes by indegree resizeLinear(indegree,minsize,maxsize) (you are choosing minsize and maxsize)
  5. Turn in an image of your network.
  6. Point out a node that has high PageRank but low indegree. Explain qualitatively how this came about.

an aside: You can also use the GUESS toolbar pageranktoolW.py, if you'd like to see how the algorithm converges.

Analyzing blogosphere (4 points)

Download the file poliblog.gdf from cTools . It represents the citation patterns between 40 list blogs during a couple of months preceding the 2004 presidential election, along with the political leaning of those blogs. Open it in Guess (one way of doing this is by clicking the "Load GDF/GraphML" button after guess starts up). Do the following (submit just the final image and the list of commands you used).

You may save the commands in a .py file to repeat the process. You would need to call execfile("yourfilename.py"), or select File>Run Script in the dropdown menu. To save your network with the new colors & positions, you could use the exportGDF(“filename.gdf”) command. You could also have created a persistent database when starting Guess.

Submitting your files

Submission of your homework is via WebCampus. You must submit all the required files in a single document containing all the answers.

Acknowledgement: The assignment is modified from Lada Adamic.