# CS 765 Complex Networks

## Due on Thursday Oct 20, 2014 at 2:30 pm

PageRank intuition (2points)

Go to pagerank (You might need to add a security exception for the website). You'll see a small, directed, network, and by clicking on the 'iterate' button, you will be calculating the PageRank of each node. It will take several iterations for the algorithm to converge. At each iteration, the probability that a random walker is found at any given node A is proportional to the probability that it was on a node B with a directed edge to A, divided by the outdegree of node B. The edge width in the visualization is proportional to the probability that a node transitions from B to A.

1. Can you explain the different widths you see? Approximately how many iterations does the algorithm take to converge?
2. Try increasing the teleportation probability. How does this influence the PageRanks assigned to the nodes?
3. Try allowing sinks. Without sinks allowed, once a random walker reaches a node with no outgoing edges, it jumps randomly to another node. With sinks allowed, it stays at that node with probability (1-teleportation) and jumps to a random node with probability = teleportation. What affect does allowing sinks have on the distribution of PageRanks?

PageRank (4 points)

Construct a small directed network (about 10 nodes) in GDF or .net format and load it into GUESS. Construct it such that you have at least one node that will have low indegree but high PageRank

1. Compute the PageRank of each node by typing `g.nodes.pagerank`
2. Color by PageRank `colorize(pagerank,green,yellow)`
3. Compute the indegree `g.nodes.indegree`
4. Size the nodes by indegree `resizeLinear(indegree,minsize,maxsize)` (you are choosing minsize and maxsize)
5. Turn in an image of your network.
6. Point out a node that has high PageRank but low indegree. Explain qualitatively how this came about.

an aside: You can also use the GUESS toolbar pageranktoolW.py, if you'd like to see how the algorithm converges.

Analyzing blogosphere (4 points)

Download the file poliblog.gdf from cTools . It represents the citation patterns between 40 list blogs during a couple of months preceding the 2004 presidential election, along with the political leaning of those blogs. Open it in Guess (one way of doing this is by clicking the "Load GDF/GraphML" button after guess starts up). Do the following (submit just the final image and the list of commands you used).

• Lay the network out using your layout algorithm of choice. Follow up with the center and rescaleLayout() commands, to adjust the position of the network and the size of the vertices.
• Play with the zoomable interface and figure out how to reposition the nodes.
• Use the information window to find out what attributes of nodes and edges are specified.
• Color the conservative blogs red and the liberal ones blue. Color the edges differently depending on the leaning of the from and to nodes.
• Compute the indegree for all nodes at once and resize the nodes according to indegree using the resizeLinear(indegree,minsize,maxsize) command, where you specify minsize and maxsize.
• Change the width of the edges to reflect the number of citations (given with the 'weight' attribute) using the resizeLinear() command.
• Make a couple of observations about the blog network and discuss whether modifying the visualization with the above steps helped you.
• Save the commands and turn in along with your exported image. Insert both into your report document.

You may save the commands in a .py file to repeat the process. You would need to call execfile("yourfilename.py"), or select File>Run Script in the dropdown menu. To save your network with the new colors & positions, you could use the exportGDF("filename.gdf") command. You could also have created a persistent database when starting Guess.