# CS 765 Complex Networks

## Due on Monday Oct 3, 2011 at 1:00 pm

Social Network (6 points)

Import your data from Facebook. (You may use http://apps.facebook.com/namegenweb/) This is the network of your acquaintances and the connections between them, but you yourself are excluded from the network.

You may use http://netwiki.amath.unc.edu/DataFormats/GraphMLToPajek to convert GraphML format into Pajek format.

1. Do an energy layout of the network using the Draw>Draw-Partition-Vector command, using the degree partition and either closeness or betweenness as the vector. Include an image.
2. Who is the most central node in the network by degree, closeness and betweenness?
3. Point out 3 vertices whose centrality scores differ (e.g. high betweenness but medium closeness) and explain from their position in the network why it happens.
4. Identify a node with high betweenness that you could afford to remove without disconnecting other vertices from that component. Create a second network that excludes that person. Use Net>Transform>Remove>Selected Vertices. Recompute betweenness for everyone remaining in the network. Include an image.
5. Point out 2 particular vertices and their position in the network. Discuss why their betweenness centrality score did or did not change.
6. Point out 1 vertex (if it exists) whose closeness centrality suffers as a result.
7. Briefly discuss the ambiguities (& missing data) in this kind of data collection.
8. Imagine you are a newcomer who wants to not only be friends with you, but occupy a central position in your network (I know, multiple personality is a bit hard to keep track of). You only have time to make 2 new acquaintances out of your network of friends. Which 2 would you choose to maximize your closeness centrality?
9. Add yourself to the network by using the command Net>Transform>Add>Vertices and adding edges in the Draw window and compute your closeness. Which 2 vertices would you connect to to maximize your betweenness score (what is your betweenness?).

For this assignment you may use existing data. However, it is highly recommended that yiou use your own graph.

PageRank (4 points)

Construct a small directed network (about 10 nodes) in GDF or .net format and load it into GUESS. Construct it such that you have at least one node that will have low indegree but high PageRank

1. Compute the PageRank of each node by typing `g.nodes.pagerank`
2. Color by PageRank `colorize(pagerank,green,yellow)`
3. Compute the indegree `g.nodes.indegree`
4. Size the nodes by indegree `resizeLinear(indegree,minsize,maxsize)` (you are choosing minsize and maxsize)
5. Turn in an image of your network.
6. Point out a node that has high PageRank but low indegree. Explain qualitatively how this came about.

an aside: You can also use the GUESS toolbar pageranktoolW.py, if you'd like to see how the algorithm converges.

Snowball Sampling (bonus 2 points)

Use the file Dining-table_partners.net to work with the following scenario. You are a prince who just met an enchanting young lady at a ball, but she left at the stroke of midnight and left a shoe behind. Now you'd like to find the shoe's owner. All you know about her is that she lives in this particular girls' dorm. The headmistress won't let you talk to the girls, so the only way you can find your princess is to covertly ask the one girl you know, Ella, to introduce you to her two favorite friends. Once you know her friends, you can ask them to introduce you to their two favorite friends, etc. This is a snowball sampling technique.

1. Highlight the vertices that you will reach using snowball sampling (Net > K-Neighbors > ...) Include an image.
2. Which girls will you not find using snowball sampling starting with Ella?