CS 790g Seminar: Complex Networks

Fall 2010

Network Lab 2

Due on Thursday Sep 16, 2009 at 1:00 pm

Social Network (6 points)

Import your data from FaceBook. (You may use http://apps.facebook.com/namegenweb/. Note that you need to convet the data to PAJEK format.) This is the network of your acquaintances and the connections between them, but you yourself are excluded from the network.
  1. Do an energy layout of the network using the Draw>Draw-Partition-Vector command, using the degree partition and either closeness or betweenness as the vector. Include an image.
  2. Who is the most central node in the network by degree, closeness and betweenness?
  3. Point out 3 vertices whose centrality scores differ (e.g. high betweenness but medium closeness) and explain from their position in the network why it happens.
  4. Identify a node with high betweenness that you could afford to remove without disconnecting other vertices from that component. Create a second network that excludes that person. Use Net>Transform>Remove>Selected Vertices. Recompute betweenness for everyone remaining in the network. Include an image.
  5. Point out 2 particular vertices and their position in the network. Discuss why their betweenness centrality score did or did not change.
  6. Point out 1 vertex (if it exists) whose closeness centrality suffers as a result.
  7. Briefly discuss the ambiguities (& missing data) in this kind of data collection.
  8. Imagine you are a newcomer who wants to not only be friends with you, but occupy a central position in your network (I know, multiple personality is a bit hard to keep track of). You only have time to make 2 new acquaintances out of your network of friends. Which 2 would you choose to maximize your closeness centrality?
  9. Add yourself to the network by using the command Net>Transform>Add>Vertices and adding edges in the Draw window and compute your closeness. Which 2 vertices would you connect to to maximize your betweenness score (what is your betweenness?).

For this assignment you may use existing data. However, it is highly recommended that yiou use your own graph.

PageRank (4 points)

Construct a small directed network (about 10 nodes) in GDF or .net format and load it into GUESS. Construct it such that you have at least one node that will have low indegree but high PageRank

  1. Compute the PageRank of each node by typing g.nodes.pagerank
  2. Color by PageRank colorize(pagerank,green,yellow)
  3. Compute the indegree g.nodes.indegree
  4. Size the nodes by indegree resizeLinear(indegree,minsize,maxsize) (you are choosing minsize and maxsize)
  5. Turn in an image of your network.
  6. Point out a node that has high PageRank but low indegree. Explain qualitatively how this came about.

an aside: You can also use the GUESS toolbar pageranktoolW.py, if you'd like to see how the algorithm converges.

Snowball Sampling (bonus 2 points)

Use the file Dining-table_partners.net to work with the following scenario. You are a prince who just met an enchanting young lady at a ball, but she left at the stroke of midnight and left a shoe behind. Now you'd like to find the shoe's owner. All you know about her is that she lives in this particular girls' dorm. The headmistress won't let you talk to the girls, so the only way you can find your princess is to covertly ask the one girl you know, Ella, to introduce you to her two favorite friends. Once you know her friends, you can ask them to introduce you to their two favorite friends, etc. This is a snowball sampling technique.

  1. Highlight the vertices that you will reach using snowball sampling (Net > K-Neighbors > ...) Include an image.
  2. Which girls will you not find using snowball sampling starting with Ella?
Submitting your files

Submission of your homework is via WebCT. You must submit all the required files in a single document containing all the answers.

Acknowledgement: The assignment is modified from Lada Adamic.