CS 765 Complex Networks
Network Lab 5
Due on Tuesday Dec 2, 2014 at 9:30 am
Power-law network (7 points)
You may wish to refer to the Power-laws ``Scale free'' networks
and the Generating and Fitting Power Law Distributions in Matlab to figure out how to complete the tasks.
Generate 100,000 random integers from a power law distribution with exponent alpha = 2.1
The Watts Strogatz small world model (3 points)
- What is the largest value in your sample?
Is it possible for a node in a network to have a degree this high (assuming you don't allow multiple edges between two nodes)?
- Construct a histogram of the frequency of occurrence of each integer in your sample.
Pajek will let you calculate the degree of each individual node (Net > Partitions > Degree > All).
Then, export the partition as a '.clu' file by clicking on the save icon to the left of the partitions drop-down select menu.
Now, you can import it into Excel or another program and histogram it. Try both a linear scale plot and a log-log scale plot.
- What happens to the bins with zero count in the log-log plot?
- Try a simple linear regression on the log transformation of both variables.
In Matlab, you can plot two data sets together as follows: plot(x1,y1,'r-',x2,y2,'b:').
This will plot y1 vs. x1 as a red solid line, and y2 vs. x2 as a blue dotted line.
(If you are using the fitlineonloglog.m Matlab script, you will feed it the binned data, and it will take the log of the x and y for you before doing a linear fit).
What is your value of the power-law exponent alpha? Include a plot of the data with the fit superimposed.
- Now exponentially bin the data and fit with a line. What is your value of alpha?
- Finally, do a cumulative frequency plot of the original data sample.
Fit, plot, and report on the fitted exponent and the corresponding value of alpha.
- Which method was the most accurate? Which one, in your opinion, gave the best view of the data and the fit?
Go to http://ccl.northwestern.edu/netlogo/models/SmallWorlds.
This is a NetLogo model that will allow you to vary the rewiring probability.
LexRank (bonus 4 points)
- Adjust this probability from 0 to 1, each time hitting "rewire" and allowing it to calculate the clustering coefficient and average path length.
Does your plot agree with what you saw in lecture?
- Try using a spring layout. In what ways do the random links make the world smaller?
Select a piece of text (10-20 sentences) that you would like to summarize and paste it in the appropriate box of the
If you are unable to paste text into the text box, try a different browser.
Submitting your files
- There are two parameters you can vary:
Vary the cosine similarity threshold and record the most salient sentence.
Does the most salient sentence change as you vary the threshold?
- the cosine similarity threshold determines how similar two sentences have to be in order to share and edge.
- the salience threshold determines how high a sentence's PageRank has to be in order for that sentence to be included in the summary.
- Accordingly, report on a cosine similarity threshold that gave you the best result (if applicable).
- Compare the 1 sentence summary to the 2 or 3-sentence summary.
In your opinion, how much do the 2nd and 3rd sentences add (in terms of adding more information).
- Would you have chosen them, or a different sentence?
Relate your answer to the structure of the lexical similarity graph.
Submission of your homework is via WebCampus.
You must submit all the required files in a single tar or zip file containing all the files for your submission.
Acknowledgement: The assignment is modified from Lada Adamic.