You may wish to refer to the Power-laws ``Scale free'' networks and the Generating and Fitting Power Law Distributions in Matlab to figure out how to complete the tasks.

Generate 100,000 random integers from a power law distribution with exponent alpha = 2.1

- What is the largest value in your sample? Is it possible for a node in a network to have a degree this high (assuming you don't allow multiple edges between two nodes)?
- Construct a histogram of the frequency of occurrence of each integer in your sample.
Pajek will let you calculate the degree of each individual node (
`Net > Partitions > Degree > All`). Then, export the partition as a '.clu' file by clicking on the save icon to the left of the partitions drop-down select menu. Now, you can import it into Excel or another program and histogram it. Try both a linear scale plot and a log-log scale plot. - What happens to the bins with zero count in the log-log plot?
- Try a simple linear regression on the log transformation of both variables.
In Matlab, you can plot two data sets together as follows:
`plot(x1,y1,'r-',x2,y2,'b:')`. This will plot y1 vs. x1 as a red solid line, and y2 vs. x2 as a blue dotted line. (If you are using the fitlineonloglog.m Matlab script, you will feed it the binned data, and it will take the log of the x and y for you before doing a linear fit). What is your value of the power-law exponent alpha? Include a plot of the data with the fit superimposed. - Now exponentially bin the data and fit with a line. What is your value of alpha?
- Finally, do a cumulative frequency plot of the original data sample. Fit, plot, and report on the fitted exponent and the corresponding value of alpha.
- Which method was the most accurate? Which one, in your opinion, gave the best view of the data and the fit?

Go to http://ccl.northwestern.edu/netlogo/models/SmallWorlds. This is a NetLogo model that will allow you to vary the rewiring probability.

- Adjust this probability from 0 to 1, each time hitting "rewire" and allowing it to calculate the clustering coefficient and average path length. Does your plot agree with what you saw in lecture?
- Try using a spring layout. In what ways do the random links make the world smaller?

Select a piece of text (10-20 sentences) that you would like to summarize and paste it in the appropriate box of the LexRank demo.

If you are unable to paste text into the text box, try a different browser.

- There are two parameters you can vary:
- the cosine similarity threshold determines how similar two sentences have to be in order to share and edge.
- the salience threshold determines how high a sentence's PageRank has to be in order for that sentence to be included in the summary.

- Accordingly, report on a cosine similarity threshold that gave you the best result (if applicable).
- Compare the 1 sentence summary to the 2 or 3-sentence summary. In your opinion, how much do the 2nd and 3rd sentences add (in terms of adding more information).
- Would you have chosen them, or a different sentence? Relate your answer to the structure of the lexical similarity graph.

Submission of your homework is via WebCampus. You must submit all the required files in a single tar or zip file containing all the files for your submission.

Acknowledgement: The assignment is modified from Lada Adamic.