Assignment 2
Programming for this assignment should be done in Python 3 using the igraph library with some data visualization coming from matplotlib.
This searchable reference for python-igraph is useful.
You are to submit this assignment in Brightspace as a single zip file, with the following structure:
<student_id>/<input_graph_1>
<student_id>/<input_graph_2>
<student_id>/<input_graph_3>
<student_id>/<input_graph_4>
<student_id>/part1.py
<student_id>/part2.py
<student_id>/analysis.txt (only for students in COMP5900)
If you find yourself struggling with technical issues related to using Python, igraph, or matplotlib, please feel free to ask questions on the Brightspace course forum. There are probably other students experiencing the same issues.
1. Centrality Measures
Choose two directed graphs each with at least $10,000$ vertices. One place to find these is the Stanford Large Network Dataset Collection, but you can also take any other data you are able to find. The graph must be directed but it may have self-loops and multiple edges. To make things more interesting, it would be good if you can find data in which the vertices have meaningful names.
Write a program with the name part1.py that reads each of the graphs into an igraph.Graph object and computes some centrality measures for the vertices. Your program should compute:
- The degree centrality
- The eigenvector centrality (using igraph.graph.eigenvector_centrality)
- The PageRank centrality (using igraph.graph.personalized_pagerank())
- with the default damping value of $0.85$
- with a damping value of $0$
- The Kleinberg hub score (using igraph.graph.hub_score())
- The Kleinberg authority score (using igraph.graph.authority_score())
Your program should print out basic information about each graph, including
- the number of vertices ($n$)
- the number of edges ($m$)
- the number of degree-$0$ vertices
For each of the centrality measures computed above, your program should output a list of the top 50 most central vertices with respect to that measure. (If your data has named vertices, then you can print them here.)
2. Degree Correlations
Choose two unidrected graphs, each with at least $10,000$ vertices. One place to find these is the Stanford Large Network Dataset Collection, but you can also take any other data you are able to find.
Write a program with the name part2.py reads each of the graphs into an igraph.Graph object and computes some degree correlation measures.
- The degree assortativity using (igraph.graph.assortativity_degree)
- The rich-club coefficient (you can import this from the Python richclub module)
Your program should print out basic information about each graph, including
- the number of vertices ($n$)
- the number of edges ($m$)
- the number of degree-$0$ vertices
For each of the centrality measures computed above, your program should output a list of the top 50 most central vertices with respect to that measure. (If your data has named vertices, then you can print them here.)