Quick tutorial

1. Quick tutorial#

We provide a quick tutorial on how to use our package to compute the distance between a graph pair. This is meant to be a simple use-example to introduce the reader to the main functions and to eventually copy and paste some code if necessary. The details on the inputs and function parameters are discussed in depth in the next notebook.

1.1. Installation#

Our package can be installed with the pip command.

pip install gdynadist

1.2. Basic use#

The temporal graphs are dealt with pandas, as thoroughly described in this section.

import pandas as pd
from GDynaDist import Graphs4Distance

# load the two temporal graphs to compare
df1 = pd.read_csv('df1.csv')
df2 = pd.read_csv('df2.csv')

# initialite the Data class
Data = Graphs4Distance()

# add the two datasets to the Data class and give them a name
Data.LoadDataset(df1, 'first_dataset')
Data.LoadDataset(df2, 'second_dataset')

# compute the distance between the two graphs
unmatched_distance = Data.GetDistance('fist_graph', 'second_graph', distance_type = 'unmatched')

If the user wants to compute the matched distance, the known bijective mapping between the two graphs’ nodes has to be specified. This is passed as a dictionary. If the two graphs have the same labels, the distance can be directly computed by running

matched_distance = Data.GetDistance('first_graph', 'second_graph', distance_type = 'matched', node_mapping = 'Same')

Note: The function has several parameters that can be adjusted to improve the efficiency of the distance calculation and memory usage.

You can refer to the next tutorial for a discussion about all parameters.


Note: Remember that a distance is an absolute measure of similarity, not a relative one. When using the distance for temporal graph analysis, it should always be compared with a baseline (for instance with a null model) to understand if the two graphs under analysis are more - or less-similar than expected.

You can refer to this tutorial for some examples of how the distance can be used for data analysis.