generate_data_instances_set.py#
Script to generate random network instances.
A network instance consist of a transition matrix, binary adjustment matrix and objective index.
The transition matrices (i.e., networks) can be generated by two methods: (1)
Filling a matrix with randomly generated uniform values and normalising
the matrix row-wise, (2) Generating a matrix Q
with randomly generated
uniform values and a matrix P
generated according to a directed Erdos-Rényi
graph, combining these matrices by calculating alpha*P+(1-alpha)*Q
and
normalising the resulting matrix row-wise.
The binary adjustment matrix is generated by filling a matrix with randomly
generated 0
and 1
values. A 50/50 probability is used for generating
0
and 1
values. To avoid self-loops, the diagonal is filled with
zeroes.
The objective index of the instance is randomly selected between 0 and the network size.
Per instance, a plot with two histograms that show the edge weight distribution are provided.
The parameters of the script are explained below.
- Parameters:
START_SEED_VALUE (int) – The start seed value of the pseudo-random number generator. The first network instance is generated with seed
START_SEED_VALUE
, the second with seedSTART_SEED_VALUE+1
, etc.INSTANCE_SIZE_RANGE (list[int]) – The allowed network size range. The left boundary specifies the minimum network size, the right boundary the maximum network size. If you want to generate a network of a specific size, simply provide the same left and right boundary. For example, if you want to generate a network of size 100, provide
[100, 100]
.NR_INSTANCES (int) – The number of network instances that will be generated.
NETWORK_TYPE (str) – Specifies the method you want to use to generate the network instances. Method (1) above corresponds to “random_uniform”, and method (2) above corresponds to “networkx”.
ALPHA (float) – If method (2) is used to generate the network instances, the matrices
Q
andP
are combined byALPHA*P+(1-ALPHA)*Q
.PROB (float) – The probability that an edge is created in the Erdos-Rényi graph.
SAVE_OUTPUT (bool) – Whether the output should be saved or not.
DATA_DIRECTORY (str) – The folder, relative to the ROOT_DIRECTORY (automatically determined), where the resulting data will be saved if
SAVE_OUTPUT=True
.INSTANCE_NAME (str) – The base name of the network instance(s). This name is currently used to define
DATA_FILE_NAME_M_MATRIX
,DATA_FILE_NAME_C_MATRIX
,INSTANCE_PARAMETERS_FILE_NAME
andRUN_PARAMETERS_FILE_NAME
.DATA_FILE_NAME_M_MATRIX (str) – The base name of the transition matrices. The generated matrices will be saved according to this name with a suffix “i” where “i” stands for the instance number starting at 0.
DATA_FILE_NAME_C_MATRIX (str) – The base name of the binary adjustment matrices. The generated matrices will be saved according to this name with a suffix “i” where “i” stands for the instance number starting at 0.
INSTANCE_PARAMETERS_FILE_NAME (str) – The file name of the table that will be saved if
SAVE_OUTPUT=True
that contains information on the generated network instances. The table contains the following columns:Instance_name
,Random_M
,Random_C
,Seed
,Problem_size
,Objective_index
. The columnInstance_name
contains the name of each network. The columnsRandom_M
andRandom_C
specify whether the transition matrices and binary adjustment matrices were generated randomly or not, respectively. The columnSeed
specifies the seed that was used for generating the instance. The columnProblem_size
specifies the network size. The columnObjective_index
specifies the randomly selected network node that will be optimised.RUN_PARAMETERS_FILE_NAME (str) – The name of the text file with the script parameters if
SAVE_OUTPUT=True
.