How to create vector migration files#

You can create the JSON metadata and binary migration files needed by EMOD to run simulations from CSV data using the Python script below. You can assign the same probability of migration to each vector in a node or you can assign different migration rates based on gender or genetics of the vector.

  1. Run the ‘convert_csv_to_bin_vector_migration.py’ script using the format below:

    python -m emodpy_malaria.migration.convert_csv_to_bin_vector_migration [input-migration-csv] [idreference(optional)]
    

Note

The IdReference must match the value in the demographics file. The bin.json metadata file will be created without a valid IdReference with expectations that the user will set it themselves if that argument is not passed in.

CSV Input Configurations#

Below are different csv file input configurations you can use to create vector migration.

One rate for all vectors#

Header (optional): FromNodeID, ToNodeID, Rate (Average # of Trips Per Day) If the csv/text file has three columns with no headers, this is the format we assume.

Parameter

Data type

Min

Max

Default

Description

FromNodeID

integer

1

2147480000

NA

NodeID, matching NodeIDs in demographics file, from which the vector/human will travel.

ToNodeID

integer

1

2147480000

NA

NodeID, matching NodeIDs in demographics file, to which the vector/human will travel.

Rate

float

0

3.40282e+38

NA

Rate at which the all the vectors/humans will travel from the FromNodeID to ToNodeID.

Example:

FromNodeID

ToNodeID

Rate

5

1

0.1

5

2

0.1

5

3

0.1

5

4

0.1

5

6

0

5

7

0

5

8

0.1

5

9

0.1

Actual csv:

1,2,0.0
1,3,0.0
1,4,0.0
1,9,0.0
2,1,0.0
2,3,0.0
4,3,0.0
4,5,0.0
4,9,0.0
5,1,0.00125
5,2,0.00125
5,3,0.00125
5,4,0.00125
5,6,0.00125
5,7,0.00125

Different rates for male and female vectors#

Header (optional): FromNodeID, ToNodeID, RateMales, RateFemales If the csv/text file has four columns with no headers, this is the format we assume.

Parameter

Data type

Min

Max

Default

Description

FromNodeID

integer

1

2147480000

NA

NodeID, matching NodeIDs in demographics file, from which the vector/human will travel.

ToNodeID

integer

1

2147480000

NA

NodeID, matching NodeIDs in demographics file, to which the vector/human will travel.

RateMales

float

0

3.40282e+38

NA

Rate at which the vector/human of male sex will travel from the FromNodeID to ToNodeID.

RateFemales

float

0

3.40282e+38

NA

Rate at which the vector/human of female sex will travel from the FromNodeID to ToNodeID.

Example:

FromNodeID

ToNodeID

RateMales

RateFemales

5

1

0.1

0.02

5

2

0.1

0.02

5

3

0.1

0.02

5

4

0.1

0.02

5

6

0

0.02

5

7

0

0.02

5

8

0.1

0

5

9

0.1

0

Actual csv:

FromNodeID,ToNodeID,RateMales,RateFemales
5,1,0.1,0.02
5,2,0.1,0.02
5,3,0.1,0.02
5,4,0.1,0.02
5,6,0,0.02
5,7,0,0.02
5,8,0.1,0
5,9,0.1,0

Different rates depending on genetics of the vector#

Header (required): FromNodeID, ToNodeID, [], arrays denoting Allele_Combinations Allele_Combinations: [[“a1”, “a1”], [“b1”, “b1”]] or [[“X1”,”Y2”]] or [[“*”, “a0”], [“X1”, “Y1”]] Due to use of commas in headers, it is best to use Excel to create the csv input files. The first (empty, []) array is used as a “default rate” if the vector’s genetics doesn’t match any of the Allele_Combinations. The other column headers denote the rate that the vector will travel at if it matches the Allele_Combinations listed. Vectors are checked against Allele_Combinations from most-specific, to least-specific, regardless of the order in the csv file. Allele_Combinations can, but don’t have to, include sex-alleles. Without specified sex-alleles, any vector that matches the alleles regardless of sex will travel at that rate. Use ‘*’ as a wildcard if the second allele does not matter and can be matched with anything.

Parameter

Data type

Min

Max

Default

Description

FromNodeID

integer

1

2147480000

NA

NodeID, matching NodeIDs in demographics file, from which the vector/human will travel.

ToNodeID

integer

1

2147480000

NA

NodeID, matching NodeIDs in demographics file, to which the vector/human will travel.

[]

float

0

3.40282e+38

NA

Default rate at which the vector that doesn’t match any other allele combinations will travel from the FromNodeID to ToNodeID.

User-defined Allele Combination

float

0

3.40282e+38

NA

Rate at which the vector that matches this and not a more-specific allele combination will travel from the FromNodeID to ToNodeID.

Example:

FromNodeID

ToNodeID

[]

[[‘a1’, ‘a1’], [‘b1’, ‘b1’]]

[[‘*’, ‘a0’], [‘X1’, ‘Y1’]]

[[‘X1’,’Y2’]]

5

1

0.1

0

0

0

5

2

0

0.1

0

0

5

3

0

0

0.1

0

5

4

0

0

0

0.1

5

6

0

0

0

0

5

7

0.1

0.1

0

0

5

8

0.1

0

0.1

0.05

5

9

0

0.1

0

0

1

2

1

0

0

0

1

3

0

1

0

0

1

4

0

0

1

0

1

6

0

0

0

1

3

6

0

0

0

0

3

7

0

0.5

0

0

3

8

0.5

0

0

0.0

3

9

0

0.5

0

0

Actual csv:

FromNodeID,ToNodeID,[],"[[""X1"",""Y2""]]","[[""a1"", ""a1""],[""b1"",""b0""], [""X1"", ""X1""]]","[[""*"", ""a0""], [""X1"", ""Y1""]]","[[""a1"", ""a1""], [""b1"", ""b1""]]"
5,1,0.1,0,0,0,0
5,2,0,0.1,0,0,0
5,3,0,0,0.1,0,0
5,4,0,0,0,0.1,0
5,6,0,0,0,0,0.1
5,7,0.1,0.1,0,0,0
5,8,0.1,0,0.1,0.05,0.01
5,9,0,0.1,0,0,0.1
1,2,1,0,0,0,0
1,3,0,1,0,0,0
1,4,0,0,1,0,0
1,6,0,0,0,1,0
3,6,0,0,0,0,0.5
3,7,0,0.5,0,0,0
3,8,0.5,0,0,0.0,0.5
3,9,0,0.5,0,0,0.0

Migration binary file#

For information, see Binary file.

JSON metadata file#

The metadata file is a JSON-formatted file that includes a metadata section and a node offsets section. The Metadata section contains a JSON (JavaScript Object Notation) with parameters that help EMOD interpret the migration binary file. You are encouraged to add your own parameters to the section to remind your selves about the source, reason, and purpose of the binary file and the data it contains. Non-required parameters are ignored.

Vector Migration Metadata File Parameters#

Parameter

Data type

Description

IdReference

string

Required. A unique id to match demographics, climate, and migration files that work together.

DatavalueCount

integer

Required.The number of outbound data values per node (max 100). The number must be the same across every node in the binary file.

GenderDataType

enum

Required. Denotes whether data is provided for each gender separately, is the same for both, or depends on vector genetics. Accepted values are ONE_FOR_BOTH_GENDERS, ONE_FOR_EACH_GENDER, VECTOR_MIGRATION_BY_GENETICS.

AlleleCombinations

array

Required for GenderDataType: VECTOR_MIGRATION_BY_GENETICS. An array of Allele_Combinations, starting with an emtpy array to mark the default migration rate.

NodeCount

integer

Required. The number of ‘from’ nodes in the data. Used to verify size NodeOffsets - 16*NodeCount = # chars in NodeOffsets.

NodeOffsets

string

Required. The number of rates/’to’ nodes for each ‘from’ node. Max of 100.

DateCreated

string

(Informational for user only) Date and time the file was generated by the script.

Tool

string

(Informational for user only) The script used to create the file.

User-created parameter

string

(Informational for user only) Example of a user-created parameter

Example#

{
    "Metadata": {
        "IdReference": "9-nodes",
        "DateCreated": "Thu Nov 21 17:41:47 2024",
        "Tool": "convert_csv_to_bin_vector_migration.py",
        "DatavalueCount": 8,
        "GenderDataType": "VECTOR_MIGRATION_BY_GENETICS",
        "AlleleCombinations": [[], [["X1","Y2"]], [["a1","a1"], ["b1","b0"], ["X1","X1"]], [["*","a0"], ["X1","Y1"]], [["a1","a1"],["b1","b1"]]],
        "NodeCount": 3,
        "Project": "Migration based on Dr. Acula research."
    },
    "NodeOffsets": "0000000500000000000000010000006000000003000000C0"
}