How to create vector migration files#
You can create the JSON metadata and binary migration files needed by EMOD to run simulations from CSV data using the Python script below. You can assign the same probability of migration to each vector in a node or you can assign different migration rates based on gender or genetics of the vector.
Run the ‘convert_csv_to_bin_vector_migration.py’ script using the format below:
python -m emodpy_malaria.migration.convert_csv_to_bin_vector_migration [input-migration-csv] [idreference(optional)]
Note
The IdReference must match the value in the demographics file. The bin.json metadata file will be created without a valid IdReference with expectations that the user will set it themselves if that argument is not passed in.
CSV Input Configurations#
Below are different csv file input configurations you can use to create vector migration.
One rate for all vectors#
Header (optional): FromNodeID, ToNodeID, Rate (Average # of Trips Per Day) If the csv/text file has three columns with no headers, this is the format we assume.
Parameter |
Data type |
Min |
Max |
Default |
Description |
---|---|---|---|---|---|
FromNodeID |
integer |
1 |
2147480000 |
NA |
NodeID, matching NodeIDs in demographics file, from which the vector/human will travel. |
ToNodeID |
integer |
1 |
2147480000 |
NA |
NodeID, matching NodeIDs in demographics file, to which the vector/human will travel. |
Rate |
float |
0 |
3.40282e+38 |
NA |
Rate at which the all the vectors/humans will travel from the FromNodeID to ToNodeID. |
Example:
FromNodeID |
ToNodeID |
Rate |
---|---|---|
5 |
1 |
0.1 |
5 |
2 |
0.1 |
5 |
3 |
0.1 |
5 |
4 |
0.1 |
5 |
6 |
0 |
5 |
7 |
0 |
5 |
8 |
0.1 |
5 |
9 |
0.1 |
Actual csv:
1,2,0.0
1,3,0.0
1,4,0.0
1,9,0.0
2,1,0.0
2,3,0.0
4,3,0.0
4,5,0.0
4,9,0.0
5,1,0.00125
5,2,0.00125
5,3,0.00125
5,4,0.00125
5,6,0.00125
5,7,0.00125
Different rates for male and female vectors#
Header (optional): FromNodeID, ToNodeID, RateMales, RateFemales If the csv/text file has four columns with no headers, this is the format we assume.
Parameter |
Data type |
Min |
Max |
Default |
Description |
---|---|---|---|---|---|
FromNodeID |
integer |
1 |
2147480000 |
NA |
NodeID, matching NodeIDs in demographics file, from which the vector/human will travel. |
ToNodeID |
integer |
1 |
2147480000 |
NA |
NodeID, matching NodeIDs in demographics file, to which the vector/human will travel. |
RateMales |
float |
0 |
3.40282e+38 |
NA |
Rate at which the vector/human of male sex will travel from the FromNodeID to ToNodeID. |
RateFemales |
float |
0 |
3.40282e+38 |
NA |
Rate at which the vector/human of female sex will travel from the FromNodeID to ToNodeID. |
Example:
FromNodeID |
ToNodeID |
RateMales |
RateFemales |
---|---|---|---|
5 |
1 |
0.1 |
0.02 |
5 |
2 |
0.1 |
0.02 |
5 |
3 |
0.1 |
0.02 |
5 |
4 |
0.1 |
0.02 |
5 |
6 |
0 |
0.02 |
5 |
7 |
0 |
0.02 |
5 |
8 |
0.1 |
0 |
5 |
9 |
0.1 |
0 |
Actual csv:
FromNodeID,ToNodeID,RateMales,RateFemales
5,1,0.1,0.02
5,2,0.1,0.02
5,3,0.1,0.02
5,4,0.1,0.02
5,6,0,0.02
5,7,0,0.02
5,8,0.1,0
5,9,0.1,0
Different rates depending on genetics of the vector#
Header (required): FromNodeID, ToNodeID, [], arrays denoting Allele_Combinations Allele_Combinations: [[“a1”, “a1”], [“b1”, “b1”]] or [[“X1”,”Y2”]] or [[“*”, “a0”], [“X1”, “Y1”]] Due to use of commas in headers, it is best to use Excel to create the csv input files. The first (empty, []) array is used as a “default rate” if the vector’s genetics doesn’t match any of the Allele_Combinations. The other column headers denote the rate that the vector will travel at if it matches the Allele_Combinations listed. Vectors are checked against Allele_Combinations from most-specific, to least-specific, regardless of the order in the csv file. Allele_Combinations can, but don’t have to, include sex-alleles. Without specified sex-alleles, any vector that matches the alleles regardless of sex will travel at that rate. Use ‘*’ as a wildcard if the second allele does not matter and can be matched with anything.
Parameter |
Data type |
Min |
Max |
Default |
Description |
---|---|---|---|---|---|
FromNodeID |
integer |
1 |
2147480000 |
NA |
NodeID, matching NodeIDs in demographics file, from which the vector/human will travel. |
ToNodeID |
integer |
1 |
2147480000 |
NA |
NodeID, matching NodeIDs in demographics file, to which the vector/human will travel. |
[] |
float |
0 |
3.40282e+38 |
NA |
Default rate at which the vector that doesn’t match any other allele combinations will travel from the FromNodeID to ToNodeID. |
User-defined Allele Combination |
float |
0 |
3.40282e+38 |
NA |
Rate at which the vector that matches this and not a more-specific allele combination will travel from the FromNodeID to ToNodeID. |
Example:
FromNodeID |
ToNodeID |
[] |
[[‘a1’, ‘a1’], [‘b1’, ‘b1’]] |
[[‘*’, ‘a0’], [‘X1’, ‘Y1’]] |
[[‘X1’,’Y2’]] |
---|---|---|---|---|---|
5 |
1 |
0.1 |
0 |
0 |
0 |
5 |
2 |
0 |
0.1 |
0 |
0 |
5 |
3 |
0 |
0 |
0.1 |
0 |
5 |
4 |
0 |
0 |
0 |
0.1 |
5 |
6 |
0 |
0 |
0 |
0 |
5 |
7 |
0.1 |
0.1 |
0 |
0 |
5 |
8 |
0.1 |
0 |
0.1 |
0.05 |
5 |
9 |
0 |
0.1 |
0 |
0 |
1 |
2 |
1 |
0 |
0 |
0 |
1 |
3 |
0 |
1 |
0 |
0 |
1 |
4 |
0 |
0 |
1 |
0 |
1 |
6 |
0 |
0 |
0 |
1 |
3 |
6 |
0 |
0 |
0 |
0 |
3 |
7 |
0 |
0.5 |
0 |
0 |
3 |
8 |
0.5 |
0 |
0 |
0.0 |
3 |
9 |
0 |
0.5 |
0 |
0 |
Actual csv:
FromNodeID,ToNodeID,[],"[[""X1"",""Y2""]]","[[""a1"", ""a1""],[""b1"",""b0""], [""X1"", ""X1""]]","[[""*"", ""a0""], [""X1"", ""Y1""]]","[[""a1"", ""a1""], [""b1"", ""b1""]]"
5,1,0.1,0,0,0,0
5,2,0,0.1,0,0,0
5,3,0,0,0.1,0,0
5,4,0,0,0,0.1,0
5,6,0,0,0,0,0.1
5,7,0.1,0.1,0,0,0
5,8,0.1,0,0.1,0.05,0.01
5,9,0,0.1,0,0,0.1
1,2,1,0,0,0,0
1,3,0,1,0,0,0
1,4,0,0,1,0,0
1,6,0,0,0,1,0
3,6,0,0,0,0,0.5
3,7,0,0.5,0,0,0
3,8,0.5,0,0,0.0,0.5
3,9,0,0.5,0,0,0.0
Migration binary file#
For information, see Binary file.
JSON metadata file#
The metadata file is a JSON-formatted file that includes a metadata section and a node offsets section. The Metadata section contains a JSON (JavaScript Object Notation) with parameters that help EMOD interpret the migration binary file. You are encouraged to add your own parameters to the section to remind your selves about the source, reason, and purpose of the binary file and the data it contains. Non-required parameters are ignored.
Vector Migration Metadata File Parameters#
Parameter |
Data type |
Description |
---|---|---|
IdReference |
string |
Required. A unique id to match demographics, climate, and migration files that work together. |
DatavalueCount |
integer |
Required.The number of outbound data values per node (max 100). The number must be the same across every node in the binary file. |
GenderDataType |
enum |
Required. Denotes whether data is provided for each gender separately, is the same for both, or depends on vector genetics. Accepted values are ONE_FOR_BOTH_GENDERS, ONE_FOR_EACH_GENDER, VECTOR_MIGRATION_BY_GENETICS. |
AlleleCombinations |
array |
Required for GenderDataType: VECTOR_MIGRATION_BY_GENETICS. An array of Allele_Combinations, starting with an emtpy array to mark the default migration rate. |
NodeCount |
integer |
Required. The number of ‘from’ nodes in the data. Used to verify size NodeOffsets - 16*NodeCount = # chars in NodeOffsets. |
NodeOffsets |
string |
Required. The number of rates/’to’ nodes for each ‘from’ node. Max of 100. |
DateCreated |
string |
(Informational for user only) Date and time the file was generated by the script. |
Tool |
string |
(Informational for user only) The script used to create the file. |
User-created parameter |
string |
(Informational for user only) Example of a user-created parameter |
Example#
{
"Metadata": {
"IdReference": "9-nodes",
"DateCreated": "Thu Nov 21 17:41:47 2024",
"Tool": "convert_csv_to_bin_vector_migration.py",
"DatavalueCount": 8,
"GenderDataType": "VECTOR_MIGRATION_BY_GENETICS",
"AlleleCombinations": [[], [["X1","Y2"]], [["a1","a1"], ["b1","b0"], ["X1","X1"]], [["*","a0"], ["X1","Y1"]], [["a1","a1"],["b1","b1"]]],
"NodeCount": 3,
"Project": "Migration based on Dr. Acula research."
},
"NodeOffsets": "0000000500000000000000010000006000000003000000C0"
}