This notebook tracks progress on the development of GMSE software R package, for game-theoretic management strategy evaluation, and related issues surrounding the development and application of game theory for addressing questions of biodiversity and food security.
2022
2021
2020
2019
2018
2017
2016
It appears that CRAN has flagged an unusual test fail (ERROR). The CRAN checks for GMSE all give an OK status, but ATLAS tests (i.e., “tests with alternative BLAS/LAPACK implementations”) that use the Fedora shared libraries seem to give the error.
Running ‘testthat.R’
ERROR
Running the tests in ‘tests/testthat.R’ failed.
Complete output:
> library("testthat")
> library("GMSE")
>
> test_check("GMSE")
[1] "Initialising simulations ... "
[1] "Initialising simulations ... "
[1] "Initialising simulations ... "
[ FAIL 1 | WARN 0 | SKIP 50 | PASS 21 ]
══ Skipped tests ═══════════════════════════════════════════════════════════════
• On CRAN (50)
══ Failed tests ════════════════════════════════════════════════════════════════
── Failure (test-make_utilities.R:33:5): Dimensions of action array are correct ──
dim(action) not equal to c(7, 13, 2).
1/3 mismatches
[1] 4 - 7 == -3
[ FAIL 1 | WARN 0 | SKIP 50 | PASS 21 ]
Error: Test failures
Execution halted
This is very strange, as the test is very simple. The test that is
failed was just looking to check whether or not the dimensions of the user
array are accurate. They have to be for any simulation to run; this
was a test set up very early in the development process. I cannot
reproduce the error, even with the rhub
package
check_on_fedora
function. I am just going to skip this test
on CRAN.
A new version of GMSE 1.0.0.1 has been released on CRAN](https://cran.r-project.org/package=GMSE). The must substantial update to version 1.0 is that a new simulated annealing algorithm has been introduced as an option for modelling agent decision-making. This simulated annealing algorithm can replace the genetic algorithm for both users and the manager in GMSE. Gabriela Ochoa and I have written a new vignette comparing the performance of simulated annealing to the genetic algorithm. Briefly, simulated annealing tends to perform better for users (in terms of simulation speed and fitness of decisions), and especially if decisions are not complex. Under these conditions, we can therefore recommend it in place of the genetic algorithm.
Other updates are quite minor, and include a new argument
mu_magnitude
, which determines the highest mutation
deviation allowed in the genetic algorithm (i.e., the amount that an
action is either increased or decreased). This used to be hard coded at
a value of 10, and we still recommend the default
mu_magnitude = 10
be used, but the new argument allows for
it to be changed.
A minor detail concerning the management of memory has also been fixed (briefly, memory was being allocated to the wrong level of pointer in a few places). This did not affect most systems, but did flag a note on some systems.
Finally we have a new publication written as part of Adrian Bach’s thesis coming out soon in Ecology and Society:
Bach, A., J. Minderman, N. Bunnefeld, A. Mill, A. B. Duthie. (2022). Intervene or wait? Modelling the timing of intervention in conservation conflicts adaptive management under uncertainty. Ecology and Society. In press.
The paper investigates the timing of management decisions, and introduces acceptable thresholds within which a population is perceived to be adequately within a target range, so no action is undertaken by management. It also introduces a budget bonus, giving a manager an increase in budget following time steps of inaction.
A new version of GMSE 0.7.0.0 has been released on CRAN now. The changelog provides a summary of the changes and bug fixes. One new feature was written by Adrian Bach, and now GMSE includes the ability to use trajectory-based prediction to determine whether or not intervention on the part of the manager. The new feature will be used in a manuscript in preparation (note that on the GMSE website, I have now included a list of publications that use GMSE). This includes three new arguments:
traj_pred
is an argument that takes a
TRUE
or FALSE
value that determines whether or
not the trajectory of a population should be used when a manager is
deciding to intervene. This argument applies only when
action_thres > 0
(i.e., when there is a range around the
target population density within which a manager will not intervene by
updating policy). When traj_pred = TRUE
(default is
FALSE
), the manager will use a simple linear extrapolation
from the density of the population in the previous time step and the
current population density to predict what the population density will
be in the next time step. If the predict density is outside of the
acceptable range specified by action_thres
, then the
manager will intervene.
bgt_bonus_rest
is an argument that takes a
TRUE
or FALSE
value and determines whether
(TRUE
) or not (FALSE
) the manager’s budget
bonus for inaction will be reset to zero following a time step of policy
update. If FALSE
, then budget is only reset when costs of
culling has been decreased in the previous time step. We will probably
want to generalise this to the cost of all actions later, which will
involve doing a bit of work to this
block of code. Else we might consider renaming the argument to
indicating culling somehow?
mem_prv_observ
This parameter is an argument that
takes a TRUE
or FALSE
value. When
TRUE
, it allows the manager to remember the population size
from the previous time step. It should be used with the
traj_pred
option.
Regarding the last bullet, I’ve made an update to GitHub v0.7.0.1, which fixes a minor bug that I just noticed this morning that unfortunately made its way into the CRAN version. I don’t think that it will actually affect any simulations though, so I don’t see a need to immediately update on CRAN. See Issue #76 for more details.
There are new options introduced by Jeroen for incrementing user and manager budgets using yield from the landscape.
usr_yld_budget
takes a numeric value that acts as a
multiplier on the user’s total yield. The value of the argument is
multiplied by the user’s total yield to get the user’s budget increment.
If, for example, we wanted to simulate a situation in which users’
budgets were entirely based on their yield, we could set
user_budget = 0
and user_yld_budget = 1
. Note
that regardless of yield, user budget can never be less than 1 or
greater than 100000.
man_yld_budget
is the equivalent to
usr_yld_budget
, but for the manager. Manager’s yield is
accumulated on public land, so if public_land = 0
, then
there will be no increment to the manager’s budget.
Improvements to the gmse_gui
have also been made by
Jeroen, which can be accessed online here. Note that I
have not yet included the trajectory strategy options that Adrian
developed into the GUI.
After some
conversation, I decided to change the movement operations when
resources feed more than once (times_feeding > 1
) or
require a threshold amount of food for survival
(consume_surv > 0
) or reproduction
(consume_repr > 0
). Previously, resources would feed,
then move times_feeding
times. After this moving and
feeding, resources would move again at the end of the time step.
Somewhat confusingly and arbitrarily, resources would therefore get two
moves in a row under these conditions before the timestep ends. In GMSE
v0.7, resources now only move once after feeding.
The new GMSE version also fixes some minor bugs, most notably getting
rid of the rare cases in which resources were able to feed once on the
landscape after dying. There are also new error catches for
gmse
and gmse_apply
functions to make sure
that arguments are structured correctly (e.g., quickly flags when a list
or vector is used instead of a scalar value).
Several new features have been added, which will also make their way
into the next version of GMSE v0.6.0.x. Two new arguments
usr_yld_budget
and man_yld_budget
are now
available in gmse
and gmse_apply
. These
arguments allow an increment of user and manager budget from yield
obtained from the landscape. Total yield from one time step of for a
user is multiplied by usr_yld_budget
, and this product is
then added to whatever the user’s baseline budget was. Similarly, the
mean total yield of all users is multiplied by
man_yld_budget
, and this product is then added to whatever
the manager’s baseline budget was. There was a bit of an issue when
integrating the code with the action threshold budget bonus, so the
solution was to add two new columns to the agent array. The budget for
any agent is then just the sum
of the basline and each of these columns. Hence future development
affecting the budget bonus and yield to budget effects can be more
easily done independently.
A new vignette has also be written by Adrian, which will be on the updated GMSE website soon.
A bug has been found and fixed, which was introduced into
gmse_apply
through the changes to the perceptions of
actions for users and managers in commit 86c298eddd1c778defaac04d30ab342eef58fb50.
The problem was that the internal elements of the paras vector were
changed to reflect columns where perceptions of different actions were
to be located in gmse
. But in gmse_apply
, the
same change was not made, so that all of these elements were interpreted
as low integers; it’s remarkable that a segfault did not occur in any of
the tests. Jeroen pointed out a major difference between the dynamics in
runs of gmse
and gmse_apply
given identical
parameter values, which led the investigation into what was causing the
problem. Finally, we found the aforementioned problem, which was
resolved in commit 7a97eb08171147f688036851be5e3783ab631495.
A subsequent fix was made to the testthat function paras vectors where
needed. The dynamics in gmse
and gmse_apply
are once again the same. On Friday, we have tentative plans for a crash
party with ConFooBio members to download the newest version of GMSE on
the rev branch and try to crash it.
Over the last two days, I have made some changes to the code and done some additional testing. Changes are summarised below.
Fixed memory leak
Since I had been doing some substantial development in C, I thought
it would be wise to run valgrind to check for memory leaks. I found one,
due to my forgetting to free some memory in the new landscape.c file.
The memory is
now correctly freed in the file, and running gmse
and
gmse_apply
in valgrind produces no memory leaks.
==33279==
==33279== HEAP SUMMARY:
==33279== in use at exit: 67,085,047 bytes in 12,107 blocks
==33279== total heap usage: 896,509 allocs, 884,402 frees, 557,718,069 bytes allocated
==33279==
==33279== LEAK SUMMARY:
==33279== definitely lost: 0 bytes in 0 blocks
==33279== indirectly lost: 0 bytes in 0 blocks
==33279== possibly lost: 0 bytes in 0 blocks
==33279== still reachable: 67,085,047 bytes in 12,107 blocks
==33279== of which reachable via heuristic:
==33279== newarray : 4,264 bytes in 1 blocks
==33279== suppressed: 0 bytes in 0 blocks
==33279== Reachable blocks (those to which a pointer was found) are not shown.
==33279== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==33279==
==33279== For counts of detected and suppressed errors, rerun with: -v
==33279== ERROR SUMMARY: 2736 errors from 4 contexts (suppressed: 0 from 0)
I have also initialised a new file in the notebook for functions that we will want to periodically run through valgrind.
Perceptions of actions for users and managers
Following a conversation with Jeroen and Nils, we decided that
another change would be beneficial for GMSE v0.6.0.x. In all previous
versions of GMSE, how managers and users perceive the consequences of
scaring actions is hidden. The consequences were calculated in a
reasonable way by GMSE and used within the fitness functions of the
genetic algorithm to determine which strategies of setting policy (for
manager) or acting (for users) was best. The relevant values for the
perceived effects of scaring, culling, castration, feeding, helping
offspring, tending crops, and killing crops were set partly in the
paras
vector, and partly calculated internally in C. Now,
instead, they are calculated for each agent (manager and users) and
placed directly in the AGENTS
array.
The AGENTS
array has therefore increased by seven columns. Rows in each of these
columns hold the perceived effects of each action for each agent. I’m
not sure why I did not think of this before, but there are several
benefits to doing it this way. First, anyone running gmse
or gmse_apply
can now directly alter how users perceive the
effects of their own actions using the perceive_scare
,
perceive_cull
, perceive_cast
,
perceive_feed
, perceive_help
,
perceive_tend
, perceive_kill
arguments, which
set how perceived utility will change given a single instance of
scaring, culling, castration, feeding, helping offspring, tending crops,
and killing crops, respectively. By default, all of these arguments will
be calculated in a realistic way, as has been described already for the developing version,
so the arguments can be ignored and GMSE will run as normal. But if,
e.g., there is reason to model the users’ perceived efficacy of culling
as being much higher than that of scaring (e.g., based on
parameterisation from empirical social data), then more appropriate
values for a specific system can be used. Additionally, with
gmse_apply
, the agents array can be changed directly,
meaning that manager perception of action effects can be changed, and
that perceptions among users can potentially be varied (e.g., if some
users are more convinced that scaring will work, while others are more
sceptical). I also like that the perceived effects are being returned in
the agents array, so the whole parameterisation of the fitness function
is much more transparent now.
The code now runs with no memory leaks, and GMSE builds on the rev branch with no notes, warnings, or errors.
Minor bug fix
Jeroen and I have fixed a minor bug in the GMSE code that was causing
a crash given the very restricted situation in which the package is
built using the Rstudio drop-down menu. In such builds (but not,
curiously, when building from the console), a fatal crash can be caused
when looping the gmse_apply
function with variable names
that are identical to argument names (e.g.,
stakeholders = 8
, followed by
gmse_apply(stakeholders = stakeholders)
). Just in case, to
avoid this issue entirely, I have inserted a short function into
gmse_apply
that produces an error when attempting to set
variable names equal to argument names. This can be completely
circumvented in gmse_apply
by setting the hidden argument
my_way_or_the_highway
equal to anything (i.e., not
NULL
). I have also included a garbage collect at the end of
gmse
and gmse_apply
functions.
Consistency between gmse
and
gmse_apply
In previous versions of GMSE, the functions gmse
and
gmse_apply
had a few slightly different defaults. This
sometimes led to slightly different dynamics when running simulations
and comparing between gmse
and looped
gmse_apply
under default arguments. To avoid any cofusion,
Jeroen and I have decided to make gmse
and
gmse_apply
default arguments completely identical. The
equivalence of the two functions given the same initial conditions has
been confirmed through simulations.
Updated vignettes
I have now checked vignettes 5-7 to see if updates were needed. Only an update to vignette 7 was necessary. Note that even where the code for vignettes remains unchanged, it would still be a good idea to re-knit everything for the website as a test to make sure that everything runs as expected with new defaults for v0.6.0.x.
Uneven land allocation
Jeroen has incorporated some code and a new
ownership_var
argument to adjust the variation in land
ownership among users. I’ve re-worked the way that public land is
handled in the new landscape owner building function to account for
this, and everything appears to work as intended.
Bug fix to public land allocation
A small bug was noticed by Jeroen this morning relating to public
land allocation. Essentially, when the proportion of public land
amounted to fewer cells on the landscape than allocated to a single
user, public land would default to zero cells. We needed some way for
there to be nonzero public land that was also quite small. Given such
situations, a new
function small_public_land
is called. Public land is
placed in the centre of the landscape, thereby cutting into at least two
users’ owned cells equally. First a rectangle proportional to the
landscape is produced, followed by more public land cells attached to
each side of this rectangle as necessary to achieve approximately the
correct number of cells. It is important to recognise that simulation
parameter values need to be chosen with care. If the total number of
landscape cells cannot be evenly distributed among users and public
land, then of course deviations from gmse
and
gmse_apply
arguments will occur. The updates cause small
deviations to occur for larger sections of landscape when possible. For
example, the below produces a landscape with four equal quadrants, three
for users and one for public land.
sim <- gmse(time_max = 4, stakeholders = 3, public_land = 0.25,
land_dim_1 = 100, land_dim_2 = 100, land_ownership = TRUE);
The below is a situation in which the amount of public land is not small. A total of 2000 public land cells are called for, with the remaining 8000 to be divided evenly amongst nine users. This is impossible, and the shortest-splitine algorithm cannot give that level of precision and make the landscapes clean.
sim <- gmse(time_max = 4, stakeholders = 9, public_land = 0.20,
land_dim_1 = 100, land_dim_2 = 100, land_ownership = TRUE);
Instead, 1840 cells are allocated to public land, with each user being allocated rectangular blocks of 900 or 920 landscape cells. But when public land to be allocated is very low, fewer cells than given to a single user, deviations are then placed on users to make the public land proportion as accurate as possible in the centre of the landscape.
sim <- gmse(time_max = 4, stakeholders = 7, public_land = 0.03,
land_dim_1 = 100, land_dim_2 = 100, land_ownership = TRUE);
In the case of the above, e.g., exactly 300 public land cells are put down in the centre of the landscape, while users are given somewhere between 1303 and 1462 cells.
Update gmse GUI
The gmse_gui
function has now been updated so that all
options are available (and default landscape allocation is enacted) for
running simulations from within the shiny GUI. I have not put this up on
the shiny server yet, but on the rev branch, the new GUI can be used
within R. This now concludes all of the planned updates for code changes
to v0.6.0.x, and now the last two things to do will be to add vignettes
and update the website. Please contact me to request any additional
changes.
Yesterday and today I implemented the fourth bullet point listed on 6 MAY 2020, which changes the default pattern of land ownership in GMSE. In over a decade of coding individual-based models, I think that this might be the first time that I have needed to apply recursion to a problem. I actually only just learned that recusion is possible in R, but decided to write the function in C just to get a bit more speed.
The goal of the new algorithm is to divide a rectangular landscape of
length land_dim_1
and width land_dim_2
into
stakeholder
smaller rectangles that are as equal as
possible in area (i.e., cell number). To do this, I used a shortest-splitline
algorithm in a new file landscape.c
. The relevant
function is the
break_land
function. For a given rectangular landscape,
the break_land
function draws a line splitting the
landscape along its longest dimension (i.e., if the rectangle has a
length of 20 and a width of 10, it is split lengthwise). If the number
of owners to be placed within the landscape is even, then the line
splits exactly halfway down the middle of the landscaope. If the number
of owners to be placed within the landscape is odd, then the line is
placed in proportion to the ratio of ownership – in other words, if 3
owners need to be placed on the landscape, then the split line would be
placed 2/3 of the way along the longest dimension (3/5 of the way for 5
owners, 4/7 of the way for 7 owners, and so forth). The rectangular
landscape is then split into two smaller sub-rectangles. The same
splitting procedure is then run on these sub-rectangles with the new
number of owners (i.e., half of the original in the case of an even
split, and appropriate portions such as 1 and 2, or 2 and 3, for the odd
number of original owners, with the biggest rectangle getting the higher
number of owners). The splitting of the landscape into smaller and
smaller sub-rectangles continues until the number of owners that need to
be placed on a rectangle equals one, in which case an integer
representing an agent’s ID is placed on all landscape cells in the
rectangle.
In the case of public land, the correct proportion
public_land
is put on the landscape by running the above
algorithm in the break_land
function with an appropriately
added number of owners. If, for example, half of the land is meant
to be public, then the total number of unique sub-rectangles to be made
is adjusted to users / (1 - proportion_public_land). Where if
proportion_public_land = 0.5, e.g., the total number of
sub-rectangles to be made will be twice the number of users. Once all of
the cells are assigned an integer value by the algorithm, all cells with
a number higher than any user ID are then re-assigned as public
land.
These changes have now been pushed to the rev branch, so as of writing this, they are available to use. Relevant testthat functions have been adjusted where necessary to account for the code changes, and the new code passes all checks and all CRAN tests with no notes, warnings, or errors.
I like the above algorithm so much that I do not see any real need to
even make the old straight-line simple sort approach an option in
gmse
. The old approach was a simplification that always
seemed a bit unrealistic, and I always wanted to come up with a better
way of allocating owned landscape cells. Please contact me or raise an issue on
GitHub if you really want the old way back in gmse
(note that gmse_apply
of course always allows custom
landscape definition).
With this new feature, I have now tackled bullet points 1, 3, and 4 from 6 MAY 2020. The next thing that I will do will be to update the GUI, which I do not expect to take long. Once this is finished, I will begin writing my vignette on resource density, and updating the website accordingly. I think I might also update the vignette on default GMSE structures to briefly discuss the above algorithm and how landscape ownership is assigned. I will put out another announcement before sending to CRAN, but please let me know if there are changes that you would like to see in GMSE v0.6.0.x.
I have made some progress on the plans from 6 MAY 2020. None of my updates are incorporated into the main R branch, so they are not permanent and I am still happy to take suggestions for improvements or other ideas in addition or in contrast to what folows.
I have completed the first bullet point of the plans from 6 MAY 2020. The order of operations is
therefore changed in the resource model on the work-in-progress rev
branch. Hence, calls to the default resource model proceed by first
having resources consume on the landscape, then birth, death, and
movement. Within resource consumption rules, feeding happens first,
followed by movement for up to times_feed
iterations. The
landscape consumption also now calls either one function or another: one
function where consume_surv > 0
or
consume_repr > 0
in which consumption happens in a
random order among resources for a variabl enumber of
times_feed
, and a second (faster) function where
consumption happens once for all resources (in order of the array, as
the order of consumption does not affect the resources themselves).
Despite the information in the vignette
on GMSE structures, column 13 of the agents array was actually being
used within the more or less unused anectdotal
function, in
which each agent ‘looks around’ from the cell on which they are standing
to see how many resources are in the general area. This function has no
purpose at the moment (I was thinking about using it to model
disagreement with the manager’s estimate and the users’ inferences about
population size), but writes to the column anywa in gmse
(but not gmse_apply
). I’m going to refer to it as
written to, but unused
in the vignette, and use column 14
of the agent array to record the number of cells owned.
Writing to the number of cells owned is done in a new
count_agent_cells
in R or the
count_owned_cells
function in C. I wrote the R function
first, but then had second thoughts about relying on it in
gmse
and particularly gmse_apply
. What if we
later want land ownership to change over the course of a simulation? I’m
not sure in this case, actually, how much extra speed we get from C (it
relies on two nested for loops across the landscape, rather than a
which
or ==
in R, which I suspect is actually
just as fast since the underlying C is probably better than mine). In
any case, it seems to make sense to have the owned cells calculated at
the beginning of the user function, so this is what I ultimately decided
on. But the R function can also be used.
I have changed the fitness function for users as such:
E_off
is
the lambda parameter, plus the amount a resource consumes on a landscape
cell divided by the amount needed to be consumed for producing one
offspring.E_off
.E_off
.These changes are now pushed to the rev
branch, which is
still under ongoing development. This effectively takes care of the
third bullet point from 6 MAY 2020. Hence,
bullets one and three are now done, so here is what remains.
I will try to make the above changes as soon as possible, but the changes that will have the biggest effect on simulations are now finished, so people can play around with the new options. The update in progress builds with no warnings, errors, or notes, and passes all tests in testthat.
New version of GMSE
A new version of GMSE v0.5.0.0 is now available on GitHub. Here are the major updates:
action_thres
and budget_bonus
in
gmse
and gmse_apply
.age_repr
arguments, and random sampling from a range of
user budgets around user_budget
can be set to be plus or
minus usr_budget_rng
in gmse
and
gmse_apply
.consume_surv
, consume_repr
, and
times_feeding
. See details
below.The new version can be downloaded from GitHub, but will not be
submitted to CRAN. It is likely to be the last version of GMSE
that is completely backwards compatible with previous versions.
In other words, if you set parameter values for previous versions of
GMSE as arguments in gmse
or gmse_apply
and
ran simulations, then you are effectively running the same code (and
should get the same results) in v0.5.0.0. In the next version update,
this will not necessarily be true because minor changes to the default
order of operations are planned (see below). In practice, this is highly
unlikely to change any results from simulations in previous versions,
but it is good to be aware of what has changed if you are expecting old
code to run identically on new versions of GMSE.
Plans for GMSE v0.6.0.x: what will change
While backwards compatibility has been a priority up until now with
GMSE development, the most recent developments in GMSE v0.5.0.0 make it
clear that the benefits of make some small changes to the default
simulations outweigh the costs. This mostly has to do with the order of
operations in the default resource submodel (currently: movement,
feeding, birth, death). Given the new option of resource consumption
affecting birth and death rates, and the consequences of farmer actions
(i.e., when land_ownership == TRUE
), a change in the order
makes sense. When feeding affects birth and death in v0.5.0.0, resources
first
move once, then move again, then eat, move, eat, move, etc. for as
many times_feeding
as defined. Hence, resources are moving
twice at the start of a time step, and potentially landing on good
landscape cells, but then not eating at all. With respect to farmer
actions, if resources move far in a single time step, then their efforts
might have very little actual effect on their individual yield. This is
because, since movement happens first, resources might be highly
likely to move to another farmer’s land anyway, making the actions
futile. This might not be a problem for modelling some systems (where
such actions are in fact futile), but the fitness function of the
genetic algorithm is designed in such a way as to have farmers assume
that one scare or cull will decrease the effects of a resource
accordingly. In practice, this doesn’t matter too much because the
absolute values in the fitness functions are arbitrary for now (i.e., we
could multiply them by the probability that a resource did not
leave the farmer’s landscape and expect the same farmer actions –
there’s nothing else they can do). But it could become an issue later,
and we might at some point be really interested in looking at patterns
among different users in a more interesting way. Therefore, in GMSE
v0.6.0.x, I plan to change the order of operations in the default
resource submodel (some more of my reasoning is discussed in Issue 56).
Overall, I plan for the following changes:
times_feeding
. If the resource is not
scared or culled at the end of the previous time step (user submodel),
then it will feed on the landscape cell on which it last resided;
scaring and culling will therefore directly affect a farmer’s landscape.
At the end of the resource submodel (after all of the feeding, birth,
and death have happened), resources will once again move. This is
because the feeding process is not forced in GMSE, so if default
consume_surv = 0
and consume_repr = 0
were
used, movement would not happen at all in the resource submodel. These
changes will be small in the code, but will change the default behaviour
of gmse()
and gmse_apply()
.gmse_gui()
.
This is just an oversight from the last version, as I should have done
this for the new arguments available in v0.5.0.0. The new GUI will look
the same as the old (I’m not sure if anyone is actually using it, but
it’s worth updating just in case), and I will send it to the same place on
shiny so that it can also be run without using R.lambda
, feeding increments by
lambda
, and helping offspring increments by one. This has
worked for a while, but I think some modifications will make it more
realistic.
land_ownership = FALSE
, scaring also won’t make any sense
(previously, people using GMSE just had to recognise that the
combination land_ownership = FALSE
and
scaring = TRUE
was nonsensical, now the users will
recognise it themselves).consume_repr = 0
, and
something else where consume_repr > 0
. In the absence of
any better ideas, I propose that the user assumes that a
resource will eat all of the contents of a single landscape cell and
produce the consequent number of offspring. This will be false most of
the time, but it seems reasonable from the perspective of a landowner,
who would have no way of knowing how much more a single resource would
be expected to eat and reproduce as a consequence of feeding. Hence,
from the user’s perspective, the effect of culling will be a decrease in
resources of 1 + lambda + (res_consume / consume_repr).consume_surv
and consume_repr
arguments).The end result of all of this will be, I think, an improved GMSE, but also one that breaks from previous versions in the default options and potentially stakeholder behaviour. For example, land owning users will inevitably see culling as a slightly better option for increasing yield, all else being equal (i.e., given identical costs, as set by the manager). I’m keen to start this new version as soon as possible. The new version will be pushed to CRAN after these changes have been implemented. I will attempt this by the end of the month, or early June at the latest. Please contact me either via email or GitHub if you have any thoughts or concerns about the next version.
With commit 6d1e5e6ca41bc6bb45c62cb174f34480c3c5add1,
I believe that I now have resolved Issue #56 and created a
version of GMSE that now allows resource death and reproduction to be
tied to resource consumption on the landscape. Initial testing shows
that this works in both gmse
and gmse_apply
. I
have not had a chance to code in the suggestion from Jeroen to have it
be the probability of survival or reproduction that changes
with increasing consumption. Rather, given that a resource has consumed
a sufficient quantity of resources consume_surv
, the
resource will survive. A separate new gmse
argument
consume_repr
ties reproduction to landscape consumption by
having resources produce floor(consumed / consume_repr)
offspring given some total consumption consumed
.
There is still considerable stochasticity introduced in terms of
whether or not a resource will survive or reproduce. First, each
resource can potentially feed more than once by using the new
times_feeding
argument. When
times_feeding > 1
(default is 1), resources will consume
the landscape, then move, then consume, and so forth until all of their
times_feeding
have run out. In practice, all resources get
the same times_feeding
value, but the code for executing it
depends on a new column in the resources array. This means that there
can be variation in the number of times a resource feeds if using
gmse_apply
(or gmse
, if we later decide to
sample feeding rates from a distribution). Feeding events also occur in
a random order, which avoids the first resource on the list filling
themselves up before moving on to the next resource (as would be a
natural way to program it with a for loop). I did it this (slightly less
efficient) way to more realistically model how resources would really
move around simultaneously on a landscape eating as they move from place
to place.
Hence, in the end, there is a lot of uncertainty surrounding whether
an individual resource will have enough to survive and reproduce in a
time step. But the probabilistic idea of Jeroen could also be imposed
with a function of some sort that would link consumption to a
probability of death (will need to think of the best way to actually do
this in the code though). Testing shows that resource increases to a
carrying capacity imposed by the constraints of the landscape, with a
lot more stochasticity than occurs just by settting a value for global
res_death_K
. To get started, for example, the below
demonstrates the new features.
sim <- gmse(consume_surv = 2, consume_repr = 3, times_feeding = 6,
res_birth_type = 0, res_death_type = 0, land_ownership = TRUE,
stakeholders = 12, scaring = TRUE, time_max = 40, user_budget = 1);
I set the user_budget
to unity above just to show how
the population rises to a natural carrying capacity when the manager and
users are unable to cause any population change. We can recover a
(usually) well-regulate population by settting the
user_budget
back to the default.
sim <- gmse(consume_surv = 2, consume_repr = 3, times_feeding = 6,
res_birth_type = 0, res_death_type = 0, land_ownership = TRUE,
stakeholders = 12, scaring = TRUE, time_max = 40,
user_budget = 1000);
Just to clarify, each landscape cell still puts out a value of 1. The
argument consume_surv = 2
means that resources need to
consume two cells worth to survive a time step (by default, resources
consume half of what’s on a landscape cell with each visit). The
argument consume_repr = 3
means that resources need to
consume three cells worth to produce one offspring. The argument
times_feeding = 6
means that resources get to feed on six
cells in each time step.
Setting res_birth_type = 0
and
res_death_type = 0
effectively removes all other factors in
GMSE that affect resource birth and death. Hence, what is observed from
running the above code is purely a consequence of the new rules limiting
survival and reproduction due to landscape consumption. Note that these
values need to be specified, else GMSE will assume that other defaults
apply in addition to the effects of landscape consumption. In
other words, the global external carrying capacity from
res_death_K
will still limit population growth because by
default res_death_type = 2
, and the mean birth number
lambda
will still apply because
res_birth_type = 2
. I don’t want to override this just in
case anyone wants to include multiple types of processes affecting birth
and death (I also want the new version to be backwards compatible with
existing code), but I think it will be important for me to write some
documentation that makes using these new features really clear. I’ll
start a new vignette.
The new version is GMSE v0.5.0.0 with Adrian now included in the list
of contributors. The action_thres
and
budget_bonus
can now be run in gmse
and
gmse_apply
, and usr_budget_rng
and
age_repr
are also available options. The new version passes
all CRAN checks with no notes, warnings, or errors. The new version
should be completely backwards compatible. If you have existing code to
run, it should do the same thing in the new version that it did in
previous versions. Everything is now just a new feature that can be
used, but is not by default.
The last thing that I think I want to do before merging with master,
updating the website, and sending to CRAN, is to make a new default
landscape. It would be good to make each landscape into more farm-like
blocks rather than the narrow stripes that exist now. I don’t have a
good algorithm for doing this for an arbitrary number of stakeholders
(i.e., for N stakeholders, separate an X by Y
landscape into roughly equal blocks of sizes ca (X*Y / N)). If anyone
wants to have a go at this and create a function that does it, please
let me know! Else I’ll try to get around to it before early next month
where the plan is to send to CRAN. If anyone has time (Nils Jeroen
Adrian), feel free to download from the dev
branch and
go.
install.packages("devtools");
library(devtools);
install_github("ConFooBio/GMSE", ref = "dev")
It should be up now.
I have now completed two major tasks in a GMSE update, first mentioned earlier in the month. The new code is up on the rev branch.
I have resolved Issue #53, so partial
matching is not allowed in gmse_apply
(or in the tests). I
have also resolved Issue #59, meaning that
it is now possible to use different action thresholds and budget bonuses
in both gmse
and gmse_apply
(arguments
action_thres
and budget_bonus
), and variable
user budgets can now be specified in gmse
, and in
gmse_apply
(argument usr_budget_rng
– of
course, manual edits to user budgets were always allowed in
gmse_apply
by directly editing the agents array).
In making these changes, I have also allowed gmse_apply
to keep track of the current time step when old calls to the function
are passed back through old_list
(i.e., when looping old
list as in
here, the time step will also be tracked automatically through the
loops, not just outside of gmse_apply
through the loop
index – this makes the action threshold possible, as it only applies
after the first time step).
I now need to move on to Issue #56, which will more or less complete the new version, though I also might update the default landscape ownership placement.
New tests have been written in testthat, and the package builds with now errors, warnings, or notes.
I am now beginning a moderate update to GMSE, which I hope to have complete and up on CRAN by the end of the month. There are three specific goals of this update:
Resolve Issue #53: Just to head
off any potential issues in gmse_apply
, several places in
the code use the format sim_list$element_name
rather than
`sim_list[[“element_name”]]. The latter is more robust because it
prohibits partial matching. Hence in gmse_apply.R,
this should be changed.
Resolve Issue #59: The code from AdrianBach needs to be incorporated into the master branch of GMSE. Doing this will mainly consist of resolving a merge conflict and a bit of refactoring, and I plan to do this early on in developing what I hope to be a new version of GMSE by the end of APR 2020.
Resolve Issue #56: This is the
major one. I want to add options in GMSE to allow resource dynamics to
be dependent up on landscape properties. Specifically, for resources to
consume the landscape cell contents, and for options to exist that allow
resource birth, death, and (potentially) movement to be dependent on
consumption. This will mainly involve updates to the resource model,
with updates to the gmse
and gmse_apply
function as needed.
In addressing 1-3, I will need to check to make sure that
gmse_apply
is handling all of the new options correctly,
and new tests will need to be written as necessary.
I will continue to update as need be and make sure that any updates from other team members (e.g., Issue #58) are merged appropriately before the new GMSE version.
Yesterday, I received an email noting a problem with the GMSE
documentation. Specifically, a warning arose because ...
was in the documentation but is no longer allowed as an argument in the
default R functions. Notice the issue on this
line of the manager.R file, which is contradicted later by the
function itself. This has now been resolved.
GMSE v0.4.0.11 is now available on GitHub. Some minor changes and bug fixes are included in the update version.
paras
vector
in development. It also made repeated calls to gmse_apply
crash when the exact same arguments were specified but some values
(e.g., resource abundance) changed.gmse_apply
,
noting some special considerations that could potentially cause
confusion.I will update once more when GMSE v0.4.0.11 is successfully submitted to CRAN.
GMSE v0.4.0.7 is now available on GitHub, and the official GMSE repository is now transferred to the ConFooBio organisation. I am currently in the process of fixing some website issues, but most of the important stuff has transferred to the new website location.
GMSE v0.4.0.3 is now available on CRAN. A new website for GMSE has also been launched. This website was built with the R package pkgdown, recently released on CRAN. The site contains all of the vignettes and documentation for GMSE, and also includes a link to this lab notebook. A submission of the accompanying manuscript will soon be uploaded on bioRxiv.
A new GMSE v0.4.0.3 has now been pushed to the master branch on GitHub and has been submitted to CRAN. The biggest update in this new version is a series of vignettes, plus a minor improvement to the genetic algorithm. More updates will follow soon, including some re-organisation of the GMSE project and a new manuscript submission.
I have re-worked the way that a manager restimates how the change in
their policy affects users’ actions. The new new_act
function in the genetic algorithm (games.c
) performs well
for getting more precise cost settings. The former way of doing it was
much more of a blunt instrument, and it had a ceiling issue – that is,
the manager would believe that higher costs caused fewer actions
even when the resulting cost was over the users’ budgets.
/* =============================================================================
* This function updates an action based on the change in costs & paras
* old_cost: The old cost of an action, as calculated in policy_to_counts
* new_cost: The new cost of an action, as calculated in policy_to_counts
* paras: Vector of global parameters
* ========================================================================== */
int new_act(double old_cost, double new_cost, double old_act, double *paras){
int total_acts;
double users, max_to_spend, acts_per_user, cost_per_user, total_cost;
double pr_on_act, budget_for_act, mgr_budget, min_cost;
users = paras[54] - 1; /* Minus one for the manager */
min_cost = paras[96]; /* Minimum cost of an action */
max_to_spend = paras[97]; /* Maximum per user budget */
mgr_budget = paras[105]; /* Manager's total budget */
total_cost = 0.0;
if(old_cost < mgr_budget){
total_cost = old_act * old_cost; /* Total cost devoted to action */
}
cost_per_user = (total_cost / users); /* Cost devoted per user */
pr_on_act = cost_per_user / max_to_spend; /* Pr. devoted to action */
/* Assume that the proportion of the budget a user spends will not change */
budget_for_act = max_to_spend * pr_on_act;
/* Calculate how many actions to expect given acts per user and users */
acts_per_user = budget_for_act / (new_cost + min_cost);
total_acts = (double) users * acts_per_user;
return(total_acts);
}
This new way of assessing how users will act is now the function to be run in the background of all manager genetic agorithms. Very nicely, this also resolves an annoyance with the maximum allowed budgets. Previously, it was unclear why maximum budgets greater than 10000 were causing problems (managers were making bad predictions). I have now set the maximum budget to an order of magnitude higher, and there are no longer any apparent issues. A new version of GMSE will soon have this update.
New Issue #40: Age distribution bump
Running simulations using gmse_apply
, jeremycusack noticed a small
but noticeable sharp decline in the population size at a generation
equal to the maximum age of resources in the population (used a maximum
age of 20). This decline is caused by the initial seed of resources
having a uniform age distribution. In the first generation, these
resources reproduce offspring that all have an age of zero, leading to
an age structure in the population with many zero age individuals and a
uniform distribution of ages greater than zero. The initial seed of
individuals with random ages died gradually, but there were enough
individuals in the initial offspring cohort that made it to the maximum
age for it to have a noticeable effect in decreasing population size
(i.e., all of these resources died on the maximum_age + 1
time step).
This effect can be avoided entirely given sufficient burn in
generations of a model, and is less of a problem when the maximum age is
low because this allows the age distribution to stabilise sooner.
Further, using gmse_apply
can avoid the issue by directly
manipulating resources ages after the initial generation. Nevertheless,
it would be useful to have a different default of age distributions
instead of a uniform distribution.
One way to do this would be to find the age (\(A\)) at which a resource is expected to be alive with a probability of \(0.5\), after accounting for mortality (\(\mu\)). This is simply calculated below:
\((1 - \mu)^A = 0.5\)
The above can be re-arranged to find A,
\(A = \frac{log(0.5)}{log(1 - \mu)}\).
Note that we could use a switch function (or something like it in R) to make \(A = 0\) when \(\mu = 1\), and revert to a uniform distribution of \(\mu = 0\) (though this should rarely happen).
The value of \(\mu\) would depend on
res_death_type
, and be processed in
make_resource
, which is used in both gmse
and
gmse_apply
. If res_death_type = 1
(density
independent, rarely used), then \mu
is simply equal to
remov_pr
. If res_death_type = 2
(density
dependent), then \mu
could be found perhaps using something
like the following:
mu = (RESOURCE_ini * lambda) / (RESOURCE_ini + RESOURCE_ini * lambda)
gi This would get a value that is at least proportional to expected
mortality rate of a resource (if res_death_type = 3
, then
we could use the some of types 1 and 2). Overall, the documentation
should perhaps recommend finding a stable set of age distributions for a
particular set of parameter combinations when using
gmse_appy
(i.e., through simulation), then using that
distribution as an initial condition. But something like the above could
probably get close to whatever the stable age distribution would be, at
least close enough to make the decline in population size trivial.
I will start to consider some of the above as a potential default for
the next version of GMSE. The best way to do this is probably to look at
how code from the res_remove
function in the
resource.c
file can best be integrated into a function
called by the R function make_resource
(i.e., either use
the equations, or estimates of them, or somehow call
res_remove
directly).
Improved convergence criteria
I have introduced and then immediately resolved Issue #39.
The convergence criteria has now been fixed with commit f598d8e52b47ef2017cac13d09aac1fb7aa6b506. To do this, I re-configured some of the genetic algorithm code into easier to read functions for checking the fitness increase. Now two separate ways of checking the increase in fitness from one genetic algorithm generation to the next exist; one for managers and one for users. This is needed because user fitness values are greater than zero and increase as their utility is maximised, but manager fitness values are less than zero and increase toward zero as their utility is maximised. The genetic algorithm now checks for a percentage improvement in fitness.
Now the default value of converge_crit
equals 1, which
means it does actually play a role sometimes (or is expected to). The
genetic algorithm will continue until the percent increase in fitness
from the previous generation is less than one percent. In practice, this
doesn’t noticeably affect much, but it does allow better strategies to
be found more quickly, and without having to play with
ga_mingen
to find them under extreme parameter settings
(e.g., huge budgets and rapid shifts in abundance).
The new fix has now been checked and built with Winbuilder into v0.3.2.0, but I am leaving this on the development branch for now in anticipation of other potential improvements to be made soon.
CRAN ready GMSE v0.3.1.7 – more flexibility, better error messages
I have now completed some substantial coding of error messages, which
will be called in both gmse
and gmse_apply
.
Essentially, these provide some help to software users who parameterise
their models in a way that does not work with GMSE. For example, if the
parameter stakeholders
is set to equal a negative number,
an error message will be returned that informs the user that at least
one stakeholder is required in the model. These error messages become a
bit more important in gmse_apply
, where it is possible for
users to include arguments that don’t make sense (e.g., arrays of
incorrect dimensions, or arguments that contradict one another).
The function gmse_apply
has also been improved to make
looping it easier. What had been happening during testing was that we
were finding it all too easy to crash R by reading in parameters that
contradicted one another (e.g., changing setting the landscape
dimensions through land_dim_1
and land_dim_2
caused a crash when also trying to add in a LAND
of
different dimension – now this returns an error that
LAND and land_dim_1 disagree about landscape size
). This
has been resolved in two ways. First, I have included many error
messages meant to catch bad and contradictory arguments in
gmse_apply
(and, to a lesser extent gmse
); it
is still possible to crash R by setting things incorrectly, but you have
to work very hard to do it – i.e., it almost has to be deliberate, as
far as I can tell. Second, I have added the argument
old_list
to gmse_apply
, which is
FALSE
by default, but can instead take the output of a
previous full list return of gmse_apply
(where
get_res = Full
). An element of the full list includes the
basic output from which key parameters can be pulled. As a reminder, the
basic gmse_apply
output looks like the below.
$resource_results
[1] 1062
$observation_results
[1] 680.2721
$manager_results
resource_type scaring culling castration feeding help_offspring
policy_1 1 NA 110 NA NA NA
$user_results
resource_type scaring culling castration feeding help_offspring tend_crops kill_crops
Manager 1 NA 0 NA NA NA NA NA
user_1 1 NA 9 NA NA NA NA NA
user_2 1 NA 9 NA NA NA NA NA
user_3 1 NA 9 NA NA NA NA NA
user_4 1 NA 9 NA NA NA NA NA
An example gmse_apply
used in a loop is below.
to_scare <- FALSE;
sim_old <- gmse_apply(scaring = to_scare, get_res = "Full", stakeholders = 6);
sim_sum <- matrix(data = NA, nrow = 20, ncol = 7);
for(time_step in 1:20){
sim_new <- gmse_apply(scaring = to_scare, get_res = "Full",
old_list = sim_old);
sim_sum[time_step, 1] <- time_step;
sim_sum[time_step, 2] <- sim_new$basic_output$resource_results[1];
sim_sum[time_step, 3] <- sim_new$basic_output$observation_results[1];
sim_sum[time_step, 4] <- sim_new$basic_output$manager_results[2];
sim_sum[time_step, 5] <- sim_new$basic_output$manager_results[3];
sim_sum[time_step, 6] <- sum(sim_new$basic_output$user_results[,2]);
sim_sum[time_step, 7] <- sum(sim_new$basic_output$user_results[,3]);
sim_old <- sim_new;
print(time_step);
}
colnames(sim_sum) <- c("Time", "Pop_size", "Pop_est", "Scare_cost",
"Cull_cost", "Scare_count", "Cull_count");
The ouput sim_sum
is shown below.
Time Pop_size Pop_est Scare_cost Cull_cost Scare_count Cull_count
[1,] 1 733 839.0023 NA 110 NA 54
[2,] 2 768 702.9478 NA 110 NA 54
[3,] 3 824 725.6236 NA 110 NA 54
[4,] 4 933 907.0295 NA 110 NA 54
[5,] 5 1180 816.3265 NA 110 NA 54
[6,] 6 1345 1224.4898 NA 10 NA 426
[7,] 7 1114 1269.8413 NA 10 NA 425
[8,] 8 820 884.3537 NA 110 NA 54
[9,] 9 952 793.6508 NA 110 NA 54
[10,] 10 1101 884.3537 NA 110 NA 54
[11,] 11 1299 1111.1111 NA 12 NA 402
[12,] 12 1079 907.0295 NA 110 NA 54
[13,] 13 1227 1564.6259 NA 10 NA 431
[14,] 14 934 839.0023 NA 110 NA 54
[15,] 15 1065 1133.7868 NA 10 NA 423
[16,] 16 768 725.6236 NA 110 NA 54
[17,] 17 869 929.7052 NA 110 NA 54
[18,] 18 949 907.0295 NA 110 NA 54
[19,] 19 1049 884.3537 NA 110 NA 54
[20,] 20 1200 1020.4082 NA 64 NA 90
We can take advantage of gmse_apply
to dynamically
change parameter values mid-loop. For example, below shows the same
code, but with a policy of scaring introduced on time step 10.
to_scare <- FALSE;
sim_old <- gmse_apply(scaring = to_scare, get_res = "Full", stakeholders = 6);
sim_sum <- matrix(data = NA, nrow = 20, ncol = 7);
for(time_step in 1:20){
sim_new <- gmse_apply(scaring = to_scare, get_res = "Full",
old_list = sim_old);
sim_sum[time_step, 1] <- time_step;
sim_sum[time_step, 2] <- sim_new$basic_output$resource_results[1];
sim_sum[time_step, 3] <- sim_new$basic_output$observation_results[1];
sim_sum[time_step, 4] <- sim_new$basic_output$manager_results[2];
sim_sum[time_step, 5] <- sim_new$basic_output$manager_results[3];
sim_sum[time_step, 6] <- sum(sim_new$basic_output$user_results[,2]);
sim_sum[time_step, 7] <- sum(sim_new$basic_output$user_results[,3]);
sim_old <- sim_new;
if(time_step == 10){
to_scare <- TRUE;
}
print(time_step);
}
colnames(sim_sum) <- c("Time", "Pop_size", "Pop_est", "Scare_cost",
"Cull_cost", "Scare_count", "Cull_count");
The above simulation results in the following output for
sim_sum
.
Time Pop_size Pop_est Scare_cost Cull_cost Scare_count Cull_count
[1,] 1 745 657.5964 NA 110 NA 54
[2,] 2 805 1111.1111 NA 12 NA 400
[3,] 3 473 634.9206 NA 110 NA 54
[4,] 4 504 566.8934 NA 110 NA 54
[5,] 5 577 498.8662 NA 110 NA 54
[6,] 6 600 430.8390 NA 110 NA 54
[7,] 7 648 612.2449 NA 110 NA 54
[8,] 8 714 702.9478 NA 110 NA 54
[9,] 9 813 612.2449 NA 110 NA 54
[10,] 10 914 1020.4082 NA 64 NA 90
[11,] 11 1011 1179.1383 57 10 49 301
[12,] 12 858 725.6236 10 110 193 37
[13,] 13 1011 1043.0839 37 30 0 198
[14,] 14 989 1043.0839 57 30 0 198
[15,] 15 983 1065.7596 48 20 10 270
[16,] 16 851 839.0023 10 110 193 37
[17,] 17 962 1111.1111 38 12 58 306
[18,] 18 783 612.2449 10 110 193 37
[19,] 19 862 816.3265 10 110 193 37
[20,] 20 963 702.9478 10 110 182 38
Hence, in addition to all
of the other benefits of gmse_apply
, one new feature is
that we can use it to study change in policy availability – in this
case, what happens when scaring is introduced as a possible policy
option. Similar things can be done, for example, to see how manager or
user power changes over time. In the example below, users’ budgets
increase by 100 every time step, with the manager’s budget remaining the
same. The consequence appears to be decreased population stability and a
higher likelihood of extinction.
ub <- 500;
sim_old <- gmse_apply(get_res = "Full", stakeholders = 6, user_budget = ub);
sim_sum <- matrix(data = NA, nrow = 20, ncol = 6);
for(time_step in 1:20){
sim_new <- gmse_apply(get_res = "Full", old_list = sim_old,
user_budget = ub);
sim_sum[time_step, 1] <- time_step;
sim_sum[time_step, 2] <- sim_new$basic_output$resource_results[1];
sim_sum[time_step, 3] <- sim_new$basic_output$observation_results[1];
sim_sum[time_step, 4] <- sim_new$basic_output$manager_results[3];
sim_sum[time_step, 5] <- sum(sim_new$basic_output$user_results[,3]);
sim_sum[time_step, 6] <- ub;
sim_old <- sim_new;
ub <- ub + 100;
print(time_step);
}
colnames(sim_sum) <- c("Time", "Pop_size", "Pop_est", "Cull_cost", "Cull_count",
"User_budget");
The output of sim_sum
is below.
Time Pop_size Pop_est Cull_cost Cull_count User_budget
[1,] 1 1215 1405.8957 10 292 500
[2,] 2 1065 1224.4898 10 336 600
[3,] 3 833 680.2721 110 36 700
[4,] 4 936 907.0295 110 42 800
[5,] 5 1174 1224.4898 10 401 900
[6,] 6 887 521.5420 110 54 1000
[7,] 7 988 680.2721 110 60 1100
[8,] 8 1084 975.0567 110 60 1200
[9,] 9 1208 861.6780 110 66 1300
[10,] 10 1360 1133.7868 10 520 1400
[11,] 11 975 861.6780 110 78 1500
[12,] 12 1079 1156.4626 10 560 1600
[13,] 13 597 770.9751 110 90 1700
[14,] 14 595 476.1905 110 96 1800
[15,] 15 586 612.2449 110 102 1900
[16,] 16 584 770.9751 110 108 2000
[17,] 17 557 589.5692 110 114 2100
[18,] 18 519 521.5420 110 120 2200
[19,] 19 469 521.5420 110 120 2300
[20,] 20 430 453.5147 110 126 2400
There is an important note to make about changing arguments to
gmse_apply
when old_list
is being used: The
function gmse_apply
is trying to avoid a crash, so
the function will accomodate parameter changes by rebuilding
data structures if necessary. For example, if you change the
number of stakeholders (and by including an argument
stakeholders
to gmse_apply
, it is assumed that
stakeholders are changing even they are not), then a new array of agents
will need to be built. If you change landscape dimensions (or just
include the argument land_dim_1
or
land_dim_2
), then a new landscape willl be built. This is
mentioned in the documentation.
GMSE v0.3.3.7 passes all CRAN checks in Rstudio. I will make sure
that the code works with win-builder, then prepare
the new submission. Alternatively, as always, the newest GMSE version
can be downloaded through GitHub if you have devtools
installed in R.
devtools::install_github("bradduthie/GMSE")
I will soon update the manuscript for GMSE and upload it to biorXiv.
Bug fix concerning density-based estimation
An error with density-based resource estimation
(observe_type = 0
) at very high values of
agent_view
was identified by Jeremy. When managers had a
view of the landscape that encompassed a number of cells that was
calculated to be larger than the actual number of landscape
cells (as defined by land_dim_1 * land_dim_2
), the manager
would understimate actual population size. This occurred only in the
manager.c
file and not in the equivalent R function shown
during plotting. The bug was fixed in commit a916b8f8a40041b5f08984cf73348108482dde59
with a simple if
statement. This has therefore been
resolved in a patched GMSE v0.3.1.3, which is now availabe on
GitHub.
Bug fix concerning resource movement
An error with the res_move_obs
parameter was identified
by Jeremy. This
parameter was supposed to only affect resource movement during
observation, but an if statement corrected in commit
5eeb88d285af57984171e7d72410659b3b441af3 was causing
res_move_obs = FALSE
to stop moving entirely in the
resource model. This has now been resolved in a patched GMSE v0.3.1.1,
which is now available on GitHub.
New option for removal of resources
A new option has been included for the argument
res_death_type
. By setting res_death_type = 3
in gmse
or gmse_apply
, resources can
experience both density dependent (caused by res_death_K
)
and density independent (caused by remove_pr
) removal
simultaneously. Effects of each are independent of one another (i.e.,
both processes occur simultaneously, so the calculation of population
size affecting removal due to carrying capacity includes resources that
might experience density independent mortality).
New group_think parameter in GMSE v0.3.1.0
A new group_think
parameter has been developed by Jeremy and me, and included
into an updated v0.3.1.0. This parameter is defined as
FALSE
by default, but when set to be TRUE
will
cause all users to act as a single block instead of independently. In
the code, what happens is that a single user (user ID number 2) runs
through the genetic algorithm, but then instead of having the resulting
actions apply to only this user, they apply to all users so that the
genetic algorithm only needs to be run once in the user model. This
decreases simulation time, particularly when there are a lot of users to
model, but at a cost of removing all variation in actions among users.
The group_think
parameter can be defined in both
gmse()
and gmse_apply()
, but I have not added
it as an option in gmse_gui()
.
GMSE v0.3.0.0 now available with gmse_apply
The gmse_apply
function is now available on a new GMSE
version 0.3.0.0. (minor tweaks to other functions have also been
made, but nothing that changes the user experience of gmse – mostly
typos corrected in the documentation). The new function allows software
users to integrate their own submodels (resource, observation, manager,
and user) into GMSE, or to use their own submodels entirely within a
single function.
GMSE apply function
The gmse_apply function is a flexible function that allows for user-defined sub-functions calling resource, observation, manager, and user models. Where such models are not specified, GMSE submodels ‘resource’, ‘observation’, ‘manager’, and ‘user’ are run by default. Any type of sub-model (e.g., numerical, individual-based) is permitted as long as the input and output are appropriately specified. Only one time step is simulated per call to gmse_apply, so the function must be looped for simulation over time. Where model parameters are needed but not specified, defaults from gmse are used.
gmse_apply arguments
res_mod The function specifying the resource model. By default, the individual-based resource model from gmse is called with default parameter values. User-defined functions must either return an unnamed matrix or vector, or return a named list in which one list element is named either ‘resource_array’ or ‘resource_vector’, and arrays must follow the format of GMSE in terms of column number and type (if there is only one resource type, then the model can also just return a scalar value).
obs_mod The function specifying the observation model. By default, the individual-based observation model from gmse is called with default parameter values. User-defined functions must either return an unnamed matrix or vector, or return a named list in which one list element is named either ‘observation_array’ or ‘observation_vector’, and arrays must follow the format of GMSE in terms of column number and type (if there is only one resource type, then the model can also just return a scalar value).
man_mod The function specifying the manager model. By default, the individual-based manager model that calls the genetic algorithm from gmse is used with default parameter values. User-defined functions must either return an unnamed matrix or vector, or return a named list in which one list element is named either ‘manager_array’ or ‘manager_vector’, and arrays must follow the (3 dimensional) format of the ‘COST’ array in GMSE in terms of column numbers and types, with appropriate rows for interactions and layers for agents (see documentation of GMSE for constructing these, if desired). User defined manager outputs will be recognised as costs by the default user model in gmse, but can be interpreted differently (e.g., total allowable catch) if specifying a custom user model.
use_mod The function specifying the user model. By default, the individual-based user model that calls the genetic algorithm from gmse is used with default parameter values. User-defined functions must either return an unnamed matrix or vector, or return a named list in which one list element is named either ‘user_array’ or ‘user_vector’, and arrays must follow the (3 dimensional) format of the ‘ACTION’ array in GMSE in terms of column numbers and types, with appropriate rows for interactions and layers for agents (see documentation of GMSE for constructing these, if desired).
get_res How the output should be organised. The default ‘basic’ attempts to distill results down to their key values from submodel outputs, including resource abundances and estimates, and manager policy and actions. An option ‘custom’ simply returns a large list that includes the output of every submodel. Any other option (e.g. ‘full’) will return a massive list with all of the input, output, and parameters used to run gmse_apply.
… Arguments passed to user-defined functions, and passed to modify default parameter values that would otherwise be called for gmse default models. Any argument that can be passed to gmse can be specified explicitly, just as if it were an argument to gmse. Similarly, any argument taken by a user-defined function should be specified, though the function will work if the user-defined function has a default that is not specified explicitly.
Example uses of gmse_apply
A simple run of gmse_apply()
will return one generation
of gmse using default submodels and parameter values.
sim <- gmse_apply();
For sim
, the default ‘basic’ results are returned as
below.
$resource_results
[1] 1102
$observation_results
[1] 1179.138
$manager_results
scaring culling castration feeding help_offspring
policy NA 10 NA NA NA
$user_results
resource_type scaring culling castration feeding help_offspring tend_crops kill_crops
Manager 1 NA 0 NA NA NA NA NA
user_2 1 NA 70 NA NA NA NA NA
user_3 1 NA 75 NA NA NA NA NA
user_4 1 NA 69 NA NA NA NA NA
user_5 1 NA 74 NA NA NA NA NA
Note in the case above we have the total abundance of resources returned, the estimate of resource abundance from the observation function, the costs the manager sets for the only available action of culling, and the number of culls attempted by each user.
The above was produced by all of the individual-based functions that
are default in GMSE; custom generated subfunctions can instead be
included provided that they fit the specifications described above. For
example, we can define a very simple logistic growth function to send to
res_mod
instead.
alt_res <- function(X, K = 2000, rate = 1){
X_1 <- X + rate*X*(1 - X/K);
return(X_1);
}
The above function takes in a population size of X
and
returns a value X_1
based on the population intrinsic
growth rate rate
and carrying capacity K
.
Iterating the logistic growth model by itself under default parameter
values with a starting population of 100 will cause the population to
increase to carrying capacity in roughly 7 generations. The function can
be substituted into gmse_apply
to use it instead of the
default GMSE resource model.
sim <- gmse_apply(res_mod = alt_res, X = 100, rate = 0.3);
The gmse_apply
function will find the parameters it
needs to run the alt_res
function in place of the default
resource function, either by running the default function values (e.g.,
K = 2000
) or values specified directly into
gmse_apply
(e.g., X = 100
and
rate = 0.3
). If an argument to a custom function is
required but not provided either as a default or specified in
gmse_apply
, then an error will be returned.
To integrate across different types of submodels,
gmse_apply
translates between vectors and arrays between
each submodel. For example, because the default GMSE observation model
requires a resource array with particular requirements for column
identites, when a resource model subfunction returns a vector, or a list
with a named element ‘resource_vector’, this vector is translated into
an array that can be used by the observation model. Specifically, each
element of the vector identifies the abundance of a resource type (and
hence will usually be just a single value denoting abundance of the only
focal population). If this is all the information provided, then a
resource_array will be made with default GMSE parameter values with an
identical number of rows to the abundance value (floored if the value is
a non-integer; non-default values can also be put into this
transformation from vector to array if they are specified in gmse_apply,
e.g., through an argument such as lambda = 0.8
). Similarly,
a resource_array
is also translated into a vector after the
default individual-based resource model is run, should the observation
model require simple abundances instead of an array. The same is true of
observation_vector
and observation_array
objects returned by observation models, of manager_vector
and manager_array
(i.e., COST) objects returned by manager
models, and of user_vector
and user_array
(i.e., ACTION) objects returned by user models. At each step, a
translation between the two is made, with necessary adjustments that can
be tweaked through arguments to gmse_apply
when needed.
Alternative observation, manager, and user, submodels, for example, are
defined below; note that each requires a vector from the preceding
model.
# Alternative observation submodel
alt_obs <- function(resource_vector){
X_obs <- resource_vector - 0.1 * resource_vector;
return(X_obs);
}
# Alternative manager submodel
alt_man <- function(observation_vector){
policy <- observation_vector - 1000;
if(policy < 0){
policy <- 0;
}
return(policy);
}
# Alternative user submodel
alt_usr <- function(manager_vector){
harvest <- manager_vector + manager_vector * 0.1;
return(harvest);
}
All of these submodels are completely deterministic, so when run with the same parameter combinations, they produce replicable outputs.
gmse_apply(res_mod = alt_res, obs_mod = alt_obs,
man_mod = alt_man, use_mod = alt_usr, X = 1000);
The above, for example, produces the following output (Note that the
X
argument needs to be specified, but the rest of the
subfunctions take vectors that gmse_apply
recognises will
become available after a previous submodel is run).
$resource_results
[1] 1500
$observation_results
[1] 1350
$manager_results
[1] 350
$user_results
[1] 385
Note that the manager_results
and
user_results
are ambiguous here, and can be interpreted as
desired – e.g., as total allowable catch and catches made, or as
something like costs of catching set by the manager and effort to
catching made by the user. Hence while manger output is set in terms of
costs of performing each action, and user output is set in terms of
action attempts, this need not be the case when using
gmse_apply
(though it should be recognised when using
default GMSE manager and user functions).
GMSE default submodels can be added in at any point.
gmse_apply(res_mod = alt_res, obs_mod = observation,
man_mod = alt_man, use_mod = alt_usr, X = 1000)
The above produces the results below.
$resource_results
[1] 1500
$observation_results
[1] 1655.329
$manager_results
[1] 655.3288
$user_results
[1] 720.8617
If we wanted to, for example, specify a simple resource and observation model, but then take advantage of the genetic algorithm to predict policy decisions and user actions, we could use the default GMSE manager and user functions (written below explicitly, though this is not necessary).
gmse_apply(res_mod = alt_res, obs_mod = alt_obs,
man_mod = manager, use_mod = user, X = 1000)
The above produces the output below returning culling costs and culling actions attempted by four users (note that the default manager target abundance is 1000).
$resource_results
[1] 1500
$observation_results
[1] 1350
$manager_results
scaring culling castration feeding help_offspring
policy NA 10 NA NA NA
$user_results
resource_type scaring culling castration feeding help_offspring tend_crops kill_crops
Manager 1 NA 0 NA NA NA NA NA
user_2 1 NA 70 NA NA NA NA NA
user_3 1 NA 70 NA NA NA NA NA
user_4 1 NA 71 NA NA NA NA NA
user_5 1 NA 73 NA NA NA NA NA
Instead of using the gmse
function, we might simulate
multiple generations by calling gmse_apply
through a loop,
reassigning outputs where necessary for the next generation (where
outputs are not reassigned, new defaults will be inserted in their
place, so, e.g., if we were to just loop without reassigning any
variables, nothing would update and we would be running the same model,
effectively, multiple times). Below shows how this might be done.
sim1 <- gmse_apply(get_res = "full", lambda = 0.3);
RESOURCES <- sim1$resource_array;
LAND <- sim1$LAND;
PARAS <- sim1$PARAS;
results <- matrix(dat = NA, nrow = 40, ncol = 4);
for(time_step in 1:40){
sim_new <- gmse_apply(RESOURCES = RESOURCES, LAND = LAND, PARAS = PARAS,
COST = COST, ACTION = ACTION, stakeholders = 10,
get_res = "full", agent_view = 20);
results[time_step, 1] <- sim_new$resource_vector;
results[time_step, 2] <- sim_new$observation_vector;
results[time_step, 3] <- sim_new$manager_vector;
results[time_step, 4] <- sim_new$user_vector;
RESOURCES <- sim_new$resource_array;
LAND <- sim_new$LAND;
PARAS <- sim_new$PARAS;
COST <- sim_new$COST;
ACTION <- sim_new$ACTION;
}
colnames(results) <- c("Abundance", "Estimate", "Cull_cost", "Cull_attempts");
The above results in the following output for
results
.
Abundance Estimate Cull_cost Cull_attempts
[1,] 1195 1165.9726 10 716
[2,] 1045 939.9167 110 461
[3,] 1160 1160.0238 10 715
[4,] 1056 1183.8192 10 715
[5,] 1014 850.6841 110 468
[6,] 1171 1237.3587 10 717
[7,] 1026 993.4563 110 464
[8,] 1202 957.7632 110 464
[9,] 1394 1469.3635 10 702
[10,] 1333 1457.4658 10 702
[11,] 1277 1397.9774 10 702
[12,] 1175 1415.8239 10 702
[13,] 1088 701.9631 110 468
[14,] 1275 1207.6145 10 718
[15,] 1200 1332.5402 10 718
[16,] 1116 1029.1493 45 512
[17,] 1249 1814.3962 10 699
[18,] 1141 1273.0518 10 722
[19,] 1019 963.7121 110 455
[20,] 1216 1629.9822 10 708
[21,] 1088 1130.2796 10 708
[22,] 988 1035.0982 38 537
[23,] 1056 1029.1493 45 505
[24,] 1154 749.5538 110 463
[25,] 1344 1499.1077 10 722
[26,] 1268 1386.0797 10 712
[27,] 1165 1493.1588 10 707
[28,] 1061 1070.7912 19 633
[29,] 1019 1076.7400 17 663
[30,] 961 600.8328 110 457
[31,] 1135 874.4795 110 450
[32,] 1338 1189.7680 10 701
[33,] 1275 1600.2380 10 710
[34,] 1174 1362.2844 10 709
[35,] 1104 1112.4331 12 685
[36,] 1003 1302.7960 10 715
[37,] 828 1183.8192 10 712
[38,] 649 785.2469 110 462
[39,] 739 1023.2005 56 488
[40,] 813 910.1725 110 455
Note that managers increase the cost of culling based on the time step’s estimated abundance, and user culling attempts decrease when culling costs increase.
In addition to the flexibility of allowing user-defined submodels,
gmse_apply
is also useful for modellers who might be
interested in simulating processes not currently available in
gmse
by itself. For example, if we wanted to model a sudden
environmental perturbation decreasing population size, or a sudden
influx of new users, after 30 generations, we could do so in the
loop.
In the near future, the gmse_apply
function will be
included in the GMSE vignette and submitted to CRAN with the rest of
v0.3.0.0 – in the mean time, I believe that all major bugs have been
ironed out, but please let me know or report an issue if
you are able to crash the function (i.e., if you run it and it causes R
to crash – you should always get an error message before this
happens).
To download the latest GMSE v0.3.0.0, simply run the below in R (make
sure that devtools
is installed).
devtools::install_github("bradduthie/GMSE")
I welcome any feedback, and I expect to submit an update to CRAN around late October.
New function gmse_apply complete and tested
I have now completed the gmse_apply
function, which
exploits the full modularity of GMSE by allowing software users to
develop their own sub-functions and string them together with any
combination of GMSE default sub-functions. As a brief summary,
gmse_apply
includes the following features:
Any arguments for custom user functions can simply be passed along by
specifying them in gmse_apply
. For example, if we have a
custom resource function alt_res
below:
alt_res <- function(X = 1000, K = 2000, r = 1){
X_1 <- X + r*X*(1 - X/K);
return(X_1);
}
We can simply include the above in gmse_apply
as follows
to use the very simple logistic growth sub-model with the
individual-based submodels that are defaults of GMSE.
sim_app <- gmse_apply(res_mod = alt_res);
The gmse_apply
function simply adds in GMSE defaults for
unspecified models, but we can specify them too.
sim_app <- gmse_apply(res_mod = alt_res, obs_mod = observation);
To adjust parameters in the alternative resource model, simply add in the arguments as below.
sim_app <- gmse_apply(res_mod = alt_res, X = 2000, K = 5000, r = 1.2);
The gmse_apply
function will know where to place them,
and update them should they be needed for other models.
I will give a more lengthy description of how to use
gmse_apply
tomorrow, when I push GMSE v0.3.0.0 to the
master branch of GitHub and advertise the update.
Compensation suggestion
A suggestion from Jeremy to include a
compensation option for users. Users could devote some of their budget
to compensation, then managers could compensate a proportion of their
damaged yield. Implementing this will require consideration from the
manager’s perspective with respect to the genetic algorithm – the users’
perspective will be easier because a user can remember their previous
losses and assess compensation versus culling. Managers might have to
think about how compensation could incentivise non-culling, but this
might actually already work given the way the manager anticipates
actions; more investigation into this will be useful following the
finalisation of gmse_apply()
, which is in progress.
Progress has been made on the gmse_apply()
function. My
goal is to make this as modular as possible – to allow any four
functions to be included in the GMSE framework, including arbitrary
arguments to each function. The gmse_apply()
function will
recognise which arguments go along with which functions, and naturally
string together results from one sub-function to the input of the next
sub-function (though this will demand that the output from functions is
labelled in a way that matches the arguments of the next function; e.g.,
if you have a ‘N_total’ as input for the observation model, then
‘N_total’ will either need to be labelled output of the resource model
or specificied directly in gmse_apply()
). Default submodels
will be the IBMs used in gmse()
, and where arguments are
not specified by the software user in gmse_apply()
(e.g.,
LAND
) they will be built from default gmse()
parameters.
The GMSE GUI
has been updated with all of the new features in version
0.2.3.0. The gmse_gui()
function is likewise updated in
a new patch version
0.2.3.1. I did this quickly because the GUI was actually easy to
update; plans for the gmse_apply
function are now also
clear, and I hope to have a working function and version 0.3.0.0 by the
end of the week, or by early next week.
GMSE Version 0.2.3.0 on GitHub
I have pushed a new version 0.2.3.0 of GMSE onto the master branch of GitHub, which means that the most up-to-date version can be installed using the code below (make sure the devtools library is installed).
devtools::install_github("bradduthie/GMSE")
The new version includes multiple new features:
plot_gmse_effort
function, which shows the
conflict between manager targets and user actions more directly (see plots from 23 AUG 2017 notes).gmse_summary
function, which takes the large
output produced from gmse
and returns a much easier to
understand set of tables.gmse_gui
function that has better defaults
and parameter organisation, which has been uploaded to shiny
for use in a browser.To run a simple default simulation, the gmse
function
remains unchanged.
sim <- gmse();
To plot the effort of managers and users, use the below.
plot_gmse_effort(agents = sim$agents, paras = sim$paras,
ACTION = sim$action, COST = sim$cost);
Below summarises the results more cleanly, extracting key information
from sim
.
gmse_summary(sim);
And as before, the GUI can be called directly from the R console.
gmse_gui();
The GUI does not yet allow you to get a vew of the
plot_gmse_effort
output, or a gmse_summary
,
but this will be a goal for future versions of GMSE.
If able, I recommend updating to version 0.2.3.0 as soon as
possible. In the coming few days, I will also add the
gmse_apply
function, primarily for
developers who will benefit from a more modular way of using GMSE,
allowing for different types of submodules to be used within the broader
GMSE framework. When the new apply function has been added (and possibly
the GUI improved), I will submit a new version 0.3.x.x to CRAN.
Bug Fix and tweaks to agent prediction
I have now fixed
a bug in the code that was causing confusion between culling and
castration. After recompiling and running simulations, manager and user
actions improve. I have also made some minor changes to default
gmse()
options. Regarding the predicted consequences of
manager and user actions (i.e., the predictions from the agents’
perspective that guid their decision making), I have adjusted
some things to make them more in line with what is expected in the
simulation as follows (recall that managers are interested in global
abundance and users are interested specifically in how abundance affects
themselves):
These values are a bit more in line with what will actually happen, so we assume that managers and users are a bit more informed now. It also allows for a bit more differentiation among actions. Overall, the model appears to perform better now – meaning that managers and users appear to be better predictors of the conseuqneces of their actions.
Before finishing the gmse_apply()
function, I will push
an updated version of GMSE to GitHub with these changes, plus new
plotting options.
I have written a gmse_summary
function (see below),
which returns a simplified list that includes four elements, each of
which is a table of data: 1. resources, a table showing time step in the
first column, followed by resource abundance in the second column. 2.
observations, a table showing time step in the first column, followed by
the estimate of population size (produced by the manager) in the second
column. 3. costs, a table showing time step in the first column, manager
number in the second column (should always be zero), followed by the
costs of each action set by the manager (policy); the far-right column
indicates budget that is unused and therefore not allocated to any
policy. 4. actions, a table showing time step in the first column, user
number in the second column, followed by the actions of each user in the
time step; additional columns indicate unused actions, crop yield on the
user’s land (if applicable), and the number of resources that a user
successfully harvests (i.e., ‘culls’).
At the moment, I have not added in the actual number of resources that a user culls. This will be added shortly, after which I will post a new function. Doing so is a bit more complicated because it requires me to go into the C code and make a recording every time it happens (see how I plan to do this below the function).
gmse_summary <- function(gmse_results){
time_steps <- dim(gmse_results$paras)[1];
parameters <- gmse_results$paras[1,];
#--- First get the resource abundances
res_types <- unique(gmse_results$resource[[1]][,2]);
resources <- matrix(dat = 0, nrow = time_steps,
ncol = length(res_types) + 1);
res_colna <- rep(x = NA, times = dim(resources)[2]);
res_colna[1] <- "time_step";
for(i in 1:length(res_types)){
res_colna[i+1] <- paste("type_", res_types[i], sep = "");
}
colnames(resources) <- res_colna;
#--- Next get estimates abd the costs set by the manager
observations <- matrix(dat = 0, nrow = time_steps,
ncol = length(res_types) + 1);
costs <- matrix(dat = NA, nrow = time_steps*length(res_types), ncol = 10);
agents <- gmse_results$agents[[1]];
users <- agents[agents[,2] > 0, 1];
actions <- matrix(dat = NA, ncol = 13,
nrow = time_steps * length(res_types) * length(users));
c_row <- 1;
a_row <- 1;
for(i in 1:time_steps){
the_res <- gmse_results$resource[[i]][,2];
manager_acts <- gmse_results$action[[i]][,,1];
resources[i, 1] <- i;
observations[i, 1] <- i;
land_prod <- gmse_results$land[[i]][,,2];
land_own <- gmse_results$land[[i]][,,3];
for(j in 1:length(res_types)){
#---- Resource abundance below
resources[i,j+1] <- sum(the_res == res_types[j]);
#---- Manager estimates below
target_row <- which(manager_acts[,1] == -2 &
manager_acts[,2] == res_types[j]);
estim_row <- which(manager_acts[,1] == 1 &
manager_acts[,2] == res_types[j]);
target <- manager_acts[target_row, 5];
adjusr <- manager_acts[estim_row, 5];
observations[i,j+1] <- target - adjusr;
#---- Cost setting below
costs[c_row, 1] <- i;
costs[c_row, 2] <- res_types[j];
estim_row <- which(manager_acts[,1] == 1 &
manager_acts[,2] == res_types[j]);
if(parameters[89] == TRUE){
costs[c_row, 3] <- manager_acts[estim_row, 8];
}
if(parameters[90] == TRUE){
costs[c_row, 4] <- manager_acts[estim_row, 9];
}
if(parameters[91] == TRUE){
costs[c_row, 5] <- manager_acts[estim_row, 10];
}
if(parameters[92] == TRUE){
costs[c_row, 6] <- manager_acts[estim_row, 11];
}
if(parameters[93] == TRUE){
costs[c_row, 7] <- manager_acts[estim_row, 12];
}
if(parameters[94] == TRUE){
costs[c_row, 8] <- parameters[97];
}
if(parameters[95] == TRUE){
costs[c_row, 9] <- parameters[97];
}
costs[c_row, 10] <- manager_acts[estim_row, 13] - parameters[97];
c_row <- c_row + 1;
#--- Action setting below
for(k in 1:length(users)){
usr_acts <- gmse_results$action[[i]][,,users[k]];
actions[a_row, 1] <- i;
actions[a_row, 2] <- users[k];
actions[a_row, 3] <- res_types[j];
res_row <- which(usr_acts[,1] == -2 &
usr_acts[,2] == res_types[j]);
if(parameters[89] == TRUE){
actions[a_row, 4] <- usr_acts[res_row, 8];
}
if(parameters[90] == TRUE){
actions[a_row, 5] <- usr_acts[res_row, 9];
}
if(parameters[91] == TRUE){
actions[a_row, 6] <- usr_acts[res_row, 10];
}
if(parameters[92] == TRUE){
actions[a_row, 7] <- usr_acts[res_row, 11];
}
if(parameters[93] == TRUE){
actions[a_row, 8] <- usr_acts[res_row, 12];
}
if(j == length(res_types)){
if(parameters[104] > 0){
land_row <- which(usr_acts[,1] == -1);
if(parameters[95] > 0){
actions[a_row, 9] <- usr_acts[land_row, 10];
}
if(parameters[94] > 0){
actions[a_row, 10] <- usr_acts[land_row, 11];
}
}
actions[a_row, 11] <- sum(usr_acts[, 13]);
}
if(parameters[104] > 0){
max_yield <- sum(land_own == users[k]);
usr_yield <- sum(land_prod[land_own == users[k]]);
actions[a_row, 12] <- 100 * (usr_yield / max_yield);
}
a_row <- a_row + 1;
}
}
}
cost_col <- c("time_step", "resource_type", "scaring", "culling",
"castration", "feeding", "helping", "tend_crop",
"kill_crop", "unused");
colnames(costs) <- cost_col;
colnames(resources) <- res_colna;
colnames(observations) <- res_colna;
action_col <- c("time_step", "user_ID", "resource_type", "scaring",
"culling", "castration", "feeding", "helping", "tend_crop",
"kill_crop", "unused", "crop_yield", "harvested");
colnames(actions) <- action_col;
the_summary <- list(resources = resources,
observations = observations,
costs = costs,
actions = actions);
return(the_summary);
}
To record kills, I think that the best way is to use the resource mortality adjustment column (at the moment, column 17 in C and 18 in R of the resource array). Mortality as of now is just adjusted to 1 in the event of a kill, and mortality occurs whenever a random probability is greater than or equal to 1. Hence, I can replace the 1 value with the user’s ID (for non-managers, this must be at least 1), and then the resource array will record the ID of the user that killed it at the particular time step. Note that this cannot be done for other adjustments such as growth rate or offspring production because the values are not interpreted as probabilities.
I will do the above tomorrow, which should not take too long. I will
then continue work on the gmse_apply
function.
Currently, the gmse()
function returns a list that
includes all of the data produced by the model, some details of which
are required for plotting.
sim_results <- list(resource = RESOURCE_REC,
observation = OBSERVATION_REC,
paras = PARAS_REC,
land = LANDSCAPE_REC,
time_taken = total_time,
agents = AGENT_REC,
cost = COST_REC,
action = ACTION_REC
);
I think that this list is fine, perhaps necessary, to keep, but the
ConFooBio group has also concluded that there should be some easier to
understand summary of the data. I propose that some function written, a
gmse_summary()
, that summarises the results in an easier to
understand way would be useful. The function could just be run as
below.
sim <- gmse();
sim_summary <- gmse_summary(sim);
The output of gmse_summary()
should be a list of all of
the relevant information that a user might want to plot or analyse. It
should include the following list elements.
sim_summary$resources
sim_summary$observations
sim_summary$costs
sim_summary$actions
More might be needed, but the above should be a good starting point that will provide four clear data tables for the user. The tables will look like the below.
1. Resource abundances over time
time_step | abundance |
---|---|
1 | 100 |
2 | 104 |
… | … |
99 | 116 |
100 | 108 |
In the above, only the resource abundance is reported to the software user, though it might also be useful to have additional columns as well eventually.
2. Observation estimates of abundance over time
time_step | estimated_abundance |
---|---|
1 | 102 |
2 | 101 |
… | … |
99 | 121 |
100 | 112 |
In the above, only the estimate from the observaiton submodel is reported to the software user. Additional columns might also be useful for things like confidence intervals, though for now I’m not sure if this is needed.
3. Costs set in each time step
time_step | manager | scaring | castration | culling | feeding | helping | unused |
---|---|---|---|---|---|---|---|
1 | 0 | 40 | NA | 60 | NA | NA | 0 |
2 | 0 | 36 | NA | 62 | NA | NA | 2 |
… | … | … | … | … | … | … | … |
99 | 0 | 0 | NA | 100 | NA | NA | 0 |
100 | 0 | 3 | NA | 97 | NA | NA | 0 |
In the above, the manager number is always 0 because this is the
number of the agent that has that role in GMSE. All impossible actions
(specificed by the simulation) are labelled NA
, while the
possible scaring and culling actions are given values that correspond to
the cost of each action for users in each time step. Hence the table
summarises policy for each time step in a way that software users can
interpret more cleanly.
4. Actions in each time step
time_step | user | scaring | castration | culling | feeding | helping | tend_crop | kill_crop | unused | crop_yield | harvested |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | 50 | NA | 50 | NA | NA | NA | NA | 0 | 90 | 12 |
1 | 2 | 59 | NA | 40 | NA | NA | NA | NA | 1 | 92 | 9 |
1 | 3 | 100 | NA | 0 | NA | NA | NA | NA | 0 | 89 | 0 |
2 | 1 | 44 | NA | 66 | NA | NA | NA | NA | 0 | 88 | 16 |
2 | 2 | 52 | NA | 48 | NA | NA | NA | NA | 0 | 94 | 12 |
2 | 3 | 98 | NA | 0 | NA | NA | NA | NA | 2 | 90 | 0 |
… | … | … | … | … | … | … | … | … | … | … | … |
99 | 1 | 36 | NA | 63 | NA | NA | NA | NA | 1 | 79 | 20 |
99 | 2 | 40 | NA | 60 | NA | NA | NA | NA | 0 | 83 | 18 |
99 | 3 | 28 | NA | 72 | NA | NA | NA | NA | 0 | 88 | 12 |
100 | 1 | 35 | NA | 62 | NA | NA | NA | NA | 3 | 82 | 18 |
100 | 2 | 37 | NA | 63 | NA | NA | NA | NA | 0 | 84 | 22 |
100 | 3 | 23 | NA | 77 | NA | NA | NA | NA | 0 | 84 | 13 |
The above action table has more rows in it than the cost table
because a row is needed for each user in each time step. This gives the
software user full access to each individual user’s actions, and their
results. Note that as above, castration, feeding, and helping, are not
options. Additionally, in this hypothetical simulation, tending or
killing crops are not options, so no actions are performed. Users divide
their budget between scaring and culling in each time step. The last two
columns also give useful information to the software user. The first is
crop yield on the user’s owned land (should probably be NA
if land_ownership = FALSE
), which will reflect the
percentage of the total possible yield (or maybe raw yield?) for each
user – hencing allowing the table to direclty correlate actions with
yield. The last column is the number of resources ‘harvested’, which I
think should count successful ‘kills’ (rather than just actions devoted
to culling). The realised culling might be lower than the actions
devoted to culling, for example, if not enough resources are actually on
the user’s land to cull. Additional statistics for each user could be
added in as columns, but this seems a good place to start. This
gmse_summary
producing a list of the above four tables will
be included in the next version of GMSE, along with the new plotting function highlighting the
conflict itself, and the gmse_apply
function discussed on
6 SEPT.
Continued progress has been made on slides for an upcoming talk.
I will be giving a talk on 19 September 2017 for the Mathematics and Statistics Group at the University of Stirling on GMSE as a general tool for management strategy evaluation. Slides for this talk will be available on GitHub.
The alternative approach from Wednesday
is being implemented smoothly. Passing user-defined functions in a
modular way is possible, but inputs and outputs need to be carefully
considered within gmse_apply()
. The objective is to make
things as easy and flexible as possible for the user, while also making
sure that the function runs efficiently.
A modular function for modellers
I am beginning work on a gmse_apply()
function, which
will improve the modularity of GMSE for developers. The goal behind this
function will be to provide a simple tool for allowing developers to use
their own resource and observation models and, with the correct inputs,
take advantage of the manager and user functions. Hence, simple resource
and observation models will be possible, but the flexibility of GMSE
should be retained as much as possible. A few starting points include
the following:
gmse_apply()
function will run a single cycle of
GMSE instead of multiple time steps.gmse()
options will be acceptable, but not
obvious, appearing as ...
passes that the user can add if
they want to change things. Otherwise, defaults will be used.Inputs and outputs of different functions will then include the following:
gmse_apply()
, the only thing required of the
user-defined function will be the population size vector, and other
parameters will be specified in the logistic function, e.g.:
gmse_apply(resource = LV(pop, K = 100, r = 1, ...), observation = ..., type = "Numerical")
.resource()
function, though some options to input landscape
and starting conditions will be needed (though these should also switch
to default).observation()
function, as as with the resource model.estimate_abundances
function in manager.c
can be bipassed entirely (i.e., I still think that we’ll want to call
the c function as normal, but add an option). This can be arranged
simply by reading in the observation vector (might have to re-structure
to an array a bit better?) as the OBSERVATION
array in c,
but then instead of running estimate_abundances
, allow an
if-else statement to read this array as abun_est
given a
new value in paras
, after which the genetic algorithm and
set_action_costs
can be run as normal. The irrelevant
output can just be ignored by the user model in
gmse_apply
.manager()
function in its normal form. As with the others, this will require some
eventual decisions about initialisation, but I can worry about this
later.gmse_apply()
that
would allow users to specify their own manager functions, but for the
moment this is just going to be hidden because the genetic algorithm
requires the COST
and ACTION
arrays
to be in the correct form for use. Hence, if someone later wants to
apply their own manager or user function, they will either need to get
the input-output correct (at the moment) or (eventually) use some
different input-output data structure that I make up later; but at some
point, things would just collapse down to MSEtools. Or, rather,
gmse_apply()
just becomes a trivial function that includes
four lines of compatible code, calling each model.ACTION
output from user()
will
just be translated in R to adjusting the vector.user()
function in its normal form. As with the others, this will require some
eventual decisions about initialisation to be determined later.As an alternative, at least to the implementation, I
think that the call could be made at the level of the individual
resource()
and observation()
R functions. This
was kind of always the plan, but there’s a semi-dirty way to mix
numerical resource and observation models with the full individual-based
manager and user models. This can be done by adding a model
option to be user defined through an
if( is.function(model) == TRUE )
in the resource.R
function. If the condition is satisfied, then resource()
will shift to the user generated model. This can actually be done for
all of the submodels very easily.
This alternative might be a better way to go. The aforementioned
‘dirty’ part of the technique might be to check to see if the output is
in the correct form, then, if only a vector is returned – turn it
into the correct form by making a data frame that has the same
number of rows by calling make_resource
.
The type 1 values could correspond to vector elements. Admittedly, this
could get slow for huge population sizes, but population sizes would
have to be massive for R to slow down from simplying making a matrix
with a lot of rows. In any case, it would at least standardise the input
and output for the user of gmse_apply
in a way that plays
nice with everything else in GMSE.
Similarly, the observation
function could also call make_resource
if a vector is
returned (since individual variation wouldn’t be relevant in the
numerical model).
With this alternative approach, no changes to the C code need to be made – the inputs and outputs just need to be tweaked into a standardised way when a vector or scalar is returned from any user-defined model (small detail – population size needs to be an integer). This can be an option later for the user and manager models – though I’m not sure how this would work, exactly. A benefit here is that some parts of the model could concievably individual-based, with others being numerical – the trade-off being the requirement for discrete resource numbers and a very small amount of slowdown (which will almost certainly not be noticable for any resaonable model).
The gmse_apply
function would then initialise a very
small number of agents, and a small landscape (unless otherwise
specified) in every run. The possibility of passing more options could
be applied with a simple ...
. This would also require a
sub-function build_para_vec
, which would be used for the
sole purpose of taking the list of options included (same as in
gmse()
) and passing it to the sub-function, with any
functions not passed being assumed defaults (and most would be
irrelevant). So the default function should then look like
gmse_apply(resource_function = "IBM", observation_function = "IBM", manager_function = "IBM", user_function = "IBM", res_number = 100, ...);
I think at least an initial population needs to be specified, but
everything else can be left up to the user, with the elipses passing to
the function building the parameter vector (which can also be called by
gmse()
, replacing some clutter). Overall, the function will
run without any input if none is specified, defaulting to an IBM with a
population size of 100 for one generation. All other options, including
non-standard functions, are left to the user.
Additional thoughts
Working this through, I’m slightly apprehensive
about the motivation for including gmse_apply
function.
Once you strip the mechanistic approach from the resource and
observation models, all you really have are two values: (1) the
population abundance or density and (2) the estimate of the population
size or density. Once you include the manage_target
into
gmse_apply
(necessary, I believe), then the genetic
algorithm is really just a fancy way of getting the difference between
the population estimate and the target size, and then setting a number
of culling actions acceptable for users. Users then cull as much as
possible because they’re assumed to want to use the resource as much as
possible. Of course, we can consider other parameters that affect user
actions (e.g., maximum catch, effort), but if we’re interested in
learning about how these concepts affect harvesting in theory, then they
can and should probably be studied using a simpler model. The real point
of the genetic algorithm is that it allows for complex, multi-variable
goal driven behaviour, as might occur given indirect effects (e.g.,
organisms on crop yield) or multiple options (e.g., culling versus
scaring or growing) and spatial complexities. There seems little to be
gained by calling the genetic algorithm to tell users to cull as much as
possible, which can be done with a (very) simple function.
I have finally fixed the annoyance in the shiny app of GMSE that caused the bottom of the browser to black, hence making it difficult to set parameter values in some tabs.
Additionally, by hovering over the different options in the application, the software user can now see a brief description of what each option does in the simulation.
I am experimenting with ways of demonstrating the conflict between what a manager incentivises, and what the users actually do, in GMSE. Below are some plots that show this for a few sample simulations. The five panels in each plot correspond to the five possible actions where policy is set. Policy set by the manager is shown with the black solid line, with the thin coloured lines reflecting individual user effort expended into each action.
The right axis is fairly easy to interpret – it’s just the percentage of the user’s total budget devoted to a particular action (note, this is not necessarily the number of actions a user performs because different actions can cost different amounts – hence the term ‘effort’).
The left axis is a bit trickier – it’s how permissive of an action the manager is in practice. High values correspond to an action being highly permitted by the manager (i.e., the manager invests no effort in making these actions costly), whereas low values correspond to an action being less permitted (i.e., the manger invests highly in making these actions costly for users).
The end result is that the lines indicating manger permissiveness are typically correlated with user effort towards any particular action. In the first example below, this is true for scaring and culling (as the manager becomes more permissive of these actions, users tend to take advantage and spend more effort doing them). Note that users do not feed because they have nothing to gain by feeding the resources, even though the manager is usually permits feeding (around generation 75, the population started going way over the manager’s target).
In the second example (below), the option for scaring has been removed. Because users want resources off of their land, the only option is to cull, so users will cull as much as permitted even though the manager is incentivising them not to as much as possible.
The below is a final example where all actions except helping are possible options.
While playing with the proto-type GUI, I
discovered a minor bug in the plotting function, which I fixed so that
the plot doesn’t make an error. I have also updated the list of
contributors in the description file, and the list of recommended
packages (shiny packages for the new gmse_gui
function).
I have now also added a new release version 0.2.2.8 to GitHub. This version requires three additional libraries:
The above three libraries will be imported as dependencies (or should be) in the new version of GMSE.
A proto-type GUI for the GMSE package is now up on shiny. I’m going to make this look nicer with a CSS style-sheet, but for now this gets the job done.
I am currently trying to get a handle on creating a GMSE GUI in shiny
by looking at the elementR package.
The authors of this package, to get their very impressive shiny
application running, need to nest multiple sub-functions inside the long
(10000+ line) runElementR
function. GMSE won’t need to have
this much code for the user interface – I have figured out roughly how
to make the input look good and functional in a browser, but a tricky
part will be to link that input with the gmse()
function
paramters, then run things.
In writing a draft manuscript, the term ‘stakeholder’ is being
applied to mean both managers and users. This differs from the model
itself and therefore in the use of GMSE. To resolve this, I think that
it would be worthwhile to change the documentation to match the
manuscript. But I don’t want to change the input
stakeholder
for any existing users of GMSE that might be
inconvenienced. Instead, I think just defining stakeholder
to be the number of managers and users could be fine by changing
stakeholder <- stakeholder + 1
in the
gmse()
function. This might need revisiting in later
versions (if we wanted to have multiple managers and stakeholders), but
such a change would be likely part of a much bigger release in which
major (and potentially inconvenient) changes would be unavoidable.
Following the release of GMSE v0.2.2.7
on CRAN, with
extended documentation,
as introduced on the ConFooBio blog
and my
blog, I shift my attention to the vignette. The vignette
in development will eventually be packaged into a futre version of
GMSE, then submitted as a separate methods paper.
GMSE v0.2.2.5
is now up on CRAN
(13:32 GMT), and my hope is that v0.2.2.7
will replace it
soon following some clarification of the documentation. I am avoiding a
public announcement of the package on CRAN until I receive confirmation
that the new version is accepted.
New logo
v0.2.2.0
: Bug fixes, new feature
While beginning to write up the vignette, I worked out a bug that applied to simulations in which stakeholder number was greater than 4 (tl;dr, these stakeholders were not acting according to their interests). This was fixed with commit 6ae58ec374f48464a0706fcf585dd5f1534e4511, and in fixing this I made the distinction between hunting type scenarios (where stakeholders have an interested in directly using resources) and farmer type scenarios (where stakeholders care about their land, and resources only indirectly because the resources affect the land).
I also added a new feature allowing the software user to adjust the
proportion of the landscape that is public_land
(commit f88545569a4c3e39906291759f376403b8e665f3).
This can be interpreted as land that is unmanaged and therefore
available for resources to use without fear of scaring or culling when
land_ownership = TRUE
. Also, now when
land_ownership = FALSE
, all land is considered public and
this is now reflected accurately with the plots.
I have also opted to change the default res_min_age
, the
age at which resources can be seen, to zero instead of one. This results
in plots that are above the defined carrying capacity sometimes because
carrying capacity is applied to adults, not juveniles when
res_death_K
is set. The result is a total carrying capacity
of (res_death_K
+ (1 * lambda
)), which
accounts for birth of juveniles in a population at carrying
capacity.
The fixed_recapt
is now running as intended as of commit
ad9d9e10ead215a703f9accdfbd149d35b350567.
New issues – proposed enhancements for the future GMSE
Before I lose track of all the proposed ideas for improving upon the GMSE package, I want to get all of them up as issues on GitHub. For completeness, I have also included the unresolved Issue 9. I will add to the below to form an organised list of future ideas to work on, all laid out as enhancement issues on GitHub. Anyone should be able to add to this list, or comment on the issues (e.g., if they would be especially useful ones to resolve).
Issue 9: Observation Error
It would be useful to incorporate observation error into the
simulations more directly. This could be affected by one or more
variables attached to each agent, which would potentially cause the
mis-identification (e.g., incorrect return of seeme
) or
mis-labelling (incorrect traits read into the observation array) of
resources. This could be done in either of two ways:
Cause the errors to happen in ‘real time’ – that is, while the observations are happening in the simulation. This would probably be slightly inefficient, but have the benefit of being able to assign errors specifically to agents more directly.
Wait until the resource_array
is marked in the
observation
function, then introduce errors to the array
itself, including errors to whether or not resources are recorded and
what their trait values are. These errors would then be read into the
obs_array
, which is returned by the function.
Issue 30: Manager assumptions about user actions
It would be useful to allow for simulations to dynamically adjust the
caution that the manager has when changing actions. At the moment,
managers always assume that some specified number of
actions
will be performed by users, and this number does
not change over the course of the simulation. But managers might be able
to use the history of user actions to learn to be more or less cautious
when setting new policy.
Issue 31: Modify manager’s predicted effects
Currently, the predicted effects of a manager’s actions are set to
values that, heuristically, appear to work in the genetic algorithm.
This is adjusted with the manager_sense
parameter, which
has a default of 0.1, such that the manager assumes that if they set
costs to increase culling by 100 percent, it will actually only increase
by 10 percent (as not all users are going to necessarily cull if given
the opportunity). Like real-world management, this is heuristic and
results in uncertainty, but future versions of GMSE could dynamically
modify this value during the course of the simulation based on real
knowledge of how policy changes have affected user actions in previous
time steps.
Issue 32: Long-term histories affect genetic algorithm
Currently, only the history of interactions from the previous time
step directly affects the genetic algorithm for stakeholders and
managers. For managers especially, this could be made a bit more
nuanced. The entire history of total actions and resource dynamics is
recorded, and this could easily be made available (e.g., in
PARAS_REC
) for managers to make decisions. Incorporating
these data into the genetic algorithm, and therefore into agent decision
making, could be tricky, but one simple example of this could be having
managers use the per-time step mean number of stakeholder actions in the
last 2-3 time steps to predict future user actions with a bit more
inertia. Managers could also use stakeholder action history from earlier
time steps, but weighting each by how long ago they occurred.
Issue 33: Non fixed mark-recapture sampling number
Currently, to simulate mark-recapture observation and analysis,
values for fixed_mark
and fixed_recapt
need to
be specified in GMSE, and the manager would have exactly these numbers
of marks and recaptures in each generation, respectively. It would also
be useful to, instead of specifying exact numbers, to have the manager
search a general area, then mark all resources in that area. Next, the
manager could search again and recapture, so the exact number is not
always set and the observation process probably mimics more closely what
happens in the field. This type of sampling is actually already
available (observe_type = 0
), so I would just need to add
some code to have managers interpret some observations as marks and
others as recaptures.
Issue 34: Resource interactions
Currently, more than one resource type is permitted, but this is not offered/visible to users of the software. A next major version of GMSE could have multiple resource types with resources actually interacting with one another (could borrow future development code from EcoEdu). Simple interactions could include competition and predator-prey functions in the resource model. The code is also already ready for managers and users to consider multiple resources in making policy decisions and actions, respectively.
Issue 35: Stakeholder lobbying
Currently, GMSE assumes that stakeholders have a negative
relationship with resources – they either want to hunt them or scare
them from their land. Future versions of GMSE should include an option
for a stakeholder type (e.g., activist) that lobbies the manager to
adjust the manager’s utilities, effectively increasing or decreasing the
target
. The data structure to do this already exists, it’s
just a matter of figuring out how best to enact it and why. For example,
would adding this have any actual effect that differs from just assuming
that the manager is being lobbied by conservationists continuously, and
that their target
is a reflection of that.
I need to double check that fixed_recapt
is doing what I
said it did on 23 JUN. My concern is that
it is not being implemented properly in the observation model – there
needs to be a difference between the first and second
times_obs
, or times_obs
might need to be
redefined for the first and second rounds of observation. It looks like
the observation model is just doing times_obs
observations
with the same number of samples in each one.
A better mark-recapture observation model estimator
Setting the parameters for the mark-recapture observation model
(observe_type = 1
) was confusing, so much so that I had to
remember how to do it. In v0.2.1.3
, I have fixed this so
that the sampling is clearer. Rather than having a
fixed_observe
argument in gmse()
, I’ve
included a fixed_mark
and fixed_recapt
;
arguments that only apply when observe_type = 1
. Under
these conditions, times_observe
is ignored and
fixed_mark
defines how many resources will be marked in
each time step; fixed_recapt
defines how many recaptures
will be made. If the value of fixed_mark
or
fixed_recapt
is greater than the actual size of the
resource popuation, then all resources in the population will be
sampled.
Get a better confidence interval for the density estimator
The density estimator is giving too few Type 1 errors because of
times_observe > 1
. This doesn’t affect anything but the
visualisation, since managers don’t make decisions based on confidence
intervals. Still, fixing the CIs would be a good idea. The CIs should
also be correct when times_observe = 1
. Really, the
times_observe > 1
is simulating a weird case in which
the central limit theorem would apply to the times_observe
estimates, and hence the mean estimate among time times observed should
be normally distributed around the mean.
Double-check for memory leaks with Valgrind
Running Valgrind on the R package GMSE revealed no memory leaks.
==15438==
==15438== HEAP SUMMARY:
==15438== in use at exit: 67,722,174 bytes in 20,290 blocks
==15438== total heap usage: 6,398,005 allocs, 6,377,715 frees, 1,211,596,543 bytes allocated
==15438==
==15438== LEAK SUMMARY:
==15438== definitely lost: 0 bytes in 0 blocks
==15438== indirectly lost: 0 bytes in 0 blocks
==15438== possibly lost: 0 bytes in 0 blocks
==15438== still reachable: 67,722,174 bytes in 20,290 blocks
==15438== suppressed: 0 bytes in 0 blocks
==15438== Reachable blocks (those to which a pointer was found) are not shown.
==15438== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==15438==
==15438== For counts of detected and suppressed errors, rerun with: -v
==15438== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I have changed some default parameters so that when I write up the default example, it provides a description that will be useful to new users.
I am in the process of organising the vignette. As I’ve done in a previous manuscript, I’ll start with notes and an outline and update now on the Rmarkdown file. The vignette will therefore evolve and be tracked through git, just like the code.
Also, while compensation payements are not yet included as a feature
of GMSE, I think that an option to include them should be relatively
easy to implement through the COST
and ACTION
arrays manager layers where column 1 equals -1
(landscape).
The killem
and helpem
columns remove and
increase crops, respectively, but the additional available columns could
be used to track compensation owed (to stakeholders) and paid (by
managers) – both at a cost, of course.
Unit tests are written
Unit tests for all sub-functions of the model are written, with the exception of functions used for plotting, which I don’t think are necessary to unit test because errors to the plot will be very obvious in development. Everything now passes CRAN checks except the licensing, which we’ll need to agree on at some point. As of Monday, I will be able to start on the vignette (manuscript).
Unit testing for long-term code maintenance
To ensure that the gmse
package functions as intended in
the long term, I am writing an extensive battery of unit tests that will
need to be passed to ensure that any new features do not introduce bugs
or break existing functions. To do this, I will use the testthat
package in R and follow the advice in Hadley Wickham’s chapter on
unit testing R code. I’ve done this already for the gamesGA package, which
is now on
CRAN, though the gmse
package will require much more
tests simply because there are many more functions to test.
The unit testing already helped by identifying a potential bug later on down the line when initialising cost arrays for simulations with more than two resources (see commit 65088054481266e67f06513dc368c515e4a9fed0. Unit tests for all initialisation functions except landscape functions are now complete. Next, I will need to do landscape functions and perhaps some (but probably not all) functions associated with plotting, then the four main GMSE model functions.
The density-based observation estimates were giving incorrect values. Looking into this, the reason for the error was because I was useing confidence intervals for proportions (e.g., the proportion of cells with resources on them) rather than counts (which will be, we assume, have a Poisson error structure). I have replaced the previous estimate of confidence intervals around local density with a Poisson estimate
\[ \hat{\lambda} \pm 1.96 \times \sqrt \frac{\hat{\lambda}}{vision^2} \]
In the above \(\hat{\lambda}\) is the estimated local density and \(vision\) refers to the total number of cells that managers can see.
With this new correction, and also fairly major bug fixes having to
do with fixing an error in landscape actions causing an infinite loop
(commit 310fb76b7e3b3499ab74e2f94c61c3276f3c4118
and fixing user actions to tend crops or kill crops appropriately
(commit 424fc2eb4f6274763f5ead0fc48ad5dd7f68c422),
I am now pushing to master and releasing v0.2.1.1
, which
effectively patches some major issues and improves plotting (including a
new legend for costs and actions).
Bug fix to the user function
An erroneous condition in a while
loop was causing an
infinite loop when manager budgets were very high and user actions were
not restricted to landscape that they owned. This has been fixed on the
development branch but not yet pushed to the master branch.
Some initial notes: GMSE (beta) package
v0.2.1.0
A beta version of GMSE is now available, and is ready to be
experimented with and tested as an R package. To download and begin
using GMSE, it is necessary to first download the devtools
library.
install.packages("devtools")
library(devtools)
Use install_github
to install using
devtools
.
install_github("bradduthie/gmse")
From here, it is possible to run GMSE simulations using the
gmse()
function. For help using this function, all documentation can be accessed by simply
calling the help files.
help(gmse)
The documentation contains a basic
description of the gmse()
function (the only one that is
needed to run simulations – subfunctions for resource, observation,
manager, and user models are all accessible as independent R functions,
but are not very useful at the moment without the initialisation in the
main function – nevertheless, the documentation for these can be
accessed with help(resource)
,
help(observation)
, help(manager)
, and
help(user)
). It also contains arguments for most of the
variables that might be usefully changed to simulate different types of
management scenarios; additional options are not shown for the moment
either because more coding is needed to make them useful or because I
don’t expect they will be needed. The explanations of the arguments are
detailed, along with documentation explaining the (extensive) amount of
data that is returned after running a simulation. To get started though,
the default simulation can be run simply.
sim <- gmse();
Parameter values can then be adjust by varying the options in the
gmse()
function.
R vignette, and the beginning of a methods paper
I will soon begin work on an R vignette, which is essentially a long form documentation that can also be a manuscript to submit to a journal.
Add some formal testing functions for future development
I will also need to add some formal R tests, which are
basically ways of automating the kind of testing that is done
continually while writing the code. The idea with formal unit tests is
to have a process that checks to see if the code breaks when a new
feature (and therefore new coded) is added. Since the results of the
simulation are stochastic, I think the best way to test is to set a seed
and use default parameter values, then check to make sure that the
results match the expect_equal_to_reference()
function in
devtools
. It might be useful to do this for each of the
resource()
, observation()
,
manager()
, user()
, and gmse()
models – perhaps testing the ith time step for each of the
sub-functions, but then the gmse()
function also as a whole
(perhaps using just 10 time steps would be sufficient for this instead
of a default of 100).
Introduce Issue #29: No edge effect causes crash
When edge_effect = 0
, and therefore nothing happens when
resources and agents move off of the edge of the landscape, R crashes.
This is almost certainly due to some sort of memory leak. This is a low
priority issue at the moment because I cannot think of a reason why
anyone explicitly want the model to just ignore resources moving off of
the landscape if someone wants something other than a torus
(edge_effect = 1
), such as a reflective edge or emigration
upon leaving the landscape, this should be explicitly coded into the
edge_effect
function in utilities.c
. Until
someone asks for it, I’ll stick with a torus.
New (draft) documentation for the gmse()
function
DESCRIPTION: GMSE simulation
The gmse function is the the primary function to call to run a simulation. It calls other functions that run resource, observation, management, and user models in each time step. Hence while individual models can be used on their own, gmse() is really all that is needed to run a simulation.
res_move_type
settings. Under default
settings, during each time step, resources move from zero to
res_movement cells away from their starting cell in any direction. Hence
res_movement is the maximum distance away from a resources starting cell
that it can move in a time step; other types of resource movement,
however, interpret res_movement differently to get the raw distance
moved (see res_move_type). The default value is 4.res_movement
cells away during a time step. Movement
direction is random and the cell distance moved is randomly selected
from zero to res_movement
. (2) Poisson selected movement in
the x and y dimensions where distance in each direction is determined by
Poisson(res_movement) and direction (e.g., left versus right) is
randomly selected for each dimension. This type of movement tends to
look a bit odd with low res_movement
values because it
results in very little diagonal movement. It also is not especially
biologically realistic, so should probably not be used without a good
reason. (3) Uniform movement in any direction up to
res_movement
cells away during a a time step
res_movement
times. In other words, the
res_movement
variable of each resource is acting to
determine the times that a resource moves in a time step and the maximum
distance it travels each time it moves. This type of movement has been
simulated in ecological models, particularly plant-pollinator systems.
The default movement type is (1).lambda
is the population growth rate also set
as an argument in gmse simulations.removal_pr
for each resource (which may be further affected by agent actions or
interactions with landscape cells). A value of (2) causes death to be
density-dependent (though potentially independently affected by agents
and landscape), with mortality probability calculated based on the
carrying capacity res_death_K
set in as an argument in gmse
simulations. The default res_death_type
is (2), as values
of (1) must be used carefully because it can result in exponential
growth that leads to massive population sizes that slow down
simulations.agent_view
from the cell of the manager. Managers sample
times_observe
subsets, where times_observe
is
a parameter value set in the gmse simulation. Managers then extrapolate
the density of resources in the subset to estimate the total number of
resources on a landscape. (1) Mark-recapture estimate of the popuation,
in which managers randomly sample times_observe
resources
in the population without any spatial bias (if there are fewer than
times_observe
resources, managers sample all resources)
times_observe
times with replacement. The first
fixed_observe
times are interpreted as marks, while the
remaining times are interpreted as recaptures (note that
fixed_observe
must be less than
times_observe
). Hence if a resource is observed at any time
in fixed_observe
independent observations, then it is
considered marked; if it is observed again at any time in
times_observe - fixed_observe
independent observations,
then it is considered recaptured. A Chapman estimate is used in the
manager model to estimate population size from these observation data.
(2) Transect-based sampling (linear), in which a manager samples an
entire row of the landscape and counts the resources on the row, then
moves onto the next row of the landscape until the entire landscape has
been covered. The number of cells in each row (i.e., the height) equals
agent_view
, so fewer transects are needed if agents can see
farther. If res_move_obs == TRUE
, then resources can move
on the landscape between each transect sampling, potentially causing
observation error if some resources are double counted or not counted at
all due to movement. If res_move_obs == FALSE
, then this
type of observation should produce no error, and resource estimation
will be exact. (3) Transect-based sampling (block), in which a manager
samples a block of the landscape and counts the resources in the block,
then moves on to the next (equally sized) block until the entire
landscape has been covered. Blocks are square, with the length of each
side equaling agent_view
, so fewer blocks are needed if
agents can see farther. If res_move_obs == TRUE
, then
resources can move on the landscape between each block sampling,
potentially causing observation error if some resources are double
counted or not counted at all due to movement. If
res_move_obs == FALSE
, then this type of observation should
produce no error, and resource estimation will be exact. The default
observation type is 0 for density-based sampling.times_observe
.observe_type = 0
)
and mark-recapture sampling (observe_type = 1
). In the
former case, the value determines how many times the manager goes out to
sample resources from a subset of the landscape. In the latter case, the
value determines how many times the manager goes out to attempt to find
new resources to mark or recapture (hence its value must be greater than
fixed_observe
).agent_move
cells away during a time step. Movement direction is random and the cell
distance moved is randomly selected from zero to
agent_move
. (2) Poisson selected movement in the x and y
dimensions where distance in each direction is determined by
Poisson(agent_move) and direction (e.g., left versus right) is randomly
selected for each dimension. This type of movement tends to look a bit
odd with low agent_move
values because it results in very
little diagonal movement. It also is not especially realistic, so should
probably not be used without a good reason. (3) Uniform movement in any
direction up to agent_move
cells away during a a time step
agent_move
times. In other words, the
agent_move
variable of each agent is acting to determine
the times that an agent moves in a time step and the maximum distance it
travels each time it moves. This type of movement has been simulated in
ecological models, particularly plant-pollinator systems. The default
movement type is (1).times_observe
times being observed. The default value is
TRUE, but if the option is set to FALSE then it shuts down all resource
movement during sampling (making observe_type = 2
and
observe_type = 3
error free).land_ownership
in the gmse() function), then this gives
some idea of where actions are being performed and where resources are
affecting the landscape. (3) Middle left panel: Shows the actual
population abundance (black solid line) and the population abundance
estimated by the manager (blue solid line) over time. The dotted red
line shows the resource carrying capacity (death-based) and the dotted
blue line shows the target for resource abundance as set in the gmse()
function; the orange line shows the total percent yield of the landscape
(i.e., 100 percent means that resources have not decreased yield at all,
0 percent means that resources have completely destroyed all yield). (4)
Middle right panel: Shows the raw landscape yield for each stakeholder
(can be ignored if land_ownership
is FALSE) over time;
colours correspond to land ownership shown in the upper right panel. (5)
Lower left panel: The cost of stakeholders performing actions over time,
as set by the manager. (6) Lower right panel: The total number of
actions performed by all stakeholders over time.start_hunting
time steps to ask the user how many resources
they want to hunt (some management information is given to help make
this choice). This feature will be expanded upon in later versions.
Right now, the human is playing the role of agent number 2, the first
stake-holder in the simulation. By default, this value is set to
FALSE.hunt = TRUE
. The default value is 95.ga_popsize
times,
and this population of individual agent actions undergoes a process of
natural selection to find an adaptive strategy. Selection is naturally
stronger in larger populations, but a default population size of 100 is
more than sufficient to find adaptive strategies.ga_popsize
times, and this
population of individual agent actions undergoes a process of natural
selection at least ga_mingen
times to find an adaptive
strategy. If convergence criteria converge_crit
is set to a
default value of 100, then the genetic algorithm will almost always
continue for exactly ga_mingen
generations. The default
value is 20, which is usually plenty for finding adaptive agent
strategies – the objective is not to find optimal strategies, but
strategies that are strongly in line with agent interests.ga_popsize
replicate agents are produced;
ga_seedrep
of these replicates are exact
replicates, while the rest have random actions to introduce variation
into the population. Because adaptive agent strategies are not likely to
change wildly from one generation to the next, it is highly recommended
to use some value of ga_seedrep
greater than zero; the
default value is 20, which does a good job of finding adaptive
strategies.ga_sampleK
strategies at random and with replacement from
the population of ga_popsize
to be included in the
torunament. The default value is 20.ga_sampleK
strategies at random and with replacement from
the population of ga_popsize
to be included in the
torunament, and from these randomly selected strategies, the top
ga_chooseK
strategies are selected. The default value is 2,
so the top 10 percent of the random sample in a tournament makes it into
the next generation (note that multiple tournaments are run until
ga_popsize
strategies are selected for the next
generation).ga_popsize
.max_ages
is 5.miminimum_cost
of actions, and the policy
set by the manager. The default user_budget
is 1000.
manager_budget This is the total budget for the manager when setting
policy. Higher budgets make it easier to restrict the actions of
stakeholders; lower budgets make it more difficult for managers to limit
the actions of stakeholders by setting policy. The default
manager_budget
is 1000.lambda
value to zero. The default value of
this is FALSE.tend_crops
. Actions on the
landscape cannot be regulated by managers, so the cost of this action is
always minimum_cost
. The default value of this is
FALSE.tend_crops
. Actions on
the landscape cannot be regulated by managers, so the cost of this
action is always minimum_cost
.manage_caution
of each possible action will always be
performed by stakeholders. I manager will therefore not ignore policy
for one action because no stakeholder is engaging in it; the default
value of manage_caution
is 1.ga_mingen
, the genetic algorithm will
terminate if the convergence criteria is met. Usually making this
criteria low doesn’t do much to improve adaptive strategies, so the
default value is 100, which in practice cases the genetic algorithm to
simply terminate after ga_mingen
generations.Returns: A large list is returned that includes
detailed simulation histories for the resource, observation, management,
and user models. This list includes eight elements, most of which are
themselves complex lists of arrays: (1) A list of length
time_max
in which each element is an array of resources as
they exist at the end of each time step. Resource arrays include all
resources and their attributes (e.g., locations, growth rates,
offspring, how they are affected by stakeholders, etc.). (2) A list of
length time_max
in which each element is an array of
resource observations from the observation model. Observation arrays are
similar to resource arrays, except that they can have a smaller number
of rows if not all resources are observed, and they have additional
columns that show the history of each resource being observed over the
course of times_observe
observations in the observation
model. (3) A 2D array showing parameter values at each time step (unique
rows); most of these values are static but some (e.g., resource number)
change over time steps. (4) A list of length time_max
in
which each element is an array of the landscape that identifies
proportion of crop production per cell. This allows for looking at where
crop production is increased or decreased over time steps as a
consequence of resource and stakeholder actions. (5) The total time the
simulation took to run (not counting plotting time). (6) A 2D array of
agents and their traits. (7) A list of length time_max
in
which each element is a 3D array of the costs of performing each action
for managers and stakeholders (each agent gets its own array layer with
an identical number of rows and columns); the change in costs of
particular actions can therefore be be examined over time. (8) A list of
length time_max
in which each element is a 3D array of the
actions performed by managers and stakeholders (each agent gets its own
array layer with an identical number of rows and columns); the change in
actions of agents can therefore be examined over time. Because the above
lists cannot possibly be interpreted by eye all at once in the
simulation output, it is highly recommended that the contents of a
simulation be stored and interprted individually if need be;
alternativley, simulations can more easily be interpreted through plots
when plotting = TRUE
.
GMSE is now a package
I have now made GMSE package, including documentation for all of the
R code except the main gmse()
function, which I will
complete soon. The package should be available to use as early as
tomorrow evening. There are still some additional tweaks that I will
continue to make, particularly to the plotting, and I want to add some
tests to the model as well. Uploading to CRAN will be done after some
beta testing – I’ll mainly follow Hadley Wickam’s book for advice
here.
Progess on new features
A new six by two plot for for case 2
and
case 3
observation functions has been added. Additionally,
G-MSE now records a new array PARAS_REC
, which holds
parameters each generation, including observation estimates and
confidence intervals. The PARAS_REC
will allow me to
simplify the plotting functions because the relevant data will be
calculated in C on the fly and neatly held in PARAS_REC
.
Additionally, I will add in the total actions for all stake-holders as
seven elements in paras
(five actions on resource
type1
and two landscape actions), and also the cost of each
action. This will not only make all of the plotting code much simpler,
it will also allow the potential for the history of actions and
costs to affect manager and stake-holder actions in future software
development.
It’s always tempting to push the model a bit further with new features or more efficient algorithms and code, but I think that now is the time to turn G-MSE into a package and send it off to colleagues to experiment with, which I will do tomorrow. Nevertheless, I want to hit a few points that will be very useful for future G-MSE features:
PARAS_REC
, but using these data will
be tricky.Each of the above will take a bit of planning in addition to coding. I’m not sure if they would also require the addition of new data arrays, but I think they are worth considering.
Resolve Issue #27
The start column (and, because observation column number equals
times
observed, the end column) was specified incorrectly
in the density and mark-recapture estimates in R. This meant that three
columns were sampled with values all equaling zero, and three columns
were not sampled with values equaling one and zero, to estimate
population size. Hence, this produced an underestimate of population
size in plots. The issue has now been resolved.
Additional user options
Additional user options now include the following (defaults shown):
stakeholders = 4, # Number of stake-holders
manage_caution = 1, # Caution rate of the manager
land_ownership = FALSE, # Do stake-holders act on their land?
manage_freq = 1 # Frequency that management enacted
Within the week the following features will be added:
case 2
and
case 3
observation functionsmanager.c
instead of in R (and saved in paras
)The above points should not take more than a day to complete, at most, and upon completing them I will then make G-MSE into a package that can be downloaded using devtools from GitHub. More long-term, I want to do the following, but this might not happen until after a draft of the methods paper is written.
Introduce Issue #27: Observation estimate understimates real population size
The case 0 observation type is consistently underestimating the true population size. This could be caused by a calculation that assumes that the size of the sampled area is larger than it actually is, or that the size of the landscape is smaller than it actually is; either way, the observation.c file needs to be double-checked and potentially debugged.
Playing around with parameter values
I have made some of the simulation inputs easier to work with on the user end and played around with different variable combinations on a relatively low-power laptop (Lenovo X201 Thinkpad). The simulation is a bit slower than desirable, but not so slow as to cause major issues (takes about a minute ore so to simulate a fairly big population of ca 200 with 12 stake-holders).
Introduce Issue #28: More stake-holders have fewer actions
For some reason, having more stakeholders appears to lead to less culling of resources even when all of them are attempting to do it. If there are more stakeholders to act, then actions should happen more often because each has the same budget.
Note that this appears to even occur when users are not restricted to their landscape; it might be something to do with double killing? There just aren’t enough resources dying in the model to match with the actions.
Resoved Issue #28:
Resolved – just input the stakeholder number incorrectly. See commit 6b63439b384cab90680f6a36a79f2c94eba46c45
Code is finally stable
I have now deliberately tried to crash G-MSE in multiple ways – the goal being to throw parameter combinations or options at the model in such a way as to cause the model to not work accurately. At first, I was successful at this when I forced managers to only allow for one management option (culling, scaring, etc.). After much debugging and testing, I have fixed this so that I am confident that the code runs as advertised, for the moment. Features that have now been included to G-MSE as a consequence of this process include the following.
movem
, killem
, castem
,
feedem
, and helpem
) are allowed; actions that
are not allowed can never be performed. At the moment, these actions are
still plotted as zeros, but soon they will be removed from the
plots.killem
at 1 instead
of 0. Yet if they wanted something between keeping culing the same and
doubling it, they were out of luck. If actions always cost at least some
value (default = 10), then some increment just above that value is
always available – hence it is better to simply give everyone a bigger
budget and set a minimum cost, giving more precision to managers to fine
tune policy.Finally shift to Friday’s goals
I have started to change some parameter inputs to make it easier to play with parameters, but I’m going to do more of this now that I’m much more confident in the G-MSE software. Once this is done, I will make the whole thing an Rpackage that can be downloaded using github developer tools, and I’ll add documentation before sending instructions around. Additionally, I will then start to write up some sample case studies (e.g., hunters on a public landscape or farmers trying to maximise yield) to show what G-MSE can do. Writing these out into an Rmarkdown file, I’ll have the start of a methods paper introducing the software.
Minor debugging
There are still a few minor bugs to work out, some of which I was able to take care of (see commit history if need be). I’m now trying to give the option to restric the number of possible actions, but restricting them seems to still produce some errors – namely, the genetic algorithm for managers doesn’t seem to be responding appropriately.
Landscape actions added to the user.c
file
I have added the function act_on_landscape
in
user.c
so that users can perform actions on the landscape.
The only two actions that the users can do, at the moment are
killem
and feedem
which effectively kill the
crop yield and increase it, respectively. All other action columns do
nothing. I’ve also added a new element to paras
that
modifies how much a user can increase crop yield (previously, I was
allowing users to double crop production on a cell only). Testing
confirmst that when users value crop yield and can greatly increase it
by feedem
, they will find this option and do so to increase
crop yield.
Resolution of Issue #21:
paras
now used everywhere
I have now cleaned the code so that paras
is effectively
used across all G-MSE functions. This effectively resolves Issue #21 and
makes the code more readable. Likewise, I have also cleaned up the
functions in a few places and introduced get_rand_int
for
easier sampling.
Next steps: Making it easier for users
The next steps as outlined on Friday are to do the following:
I don’t think that this will be too time-consuming because there is likely to be very little trouble-shooting and debugging for the above. Once all of this is done though, I will want to add the browser interface for G-MSE. This will be challinging, but the recently developed elementR package can provide some inspiration for working with shiny in a package that requires a lot of options to be set.
Rewritten do_actions
successful
I am largely satisfied with a rewrite of the do_actions
function, which affects the way that users perform actions on resources
by changing the rules to make actions simultaneous instead of sequential
by user. Instead of having one user perform actions on resources, then
another user perform actions, etc., the new do_actions
program instead just grabs the ACTIONS
array after the
genetic algorithm is called for all users and randomly performs
actions until no more actions exist. In other words, the order in
which the actions of all users are performed is effectively randomised
so that, for example, one user does not have an advantage of acting last
and therefore moving all of their resources to a neighbour’s territory
after their neighbour has performed all of their actions in a time step.
This implementation is probably slightly less efficient, but probably
not too much.
Landscape actions are not implemented at the moment, and will need to
be rewritten, though this should be considerably easier as their are
fewer actions to perform and the actions occur directly on the
landscape. The existing landscape_actions
(still not
deleted from user.c
) might be easy to edit even. Once this
is done, the whole model should be in place without any major issues;
I’m not sure if Issue #26 is
actually a bug, or just a consequence to be expected from a low-seeded
genetic algorithm, but the algorithm works either way, and not having a
seed is probably always a bad idea.
There are a few things that are definitely left to do.
landscape_actions
function in the
user.c
file.paras
. Some added elements might even help with the
plotting later and the management function (predicting growth rates) –
maybe make a paras_rec
that is a two dimensional array with
time_max
rows.One weird thing to address, which I actually don’t think is a bug: Sometimes when resource movement is very low (ca 1), the resources become highly autocorrelated on the landcape. I hypothesise that this is caused by some stake-holders doing a relatively poor job of killing resources at some point in the past and leading to a threshold of population growth that is localised and out of control. This happens often, but not consistently to the same agent, and sometimes to more than one agent in a simulation; it would be good to check to make sure that this is the correct interpretation of the patterns from the model.
One potential idea is to also give the manager a bit more information, in addition to allowing them to see growth rates of species empirically measured, to also see the enactment of policy in relation to how it is set. For example, if a manager sets a cost of 10 for killing, does that over or undershoot the target – the degree to which the target is over or undershot could be multiplied by the existing value to get a clearer prediction (e.g., if the manager wants 50 resources to die, but the way that those resources are distributed only allows 30 deaths because some resources are autocorrelated among different user’s land).
Resolved Issue #22
I’ve finally tracked down the bug that causes multiple resource types
to crash in the user function. The error was in the
land_to_counts
, which had conditions in the main
while
loop that couldn’t be met and were unnecesary (might
have to add one more at some point if we want multiple landscape layers
to work – later though). I’ve removed this condition, and also initiased
the COST
and ACTION
arrays without the
extraneous rows caused by landscape levels being repeated for resource
types; if nothing else, these were distracting, but I could see them
causing bugs later. As of commit 102018fc0457e510f87e812a97681860bed1a382,
G-MSE should be, in theory, functional with multiple resources, though I
still have the rewrite of do_actions
to do.
Major test fails – rewrite of do_actions
needed
The user function has a major bug that is causing strange things to
happen to resources. Consistently, resources are piling up on one or
another user’s land – I’ve found little rhyme or reason why, but it is
caused at least partly by the location-specific nature of user actions
(in other words, once u_loc = 0
and users can affect any
resource on the landscape, no spatial pattern exists). Note that this
happens even when users cannot move resources (e.g., only kill
on their land), so it’s not just that the last agent to act clears all
the resources from their landscape. The resources always seem to collect
on one owner’s land, and it’s not consistent whose (nor is there any
seeming connection from the spatial distribution and the agent
actions).
This bug gives me an excuse to re-write
do_actions
, which I probably needed to do anyway
because Issue
#22 is still unresolved. A rewrite of do_actions
and
everything down stream might fix the resource type specification error.
As of now the do_actions
function is called for each agent
sequentially, and each agent then performs their actions on each type of
resource and landscape level by moving through rows of the
ACTION
array (with error if
there is more than one type of resource). Hence, one user does all of
their business, then another, and so on.
It would be much better to do this all simultaneously, and it
shouldn’t take too much computation time or coding time. Instead of
going through agents sequentially, the idea is to copy the entire
ACTION
array (all agents having gone through their genetic
algorithms) into a function. Next, calculate the total number of actions
to be performed. Then, sample a random row, column, and layer of
ACTION
, which will be associated with a randomly selected
agent. The lucky winner will then randomly sample rows of the
RESOURCE
array until they find one that they can affect
(e.g., is on their land, has not been killed, and is of the correct
type); if they exhaustively search all resources but cannot find one to
affect, then they don’t perform the action – note the element
in the copied ACTION
array should not necessarily be set to
zero because another agent might subsequently kick a resource onto their
land to kill; it should decrement the action by one though (else a clear
risk of infinite loop). Landscape actions can proceed the same way; the
random selection simulates people doing things simultaneously over the
course of a season.
Introduce Issue #26: Genetic algorithm seed reliance
For some reason, the initial seed of the genetic algorithm appears to be having an effect that it shouldn’t. When there are no individuals seeded in the genetic algorithm from the previous generation, the agents appear to go under-budget. It’s not clear why this is the case. Oddly, managers appear to use a budget of 250 despite it being set at 300 given any seed greater than zero. When the seed is zero, the budget for setting costs drops to ca 100 for reasons that are not at all clear to me. For stake-holders, the cost drops to a fraction of its set budget (about a 30th of it). Yet, the stake-holder cost is still too low even when a seed of 20 is set; most stake-holders spend ca 1/6 of their budget when they should be forced to spend all of it.
Stake-holders are helping resources when they should not
Stake-holders are helping resources. This was caused by some issues
in resource_actions
in the user model which has now been
partially resolved in commit f1ce95e092739e6e53df05b326c491d917679eb9.
Essentially, resources were being helped out too much (i.e., growth
rates went from 0.05 to 1 when helping – changed now to 0.05 times two –
increasing birth rate 100 percent), and sometimes being helped out even
after having been killed or castrated. This is still
happening, as is evident when looking at
RESOURCE_REC
. Resolving it is priority one.
Cleanup and toward resolving Issue #21
I have done some more clean-up of the manager.c
file,
mainly reducing the number of arguments passed to functions using the
paras
vector. I’ve also removed some more hard-coded
values, particularly by defining columns for things like resource types
by holding column numbers in paras
. I’m not sure whether I
want to do this for the action and cost array cols 7-12 in
set_action_costs
yet. It might be a good idea. One
thing to keep an eye on is the para[66]
value,
which now is just the number of resources (also 1 minus the
lookup
table rows). It holds together for now, and nicely
can be affected globally, but I need to pay attention to how
its affecting management in the set_action_costs
function.
Note on managing-observing trade-off
We could introduce a trade-off between observation and allocating
costs for the manager in G-MSE, as in Milner-Gulland (2011). Running this through the
genetic algorithm could be a challenge – somehow the observation
intensity would need to be put into the fitness function. Storing it
would be fairly trivial – could just use bankem
, but
converting observation time to manager fitness would require somemore
thought.
Introduce Issue #25: Agent’s action error
For some reason, some initial testing seemed to suggest that resource population growth increases with the number of stake-holders, even if those stake-holders are hostile to the resource. Some further testing confirmed that stake-holders don’t engage in actions there are more than two of them – it’s possible that I hard-coded something during testing, but it needs to be fixed. For now, I’m shifting the default testing options to 3 stake-holders to isolate the issue.
Resolve Issue #25: Agent’s action error
That was quick. What happened was an issue with the COST
and ACTION
array – I basically had the code to initialise
three but not four stake-holders accurately. When a fourth was
initialised (or nine, in the case of one test), the stake-holder did
nothing because it was devoting itself to costly non-sense actions from
the start and couldn’t get out of them. Resources then did better
because there were fewer agents able to affect them (those agents owning
a smaller amount of land). When this is resolved, a fourth stake-holder
performs the expected actions and the population dynamics oscillate even
more as a result because more total actions are being performed
(and on more land, as I’ve set it).
More progress toward resolving Issue #21
I’ve now reduced the number of arguments and hard-coded values in the
functions of observation.c
, leaving only the
transect
and sample_fixed_res
functions to go.
Overall, I do think tha this makes the code more readable, and
everything goes back to the paras
vector, which will be
useful later in input and output during software testing and use.
Progress toward resolving Issue #21
The functions in resource.c
now take the
paras
vector as an argument where practical (most of the
time). This cleans the code up quite a bit and has the nice side effect
of giving me an excuse to also remove some of the hard coded values
(even if they don’t change, this is probably a good idea).
Concrete plans for cleaning up the code
Now that the main engine of G-MSE is in place, there are a few things that I want to do in the next week or two to clean up the code.
para
vector as originally intended. It shouldn’t take too
much extra work to do this, and it can be done systematically for each c
file by adding any new arguments to para
if they are not in
it already.gmse
. It’s not necessary for running
the model, but it would be very nice to somehow create some options to
build the COST
and ACTION
arrays for some
simple scenarios – perhaps even have a way to edit these arrays easily
within the code (then, eventually, as input into
gmse()
).Following these things, it would also be helpful to do the following.
Resolved Issue #24
Resources retain helpem
and
feedem
Issue #24
appears to be resolved, although the it was a bit trickier than
anticipated to do so. I created three additional columns in the
RESORUCE
array to store the change in the baseline values
of birthrate, death probability, and offspring number. As far as I can
tell, there is no longer any carryover in these demographic values, nor
do parents pass on their adjusted values to their offspring. Fixing this
required several changes to user.c
and
resource.c
. As a consequence, death rate caused by killing
is now completely independent of carrying capacity (as seems sensible).
Another thing to decide is if increases in birth rate or
offspring number caused by user actions should also be independent of
carrying capacity; that is, when users helpem
or
feedem
, are they increasing the carrying capacity itself,
or just the population growth rate to carrying capacity (as of
now, it’s the latter).
A working example – but still some debugging to do
After resolving Issue #24, some initial testing shows that the model appears to be working as intended (more testing is obviously needed). The below shows a scenario in which one resource has a small effect on crop yield. The upper left panel shows resources on the landscape. The upper right panel shows land ownership (the blue is public – manager owned – land). The middle left shows population abundance (black) and its estimate (blue); carrying capacity is 400 (red dashed line), but the manager is trying to keep the population around 200 (dashed blue line) – mean percent yield of the crop is shown in the orange line. The mid right panel shows yields of each plot. The lower left panel shows the changing policy set by the manager – red lines show the cost of stakeholders killing or castrating resources (very high values effectively prohibit it). Green shows the cost of moving (scaring) resources off the stake-holder’s land, and blue shows the cost of helping the resource (increasing its birthrate or offspring production). The lower right shows what stake-holders do in response to policy – colours show actions corresponding to the same colour costs in the lower left panel.
So in the above example, we have the manager effectively prohibiting killing or castrating resources until about generation 18, when the population gets higher than desired. At this point, the manager switches to allow killing and castrating, and makes moving and helping resources more costly – stake-holders respond to this by doing a bit more killing and castrating, and the population goes down in response.
Looking good, but still need to clean the code
The above example is encouraging, but there is still quite a bit of
clean-up to do. More unit testing is necessary to make sure that all
resources are doing what they should, and I think the interaction
between resources and landscape could be made a bit better in the
resource model. Also, setting the initial costs and utilities is quite
messy – I need to fix this up a bit so that there is at least one easy
place to do this in the code, then an easy way to do it as an argument
in the gmse
function. It would also be nice not to have
managers or stake-holders be quite so short-sighted – but having
decisions be made based on history will require quite a bit more work,
though the structure is there for it to be done in the code.
One more debugging in the genetic algorithm
A bug in the code was causing managers to set their marginal fitneses
to zero within the genetic algorithm. The reason for this was that the
functions crossover
and mutation
allowed for
util
, u_loc
, and u_land
columns
to be changed when the zero column of the action array was positive –
i.e., when the actions corresponded to affecting other agents
utilities in some way. The reason for this coding is so that agents can
potentially affect one anothers utilities (e.g., a stake-holder lobbying
the manager), but it does not make sense for stake-holders to affect
their own utilities. The bug was caused because when the manager mutated
(or crossed over) to change their own utilities in some way, the high
cost recognised this as over-budget and set the value to zero, hence
replacing the marginal utility set in the manager model. This was easily
fixed by not allowing an agent to affect its own utility values (i.e.,
disallow the utility columns to be changed when the first column of the
ACTION
array equal’s the agent’s own ID). This would have
caused an issue later anyway, so it’s better to spot it now. Re-running
the model, the bug is fixed and the manager marginal utilities are
retained in the appropriate row of ACTION
(see commit 4dacbe83ed1be0d1216b692a1db18f5323ed22f2).
Another thing that needs de-bugging
For some reason, managers are going way overbudget in allocating actions. Fortunately, they’re at least allocating their actions well, but I need to fined out why their budget looks more like 500 when I set it to 100. Note that this only happens when managers want more of a resource, not fewer. Perhaps the marginal utility is getting added into the budget? Yes, this appears to be the case and has been fixed with commit d60312da590630fc2a680a57b8daed8e6d6bfafd, and now the costs no longer go over the manager’s budget.
Valgrind summary
Some initial testing revealed that some memory might have been poorly
allocated; allocating space for an int
instead of a
double
in the genetic algorithm was flagged by
valgrind
. After fixing this (see commit 57d0c78de7e421687870749549d309cf85d31dab),
valgrind
returns no errors or leaks.
==8048==
==8048== HEAP SUMMARY:
==8048== in use at exit: 194,900,716 bytes in 19,004 blocks
==8048== total heap usage: 12,387,437 allocs, 12,368,433 frees, 2,482,975,628 bytes allocated
==8048==
==8048== LEAK SUMMARY:
==8048== definitely lost: 0 bytes in 0 blocks
==8048== indirectly lost: 0 bytes in 0 blocks
==8048== possibly lost: 0 bytes in 0 blocks
==8048== still reachable: 194,900,716 bytes in 19,004 blocks
==8048== suppressed: 0 bytes in 0 blocks
==8048== Reachable blocks (those to which a pointer was found) are not shown.
==8048== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==8048==
==8048== For counts of detected and suppressed errors, rerun with: -v
==8048== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I dare say that this might nearly be an alpha version of the software. I just need to get some more clever ways to input array values, and make sure that I resolve Issue #22.
Issue
#24 Resources retain helpem
and
feedem
Resources are retaining their values of helpem
and
feedem
after being helped for one generation.
Worse, they are passing their inherited characteristics on to their
offspring. This needs to be changed so that agent actions have the
temporary effect of increasing offspring survival
probability or reproduction – else populations will never run the risk
of crashing.
Some success with manager fitness function debugging – more testing needed
After much time working on debugging the manager fitness function, I believe that all of the bugs are worked out of it, and that the managers are now responding dynmically to agent actions and resource abundances. I now need to test the whole function in multiple different ways to confirm this, and to make sure that the manager sets policy as predicted for some very simple scenarios.
helpem
and feedem
costly when
resource abundance is higher than what managers want it to be, but
stake-holders want more resources.One potential issue I’ve already noticed – if managers make stake-holder actions so costly that they never perform them, then the manager might operate under the assumption that they will never perform the action even if costs drop. It might therefore be necessary to add an increment to the total actions (e.g., add 10 to each, just to give managers the ability to consider the possibility) or somehow have managers tie predicted actions to stake-holder utilities (I don’t like this as much – to speculative and computationally intense).
Debugging the manager fitness function in the genetic algorithm
Today I have spent my time attempting to completely debug the newly
created manager_fitness
function and its sub-functions.
Unfortunately, one bug still appears to remain. For some reason, the
function adds actions to the POPULATION
array in the first
row. This issue has been isolated, and I’m almost sure that it is caused
by something in manager_fitness
. Tomorrow, the goal is to
fix this so that actions are applied correctly where the row’s first
column is 1 (the manager agentID
).
New Issue #23: Revise predicted consequences of user and manager actions
In functions in the genetic algorithm res_to_counts
and
policy_to_counts
, the projected consequences of actions
needs to be fine tuned. As of now, it predicts one fewer resource from
movem
, killem
, and castem
, and
one more resource from feedem
and helpem
in
res_to_counts
. In policy_to_counts
, it
predicts one fewer resource for killem
and one more
resource for feedem
and helpem
. Really, there
should probably at least be an option to use more precise estimates of
what will happen. For the user function, this matters a bit less because
stake-holders typically just want more or less of a resource. Managers,
however, are trying to hit a middle ground a lot of the time; it is also
more reasonable to assume that they have demographic information on the
resources of interest.
More writing and re-writing the manager genetic algorithm
I have completed an initial draft of the manager fitness function
manager_fitness
and its associated sub-functions
policy_to_counts
and sum_array_layers
. The
function manager_fitness
might need to be pruned a bit by
adding a third sub-function, as it’s a bit long at the moment.
/* =============================================================================
* This is a preliminary function that checks the fitness of each agent by
* passing through a loop to payoffs_to_fitness
* fitnesses: Array to order fitnesses of the agents in the population
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* agent_array: The agent array
* jaco: The jacobian matrix of resource and landscape interactions
* interact_table: Lookup table for figuring out rows of jaco and types
* interest_num: The number of rows and cols in jac, and rows in lookup
* ========================================================================== */
void manager_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, double **agent_array, double **jaco,
int **interact_table, int interest_num, int agentID,
double ***COST, double ***ACTION, int COLS, int layers){
int agent, i, row, act_type, action_row, manager_row, type1, type2, type3;
double agent_fitness, *count_change, foc_effect, change_dev, max_dev;
double movem, castem, killem, feedem, helpem, *dev_from_util;
double utility, *utils, **merged_acts, **merged_costs, **act_change;
count_change = malloc(interest_num * sizeof(int));
utils = malloc(interest_num * sizeof(int));
dev_from_util = malloc(interest_num * sizeof(double));
merged_acts = malloc(ROWS * sizeof(double *));
for(i = 0; i < ROWS; i++){
merged_acts[i] = malloc(COLS * sizeof(double));
}
merged_costs = malloc(ROWS * sizeof(double *));
for(i = 0; i < ROWS; i++){
merged_costs[i] = malloc(COLS * sizeof(double));
}
act_change = malloc(ROWS * sizeof(double *));
for(i = 0; i < ROWS; i++){
act_change[i] = malloc(COLS * sizeof(double));
}
sum_array_layers(ACTION, merged_acts, 0, ROWS, COLS, layers);
sum_array_layers(COST, merged_costs, 1, ROWS, COLS, layers);
max_dev = 0;
for(agent = 0; agent < pop_size; agent++){
for(action_row = 0; action_row < interest_num; action_row++){
count_change[action_row] = 0; /* Initialise at zero */
utils[action_row] = 0; /* Same for utilities */
while(population[action_row][0][agent] < -1){
type1 = population[action_row][1][agent];
type2 = population[action_row][2][agent];
type3 = population[action_row][3][agent];
manager_row = 0;
while(population[manager_row][0][agent] == agentID &&
population[manager_row][1][agent] == type1 &&
population[manager_row][2][agent] == type2 &&
population[manager_row][3][agent] == type3
){
manager_row++;
}
}
policy_to_counts(population, merged_acts, agent, merged_costs,
act_change, action_row, manager_row, COLS);
foc_effect = 0.0;
foc_effect -= act_change[action_row][9]; /* See Issue #23 */
foc_effect += act_change[action_row][10];
foc_effect += act_change[action_row][11];
for(i = 0; i < interest_num; i++){
count_change[i] += foc_effect * jaco[action_row][i];
}
utils[action_row] = population[manager_row][4][agent];
}
for(i = 0; i < interest_num; i++){ /* Minimises dev from marg util*/
change_dev += (count_change[i]-utils[i])*(count_change[i]-utils[i]);
}
if(change_dev > max_dev){
max_dev = change_dev;
}
dev_from_util[agent] = change_dev;
}
for(agent = 0; agent < pop_size; agent++){
fitnesses[agent] = max_dev - dev_from_util[agent];
}
for(i = 0; i < ROWS; i++){
free(act_change[i]);
}
free(act_change);
for(i = 0; i < ROWS; i++){
free(merged_costs[i]);
}
free(merged_costs);
for(i = 0; i < ROWS; i++){
free(merged_acts[i]);
}
free(merged_acts);
free(dev_from_util);
free(utils);
free(count_change);
}
The policy_to_counts
function feeds new actions back to
the main manager fitness function based on the new costs imposed by
managers. We assume that new actions are proportional to the percent
increase or reduction to costs (e.g., twice as many killem
actions if the manager makes it cost half as much). I cases where the
cost drops to zero (debating whether I want his to be possible –
probably not), we assume the new cost is 0.5 and calculate
accordingly.
/* =============================================================================
* This function updates a temporary action array for changes in policy
* population: The population array of agents in the genetic algorithm
* merged_acts: The action 2D array of summed elements across 3D ACTION
* agent: The agent (layer) in the population being simulated
* merged_costs: The mean cost paid for each element in the ACTION array
* act_change: The array of predicted new actions given new costs
* action_row: The row where the action and old costs are located
* manager_row: The row where the new costs from the manager are located
* COLS: The number of columns in the ACTION and COST arrays
* ========================================================================== */
void policy_to_counts(double ***population, double **merged_acts, int agent,
double **merged_costs, double **act_change,
int action_row, int manager_row, int COLS){
int col;
double old_cost, new_cost, cost_change, new_action;
for(col = 0; col < COLS; col++){
old_cost = merged_costs[action_row][col];
new_cost = population[manager_row][col][agent];
if(new_cost == 0){
new_cost = 0.5; /* Need to avoid Inf increase in cost somehow */
}
cost_change = old_cost / new_cost;
new_action = merged_acts[action_row][col] * cost_change;
act_change[action_row][col] = floor(new_action);
}
}
The function sum_array_layers
is basically an apply
function in R, except that it only works with the COST
or
ACTION
arrays, and only in one dimension.
/* =============================================================================
* This function sums (or averages) a row of COST or ACTION across all layers
* array: The 3D array that is meant to be summed or averaged
* out: The 2D array where the summed/average values are to be stored
* get_mean: TRUE (1) or FALSE (0) indiciating whether to get mean vs sum
* ROWS: Number of rows in array
* COLS: Number of cols in array
* total_layers: How many layers there are in array (depth)
* ========================================================================== */
void sum_array_layers(double ***array, double **out, int get_mean, int ROWS,
int COLS, int layers){
int row, col, layer;
for(row = 0; row < ROWS; row++){
for(col = 0; col < COLS; col++){
if(get_mean == 1){
for(layer = 0; layer < layers; layer++){
out[row][col] += (array[row][col][layer] / layers);
}
}else{
for(layer = 0; layer < layers; layer++){
out[row][col] += array[row][col][layer];
}
}
}
}
}
I have not tested any of these functions at all. They almost
certainly contain some bugs at the moment, so a lot of work is going to
need to debugging them and making sure that they actually are doing what
I want them to do. Tomorrow might be a good time for a thorough
debugging and memory leak checks. If all this works though, managers
should be able to dynamically change costs in response to stake-holders
to manage resources – once the appropriate call from
manager.c
is in place (it hasn’t been coded yet, but this
should be trivial to write). Note that the git history
immediately prior to commit 79446e394133bb9e6b4792d334ab863e32ef0881
will show some attempts at getting the above functions working in
different ways. I settled on the above after restructuring the code
considerably for both speed and readability.
Linking manager marginal utilities and manager actions remains difficult, but I have decided on the following plan to move forward.
It will be useful to develop a very simple criteria for assessing the
fitness of adjusting costs in strategy_fitness
in
game.c
. Do this in the switch
function where
case 1:
, but include an if
statement to make
sure that if(act_type == agentID)
, then the genetic
algorithm knows that it’s affecting all other user actions in the
-2
row. A new function
policy_to_counts
will be created in game.c
which takes in the ACTION
and COST
arrays.
This new function will assume two things.
The proportion of +
, -
,
and 0
actions (from the perspective of a stake-holder) will
not change – i.e., stake-holders will try to achieve the same ends in
the next time step as they did in the previous time step. The
movem
column will be defined as -
if
util_loc = 1
and util_land = 1
, else it will
be defined as 0
(again, from the stake-holder perspective,
from the manager’s perspective this is always 0
– at least,
I can’t think of any reason why we would want it not to be zero.
Stake-holders will invest in whatever +
,
-
, or 0
action is least costly. Hence to
loosely predict stake-holder actions, the manager could simply assume
that the stake-holder invests a proportion of their total budget to the
least costly action, and based on the manager’s set cost, puts their
budget into those actions accordingly. Hence, if we had a farmer that
wanted to increase crop yield by reducing resource abundance, and had to
choose between movem
, killem
, and
castem
with costs of 10, 2, and 5, respectively, then the
farmer would put all of their budget into killem
(or a high
proportion, at least). This requires the manager (whose actions are
already set within the genetic algorithm) to get a proportion for each
of the stake-holders actions, then divy their actions out based on the
revised costs.
The initial plan: getting something to work
Let’s try all of the above again. We’re trying to get from cost adjustments to fitness. We have the cost adjustments in hand; the manager population in the genetic algorithm is in the process of selecting which of these adjustments are best. The difficultly is now translating the cost adjustments to stake-holder actions, and figuring out just how good we want managers to be at assessing stake-holder actions. One extreme is to run the genetic algorithm in each within the manger’s decision making to figure out how stake-holders will respond to policy change with a high degree of acurracy; this would take a massive amount of computation time and be a bit unrealistic in that it would kind of assume that managers can read the minds of stake-holders.
Another extreme is to assume the sum total of each action will not
change and to adjust costs accordingly. Perhaps, to start, we could
define a new array within the new function
policy_to_counts
, **sum_actions
, which would
sum up all stake-holder actions for each resource type.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem | bankem |
---|---|---|---|---|---|---|---|---|---|---|---|---|
-2 | 1 | 0 | 0 | 10 | 10 | 10 | 32 | 2 | 16 | 0 | 0 | 0 |
-2 | 2 | 0 | 0 | 301 | 10 | 10 | 4 | 1 | 0 | 0 | 1 | 1 |
The hypothetical sum_actions
array above adds up all of
the rows in the ACTION
array where column 1 equals -2. For
each resource we then get a picture of what is going to happen in the
next generation (to some extent it is unrealistic to assume that
managers have even this much detailed information, but then again, these
are actions from the previous time step, so it’s perhaps not too much of
a stretch to assume that the manager has some idea of what actions were
taken by stake-holders). We also get a picture of the sum utilities for
each resource type. To project the consequences of manager cost
adjustment, managers can compute the proportion change in cost
(which will require to read the COST
array into the fitness
function) and assume that the proportion of actions changes accordingly.
For example, if the manager makes it half as costly to
killem
for resource type1 = 1
above, then they
could assume that killem
will be 32 total actions in the
next generation. These sum total actions, adjusted by manager
changes in costs, could then be run through the interaction array to
project the change in resource abundance – fitness could be assessed by
minimising the the difference between the projected change in resources
from the marginal utilities.
This isn’t perfect prediction. Sometimes stake-holders will probably radically change behaviour after some cost threshold is met, but I think this is kind okay (at the very least, managers will respond in the next generation).
Other ideas
I will start coding the above plan, but there are probably other
reasonable options to consider. I would like to also add the option of
enacting policy via a second resource – representing resources as
something like hunting licenses. The effects of these licenses could be
understood through the interaction matrix (essentially, they’d be like
introducing a predator, but one that goes through stake-holders). The
manager could set the number of hunting licenses using the
feedem
(increases birth rate by action number) and
castem
(causes one fewer so resource doesn’t reproduce)
columns (licenses would otherwise have a birthrate and death rate of
one, so each replaces itself in the next generation) – birth type would
also need to be changed to not be selected from a random Poisson. The
bankem
column could be interpreted as buying a license,
somehow.
Implementation of the initial plan
This was somewhat difficult because of the way that marginal
utilities are handled in the manager.c
file. A new vector
needed to grab the correct utilities and actions for adjusting costs and
it was easier and more readable to just write a separate
manager_fitness
function (it can still be called by
non-managers, though I’m struggling to think of when this would be
desirable). The manager_fitness
function is unfinished.
/* =============================================================================
* This is a preliminary function that checks the fitness of each agent by
* passing through a loop to payoffs_to_fitness
* fitnesses: Array to order fitnesses of the agents in the population
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* agent_array: The agent array
* jaco: The jacobian matrix of resource and landscape interactions
* interact_table: Lookup table for figuring out rows of jaco and types
* interest_num: The number of rows and cols in jac, and rows in lookup
* ========================================================================== */
void manager_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, double **agent_array, double **jaco,
int **interact_table, int interest_num, int agentID){
int agent, i, row, act_type, action_row, manager_row, type1, type2, type3;
double agent_fitness, *count_change, foc_effect, change_dev;
double movem, castem, killem, feedem, helpem;
double utility, *utilities;
count_change = malloc(interest_num * sizeof(int));
utilities = malloc(interest_num * sizeof(int));
for(agent = 0; agent < pop_size; agent++){
for(i = 0; i < interest_num; i++){
count_change[i] = 0; /* Initialise all count changes at zero */
utilities[i] = 0; /* Same for utilities */
}
action_row = 0;
while(population[action_row][0][agent] < -1){
type1 = population[action_row][1][agent];
type2 = population[action_row][2][agent];
type3 = population[action_row][3][agent];
manager_row = 0;
while(population[manager_row][0][agent] == agentID &&
population[manager_row][1][agent] == type1 &&
population[manager_row][2][agent] == type2 &&
population[manager_row][3][agent] == type3
){
manager_row++;
}
}
/* Get the marginal utilities into utilities by running policy_to_counts
* and get the count_change the same way. The above runs thorugh this
* for each agent and for each resource Here still within the agent loop
* we need to get the vectors summed appropriately to a reasonable
* fitness metric (keeping in mind that it's not just ordinal
*/
fitnesses[agent] = 0;
for(i = 0; i < interest_num; i++){ /* Minimises dev from marg util*/
change_dev = (count_change[i] - utilities[i]) *
(count_change[i] - utilities[i]) + 1;
fitnesses[agent] += (1 / change_dev);
}
}
free(utilities);
free(count_change);
}
Likewise, a sub-function that manager_fitness
will call
also needs some work.
/* =============================================================================
* This function updates count change and utility arrays for changes in policy
* population: The population array of agents in the genetic algorithm
* interact_table: The lookup table for figuring out how resources interact
* int_num: The number of rows and cols in jac, and rows in the lookup
* utilities: A vector of the utilities of each resource/landscape level
* agent: The agent in the population whose fitness is being assessed
* layers: The number of layers (z dimension) in the COST and ACTION arrays
* COST: The cost array, for comparison with how costs change with actions
* ACTION: The action array to summarise current stake-holder actions
* agentID: The ID of the agent doing policy (should probably always be 1)
* ========================================================================== */
void policy_to_counts(double ***population, int **interact_table, int int_num,
double *utilities, int agent, int layers, double **jaco,
double *count_change, double ***COST, double ***ACTION,
int agentID, int ROWS, int action_row, int manager_row){
int row, col, layer, act_type, i, type1, type2, type3, cost_row;
double old_cost, new_cost, cost_change, new_action, mean_cost, sum_actions;
double **mean_costs, *hold_actions;
hold_actions = malloc(13 * sizeof(double));
for(i = 0; i < 13; i++){
hold_actions[i] = population[action_row][i][agent];
}
for(col = 7; col < 13; col++){
sum_actions = 0;
mean_cost = 0;
for(layer = 0; layer < layers; layer++){
sum_actions += ACTION[action_row][col][layer];
mean_cost += (COST[action_row][col][layer] / layers);
}
old_cost = mean_cost;
new_cost = population[manager_row][col][agent];
cost_change = old_cost / new_cost;
new_action = sum_actions * cost_change;
population[action_row][col][agent] = floor(new_action);
}
res_to_counts(population, interact_table, int_num, count_change, utilities,
jaco, action_row, agent);
for(i = 0; i < 13; i++){
population[action_row][i][agent] = hold_actions[i];
}
free(hold_actions);
}
The history of struggling with these two functions in a way that is accurate, readable, and efficient is in the git history. I’ll consider both functions with fresh eyes tomorrow with the goal of getting something working.
We have now reached a point where we have a clear link from manager
utility to a manager’s desired change in resources. The
util
column of a manager (layer = 1) action array defines
how many resources of a particular type the manager wants there
to be when column 1 equals -2 (added below for clarity).
-2.000000 1.000000 0.000000 0.000000 100.000000 1.000000 1.000000
-1.000000 1.000000 0.000000 0.000000 1.000000 1.000000 1.000000
1.000000 1.000000 0.000000 0.000000 -330.696014 0.000000 0.000000
2.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000
3.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000
From this util
value and the estimated abundance from
the observation model, we can get to the marginal utility, which is
placed in the same action array layer where column 1 equals the manager
ID. We now need this value to have some effect; e.g., in the above where
the population size is 330 individuals more than the manager wants, the
manager needs to adjust the cost array in some way that has the
predicted effect of lowering population size by roughly this amount. The
way that the genetic algorithm can learn to do this is by assuming that
the action array (which will have been the actions run in the last
user
model) represents what stake-holders will do when
constrained appropriately by costs. So, for example, we can consider the
ACTION
array below.
, , 1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] -2 1 0 0 100 1 1 0 0 0 0 20 0
[2,] -1 1 0 0 1 1 1 0 0 1 0 0 0
[3,] 1 1 0 0 0 0 0 0 0 0 0 0 1
[4,] 2 1 0 0 0 0 0 0 1 1 0 0 0
[5,] 3 1 0 0 0 0 0 0 0 0 0 0 0
, , 2
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] -2 1 0 0 1 1 1 81 0 0 0 0 0
[2,] -1 1 0 0 100 1 1 0 0 0 0 0 0
[3,] 1 1 0 0 0 0 0 0 0 0 0 0 0
[4,] 2 1 0 0 0 0 0 0 0 0 0 0 0
[5,] 3 1 0 0 0 0 0 0 0 0 0 0 0
, , 3
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] -2 1 0 0 1 1 1 72 2 0 0 0 1
[2,] -1 1 0 0 100 1 1 1 0 0 0 0 0
[3,] 1 1 0 0 0 0 0 0 1 0 0 0 0
[4,] 2 1 0 0 0 0 0 0 0 0 1 0 0
[5,] 3 1 0 0 0 0 0 0 0 1 0 0 0
The first layer is the manager, and the second two are stake-holders
that have util = 100
for the landscape layer 1, and
util = 1
for the resource defined by
type1 = 1
, type2 = 0
, and
type3 = 0
. In the above example, the resource might be
geese disturbing crops, and the stake-holders might be farmers. In both
cases, the stake-holders devote nearly all or nearly all of their budget
to moving the resource (column 8 corresopnds to movem
). The
manager can then project total number of resources increased or
decreased by these actions, and – perhaps eventually – whose land they
will be on (the code is there for the manager to prefer them on public
or private land, but this might need to be implemented later). Assuming
that the manager just cares about total resource abundance for now, the
should be able to recognise that movem
will not
decrease total resource abundance; hence, the manager might
prefer stake-holders to switch to killem
(column 10).
It’s this switch that is a challenge for the model. It’s easy now to have the manager recognise that stake-holder actions are not optimal in terms of policy – more actions are being devoted to something that doesn’t kill resources, and more would be better placed by increasing stake-holder column 10 values. Still, how much the manager should lower costs to get the desired abundance is unclear (and even more so if we were to add multiple resources). The manager can’t exactly use the stake-holder utilities for the resource per se either, because the actions are determined by how resources interact; we also don’t want to run a sub-genetic algorithm for the manager to anticipate stake-holder actions, as this would end up being computationally intense.
Perhaps the manager should simply recognise the plus-minus-neutral
effects of each column (from columns 8-13 above,
0 - - + + 0
). This gets part of the way there; if the
manager wants resources killed, then they could crank up the costs
associated with all 0
and +
actions (perhaps
this shouldn’t be allowed though for bankem
, which would
effectively prohibit stake-holders from inaction). The magnitude of
costs for -
actions such as killem
or
castem
could then be decided by assuming that stake-holders
would transfer -
and 0
(again, exclude
bankem
) to the lowest column of -
if the cost
were the lowest.
Maybe managers should make a judgement a priori about what
stake-holders are trying to do; classifying them as either wanting more
or less of a particular resource. Wanting less of a resource would be
associated with high values in action columns for killem
and castem
, and also movem
but only
if u_loc = 1
and u_land = 1
for the
stakeholder (i.e., if the desires and ability to move them depends on
the resource being on their land). Wanting more of a resource
would be associated with high values in action columns for
feedem
and helpem
. The manager could then
assume that stake-holders would allocate their total budget actions
proportionally to -
, +
, or 0
columns, but without discrimination between columns. It could be easy to
summarise budget and action totals as in the table below.
Action type | Total budget |
---|---|
Increasing | 500 |
Decreasing | 200 |
Neutral | 300 |
In the above, a net 300 more resources would appear if the manager does nothing (ignoring the resource model and consequences of carrying capacity for the moment). Note that the actions should really be run through the interaction array so that interactions between two resources could be hypothetically projected. This wouldn’t be much extra work – the increase in a resource could just be multiplied by the appropriate column in the Jacobian matrix. Also note that costs are only one way to adjust resources – another would be having something like licenses to kill be a resource that stake-holders might want to buy – the manager could make more of these and they could themselves be modelled as a dynamic and affecting the Jacobian matrix.
Manager model looped resources and mark-recapture
The density estimate within the manager model manager.c
now returns accurate abundances for multiple resources (storing them in
a vector est_abun
). This was confirmed by shutting off the
user
function (see Issue: #22).
Additionally, the mark-recapture estimator has been successfully
initialised in manager.c
, and works for multiple resources.
There was a bit of a hiccup here because the test printouts of
abundances were consistently different from what was seen in the R plot.
This turned out to be a minor error in R, not C (one too many columns
were being read in in R to estimate resources marked). By fixing the
error in R, both R and C estimates now match and are accurate. The next
step is to return estmates for transect-based sampling abundances.
The mark-recapture analysis uses Chapman estimation, which is
calculated in two functions. The function rmr_est
runs
calls chapman_est
for each individual resource, inputting
the results into the abun_est
vector.
/* =============================================================================
* This function calculates mark-recapture-based (Chapman) abundance estimates
* obs_array: The observation array
* para: A vector of parameters needed to handle the obs_array
* obs_array_rows: Number of rows in the observation array obs_array
* obs_array_cols: Number of cols in the observation array obs_array
* abun_est: Vector where abundance estimates for each type are placed
* interact_table: Lookup table to get all types of resource values
* int_table_rows: The number of rows in the interact_table
* trait_number: The number of traits in the resouce array
* ========================================================================== */
void rmr_est(double **obs_array, double *para, int obs_array_rows,
int obs_array_cols, double *abun_est, int **interact_table,
int int_table_rows, int trait_number){
int resource, type1, type2, type3;
double estimate;
for(resource = 0; resource < int_table_rows; resource++){
abun_est[resource] = 0;
if(interact_table[resource][0] == 0){ /* Change when turn off type? */
type1 = interact_table[resource][1];
type2 = interact_table[resource][2];
type3 = interact_table[resource][3];
estimate = chapman_est(obs_array, para, obs_array_rows,
obs_array_cols, trait_number, type1, type2,
type3);
abun_est[resource] = estimate;
}
}
}
The function chapman_est
itself does all of the maths
for estimating population abundance from mark-recapture data in the
OBSERVATION
ARRAY.
/* =============================================================================
* This function calculates RMR (chapman) for one resource type
* obs_array: The observation array
* para: A vector of parameters needed to handle the obs_array
* obs_array_rows: Number of rows in the observation array obs_array
* obs_array_cols: Number of cols in the observation array obs_array
* trait_number: The number of traits in the resource array
* type1: Resource type 1
* type2: Resource type 2
* type3: Resource type 3
* ========================================================================== */
double chapman_est(double **obs_array, double *para, int obs_array_rows,
int obs_array_cols, int trait_number, int type1, int type2,
int type3){
int row, col;
int total_marks, recaptures, mark_start, recapture_start;
int *marked, sum_marked, n, K, k;
double estimate, floored_est;
total_marks = (int) para[11];
recaptures = (int) para[10];
mark_start = trait_number + 1;
recapture_start = mark_start + (total_marks - recaptures);
if(total_marks < 2 || recaptures < 1){
printf("ERROR: Not enough marks or recaptures for management");
return 0;
}
n = 0;
marked = malloc(obs_array_rows * sizeof(int));
for(row = 0; row < obs_array_rows; row++){
marked[row] = 0;
if(obs_array[row][1] == type1 &&
obs_array[row][2] == type2 &&
obs_array[row][3] == type3
){
for(col = mark_start; col < recapture_start; col++){
if(obs_array[row][col] > 0){
marked[row] = 1;
n++;
break;
}
}
}
}
K = 0;
k = 0;
for(row = 0; row < obs_array_rows; row++){
if(obs_array[row][1] == type1 &&
obs_array[row][2] == type2 &&
obs_array[row][3] == type3
){
for(col = recapture_start; col < obs_array_cols; col++){
if(obs_array[row][col] > 0){
K++;
if(marked[row] > 0){
k++;
}
break;
}
}
}
}
estimate = ((n + 1) * (K + 1) / (k + 1)) - 1;
floored_est = floor(estimate);
free(marked);
return floored_est;
}
No confidence intervals are calculated at the moment, since I’m not
sure how the simulated manager would use the uncertaintly, but if we
eventually want real people to be able to ‘play’ the game as managers,
then it shouldn’t be too difficult to add confidence intervals to all
population size estimates within the C functions of
manager.c
.
Transect estimation of resource abundances
Manager estimation of abundances collected from transect type
sampling (i.e., case 2
and case 3
) are
considerably easier than density-based and mark-recapture matrics. The
times a resource is observed is simply stored in the 12th column (in C;
13 in R) of the observation matrix. The transect_est
does
the job for any number of resources all in one go.
/* =============================================================================
* This function calculates mark-recapture-based (Chapman) abundance estimates
* obs_array: The observation array
* para: A vector of parameters needed to handle the obs_array
* obs_array_rows: Number of rows in the observation array obs_array
* abun_est: Vector where abundance estimates for each type are placed
* interact_table: Lookup table to get all types of resource values
* int_table_rows: The number of rows in the interact_table
* ========================================================================== */
void transect_est(double **obs_array, double *para, int obs_array_rows,
double *abun_est, int **interact_table, int int_table_rows){
int resource, observation, type1, type2, type3;
for(resource = 0; resource < int_table_rows; resource++){
abun_est[resource] = 0;
if(interact_table[resource][0] == 0){ /* Change when turn off type? */
type1 = interact_table[resource][1];
type2 = interact_table[resource][2];
type3 = interact_table[resource][3];
for(observation = 0; observation < obs_array_rows; observation++){
if(obs_array[observation][1] == type1 &&
obs_array[observation][2] == type2 &&
obs_array[observation][3] == type3
){
abun_est[resource] += obs_array[observation][12];
}
}
}
}
}
Abundances need to now be compared to manager utilities (and for now,
I’m just going to assume that the agent with agentID = 1
is
the head manager (other type1 = 0
agents can be ‘managers’
collecting data, but I don’t see how or why we would want multiple
managers bargaining over resources with different util
values; not yet at least, and probably not ever).
Getting marginal utilities for management and putting them in
ACTION
Back to the big picture, I have now finished the first five (easiest) of the tasks below.
/* 1. Get summary statistics for resources from the observation array */
/* 2. Place estimated resource abundances in a vector the size of int_d0 */
/* 3. Initialise new vector of size int_d0 with temp utilities of manager */
/* 4. Subtract abundances from temp utilities to get marginal utilities */
/* 5. Insert the marginal utilities into the agent = 1 col1 of ACTION */
/* 6. Run the genetic algorithm (add extension to interpet cost effects) */
/* 7. Put in place the new ACTION array from 6 */
/* 8. Adjust the COST array appropriately from the new manager actions */
Essentially, the manager.c
function now gets estimates
for the abundances of each resource, then places those estimates in a
temporary vector. Elements in this vector (corresponding to resource
abundances) are then subtracted from the manager’s utility values
(corresponding to desired resource counts). What’s left is then
the marginal utility of resources – if there are more resources than the
manager desires, then the marginal utility is negative, and if there are
fewer, then the marginal utility is positive. The marginal utility is
then placed back into the first layer of the ACTION
array
(corresponding to the manager) where column 1 equals 1 (i.e., intepreted
as actions of the manager affecting their own costs – existing values of
which aren’t really being used because the concept doesn’t make a lot of
sense, and the values are really just there as place-holders for where
they mean things in other layers of the array). Hence the
util
column then includes values for the ideal resource
abundance (where column 1 equals -2 – util
is in column 5)
and the marginal utility given estimated resource abundance (where
column 1 equals 1). See below.
-2.000000 1.000000 0.000000 0.000000 100.000000 1.000000 1.000000
-1.000000 1.000000 0.000000 0.000000 1.000000 1.000000 1.000000
1.000000 1.000000 0.000000 0.000000 -330.696014 0.000000 0.000000
2.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000
3.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000
In the above, the manager sees 100 as the ideal population size, but
there are ca 430 resources of type1 = 1
,
type2 = 0
, type3 = 0
in the population. Hence
the manager would like to see ca 330 fewer of these kinds of
resources. The -330.696014
printed from a test simulation
above will allow the genetic algorithm to adjust the COST
array accordingly, decreasing COST
columns that correspond
to the killing or castrating (but not moving, I suppose) of
resources.
Resolved Issue: #20
Issue: #20
has now been resolved. The res_type
has now been removed
from the observation model, and observations simply occur for all unique
resource types – if some are not needed, then they are not analysed.
Doing this required very little modification for transect type sampling
(case 2
and case 3
), but considerably more for
density base sampling (case 0
) and especially mark and
recapture (case 1
). In these cases, I decided to split the
sampling functions up more clearly. Testing revealed some initial
errors, but these were ironed out and fixed. Currently, the correct
OBSERVATION
array is returned, although this array is not
analysed correctly when plotting in R for more than one resource (the
code for this has not yet be written).
NOTE: there is no code written to ignore subdivisions yet.
I’m not sure whether or not we’ll actually want this, but the option
could simply be something placed in para
and checked in the
subfunctions in observation.c
Major changes to observation.c
In resolving Issue: #20, I
have re-worked the code in observation.c
to be more
readable. instead of the switch(methods)
in the main
observation function calling density based estimation or mark-recapture,
but both of these functions, confusingly, calling the same
mark_res
sub-function, I now have mark_res
being called for density based estimation. Hence, each method of
observation has its own (considerably smaller) sub-function, each of
which calls another sub-function. For example, with density-based
estimation, we have the following function called times_obs
times.
/* =============================================================================
* Density method of estimation
* ===========================================================================*/
/* =============================================================================
* This simulates the capture-mark-recapture of a resource type
* Inputs include:
* resource_array: data frame of resources to be marked and/or recaptured
* agent_array: data frame of agents, potentially doing the marking
* land: The landscape on which interactions occur
* paras: vector of parameter values
* res_rows: Total number of resources that can be sampled
* a_row: Total number agents that could possibly sample
* obs_col: The number of columns in the observational array
* a_type: The type of agent that is doing the marking
* by_type: The type column that is being used
* find_type: The type of finding that observers do (view-based or rand)
* Output:
* Accumlated markings of resources by agents
* ========================================================================== */
void mark_res(double **resource_array, double **agent_array, double ***land,
double *paras, int res_rows, int a_row, int obs_col, int a_type,
int by_type, int find_type){
int resource;
int agent;
int count;
int edge; /* How does edge work? (Effects agent vision & movement) */
int samp_res; /* A randomly sampled resource */
int ldx, ldy;
int move_t;
int sample_num; /* Times resources observed during one time step */
edge = (int) paras[1]; /* What type of edge is on the landscape */
sample_num = (int) paras[11];
ldx = (int) paras[12]; /* dimensions of landscape -- x and y */
ldy = (int) paras[13];
move_t = (int) paras[14]; /* Type of movement being used */
for(agent = 0; agent < a_row; agent++){
if(agent_array[agent][by_type] == a_type){
mark_in_view(resource_array, agent_array, paras, res_rows, agent,
find_type, obs_col);
}
if(sample_num > 1){
a_mover(agent_array, 4, 5, 6, edge, agent, land, ldx, ldy, move_t);
}
}
}
The above function calls mark_in_view
, which marks all
resources in the agent’s view (regardless of type, which will get sorted
out later).
/* =============================================================================
* This simulates an individual agent doing some field work (observing)
* Inputs include:
* resource_array: data frame of resources to be marked and/or recaptured
* agent_array: data frame of agents, potentially doing the marking
* paras: vector of parameter values
* res_rows: Total number of rows in the res_adding data frame
* worker: The row of the agent that is doing the working
* find_proc: The procedure used for finding and marking resources
* res_type: The type of resources being marked
* obs_col: The number of columns in the observation array
* Output:
* The resource_array is marked by a particular agent
* ========================================================================== */
void mark_in_view(double **resource_array, double **agent_array, double *paras,
int res_rows, int worker, int find_proc, int obs_col){
int xloc; /* x location of the agent doing work */
int yloc; /* y location of the agent doing work */
int view; /* The 'view' (sampling range) around agent's location */
int edge; /* What type of edge is being used in the simulation */
int resource; /* Index for resource array */
int r_x; /* x location of a resource */
int r_y; /* y location of a resource */
int seeme; /* Test if observer sees/captures the resource */
int ldx; /* Landscape dimension on the x-axis */
int ldy; /* Landscape dimension on the y-axis */
int EucD; /* Is vision based on Euclidean distance? */
double min_age; /* Minimum at which sampling can occur */
xloc = (int) agent_array[worker][4];
yloc = (int) agent_array[worker][5];
view = (int) agent_array[worker][8];
edge = (int) paras[1];
ldx = (int) paras[12];
ldy = (int) paras[13];
EucD = (int) paras[20];
min_age = paras[16];
for(resource = 0; resource < res_rows; resource++){
if(resource_array[resource][11] >= min_age){
r_x = resource_array[resource][4];
r_y = resource_array[resource][5];
seeme = binos(xloc, yloc, r_x, r_y, edge, view, ldx, ldy, EucD);
agent_array[worker][10] += seeme;
resource_array[resource][obs_col] += seeme;
resource_array[resource][12] += seeme;
}
}
}
The mark-recapture technique, in contrast, calls the new function
sample_fixed_res
once (time_obs
is taken care
of in the sub-function).
/* =============================================================================
* Mark re-capture method of estimation
* ===========================================================================*/
/* =============================================================================
* This simulates the capture-mark-recapture of a resource type
* Inputs include:
* resource_array: data frame of resources to be marked and/or recaptured
* agent_array: data frame of agents, potentially doing the marking
* land: The landscape on which interactions occur
* paras: vector of parameter values
* lookup: The table listing resources and landscape layers to lookup
* res_rows: Total number of resources that can be sampled
* agent_number: Total number of agents in the agent array
* a_type: The type of agent that is doing the marking
* trait_number: The number of traits (columns) of the resource array
* lookup_rows: The number of rows in the lookup table
* Output:
* Accumlated markings of resources by agents
* ========================================================================== */
void sample_fixed_res(double **resource_array, double **agent_array,
double ***land, double *paras, int **lookup, int res_rows,
int agent_number, int a_type, int trait_number,
int lookup_rows){
int edge_type, move_type, fixed_sample, times_obs, move_res, by_type;
int land_x, land_y;
int obs_iter, agent;
int row, type1, type2, type3;
edge_type = (int) paras[1];
move_type = (int) paras[2];
fixed_sample = (int) paras[10];
land_x = (int) paras[12];
land_y = (int) paras[13];
by_type = (int) paras[17];
move_res = (int) paras[19];
if(fixed_sample < 1){
printf("ERROR: Fixed sample must be >= 1 \n ... Making = 1 \n");
paras[10] = 1;
fixed_sample = 1;
}
for(row = 0; row < lookup_rows; row++){
if(lookup[row][0] == 0){
obs_iter = trait_number + 1;
times_obs = (int) paras[11];
type1 = lookup[row][1];
type2 = lookup[row][2];
type3 = lookup[row][3];
while(times_obs > 0){
for(agent = 0; agent < agent_number; agent++){
if(agent_array[agent][by_type] == a_type){
mark_fixed(resource_array, agent_array, paras, res_rows,
agent, obs_iter, type1, type2, type3);
}
}
obs_iter++;
times_obs--;
if(move_res == 1){ /* Move resources if need for new sample */
res_mover(resource_array, 4, 5, 6, res_rows, edge_type,
land, land_x, land_y, move_type);
}
}
}
}
}
The sub-function mark_fixed
marks a fixed number of a
specific type of resource.
/* =============================================================================
* This simulates an individual agent marking a fixed number of resources
* Inputs include:
* resource_array: data frame of resources to be marked and/or recaptured
* agent_array: data frame of agents, potentially doing the marking
* paras: vector of parameter values
* res_rows: Total number of rows in the res_adding data frame
* worker: The row of the agent that is doing the working
* obs_col: The number of columns in the observation array
* type1: Resource type 1 being marked
* type2: Resource type 2 being marked
* type3: Resource type 3 being marked
* Output:
* Specific resources in resource_array are marked by a particular agent
* ========================================================================== */
void mark_fixed(double **resource_array, double **agent_array, double *paras,
int res_rows, int worker, int obs_col, int type1, int type2,
int type3){
int xloc; /* x location of the agent doing work */
int yloc; /* y location of the agent doing work */
int view; /* The 'view' (sampling range) around agent's location */
int edge; /* What type of edge is being used in the simulation */
int resource; /* Index for resource array */
int r_x; /* x location of a resource */
int r_y; /* y location of a resource */
int seeme; /* Test if observer sees/captures the resource */
int ldx; /* Landscape dimension on the x-axis */
int ldy; /* Landscape dimension on the y-axis */
int fixn; /* If procedure is to sample a fixed number; how many? */
int count; /* Index for sampling a fixed number of resource */
int sampled; /* The resource randomly sampled */
int type_num; /* Number of the type of resource to be fixed sampled */
int EucD; /* Is vision based on Euclidean distance? */
double sampl; /* Random uniform sampling of a resource */
double min_age; /* Minimum at which sampling can occur */
xloc = (int) agent_array[worker][4];
yloc = (int) agent_array[worker][5];
view = (int) agent_array[worker][8];
edge = (int) paras[1];
ldx = (int) paras[12];
ldy = (int) paras[13];
EucD = (int) paras[20];
min_age = (int) paras[16];
fixn = (int) paras[10];
type_num = 0;
for(resource = 0; resource < res_rows; resource++){
if(resource_array[resource][1] == type1 &&
resource_array[resource][2] == type2 &&
resource_array[resource][3] == type3 &&
resource_array[resource][11] >= min_age
){
type_num++;
}
}
if(type_num > fixn){ /* If more resources than the sample number */
/* Temp tallies are used here to sample without replacement */
for(resource = 0; resource < res_rows; resource++){
if(resource_array[resource][1] == type1 &&
resource_array[resource][2] == type2 &&
resource_array[resource][3] == type3
){
resource_array[resource][13] = 0; /* Start untallied */
}
}
count = fixn;
sampl = 0;
while(count > 0){
do{ /* Find an un-tallied resource in the array */
sampl = runif(0, 1) * res_rows;
sampled = (int) sampl;
} while(resource_array[sampled][13] == 1 ||
resource_array[sampled][1] != type1 ||
resource_array[sampled][2] != type2 ||
resource_array[sampled][3] != type3 ||
resource_array[sampled][11] < min_age ||
sampled == res_rows /* In case sample returns 1 */
);
resource_array[sampled][obs_col]++; /* Marks accumulate */
resource_array[sampled][12]++;
resource_array[sampled][13] = 1; /* Tally is noted */
count--;
}
agent_array[worker][10] += fixn;
}else{ /* Else all of the resources should be marked */
for(resource = 0; resource < res_rows; resource++){
if(resource_array[resource][1] == type1 &&
resource_array[resource][2] == type2 &&
resource_array[resource][3] == type3 &&
resource_array[resource][11] >= min_age
){
resource_array[resource][obs_col]++; /* Mark all */
resource_array[resource][12]++;
}
}
agent_array[worker][10] += type_num; /* All resources marked */
}
}
This still isn’t the most readable code, but it’s better than what it
was before. Eventually, I would prefer to get rid of as man of the
function arguments as possible and place these as elements within
para
. The identities of para
elements could
then be explained in the function more clearly (and consistently). This
would lead to less bulky functions and a bit clearer structure for the
code, but I think it can be implemented later as the code is given a
more general clean-up.
Issue: #22 User function crashes with multiple resources
For some reason, the user.c
call appears to crash when
there is more than one resource. I only noticed this after the overhaul
of the observation model, but I doubt they are related. More likely, one
of the many arrays with dimensions that depend on resource number is
being built or called improperly, leading to a segmentation fault. This
should be fixed, of course, but I suspect the problem is not too far
buried.
Next steps still to get abundance estimate from observation
array in manager.c
The above issues were progress, but it has set back the pace of the manager model a bit. The next item on the agenda is to still get individual abundance estimates within for each unique resource type in the manager model. The density estimate is completed, and the whole thing should work because the code is already written to loop through the interaction table and get abundance estimates for each unique resource. This should be tested by including more than one resource and turning off the user function (see Issue: #22). Once it works, then I need to do the same thing for mark-recapture and transect-based estimates of abundance. Then, I’ll try to get through items 2-5 from the list on the list from Monday.
Issue:
#20 Remove res_type
from observation model
Currently, the observation model only records resources into the
observation array if they are of a particular type1
, which
is specified in the para
vector and used to produce a data
array with only one type of specified resource. Originally, this seemed
like a good idea, but after spending some initial time writing the
management model, I don’t think there is any need nor good reason to
restrict observation to a specific resource type. Instead, all types
should be marked and moved to the observation array. Then, if management
analysis wants statistics for only one type of resource, its very easy
to use an if
statement to check that the type is
appropriate. It’s much easier to ignore parts of the array than to make
more than one array when needed through multiple calls of the
observation function.
To fix this, it shouldn’t be much more than a simple removal of
specifying res_type
values in observation.c
.
When there is only one resource type, all calculations should proceed
normally, but when more resources are introduced, an if
is
needed for both management and plotting (different groups of resources
could even be made, ignoring subdivisions, by skipping the
if
if the type specified to look at equals
-1
.
Issue:
#21 Improve code readability using para
Originally, I had the idea to use the global vector para
as a way of storing information easily and using it across all of the
models. The vector para
would store key information about
pretty much everything, then be dynamically updated as need be from
higher level functions in the model. In the last two months of coding, I
have been specifying parameter names in functions explicitly, which has
made sense during the coding process for my own writing, but it will be
beneficial to clean all of this up later by reading para
into these sub-functions that otherwise have sometimes about a dozen
arguments. Most functions would then have considerably fewer arguments,
and the description of variables stored as vector
elements in para
could be immediately defined within
sub-functions and used by name thereafter. The whole program would then
have a similar feel of reading in key arrays and vectors and then
specifying the key variables within sub-functions.
1. Get summary statistics for resources from the observation array
I have now written the functions for case 0
(density-based estimation) getting abundance estimates from the
observation
array, as outlined yesterday. This took slightly longer than
anticipated because it turns out that there was a minor error in R’s
estimation of abundances due to an incorrect column being summed. I had
to figure out why C and R did not agree on the same abundance estimates;
they now do (i.e., both independent codings to get the same estimate
from the same observation array). I now need to do the other three types
of observation and get abundance estimates from it. In the highest level
function for this part of the manager model, the
estimate_abundances
function is called.
/* =============================================================================
* This function uses the observation array to estimate resource abundances
* obs_array: The observation array
* para: A vector of parameters needed to handle the obs_array
* interact_table: Lookup table to get all types of resource values
* agent_array: Agent array, including managers (agent type 0)
* agents: Total number of agents (rows) in the agents array
* obs_x: Number of rows in the observation array
* obs_y: Number of cols in the observation array
* abun_est: Vector where abundance estimates for each type are placed
* int_table_rows: The number of rows in the interact_table
* ========================================================================== */
void estimate_abundances(double **obs_array, double *para, int **interact_table,
double **agent_array, int agents, int obs_x, int obs_y,
double *abun_est, int int_table_rows){
int estimate_type, recaptures;
double abun;
estimate_type = (int) para[8];
switch(estimate_type){
case 0:
dens_est(obs_array, para, agent_array, agents, obs_x, obs_y,
abun_est, interact_table, int_table_rows);
break;
case 1:
recaptures = (int) para[10];
break;
case 2:
break;
case 3:
break;
default:
break;
}
}
The above function will call subfunctions based on estimate type (0
to 3). Now only the density function dens_est
has been
written and tested.
/* =============================================================================
* This function calculates density-based abundance estimates
* obs_array: The observation array
* para: A vector of parameters needed to handle the obs_array
* agent_array: Agent array, including managers (agent type 0)
* agents: Total number of agents (rows) in the agents array
* obs_array_rows: Number of rows in the observation array obs_array
* obs_array_cols: Number of cols in the observation array obs_array
* abun_est: Vector where abundance estimates for each type are placed
* interact_table: Lookup table to get all types of resource values
* int_table_rows: The number of rows in the interact_table
* ========================================================================== */
void dens_est(double **obs_array, double *para, double **agent_array,
int agents, int obs_array_rows, int obs_array_cols,
double *abun_est, int **interact_table, int int_table_rows){
int i, j, resource;
int view, a_type, land_x, land_y, type1, type2, type3;
int vision, area, cells, times_obs, tot_obs;
double prop_obs, estimate;
a_type = (int) para[7]; /* What type of agent does the observing */
times_obs = (int) para[11];
land_x = (int) para[12];
land_y = (int) para[13];
view = 0;
for(i = 0; i < agents; i++){
if(agent_array[i][1] == a_type){
view += agent_array[i][8];
}
}
vision = (2 * view) + 1;
area = vision * vision * times_obs;
cells = land_x * land_y; /* Plus one needed for zero index */
tot_obs = 0;
for(resource = 0; resource < int_table_rows; resource++){
abun_est[resource] = 0;
if(interact_table[resource][0] == 0){ /* Change when turn off type? */
type1 = interact_table[resource][1];
type2 = interact_table[resource][2];
type3 = interact_table[resource][3];
tot_obs = res_obs(obs_array, obs_array_rows, obs_array_cols, type1,
type2, type3);
prop_obs = (double) tot_obs / area;
estimate = prop_obs * cells;
abun_est[resource] = estimate;
}
}
}
The dens_est
function above calls the
res_obs
function, which returns the number of observations
for a specific resource type.
/* =============================================================================
* This function calculates density-based abundance estimates
* obs_array: The observation array
* obs_rows: Number of rows in the observation array obs_array
* obs_cols: Number of cols in the observation array obs_array
* type1: Resources of type 1 being observed
* type2: Resources of type 2 being observed
* type3: Resources of type 3 being observed
* ========================================================================== */
int res_obs(double **obs_array, int obs_rows, int obs_cols, int type1,
int type2, int type3){
int i, j, obs_count;
obs_count = 0;
for(i = 0; i < obs_rows; i++){
if( (obs_array[i][1] == type1 || obs_array[i][1] < 0) &&
(obs_array[i][2] == type2 || obs_array[i][2] < 0) &&
(obs_array[i][3] == type3 || obs_array[i][3] < 0)
){
for(j = 15; j < obs_cols; j++){
obs_count += obs_array[i][j];
}
}
}
return obs_count;
}
The point of the above break-down, aside from making things more
readable, is that we might want to get abundance estimates for each
resource type – at least have G-MSE produce them even if we pretend that
managers cannot see them. When Issue: #20 is
resolved, all resources will then be estimated. NOTE: This could
be an issue because if a fixed number of resource types are sampled, as
with mark-recapture, then it could sample different resources.
It might be best to change mark-recapture so that it takes a
fixed_obs
for each unique resource type, somehow.
The point is that it’s easier and more computationally efficient to
ignore some data (and not allow managers to notice it) than it is to
have to run observation
multiple times to re-collect using
the same protocol.
Eventually, I also want the obs_array[i][1, 2, or 3]
to
be able to take -1 as a value in here somehow – basically, I want the
counts to be taken to ignore one of a resource’s type. For
example, we could imagine wanting to have separate sexes indicated by
type2 = 0
or type2 = 1
in column 2 of the
obs_array
, but perhaps not want managers to actually
use this when estimating abundance (alternatively – could
combine observations later).
Manager function to genetic algorithm link
There is a minor conceptual issue regarding the implementation of the
genetic algorithm with the manager function. The manager’s actions need
to be based on the OBSERVATION
array, but stake-holders
need not use this information. There are two options for implementing
the genetic algorithm regarding observations.
OBSERVATION
array could just be read into the
genetic algorithm and not used for stake-holders. This might require the
user
function to be initially called after the
manager
function so that an OBSERVATION
array
exists (or a dummy could be made easily enough in the
user.R
function.ACTION
and COST
arrays into the genetic algirthm to zero in on the actions. For example,
if there are too many resources, then the util
could be
adjusted within manager.c
(or manager.R
) to be
negative, hence making the genetic algorithm select strategies that
lower costs on killem
actions proportional to how many the
manager wants killed.I think that option 2 is actually a bit faster, and will probably be easier to implement in terms of coding.
Isolating effects of uncertainty
It is worth pointing out in passing that above option 2 offers a very
straight-forward way of looking at uncertainty with respect to
management decisions. When passing resource abundances to update
temporary util
values for managers, we could compare the
estimates of abundances produced from the observation model to the
actual abundances from the resource model. This could be a very
simple option in the software, and it might be useful to run the
genetic algorithm twice for managers in each time step to
simulate side-by-side how decisions would be made in the presence and
absence of uncertainty.
Initialising the manger model
To get the ball rolling on the manager function’s implementation of
the genetic algorithm, now is as good of a time as any to initialise
manager.R
and manager.c
, since the arguments
passed to game.c
need to be coaxed into the right form via
the manager model. It’s important to keep in mind that I still
need to implement the lobbying option for stake-holders, but I
think that this will be easier once the manager’s genetic algorithm is
built. It’s also notworthy that we’re probably not going to need
managers to adjust stake-holder’s utilities. So really, in a pinch,
their are three types of actions that are really going to be important,
probably always.
To this list of three essential types of actions, there are a few additional actions that would be good to have, ideally as fitting within the general framework of the model seemlessely, but if necessary could be add-ons for future development.
To this, there are a few more other possible options that I can’t, at the moment, see why anyone would want to model. I’m not entirely sure these are really sensical, actually.
Framework for manager actions
The framework for manager actions in both R and C is now entirely
built – data structures can be read in and out, so now all that is left
is to do the modelling. I’ve commented what will happen within
manager.c
; each of the numbers below might or might not
represent uniqe sub-functions.
/* Do the biology here now */
/* ====================================================================== */
/* 1. Get summary statistics for resources from the observation array */
/* 2. Place estimated resource abundances in a vector the size of int_d0 */
/* 3. Initialise new vector of size int_d0 with temp utilities of manager */
/* 4. Subtract abundances from temp utilities to get marginal utilities */
/* 5. Insert the marginal utilities into the agent = 1 col1 of ACTION */
/* 6. Run the genetic algorithm (add extension to interpet cost effects) */
/* 7. Put in place the new ACTION array from 6 */
/* 8. Adjust the COST array appropriately from the new manager actions */
/* This code switches from C back to R */
/* ====================================================================== */
With all of these in place, the end result should be a new
COST
array based on manager actions. It will be important
to make sure that the manager’s costs are defined appropriately
so that the manager doesn’t start doing actions themselves. This could
actually be a bit of a problem; if we want the manager to do things
themselves, then it’s hard to see why they wouldn’t just perform the
actions instead of adjusting the costs. Then again, perhaps this is kind
of the point? Maybe sufficiently high costs of actions and sufficiently
low costs of policy adjustment should cause the genetic
algorithm to naturally find policy as a better means of acheiving what
the manager wants. In fact, this seems almost certain; if managers in
the real world could achieve all policy aims single-handedly, then
that’s what they would probably be hired to do. In the real world,
changing policy is more effective – it’s also possible that we could
allow them to do their own direct actions to resources in the
user
model, like the stake-holders. Costs of setting policy
could then be independent from costs of doing actions by changing
manager COST
between models.
Concrete plan for manager fitness function
The next step in the coding is to allow managers to generate policy by using their utilities to affect the costs of other agents. This will require that managers recognise how the actions of other agents will affect resources and the landscape, then adjust costs to encourage agents to act in a particular way. There are several things to keep in mind here.
bankem
column could be used to suck up actions in the
genetic algorithm if doing nothing is advantageous..killem
) should stop
managers from feeding resources on public land. Maybe it does, but it
doesn’t seem like it should have to be this way.Some solutions to account for the above
Given that the manager already has a special status in the rest of
G-MSE, maybe it’s not too much of a stretch to make their cost-adjusting
actions apply to all non-managers by default by making all
cost-adjusting rows in the ACTION
array (on the manager’s
layer) equivalent. Or, even better, the first row,
which corresponds to the manager’s own costs (or any agent’s
own cost row, since the cost of adjusting their own cost
doesn’t really come into play – or really make much sense), could simply
define the cost of affecting all stake-holders cost
values.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem | bankem |
---|---|---|---|---|---|---|---|---|---|---|---|---|
-2 | 1 | 0 | 0 | 101 | 101 | 101 | 3 | 8 | 4 | 4 | 2 | 1 |
-2 | 2 | 0 | 0 | 101 | 101 | 101 | 4 | 3 | 6 | 2 | 3 | 1 |
-1 | 1 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
1 | 1 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
1 | 2 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
2 | 1 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
2 | 2 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
3 | 1 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
3 | 2 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
Assume that the above COST
array layer corresponds to
the manager: agent = 1
. In the above COST
array, we have three agents (agent = 1
is a manager, while
agent = 2
and agent = 3
are stake-holders),
two resource types, and one landscape layer. So in rows 4 and 5 above
where the first column agent = 1
, we have, essentially,
costs of what it takes to affect the entire array of stake-holders
actions (all layers where agent = -2
or
agent = -1
) on a particular resource. This value can then
be used to directly implement actions in the first two rows of all
layers of the ACTION
array. Note that the
structure of the code does not allow managers to make policy on
landscape use – only resources, which might or might not have to be
changed. What we’re essentially saying with this is that a manager
cannot tell a farmer to not kill or fertilise crops on the farmer’s own
land. Only resources are affected by manager policy (we could, of
course, make crops resources – though this would be a bit of a time
consuming work around). We could find a way around this if need be, but
I can’t think of many situations in which we would want a manager to be
able to tell a stake-holder that they can’t increase their own crop
yield or kill their crops.
Issue #19: Cost and Action arrays: landscape level initialisation
The number of landscape layers in the COST
and
ACTION
arrays is too many – the utility_layer
adds one in for every unique resource instead of for every unique
landscape layer. This is an easy fix by adding an option to the function
to specify landscape layers.
More concrete ideas
A general algorithm-sketch for the managers could be as follows in the fitness function of the genetic algorithm. Note that this will be called in the manager model, not the user model, so there isn’t a worry about the stake-holders updating their actions at the wrong time – their actions will be static while the manager is considering what to do.
ACTION
array, column
util
.ACTION
matrix of stake-holders (e.g., from all rows where
agent = -2
). Hence, resources will be decreased by some
columns and increased by others based on the past actions of
stake-holders. This might or might not also need to include
resource abundance projections based on birth and death rates – ideally
this would be the case, with projections using estimates from the
observation model, but maybe just use the abundances as a first step?
Note, the ‘projected’ birth rate need not use the
RESOURCE
array explicitly, but could be estmated and
applied from the history of observation – in fact, this would probably
be better.The algorithm above should be fairly fast, and while it won’t provide
the optimal solution for managers, it isn’t actually intended
to do so. The point is to find an adaptive strategy based on the tools
available to the manager and the limited information that the manager
realistically has about resource abundacnes and stake-holder strategies.
Following from the above, I dare say that the fitness function for
stake-holders affecting manager’s util
might be not too
difficult, but I’ll need to think carefully about the best way to
implement it.
Resolved issue #12
I have finally resolved issue #12, which
was always a bit annoying but never terribly serious. The problem was
that density-based estimation as done in Nuno et
al. (2013) would only plot correctly when
times_observe = 1
; that is, when managers went out to
observe a sample of the population exactly once per time step. Obviously
we want the option to allow multiple trips to sample in a single time
step, as sampling once (unless the number of cells viewed on the
landscape include almost the entire landscape) leads to highly variable
results – and even more so now that resource distributions tend to
become clumped on the landscape when agents scare them off their land.
Previously the proportion estimate used the number of unique
resources observed, but what we really needed was the unique
observations. By simply summing all values in columns
16
to 16 + times_observed - 1
of the
observation array, we get the total number of observations.
Quick note about agents affecting each other’s costs
It’s worth noting that there is no need to save any additional data
structure to have agents affect each others costs – at least not at the
moment. This is because the user and manager models are separated in a
broader time. When a manager uses the genetic algorithm, the
ACTION
array that they have to use has already been updated
for stake-holders, so the managers are effectively seeing the
most recent actions of stake-holders and will be able to adjust costs
and use the recent actions to predict changes (can perhaps assume
proportional allocation of actions, so if the manager makes one action
more costly, then the stake-holder will shift to increase other actions
– or perhaps this is too much to predict; maybe managers should just
assume that increasing cost will decrease an action as if the action
isolated. How strategic do we assume managers think?). Likewise,
stake-holders are seeing the managers most recent priorities
and might lobby them accordingly.
Resolved issue #18
The landscape actions within the user model are now affected by the interaction matrix from the appropriate diagonal element. This effectively adjust the effect of a user’s actions to increase a cell value by some magnitude. For example, if an agent wants to increase their crop yield they will not do so by the current cell yield plus one times whatever the appropriate element is in the interaction matrix (default could be one – doubling yield). Initial testing shows that this works as intended; stake-holders interested in maximising crop yield do so reliably when they can increase yield on a cell twenty-fold; mean crop yield on the whole landscape increases in turn. When the increase is smaller (50%), then a range of strategies appears possible – one stake-holder chose to kill resources while the other chose to directly increase yield (dependent on costs, which varied among stake-holders). Next, the plan is to address some of the minor clean-up tasks (bulleted list from yesterday) before getting to the ultimate goal of allowing agents to affect one another’s costs in the genetic algorithm.
Resolved issue #11
I appear to have resolved issue #11 by
calling the a_mover
function in observation.c
from within the anecdotal
function. This gives the option
to move agents when the R function anecdotal
is called
during a time step. Later I might also consider giving the option to
specify moving agents onto land that they own; this would probably be
best accomplished by calling send_agents_home
, which is
currently in user.c
, but could be moved to
utilities.c
. Tests of anecdotal
in R confirm
that it is moving agents as expected.
I have begun to implement actions on the landscape as an option. For now, these actions will include increasing crop yield directly in some way (magnitude to be affected by the interaction array, see below), and killing crops.
New issue #18: Make landscape actions from interaction array
It will be helpful to link the appropriate element of the interaction
array (Jacobian matrix) to the actions in the
landscape_actions
function in user.c
. As of
now, the amount of increase in crop yield (and decrease) is hard-coded
in the function, but it really should be linked with the appropriate
diagonal element in the interaction array – increasing or decreasing a
cell’s value by the magnitude in the array element.
More testing, success
More testing shows that the genetic algorithm and user function is working as intendend, and I have looped the genetic algorithm so that it is run for all simulated agents with now issues – even when landscape dimensions or agent number changes. There are, however, some minor things that need to be tweaked.
observe_type = 0
still isn’t plotting correctly.
I’m not sure why this is, but it seems to produce underestimates of
population abundance that are off consisently by half the
times_obs
. It will be a good idea to get this settled,
finally.The minor bug from yesterday has been resolved. The following bet of code needed to be within the larger loop that cycled through the population of agents in the genetic algorithm.
for(i = 0; i < interest_num; i++){
count_change[i] = 0; /* Initialise all count changes at zero */
utilities[i] = 0; /* Same for utilities */
}
The count changes and utilities were not being initialised at zero,
meaning that count_change
was cumulative over agents. When
this is fixed, and agents highly value the resource
(utility = 100
), they either evolve to feedem
or helpem
as much as possible, as reflected in the
ACTION
array.
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] -2 1 0 0 100 1 1 0 0 0 92 0 0
[2,] -1 1 0 0 1 1 1 0 0 0 0 0 0
[3,] 1 1 0 0 0 0 0 0 0 0 0 0 0
[4,] 2 1 0 0 0 0 0 0 0 0 0 1 0
So with this very simplified test, the function is doing what it is supposed to do. Next, I fixed the utility of the landscape to 100 to see if the agent recognises that it can increase crop yield by killing or scaring the resource via the interaction array.
Test of killing resource to maximise crop yield
After some further debugging, the agents in the genetic algorithm now figure out to kill resources when resources destroy crops.
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] -2 1 0 0 1 1 1 94 0 0 0 0 0
[2,] -1 1 0 0 100 1 1 0 0 0 0 0 0
[3,] 1 1 0 0 0 0 0 0 0 0 0 0 0
[4,] 2 1 0 0 0 0 0 0 0 0 0 0 1
In the above ACTION
array, utility of the crop yield is
100, and the interaction matrix indicates that resource of
type1 = 0
decreases crop yield on a cell by one half. In
response to this, the stake-holders find the solution of killing
resources on their land (indicated by the 94
in the eight
column above). THe code for doing this is not terribly readable at the
moment.
/* =============================================================================
* This is a preliminary function that checks the fitness of each agent by
* passing through a loop to payoffs_to_fitness
* fitnesses: Array to order fitnesses of the agents in the population
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* agent_array: The agent array
* jaco: The jacobian matrix of resource and landscape interactions
* interact_table: Lookup table for figuring out rows of jaco and types
* interest_num: The number of rows and cols in jac, and rows in lookup
* ========================================================================== */
void strategy_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, int COLS, double **agent_array, double **jaco,
int **interact_table, int interest_num){
int agent, i, row, act_type, type1, type2, type3, interest_row;
double agent_fitness, *count_change, foc_effect;
double movem, castem, killem, feedem, helpem;
double utility, *utilities;
count_change = malloc(interest_num * sizeof(int));
utilities = malloc(interest_num * sizeof(int));
for(agent = 0; agent < pop_size; agent++){
for(i = 0; i < interest_num; i++){
count_change[i] = 0; /* Initialise all count changes at zero */
utilities[i] = 0; /* Same for utilities */
}
for(row = 0; row < ROWS; row++){
foc_effect = 0;
act_type = (int) population[row][0][agent];
type1 = population[row][1][agent];
type2 = population[row][2][agent];
type3 = population[row][3][agent];
utility = population[row][4][agent];
movem = population[row][7][agent];
castem = population[row][8][agent];
killem = population[row][9][agent];
feedem = population[row][10][agent];
helpem = population[row][11][agent];
switch(act_type){
case -2:
foc_effect -= movem; /* Times birth to account for repr? */
foc_effect -= castem; /* But only remove E offspring? */
foc_effect -= killem; /* But also remove E offspring? */
foc_effect += feedem; /* But should less mortality */
foc_effect += helpem; /* But should affect offspring? */
interest_row = 0;
while(interest_row < interest_num){
if(interact_table[interest_row][0] == 0 &&
interact_table[interest_row][1] == type1 &&
interact_table[interest_row][2] == type2 &&
interact_table[interest_row][3] == type3
){
break;
}else{
interest_row++;
}
}
for(i = 0; i < interest_num; i++){
count_change[i] += foc_effect * jaco[interest_row][i];
}
utilities[interest_row] = utility;
case -1:
interest_row = 0;
while(interest_row < interest_num){
if(interact_table[interest_row][0] == 1 &&
interact_table[interest_row][1] == type1 &&
interact_table[interest_row][2] == type2 &&
interact_table[interest_row][3] == type3
){
break;
}else{
interest_row++;
}
}
utilities[interest_row] = utility;
break; /* Add landscape effects here */
default:
break;
}
}
fitnesses[agent] = 0;
for(i = 0; i < interest_num; i++){
fitnesses[agent] += count_change[i] * utilities[i];
}
}
free(utilities);
free(count_change);
}
The above can be greatly simplified and made clearer, with the goal
towards simple fitness functions for the case in which agents directly
affect resources or crops. The indirect interactions will be
sub-functions in the above called in the switch
where
case
is greater than zero.
Breaking down the strategy_fitness
function
I’ve broken down the strategy_fitness
function into
three more manageable functions that can be further developed as
necessary. The strategy_fitness
function now calls
functions thatupdate the count_change
and
utilities
arrays as a result of direct actions to resources
and the landscape.
/* =============================================================================
* This is a preliminary function that checks the fitness of each agent by
* passing through a loop to payoffs_to_fitness
* fitnesses: Array to order fitnesses of the agents in the population
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* agent_array: The agent array
* jaco: The jacobian matrix of resource and landscape interactions
* interact_table: Lookup table for figuring out rows of jaco and types
* interest_num: The number of rows and cols in jac, and rows in lookup
* ========================================================================== */
void strategy_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, int COLS, double **agent_array, double **jaco,
int **interact_table, int interest_num){
int agent, i, row, act_type, type1, type2, type3, interest_row;
double agent_fitness, *count_change, foc_effect;
double movem, castem, killem, feedem, helpem;
double utility, *utilities;
count_change = malloc(interest_num * sizeof(int));
utilities = malloc(interest_num * sizeof(int));
for(agent = 0; agent < pop_size; agent++){
for(i = 0; i < interest_num; i++){
count_change[i] = 0; /* Initialise all count changes at zero */
utilities[i] = 0; /* Same for utilities */
}
for(row = 0; row < ROWS; row++){
act_type = (int) population[row][0][agent];
switch(act_type){
case -2:
res_to_counts(population, interact_table, interest_num,
count_change, utilities, jaco, row, agent);
break;
case -1:
land_to_counts(population, interact_table, interest_num,
utilities, row, agent);
break;
default:
break;
}
}
fitnesses[agent] = 0;
for(i = 0; i < interest_num; i++){
fitnesses[agent] += count_change[i] * utilities[i];
}
}
free(utilities);
free(count_change);
}
The case -2
calls res_to_counts
below.
/* =============================================================================
* This function updates count change and utility arrays for direct actions on
* resources
* population: The population array of agents in the genetic algorithm
* interact_table: The lookup table for figuring out how resources interact
* int_num: The number of rows and cols in jac, and rows in the lookup
* count_change: A vector of how counts have changed as a result of actions
* utilities: A vector of the utilities of each resource/landscape level
* jaco: The interaction table itself (i.e., Jacobian matrix)
* row: The row of the interaction and lookup table being examined
* agent: The agent in the population whose fitness is being assessed
* ========================================================================== */
void res_to_counts(double ***population, int **interact_table, int int_num,
double *count_change, double *utilities, double **jaco,
int row, int agent){
int i, act_type, interest_row;
double foc_effect;
foc_effect = 0.0;
foc_effect -= population[row][7][agent]; /* Times birth account for repr?*/
foc_effect -= population[row][8][agent]; /* But only remove E offspring? */
foc_effect -= population[row][9][agent]; /* But also remove E offspring? */
foc_effect += population[row][10][agent]; /* But should less mortality */
foc_effect += population[row][11][agent]; /* But should affect offspring? */
interest_row = 0;
while(interest_row < int_num){
if(interact_table[interest_row][0] == 0 &&
interact_table[interest_row][1] == population[row][1][agent] &&
interact_table[interest_row][2] == population[row][2][agent] &&
interact_table[interest_row][3] == population[row][3][agent]
){
break;
}else{
interest_row++;
}
}
for(i = 0; i < int_num; i++){
count_change[i] += foc_effect * jaco[interest_row][i];
}
utilities[interest_row] = population[row][4][agent];
}
And the case -1
calls land_to_counts
below.
/* =============================================================================
* This function updates count change and utility arrays for direct actions on
* a landscape
* population: The population array of agents in the genetic algorithm
* interact_table: The lookup table for figuring out how resources interact
* int_num: The number of rows and cols in jac, and rows in the lookup
* utilities: A vector of the utilities of each resource/landscape level
* row: The row of the interaction and lookup table being examined
* agent: The agent in the population whose fitness is being assessed
* ========================================================================== */
void land_to_counts(double ***population, int **interact_table, int int_num,
double *utilities, int row, int agent){
int i, act_type, interest_row;
double foc_effect;
interest_row = 0;
while(interest_row < int_num){
if(interact_table[interest_row][0] == 1 &&
interact_table[interest_row][1] == population[row][1][agent] &&
interact_table[interest_row][2] == population[row][2][agent] &&
interact_table[interest_row][3] == population[row][3][agent]
){
break;
}else{
interest_row++;
}
}
utilities[interest_row] = population[row][4][agent];
}
For each sub-case, how the population array is interpreted can be
specialised. For example, if castem
doesn’t really mean
anything on the landscape, then it can simply be ignored and agents will
adapt by not doing it. In this sense, these two sub-functions become
easy things to tinker with for translating actions to utilities.
Version v0.0.9
: A working
genetic algorithm
I moved the function do_actions
and its dependency
resource_actions
to the user.c
file so that
the actions of a particular agent could be performed on the
actual population after the genetic algorithm simulated and
selected an adaptive strategy. As a test drive, I simulated the actions
of only one stake-holder who is trying to maximise crop yield, and whose
only avenue for doing so is getting resources of their land one way or
another. The figure below shows the output.
The figure tells an interesting story. The light blue individual in the right-hand panels represents a farmer, who has quickly figured out that little black dots on their farm are decreasing crop yield, which the farmer wants to maximise. Initially, the growing population of black dots causes crop yield to decline, but by generation 8 or 9, the farmer has opted to scare these dots to public land. Consequently, mean farm yield over all land goes up a bit (due to intraspecific competition between black dots), and almost all of the crop damage occurs on the public land (dark blue) while the farmer’s land (light blue) has better yield. The spatial distribution of the black dots is very easy to see – all of the back dots have been ‘scared’ into the public land.
This is exciting – we have a working model in which a genetic
algorithm is being used to identify and enact a stake-holders strategy
given their specific interests. The only major conceptual
hurdle now is likely to be the manager’s response, enacting
policy by affecting costs of actions, and stake-holders actions that
affect other agent’s costs (e.g., lobbying a manager). This isn’t even
much of a jump though – really, the framework is in place and a lot of
the work from here is just grunt work in terms of coding the specifics
what options will be available to what agents. It shouldn’t be
too long before we have a working model of conflict that can be
applied to real-world case studies. Hence, I’m calling this
v0.0.9
and pushing to master
. This
implementation of the genetic algorithm is also not noticably slower
than previous versions – it took about a second to run the above; my
goal is to keep it low.
The next step will be to figure out what options should be available
for directly affecting the landscape, and what needs to be done to apply
the genetic algorithm to costs of other agents actions (hooks
for this are already coded in switch
functions of the
genetic algorithm). I would also like to build the
manager.c
function with the ability to empirically derive
the Jacobian matrix, and (eventually) make it possible for agents to
consider the histories of each others actions (shouldn’t be too
much of a stretch, but this is an extension of the genetic algorithm
that can come later).
It’s worth pointing out that the interaction array from yesterday’s
make_interaction_array
function can be defined more
generally as a Jacobian
matrix. I think it’s worth doing this for the sake of clarity and
generality, and thinking about the elements of the array as first order
partial derivatives.
Planning with the Jacobian matrix
Note that one benefit of individual-based modelling is that each
individual can be unique – for example, an individual’s consumption rate
of crops does not have to be completely defined by its type; there can
be individual variation within types too. Hence, it is probably
undesirable to have the Jacobian matrix of type and landscape cell
layers define how individual interactions should occur. There
should be some variation and uncertainty at least as an option. Hence,
the interaction array should probably be calculated a
posteriori as much as possible – ideally from looking at
interactions on the landscape (e.g., by the eventual
manager.c
), or perhaps a function should go through the
resource and landscape arrays and figure out the average interaction for
each data type somewhere within G-MSE (but not from within the
genetic algorithm; it would take too long). These details can be worked
out later in the manager.c
file, or perhaps somehow with
the anecdotal
function, which I think now can be made more
general. For now, I’m going to manually set values in
the matrix and use them to build an efficient genetic algorithm.
Dealing with issues of order in the fitness function
Resource type order needs to be identified for all resource types and landscape layers. The easiest way to do this is to just have a new array that lists all resource and landscape types, such as the below.
Res | Type 1 | Type 2 | Type 3 |
---|---|---|---|
1 | 1 | 0 | 0 |
1 | 2 | 0 | 0 |
0 | 1 | 0 | 0 |
The first column just identifies whether or not the row refers to a resource or a landscape level. The second through fourth columns identify a type (2 and 3 are always zero for landscape levels). This strikes me as the most clear way of keeping track of which rows go with which types in both the Jacobian matrix and the resource array, which I eventually will want to include columns associated with each row, potentially?
The table is initialised with a simple function now in
initialise.R
to be called from the main
gmse.R
.
#' Initialise array of resource and landscape-level interactions
#'
#'@param resources the resource array
#'@param landscape the landscape array
#'@export
make_interaction_table <- function(resources, landscape){
resource_types <- unique(resources[,2:4]);
resource_part <- matrix(data=0, nrow=dim(resource_types)[1], ncol=4);
resource_part[,2:4] <- resource_types;
landscape_count <- dim(landscape)[3] - 2; # Again, maybe all in later?
landscape_part <- matrix(data = 0, nrow = landscape_count, ncol = 4);
landscape_part[,1] <- 1;
landscape_part[,2] <- 1:landscape_count;
the_table <- rbind(resource_part, landscape_part);
}
The table, along with the Jacobian matrix, is now passed to the user function and into the genetic algorithm where it can be used by the fitness function.
A revised fitness function
A revised fitness function is below, which has not passed unit tests because it doesn’t appear to be maximising utility correctly. There is likely one or more minor bugs in the code that need to be fixed, and it would be better anyway to break the below down into a couple smaller functions anyway.
/* =============================================================================
* This is a preliminary function that checks the fitness of each agent by
* passing through a loop to payoffs_to_fitness
* fitnesses: Array to order fitnesses of the agents in the population
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* agent_array: The agent array
* jaco: The jacobian matrix of resource and landscape interactions
* interact_table: Lookup table for figuring out rows of jaco and types
* interest_num: The number of rows and cols in jac, and rows in lookup
* ========================================================================== */
void strategy_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, int COLS, double **agent_array, double **jaco,
int **interact_table, int interest_num){
int agent, i, row, act_type, type1, type2, type3, interest_row;
double agent_fitness, *count_change, foc_effect;
double movem, castem, killem, feedem, helpem;
double utility, *utilities;
count_change = malloc(interest_num * sizeof(int));
utilities = malloc(interest_num * sizeof(int));
for(i = 0; i < interest_num; i++){
count_change[i] = 0; /* Initialise all count changes at zero */
utilities[i] = 0; /* Same for utilities */
}
for(agent = 0; agent < pop_size; agent++){
for(row = 0; row < ROWS; row++){
foc_effect = 0;
act_type = (int) population[row][0][agent];
type1 = population[row][1][agent];
type2 = population[row][2][agent];
type3 = population[row][3][agent];
utility = population[row][4][agent];
movem = population[row][7][agent];
castem = population[row][8][agent];
killem = population[row][9][agent];
feedem = population[row][10][agent];
helpem = population[row][11][agent];
switch(act_type){
case -2:
foc_effect -= movem; /* Times birth to account for repr? */
foc_effect -= castem; /* But only remove E offspring? */
foc_effect -= killem; /* But also remove E offspring? */
foc_effect += feedem; /* But should less mortality */
foc_effect += helpem; /* But should affect offspring? */
interest_row = 0;
while(interest_row < interest_num){
if(interact_table[interest_row][1] == type1 &&
interact_table[interest_row][2] == type2 &&
interact_table[interest_row][3] == type3
){
break;
}else{
interest_row++;
}
} /* Found the right row in the look-up table */
for(i = 0; i < interest_num; i++){
count_change[i] += foc_effect * jaco[interest_row][i];
}
utilities[interest_row] = utility;
case -1:
break; /* Add landscape effects here */
default:
break;
}
}
fitnesses[agent] = 0;
for(i = 0; i < interest_num; i++){
fitnesses[agent] += count_change[i] * utilities[i];
}
/* The below will be removed -- once a minor bug is found */
/* fitnesses[agent] = population[0][12][agent]; */
}
free(utilities);
free(count_change);
}
Nevertheless, this is definitely some progress – and the code is still fast. The next step is to print output from the above function to track down what is incorrect.
An additional thought that could be useful for the genetic algorithm
is that it might make sense for the AGENT
array to also
include abundances of each resource type and landscape level in columns
at the end of the array, the order of which matches the order of the 2D
array described yesterday. Something like
the former anecdotal
function could then be used to fill in
abundance values as appropriate (e.g., matching resource on the agent’s
owned land, or on public land, or nearby to the location of the
agent).
Initialise interaction array
A new function in R initialises an array of interactions among resource types and landscape layers.
#' Initialise array of resource and landscape-level interactions
#'
#'@param resources the resource array
#'@param landscape the landscape array
#'@export
make_interaction_array <- function(resources, landscape){
resource_types <- unique(resources[,2:4]);
resource_count <- dim(resource_types)[1];
landscape_count <- dim(landscape)[3] - 2; # Maybe put all of them in later?
total_dims <- resource_count + landscape_count;
INTERACTIONS <- matrix(data = 0, nrow = total_dims, ncol = total_dims);
name_vec <- NULL;
for(i in 1:dim(resource_types)[1]){
name_vec <- c( name_vec,
paste(resource_types[i,1],
resource_types[i,2],
resource_types[i,3],
sep = "" )
);
}
name_vec <- c(name_vec, as.character(paste("L",1:landscape_count,sep="")));
rownames(INTERACTIONS) <- name_vec;
colnames(INTERACTIONS) <- name_vec;
return(INTERACTIONS);
}
Specific values can be added in outside the
make_interaction_array
function and updated as need be by
G-MSE.
It’s a bit painful, but I’m going to delete some major pieces of code
in the genetic algorithm (which will obviously be preserved in version
control). The following functions are slowing things down, and given the
new approach outlined in option 3
from 13 APR, I’m going to remove them and focus on the
ACTION
array only, assuming that agents act as if their
actions will yield the intended results.
/* =============================================================================
* This function calculates an individual agent's fitness
* ========================================================================== */
double calc_agent_fitness(double ***population, int ROWS, int COLS,
int landowner, double ***landscape,
double **resources, int res_number, int land_x,
int land_y, int land_z, int trait_number,
double *fitnesses, double *paras){
int agent, resource, resource_new, trait, row, col, xloc, yloc, zloc;
int res_on_land, res_nums_added, res_nums_subtracted, res_num_total;
double *payoff_vector, *payoffs_after_actions, *payoff_change;
double **TEMP_RESOURCE, **TEMP_ACTION, ***TEMP_LANDSCAPE;
double **ADD_RESOURCES, **NEW_RESOURCES;
double a_fitness;
payoff_vector = malloc(ROWS * sizeof(double));
payoffs_after_actions = malloc(ROWS * sizeof(double));
payoff_change = malloc(ROWS * sizeof(double));
/* --- Make tempororary resource, action, and landscape arrays below --- */
TEMP_RESOURCE = malloc(res_number * sizeof(double *));
for(resource = 0; resource < res_number; resource++){
TEMP_RESOURCE[resource] = malloc(trait_number * sizeof(double));
}
for(resource = 0; resource < res_number; resource++){
for(trait = 0; trait < trait_number; trait++){
TEMP_RESOURCE[resource][trait] = resources[resource][trait];
}
}
TEMP_ACTION = malloc(res_number * sizeof(double *));
for(row = 0; row < ROWS; row++){
TEMP_ACTION[row] = malloc(COLS * sizeof(double));
}
for(row = 0; row < ROWS; row++){
for(col = 0; col < COLS; col++){
TEMP_ACTION[row][col] = population[row][col][landowner];
}
}
TEMP_LANDSCAPE = malloc(land_x * sizeof(double *));
for(xloc = 0; xloc < land_x; xloc++){
TEMP_LANDSCAPE[xloc] = malloc(land_y * sizeof(double *));
for(yloc = 0; yloc < land_y; yloc++){
TEMP_LANDSCAPE[xloc][yloc] = malloc(land_z * sizeof(double));
}
}
for(zloc = 0; zloc < land_z; zloc++){
for(yloc = 0; yloc < land_y; yloc++){
for(xloc = 0; xloc < land_x; xloc++){
TEMP_LANDSCAPE[xloc][yloc][zloc] = landscape[xloc][yloc][zloc];
}
}
}
/* ----------------------------------------------------------- */
calc_payoffs(TEMP_ACTION, ROWS, landscape, TEMP_RESOURCE, res_number,
landowner, land_x, land_y, payoff_vector);
do_actions(landscape, TEMP_RESOURCE, land_x, land_y, TEMP_ACTION, ROWS,
landowner, res_number, COLS);
/* ===== Below re-creates key parts of the resource model ===== */
project_res_abund(TEMP_RESOURCE, paras, res_number);
res_nums_added = 0;
res_nums_subtracted = 0;
for(resource = 0; resource < res_number; resource++){
res_nums_added += TEMP_RESOURCE[resource][10];
if(TEMP_RESOURCE[resource][8] < 0){
res_nums_subtracted += 1;
}
}
ADD_RESOURCES = malloc(res_nums_added * sizeof(double *));
for(resource = 0; resource < res_nums_added; resource++){
ADD_RESOURCES[resource] = malloc(trait_number * sizeof(double));
}
res_place(ADD_RESOURCES, TEMP_RESOURCE, res_nums_added, res_number,
trait_number, 10, 11);
res_num_total = res_number + res_nums_added - res_nums_subtracted;
NEW_RESOURCES = malloc(res_num_total * sizeof(double *));
for(resource = 0; resource < res_num_total; resource++){
NEW_RESOURCES[resource] = malloc(trait_number * sizeof(double));
}
resource_new = 0;
for(resource = 0; resource < res_number; resource++){
if(TEMP_RESOURCE[resource][8] >= 0){
for(trait=0; trait < trait_number; trait++){
NEW_RESOURCES[resource_new][trait] =
TEMP_RESOURCE[resource][trait];
}
resource_new++;
}
}
for(resource = 0; resource < res_nums_added; resource++){
for(trait = 0; trait < trait_number; trait++){
NEW_RESOURCES[resource_new][trait] = ADD_RESOURCES[resource][trait];
}
resource_new++;
}
res_landscape_interaction(NEW_RESOURCES, 1, 1, 8, res_num_total, 14,
TEMP_LANDSCAPE, 1);
/* ============================================================*/
calc_payoffs(TEMP_ACTION, ROWS, landscape, NEW_RESOURCES, res_num_total,
landowner, land_x, land_y, payoffs_after_actions);
a_fitness = payoffs_to_fitness(TEMP_ACTION, ROWS, payoffs_after_actions);
/* ----------------------------------------------------------- */
for(resource = 0; resource < res_num_total; resource++){
free(NEW_RESOURCES[resource]);
}
free(NEW_RESOURCES);
for(resource = 0; resource < res_nums_added; resource++){
free(ADD_RESOURCES[resource]);
}
free(ADD_RESOURCES);
for(xloc = 0; xloc < land_x; xloc++){
for(yloc = 0; yloc < land_y; yloc++){
free(TEMP_LANDSCAPE[xloc][yloc]);
}
free(TEMP_LANDSCAPE[xloc]);
}
free(TEMP_LANDSCAPE);
for(row = 0; row < ROWS; row++){
free(TEMP_ACTION[row]);
}
free(TEMP_ACTION);
for(resource = 0; resource < res_number; resource++){
free(TEMP_RESOURCE[resource]);
}
free(TEMP_RESOURCE);
free(payoff_change);
free(payoffs_after_actions);
free(payoff_vector);
return a_fitness;
}
The above re-creation of the resouce model was particularly slow –
essentially running a big chunk of resource.c
2000 times
(once for each of 100 simulated agents in the genetic algorithm for 20
generations). With more stake-holders or longer convergence times, this
would become very time-consuming without much benefit.
The above function calls project_res_abund
(below) which
is no longer needed.
/* =============================================================================
* This function looks at the resources and projects how many new resources
* their will be after deaths and births.
* resources: The resource array
* paras: Relevant parameter values
* res_number: The number of rows in the resource array
* ========================================================================== */
void project_res_abund(double **resources, double *paras, int res_number){
int birthtype, deathtype;
int birth_K, death_K;
int resource;
birthtype = (int) paras[3];
deathtype = (int) paras[4];
birth_K = (int) paras[5];
death_K = (int) paras[6];
res_add(resources, res_number, 9, birthtype, birth_K);
res_remove(resources, res_number, 8, deathtype, death_K);
}
The function calc_agent_fitness
also calls
calc_payoffs
, which can be removed.
/* =============================================================================
* This function calculated each payoff for rows in the action matrix
* population: array of the population that is made (malloc needed earlier)
* ROWS: Number of rows in the COST and ACTION arrays
* landscape: The landscape array
* resources: The resource array
* res_number: The number of rows in the resource array
* landowner: The agent ID of interest -- also the landowner
* land_x: The x dimension of the landscape
* land_y: The y dimension of the landscape
* payoff_vector: A vector of payoffs for each row of the action array
* ========================================================================== */
void calc_payoffs(double **population, int ROWS, double ***landscape,
double **resources, int res_number, int landowner,
int land_x, int land_y, double *payoff_vector){
int xloc, yloc, yield_layer;
int resource, row;
int landscape_specific;
int res_count;
double cell_yield;
for(row = 0; row < ROWS; row++){
payoff_vector[row] = 0;
if(population[row][0] == -2){
for(resource = 0; resource < res_number; resource++){
if(population[row][1] == resources[resource][1] &&
population[row][2] == resources[resource][2] &&
population[row][3] == resources[resource][3]
){
landscape_specific = population[row][6];
if(landscape_specific == 0){
res_count++;
}else{
xloc = resources[resource][4];
yloc = resources[resource][5];
if(landscape[xloc][yloc][2] == landowner){
res_count++;
}
}
}
}
payoff_vector[row] += res_count;
}
if(population[row][0] == -1){
yield_layer = population[row][1];
for(xloc = 0; xloc < land_x; xloc++){
for(yloc = 0; yloc < land_y; yloc++){
if(landscape[xloc][yloc][2] == landowner){
cell_yield = landscape[xloc][yloc][yield_layer];
payoff_vector[row] += cell_yield;
}
}
}
}
if(population[row][0] > -1){
payoff_vector[row] = 0;
}
}
}
I’m leaving in the functions do_actions
and
resource_actions
, which, while not part of the genetic
algorithm now, might be used in user.c
to to enact the
strategies selected by the genetic algorithm.
In place of all these functions, I’m going to write a modified
version of payoffs_to_fitness
(below, which will also be
removed) called actions_to_fitness
, which will need the
ACTION
array and RESOURCE
array to return a
value the_fitness
.
/* =============================================================================
* This function translates resouce abundances and crop yields to the fitness
* of an agent
* action: The action array
* ROWS: Number of rows in the COST and ACTION arrays
* payoffs: Payoffs associated with each row of the action arrray
* ========================================================================== */
double payoffs_to_fitness(double **action, int ROWS, double *payoffs){
int row;
double utility, abundance, the_fitness;
for(row = 0; row < ROWS; row++){
utility = action[row][4];
abundance = payoffs[row];
the_fitness += utility * abundance;
}
return the_fitness;
}
The idea will be to have agents assume that their actions
will have the intended results (e.g., killing 5 resources) without using
the entire resource model to project whether or not this is really
expected (e.g., if only 3 resources are avaialble to kill). Since the
ACTION
array includes
utilities, we can multiply the assumed action effects times utility
to calculate fitness. One necessary added complication
is that there needs to be some way to model indirect effects on fitness,
for example, if resources increase or decrease crop cell values (or
other resource abundances) or vice versa. There needs to be some way for
agents to recognise that they can, e.g., kill resources to increase crop
yield. Rather than go through the computationally intense task of
replicating full interactions within the genetic algorithm, I
think it would be better to have G-MSE create a 2D array that identifies
the effect of each resource type or landscape layer on each other
resource type or landscape layer. This array wouldn’t need to
be re-created ex nihilo every time the genetic algorithm is
run, but could instead be either produced at a higher level from the
parameters of the genetic algorithm, or perhaps calculated somehow in
the manager function (not yet written). Hence, the consequences of an
action on any given resource type or landscape layer could be followed
through the 2D array instead by re-creating the resource algorithm. This
would allow us to directly manipulate error as well, if for example some
stake-holders don’t recognise certain consequences of affecting one or
another resource. For proof of concept, only a two by two array needs to
be used.
Resource_1 | Landscape_1 | |
---|---|---|
Resource_1 | 0 | -0.5 |
Landscape_1 | 0.1 | 0 |
Where the rows above are the focal thing of interest and the columns show what the focal thing is having an effect on, per capita. This could be challenging because the per capita effect might vary with resource abundances, and might be factored through other parameters (e.g., landscape cells affecting resource birth or death). Getting expected change in abundance could be a bit challenging, though would be certainly less computationally intense than the way I was doing it before. I’ll start with using defined parameter values for proof of concept, but I do think that this array would be best built in the manager model, perhaps with multiple options that can incorporate error and uncertainty.
Double-check resource functions
It’s now time to simulate the recreation of the RESOURCE
array within the genetic algorithm, so it was useful to re-check the
resource functions to remember what they RESOURCE
array
looks like after the resource model portion of G-MSE and why. The
genetic algorithm needs to simulate births and deaths, making the code
below from resources.c
particularly relevant.
for(resource = 0; resource < rows; resource++){
res_adding[resource][realised] = 0;
rand_pois = rpois(res_adding[resource][add]);
res_adding[resource][realised] = rand_pois;
added += (int) rand_pois;
}
The above is in a switch
function that is currently
superfluous but might later model different types of reproduction. Hence
it is probably best to just run the whole function res_add
,
which will add the number of new resource each existing resource
produces to column 10
in C.
In fact, it will probably be considerably cleaner and more readable
to just make the biology-centred part of the whole resource
function in resource.c
its own function,
resource_dynamics
, then run resource_dynamics
in the genetic algorithm with appropriate links. As a bonus,
this would take care of the landscape-level effects of resources
too.
Calculate agent fitness function almost finished
The function calc_agent_fitness
is almost complete,
which will be an initial draft of the genetic algorithm after I write in
the code to translate resource abundances and crop yields to realised
utilities. The meat of the function (excluding intialisation an memory
management) is below
/* ----------------------------------------------------------- */
calc_payoffs(TEMP_ACTION, ROWS, landscape, TEMP_RESOURCE, res_number,
landowner, land_x, land_y, payoff_vector);
do_actions(landscape, TEMP_RESOURCE, land_x, land_y, TEMP_ACTION, ROWS,
landowner, res_number, COLS);
/* ===== Below re-creates key parts of the resource model ===== */
project_res_abund(TEMP_RESOURCE, paras, res_number);
res_nums_added = 0;
res_nums_subtracted = 0;
for(resource = 0; resource < res_number; resource++){
res_nums_added += TEMP_RESOURCE[resource][10];
if(TEMP_RESOURCE[resource][8] < 0){
res_nums_subtracted += 1;
}
}
ADD_RESOURCES = malloc(res_nums_added * sizeof(double *));
for(resource = 0; resource < res_nums_added; resource++){
ADD_RESOURCES[resource] = malloc(trait_number * sizeof(double));
}
res_place(ADD_RESOURCES, TEMP_RESOURCE, res_nums_added, res_number,
trait_number, 10, 11);
res_num_total = res_number + res_nums_added - res_nums_subtracted;
NEW_RESOURCES = malloc(res_num_total * sizeof(double *));
for(resource = 0; resource < res_num_total; resource++){
NEW_RESOURCES[resource] = malloc(trait_number * sizeof(double));
}
resource_new = 0;
for(resource = 0; resource < res_number; resource++){
if(TEMP_RESOURCE[resource][8] >= 0){
for(trait=0; trait < trait_number; trait++){
NEW_RESOURCES[resource_new][trait] =
TEMP_RESOURCE[resource][trait];
}
resource_new++;
}
}
for(resource = 0; resource < res_nums_added; resource++){
for(trait = 0; trait < trait_number; trait++){
NEW_RESOURCES[resource_new][trait] = ADD_RESOURCES[resource][trait];
}
resource_new++;
}
res_landscape_interaction(NEW_RESOURCES, 1, 1, 8, res_num_total, 14,
TEMP_LANDSCAPE, 1);
/* ============================================================*/
calc_payoffs(TEMP_ACTION, ROWS, landscape, NEW_RESOURCES, res_num_total,
landowner, land_x, land_y, payoffs_after_actions);
/* Need a calc_utilities function */
/* ----------------------------------------------------------- */
The next step is to write the calc_utilities
. Overall,
the whole program is noticeably slower, so I will want to optimise a bit
if possible. I also need to do some unit testing for
all of this to make sure that the genetic algorithm is doing what I
intend it to do.
Moving forward: optimsation and error in
ACTION
Having completed some initial coding and testing, there is a lot to
do on everything downstream of calc_agent_fitness
. The
function doesn’t appears to alter agent utilities somehow, and slows
down the simulations dramatically, from about half a second to several
minutes to get through 100 generations. Options for addressing this
include:
calc_payoffs
function and changes the nature of the
payoffs_to_fitness
function to directly assess fitness by
assumingactions will be successful.I find the third option most tempting. Perhaps there will be a case
for extreme accuracy in predicting the effects of actions, but I think
that it’s unlikely that we will lose much if we assume that agent
actions are successful in the genetic algorithm. This also builds in the
kind of error that would seem to be realistic in terms of human
behaviour. It will be necessary, however, to still have agents link
resource abundance with changes on the landscape – e.g., the indirect
fitness benefit in terms of crop production increase caused by killing a
resource needs to be realised in some way. The best way might be to
rewrite res_landscape_interaction
somehow to link the two
without looping through each landscape cell for each resource.
I don’t know the best way to do this yet – perhaps something in an
observation model that estimates mean crop loss due to a resources of
type X
?
Resolved Issue #17
I am now closing Issue #17
introduced yesterday, as the issue is
resolved such that action array columns util
,
u_loc
, and u_land
are now not touched by the
genetic algorithm where the first column of the action array takes a
negative value.
Moving on to castration
I have tested to confirm that the moving (i.e., scaring) action is
working and actually moving resources as intended. I am now moving on to
the code for castrating (decreasing birth rate to zero) resources. As
with the moving, there is really no analog on the landscape for this
(since crops modelled using the landscape don’t reproduce explicitly –
if we wanted them to, we could just model them as a different kind of
resource), so I am also only doing a function of this for resources. Any
positive values in the ACTION
array therefore have no
effect on landscape rows (i.e, where the first column equals
-1
).
The castration function (up and working) reuses a lot of code from the moving function, which initially led me to trying to make all of the actions part of one function.
/* =============================================================================
* This function causes the agents to castrate a resource
* land: The landscape array
* resources: The resource array
* owner: The agent ID of interest -- also the landowner
* u_loc: Whether or not an agent's actions depend on owning land cell
* casts_left: The number of remaining times an agent will castrate
* res_number: The total number of resources in the resources array
* land_x: The x dimension of the landscape
* land_y: The y dimension of the landscape
* res_type1: Type 1 category of resources being moved
* res_type2: Type 2 category of resources being moved
* res_type3: Type 3 category of resources being moved
* ========================================================================== */
void castrate_resource(double ***land, double **resources, int owner, int u_loc,
int casts_left, int res_number, int land_x, int land_y,
int res_type1, int res_type2, int res_type3){
int xpos, ypos, xloc, yloc;
int cell, cast;
int resource, t1, t2, t3;
resource = 0;
while(casts_left > 0 && resource < res_number){
t1 = (int) resources[resource][1];
t2 = (int) resources[resource][2];
t3 = (int) resources[resource][3];
if(t1 == res_type1 && t2 == res_type2 && t3 == res_type3){
xpos = (int) resources[resource][4];
ypos = (int) resources[resource][5];
cell = land[xpos][ypos][2];
cast = check_if_can_act(u_loc, cell, owner);
if(cast == 1){
resources[resource][9] = 0;
casts_left--;
}
}
resource++;
}
}
Nevertheless, having these modular actions makes the code a bit more
readable, and to combine all of them would require multiple
while
loops within the function anyway – the resource type
check could be pulled out, but then this would defeat the whole point of
being able to switch the order of actions. Then again,
it could make it easier to avoid having the same resource
experiencing multiple actions. This is probably undesirable.
Even more importantly, there is an issue here that
all of these actions will start out with resource = 0
, so
the first resource will by default experience multiple actions wherever
this is possible. Clearly this needs to be either randomised or done
systematically in some way. I think that the best solution is to
create a function to sample without replacement, put that
function in utilities.c
, then use it select resources to be
acted on – hence each resource will only experience one action.
In the unlikely event that there are more actions than resources, it
would be useful to somehow randomise which actions are taken – perhaps
smaller action specific functions should operate within the larger
function ordering actions. In any case, the above
castrate_resource
function and the moving function should
change.
Major restructure of actions successful
In working through the separate user actions, I found it challenging
to try to code things such that correct resources were affected, but
these actionable resources were not affected in any particular order
(e.g., all scaring first, then killing, etc.). If there was a particular
order, then it’s possible that users could systematically run out of
resources to do things to (perhaps because they exausted the resources
on their land) and hence always move resources but not kill
them for arbitrary reasons. To work around this, it is necessary to
randomly select an action and perform it on an actionable resource. This
is solved with a new function resource_actions
, which
initial testing finds to work as intended.
/* =============================================================================
* This function enacts all user actions in a random order
* resources: The resource array
* row: The row of the action array (should be 0)
* action: The action array
* can_act: Binary vector length res_number where 1 if resource actionable
* res_number: The number of rows in the resource array
* land_x: The x dimension of the landscape
* land_y: The y dimension of the landscape
* ========================================================================== */
void resource_actions(double **resources, int row, double **action,
int *can_act, int res_number, int land_x, int land_y){
int resource, xloc, yloc, i;
int util, u_loc, u_land;
int movem, castem, killem, feedem, helpem;
int *actions, total_actions, action_col, sample;
actions = malloc(5 * sizeof(int));
total_actions = 0;
for(i = 0; i < 5; i++){
action_col = i + 7;
actions[i] = action[row][action_col];
total_actions += action[row][action_col];
}
resource = 0;
while(resource < res_number && total_actions > 0){
if(can_act[resource] == 1){
do{ /* Sampling avoids having some actions always first */
sample = floor( runif(0, 5) );
}while(actions[sample] == 0 && sample == 5);
/* Enact whichever action was randomly sampled */
switch(sample){
case 0: /* Move resource */
xloc = (int) floor( runif(0, land_x) );
yloc = (int) floor( runif(0, land_y) );
resources[resource][4] = xloc;
resources[resource][5] = yloc;
actions[0]--;
break;
case 1: /* Castrate resource */
resources[resource][9] = 0;
actions[1]--;
break;
case 2: /* Kill resource */
resources[resource][8] = 1;
actions[2]--;
break;
case 3: /* Feed resource (increase birth-rate)*/
resources[resource][9]++;
actions[3]--;
break;
case 4: /* Help resource (increase offspring number directly) */
resources[resource][10]++;
actions[4]--;
break;
default:
break;
}
total_actions--;
}
resource++;
}
free(actions);
}
The above function is called by do_actions
within a
switch
statement. Recall that the resources
array here is a temporary array that will later be used to assess the
impact of actions with respect to user utility to ultimately assign each
agent in the genetic algorithm a fitness value. Some functions within
resource.c
are going to need to be used for this.
Since I have added multiple functions that allocate memory, now is probably a good time to check for any errors or memory leaks.
R -d "valgrind --tool=memcheck --leak-check=yes --track-origins=yes" --vanilla < gmse.R
After running valgrind, all appears to be clear.
==5787== HEAP SUMMARY:
==5787== in use at exit: 95,076,495 bytes in 17,685 blocks
==5787== total heap usage: 12,369,394 allocs, 12,351,709 frees, 2,246,425,240 bytes allocated
==5787==
==5787== LEAK SUMMARY:
==5787== definitely lost: 0 bytes in 0 blocks
==5787== indirectly lost: 0 bytes in 0 blocks
==5787== possibly lost: 0 bytes in 0 blocks
==5787== still reachable: 95,076,495 bytes in 17,685 blocks
==5787== suppressed: 0 bytes in 0 blocks
==5787== Reachable blocks (those to which a pointer was found) are not shown.
==5787== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==5787==
==5787== For counts of detected and suppressed errors, rerun with: -v
==5787== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
The changes now have been pushed from a local branch to
dev
. The next thing to work on is to get
key parameters from the temporary resource array that was changed (how
many resources added, lost, moved, etc.). After this information is
collected, then another calc_payoffs
can be run on the
changed array to get an updated estimate of key values and compare
before and after user actions.
I’ve decided that u_loc
should actually refer to the
actions of a particular agent being taken on their own land
(u_loc = 1
) or on all land (u_loc = 0
). I’m
debating whether a third option u_loc = -1
should be
available for forcing action to only occur on public land. I have also
created a more readable structure for the do_actions
function in game.c
, which is going to be a bit on the long
side, and will therefore need to be written in a way that is easy to
follow, going through each action and performing the action through a
series of nested while
loops.
Cost issue: Issue #17
The action array has three columns of util
,
u_loc
, and u_land
, which represent the utility
of a resource, whether or not actions on the resource are restricted to
the user’s land, and whether or not the utility of the resource is
dependent on it being on the user’s land. Currently, any positive values
correspond to some cost in the cost array, which means that they are
changed to zero when the cost is high. In essence, these three
columns represent identity, while the remaining rows to the right
represent actions. Ideally, we don’t want the users to be
affecting, or the constrain_costs
function changing,
util
, u_loc
, and u_land
columns –
only the ones to the right.
What needs to happen next is for util
,
u_loc
, and u_land
columns to be untouchable by
the genetic algorithm when the first row in the action array
(agent
) is negative – corresponding to direct actions of
the user on resources or landscape layers. Remaining util
,
u_loc
, and u_land
should be
touchable. Hence, within constrain_costs
, it is necessary
to block adjustment to the relevant columns.
I appear to have found a fix for this, but I’m going to wait a day before I call the issue resolved. The fix basically involved telling the program not to touch columns below 7 if the first column of the row is less than one.
start_col = 4;
if(population[row][0][agent] < 0){
start_col = 7;
}
The new variable start_col
then defines the column to
start on when considering whether or not to constrain costs. The above
also needs to appear in the functions affecting population
initialisation, mutation, and crossover of the genetic algorithm. I’m
not sure if there is a more elegant or more readable solution, but the
above appears to work fine. The appropriate columns are untouchable in
rows where the first column is negative. The constraining part of the
constrain_costs
function also looks a bit messy.
while(tot_cost > budget){
do{ /* This do assures xpos never equals ROWS (unlikely) */
xpos = (int) floor( runif(0,ROWS) );
}while(xpos == ROWS);
if(population[xpos][0][agent] > 0){
do{
ypos = (int) floor( runif(4,COLS) );
}while(ypos == COLS);
}else{
do{
ypos = (int) floor( runif(7,COLS) );
}while(ypos == COLS);
}
if(population[xpos][ypos][agent] > 0){
population[xpos][ypos][agent]--;
tot_cost -= COST[xpos][ypos][layer];
}
}
I think the messiness is really mostly caused by the do
loops, which are there as a safety precaution against the unlikely event
that the random number selected exactly equals ROWS
or
COLS
, and hence returns a segfault.
Next steps
With the util
, u_loc
, and
u_land
column situation seemingly resolved in the
action array, I’ve done some initial testing again on the
move_resource
. The move_resource
function now
appears to only move resources when it’s supposed to (i.e., when they’re
on the land and the action array says to move them – assuming
u_loc = 1
). Now, before moving on, I should check to
make sure that resources are actually being moved in the resources
array. Once this is finished, I will double check Issue
17, then move on to a castrate_resource
function.
I have re-arranged the fitness function structure to calculate
fitness payoffs more clearly. One top level
strategy_fitness
function will calculate all strategy
fitness in the genetic algorithm by looping through
calc_agent_fitness
for each agent in the population
(note: this is each agent in the genetic algorithm
population, from which the new strategy for one agent
in the bigger G-MSE will be selected). The
calc_agent_fitness
will itself call the
calc_payoffs
function (see below) to get a vector with the
same rows as the ACTION
and COST
arrays. Each
element will eventually represent a change in the resource or landscape,
corresponding to some utility value which will make it possible to
calculate and compare overall fitness of the strategy.
void calc_payoffs(double ***population, int ROWS, double ***landscape,
double **resources, int res_number, int landowner,
int land_x, int land_y, double *payoff_vector, int agent){
int xloc, yloc, yield_layer;
int resource, row;
int landscape_specific;
int res_count;
double cell_yield;
for(row = 0; row < ROWS; row++){
payoff_vector[row] = 0;
if(population[row][0][agent] == -2){
for(resource = 0; resource < res_number; resource++){
if(population[row][1][agent] == resources[resource][1] &&
population[row][2][agent] == resources[resource][2] &&
population[row][3][agent] == resources[resource][3]
){
landscape_specific = population[row][6][agent];
if(landscape_specific == 0){
res_count++;
}else{
xloc = resources[resource][4];
yloc = resources[resource][5];
if(landscape[xloc][yloc][2] == landowner){
res_count++;
}
}
}
}
payoff_vector[row] += res_count;
}
if(population[row][0][agent] == -1){
yield_layer = population[row][1][agent];
for(xloc = 0; xloc < land_x; xloc++){
for(yloc = 0; yloc < land_y; yloc++){
if(landscape[xloc][yloc][2] == landowner){
cell_yield = landscape[xloc][yloc][yield_layer];
payoff_vector[row] += cell_yield;
}
}
}
}
if(population[row][0][agent] > -1){
payoff_vector[row] = 0;
}
}
}
The above will needed to be called twice in
calc_agent_fitness
so that the difference between vector
elements can be calculated.
Use of memcpy
to copy whole arrays
I have saved a bit of hassle by switching from the multiple loops to
the simple use of memcpy
in c, which works as follows in
the calc_agent_fitness
function.
void calc_agent_fitness(double ***population, int ROWS, int COLS, int landowner,
double ***landscape, double **resources, int res_number,
int land_x, int land_y, int trait_number,
double *fitnesses){
int agent, resource;
int res_on_land;
double *payoff_vector;
double **TEMP_RESOURCE;
payoff_vector = malloc(ROWS * sizeof(double));
TEMP_RESOURCE = malloc(res_number * sizeof(double *));
for(resource = 0; resource < res_number; resource++){
TEMP_RESOURCE[resource] = malloc(trait_number * sizeof(double));
}
memcpy(&TEMP_RESOURCE, &resources, sizeof(TEMP_RESOURCE));
for(resource = 0; resource < 10; resource++){
printf("%f\t%f\t || %f\t%f\n", resources[resource][0],
resources[resource][1], TEMP_RESOURCE[resource][0],
TEMP_RESOURCE[resource][1]);
}
/*
calc_payoffs(population, ROWS, landscape, resources, res_number, landowner,
land_x, land_y, payoff_vector, agent);
*/
free(payoff_vector);
}
The temporary vector TEMP_RESOURCE
needs to be made and
remade so many times, and it appears that memcpy
is slightly faster than for loops. Nevertheless, I fear that use of
memcpy
might make the code less readable and its
implementation could depend
on the hardware and compiler, which I don’t want. For now, I’m going
to do this the more readable way.
Using do_actions
function
A do_actions
function will enact the actions of
one (usually out of 100) member of the population in
the genetic algorithm. So the general procedure will be to do the
following.
strategy_fitness
, loop through each member of the
population (agent ACTION
array) running
calc_agent_fitness
.calc_agent_fitness
, copy a dummy version of the
resource, action, and landscape arrays.calc_agent_fitness
, run calc_payoffs
to
get the payoffs before performing actions.calc_agent_fitness
, then run
do_actions
, which causes a change in the temporary resource
array as a consequence of the temporary action array.calc_agent_fitness
, re-run calc_payoffs
to get new payoffs after having performed the actionsThis should give a fitness function that is then returned to
strategy_fitness
(might want to have
calc_agent_fitness
return an int
), which will
store all fitnesses in a vector after the above loops. More bells and
whistles can be added on to this later, but when this is finished, it
should be a working genetic algorithm for modelling complex stake-holder
behaviour.
Working through implementing the ideas for the fitness function from
yesterday. I’ve linked some key parameters
now through user
and ga
so they can be run in
strategy_fitness
(namely the resource number and agent_ID,
landowner
). Now it’s important to note that building local
resources has to be conditional – if an agent has no land, they cannot
do things on their land. And if their interests are global, this needs
to be considered too. I think a collection of small functions called
according to parameter options is needed, and landscape specific changes
are really a subset of general actions, so maybe there’s a better way to
do this. Really though, it would be nice to have a way for the cost of
performing actions on land owned versus land
not owned to be different. Then again, it would be nice to have
different utilities for resources on and off your land, but this could
get very complex very quickly (might, however, be interesting in that
maybe a farmer values crops positively on their own land but negatively
on the land of other farmers). I think it will also shake out when the
manager actions affecting costs comes into play – so a manager will
naturally up the cost if shooting somehow becomes not tied to a
location in a way that affects other stake-holders or management
decisions, for whatever reason. The bottom line is that I think it’s
okay for now to run the ga
with the constraint that
if a resource/landscape utility value is tied to land ownership,
then the actions should also be tied to owned landcape cells.
If not, then actions should happen either on all cells or only public
land.
void strategy_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, int COLS, double ***landscape,
double **resources, double **agent_array,
int res_number, int landowner){
int xloc, yloc;
int agent, resource;
int res_on_land;
double **RESOURCE_LOCAL;
/* Need something here -- check if:
*
* 1) agent has landscape-specific utility
* 2) agent actually owns some land
*
* If neither are true, then RESOURCE_LOCAL should not be built, and actions
* of stake-holders should be interpreted accordingly (e.g., agents could be
* allowed to do some actions on public land, or not at all -- perhaps an
* option added to paras?
*/
res_on_land = 0; /* Make a sub-function returning an int for this */
for(resource = 0; resource < res_number; resource++){
xloc = (int) resources[resource][4];
yloc = (int) resources[resource][5];
if(landscape[xloc][yloc][2] == landowner){
res_on_land++;
}
}
for(agent = 0; agent < pop_size; agent++){
fitnesses[agent] = population[0][12][agent];
}
}
Definitions of the
COST
and ACTION
arrays
A quick reminder to myself what’s going on in the COST
and ACTION
arrays, as it is most relevant for fitness
functions. In the example COST
array there are two agents
(very simple – just a manager and a stake-holder) and one resource. The
table below is one layer of a 3D 2-layer array where each layer
identifies actions for each unique agent.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem | bankem |
---|---|---|---|---|---|---|---|---|---|---|---|---|
-2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 |
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
The way to read the above in the code is as follows. All of the
actions are for the one stake-holder (i.e., assume we’re looking at the
second layer in the 3D array). The first row of actions is special
because it represents the degree to which a stake-holder themselves
values and things they will do to a resource
of type1 = 1
, type2 = 0
, and
type3 = 0
. Value is indicated by util
, whether
or not that value is ‘visible’ to the agent (which may be implemented in
different ways, but for now is within some distance of their location)
is indicated by u_loc
, and whether or not that value is
tied to it being on land that the stake-holder owns is indicated by
u_land
. Actions are indicated by the remaining columns, and
if u_land = TRUE
, then we assuem that actions are
restricted to resources on the agent’s owned land – though I am tempted
to change u_loc
to mean whether or not actions are
(1
) or are not (0
) restricted in this way.
So the first row where agent = -2
represents values and actions of the focal agent (indicated by this
layer of the 3D array) for a particular type of resource (note that more
rows where agent = -2
would be needed for more resources).
The second row where agent = -1
refers
specifically to the values and actions of a particular layer of the
landscape, which is indicated in the type1
column (making
type2
and type3
effectively useless in this
row, for now). Right now this is type1 = 1
, which is the
index (for C – R is of course 2) where the values of crop production are
stored; I’m not sure if it’s worth adding more rows for additional
layers later, but this framework at least allows the possibility for
other landscape properties to be valued. Of course, whenever
agent = -1
and we’re looking at the landscape, actions such
as movem
and castem
will need to have
different meanings – or no meaning, but feedem
and
killem
could be fairly straightforward. The
third row is action taken to the agent whose ID is
1
with reference to resource type1 = 1
,
type2 = 0
, and type3 = 0
. Any nonzero values
here in util
, u_loc
, and u_land
cause the focal agent to change the value of another agent,
while any nonzero values in movem
, castem
, …,
bankem
cause the focal agent to change the cost of
another agent taking a particular action (i.e., it affects the other
agent’s layer at agent = -2
), increasing it or decreasing
it (NOTE: I just noticed that I really need to set up these
tables with values for type1 = -1
or something here – to
allow agents to affect other agents costs of actions affecting the
landscape). So in theory the agent in the above table could
change the manager’s values (e.g., modelling lobbying) or the cost of
them performing actions (e.g., modelling something like protesting or
lobbying third parties?). The could also in theory change their
own values and costs of perfoming actions, though I think this should
almost always be prohibited by making the cost of doing so effectively
infinite, which brings me to the cost array.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem | bankem |
---|---|---|---|---|---|---|---|---|---|---|---|---|
-2 | 1 | 0 | 0 | 101 | 101 | 101 | 3 | 8 | 4 | 4 | 2 | 1 |
-1 | 1 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
1 | 1 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
2 | 1 | 0 | 0 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 | 101 |
Assume that each agent gets a total budget of 100
. In
that case, all of the table elements equal to 101 are effectively off
limits because they are too costly (I might actually want to make them
1000 just in case some agent gets crafty and tries to lower another
agent’s cost). So in the above, the stake-holder can only take six
possible actions, all of which directly affect resources and not other
stake-holders’ values or costs. By setting it up this way, the genetic
algorithm will converge on the best set of these actions and the
ACTION
array above will never change values where
COST
elements are 101
. Later, we might
decrease the value of util
in row 3 to allow the
stake-holder to lobby the manager. Managers (different layer) would
simply have cost arrays that have lower values allowing them to affect
actions of stake-holders. With these tables now (I hope) completely
clear, the code writing itself should become much smoother – I’ve
anchored to the title immediately above because I know it’s going to be
necessary to come back to these two difficult to remember tables.
Objectives for fitness function
Eventually we’ll want to find some kind of mult-objective fitness function (Lee 2012), especially for the managers. For now I’m going to simplify and just make fitness a simple matter of abundance times deviation from utility – where for a landscape ‘abundance’ is replaced by owned cell crop yield, then sum over all resources and landscape layers. For now, utilities can be set unreasonably high – so conservationists might want 1000000 geese and farmers might want 1000000 in yield – much higher than is possible so that more is always better. The fitness function will then assign fitnesses minimising the deviation from utility somehow. The deviation from will eventually allow managers to have more reasonable goals – allowing the genetic algorithm to find more flexible and dynamic strategies.
Quick check on neural networks
I want to make sure I understand neural networks well enough to be
able to explain why I’m not using them (yet). Daniel Shiffman’s book chapter
in The Nature of Code helps out here. Because the simulated
agents aren’t trying to recognise a particular pattern, I don’t think a
neural network is how I would describe the COST
and
ACTION
arrays – nor would a more explicit network structure
going from input (e.g., costs and resources abundances) to output
(actions) be terribly useful for current purposes. Nevertheless,
a neural network will be useful if combined with empirical data
to mimic human behaviour. For example, if we want to make an
agent that predicts stake-holders’ actions based off of empirically
collected data from behavioural games, then a neural network could be
fed input and then callobrated to the ‘correct’ behaviour observed by
humans through correlations with specific conditions. This would, in
effect, create an artifical bot that does what a human would do based on
correlating situations with actions. I’m not sure how much data it would
require to parameterise effectively, but I suspect more than would be a
lot – we would probably need to have dozens of people act as stake-holders or managers within G-MSE
and collect their actions.
Back to fitness functions
I now need to complete the genetic algorithm with a useful fitness function. The fitness function should calculate the change in resources caused by a stake-holder’s actions, then match them to utility. Note that this doesn’t require all resources to be calculated to figure out what total utility is before and after an agent has acted – only how the agent’s actions have increased or decreased resources, and the weight (utility) assigned to each.
A starting point is to do some clean-up. There are several values
hard coded in to the ga()
function that need to be assigned
variables to be set in gmse.R
. Once I have the
ga()
function a bit more readable, then I can move on to
the specifics of the fitness function.
Having done the clean-up, I now note again that the utility from an action isn’t always direct – e.g., killing one resource might increase crop yield. Somehow, the action of removing an individual from the population must therefore be recognised by the agent as increasing yield by a particular amount. There are a few ways that I could think to do this:
Have stake-holders correlate resources with crop production on cells. This would be the most complex way of doing things – probably the most flexible too, but I’m not sure if it would actually be the most realistic. Not for farmers watching their crops being eaten at least; the cause and effect is something I think stake-holders could probably observe pretty clearly.
Give stake-holders complete access to the resouce array and have them figure out exactly how much damage their land is going to sustain by seeing the number of resources on it and the amount of damage that each reosurce reduces per cell (column 14 in C, 15 in R). Maybe this is the best starting point, though it does seem to be a bit too exact; no farmer is going to know exactly how many animals are on their farm and exactly how much damage they will do. Still, perhaps we assume this and add in error later.
Give stake-holders access to the resource array column in which
crop damage is specified, then have them associate mean damage per cell
with each resource type. Do not, however, give them access to resource
locations, and require that they instead estimate the density of
resources on their landscape in the same way that managers might in an
observation model type 0
(i.e., look at a few cells on
their property, then infer the total number of resources and how much
damage they’ll do). I like this because it seams reasonable that a
farmer could know roughly how much damage an animal does to their crop
in the area, but probably doesn’t have the time or ability to sample
every corner of their land to find exactly the number of animals on it.
It also doesn’t give stake-holders a superior ability to estimate local
population size.
Ignore the resource array – just have the stake-holders act some
way (e.g., invest in killing stuff on the land or feeding stuff), then
run a sub-routine mimicking the population landscape interaction (e.g.,
call res_landscape_interaction
from resource.c
directly). If some resources are created or destroyed, then this would
need to be accounted for by making a dummy resource array. Perhaps the
following:
RESOURCE
array on the stake-holder’s landRESOURCE_LOCAL
with
only resources on the stake-holder’s landRESOURCE_LOCAL
array in relevant columns (e.g., birth
,
remove_pr
, etc.)res_add
and res_remove
to
get the number of individuals being added or removed.res_landscape_interaction
function to find the
effect of the added and removed individuals on landscapeAlthough I initially thought option 3 was pretty good, I’m now
leaning toward option 4 as being the best one to try out first; it seems
more flexible. Eventually, of course, we can specify options for
different ways of calculating fitness, but I think it’s best to pick one
option first and go with it. I think option 4 will be slightly slower
than option 3, but I’m curious as to exactly how much slower. Hence,
I’ll try number 4 first, then potentially move to 3 as a default if it’s
too clunky. I have to keep in mind as well that managers are probably
going to need to run the user
functions to set policy
eventually (unless I can find a work-around that gets managers to
anticipate stake-holder actions in making policy), and this will likely
slow things down exponentially. Time isn’t much of an issue now, and I
want to keep things efficient as possible. Also
important, I need to make sure that there is some
if
statement that only deals with the landscape if the
stake-holder owns land. If they don’t own any, then their actions need
to be restricted accordingly – maybe to lobbying the manager or only
doing things on public land?. As a next step, I will
attempt to write the code for option 4 above, perhaps excluding (for
now) stake-holders that own no land.
More thoughts on genetic algorithms
I’ve come back to thinking more about how to write the fitness
function of the G-MSE genetic algorithm, and about the relationship
between evolution and individual learning, more generally. Watson and Szathmary (2016) argue that learning
and (adaptive) evolution are formally linked. In practice, they note
that ‘’In a good model space, desirable future behaviours should be
similar (nearby) to behaviours that were useful in the past. For
example, perhaps ’eating apples’ should be close to ‘eating pears’ but
far from ‘eating red things’.’’ Watson and
Szathmary (2016) also note that ‘’The representation of
associations or correlations has the same fundamental relationship to
learning as transistors have to electronics or logic gates to
computation (and synapses to neural networks). Although mechanisms to
learn a single correlation between two features can be trivial, these
are also sufficient, when built up in appropriate networks, to learn
arbitrarily complex functions’’. A potentially confusing aspect of
this with respect to G-MSE is that we have two scales of time of
interest. The first scale is within a single time step (i.e., inside the
user model), and the second scale is over multiple time steps
(population model \(\to\) observation
model \(\to\) management model \(\to\) user model). Most of the time, when
we focus on learning, we’re talking about the program learning to
make a decsion within a time step rather than stake-holders
learning to make decisions across time steps. I’m not opposed to
modelling the latter, but the former needs to come first in software
development. So when we model learning through the genetic algorithm,
it’s the iterative processes in ga()
– there is less worry,
I think, about the correlations that Watson and
Szathmary (2016) describe; rather, the associations are explicit.
A value in the ACTION
array is associated
with a particular outcome that can be tied to stake-holder interests.
More abstract learning over G-MSE generations can be added in later with
estimates of correlations between actions and outcomes.
Major updates merged to master
I have merged all of the recent updates on the genetic algorithm to
the master branch. We now have a bug-free G-MSE model
v0.0.8
that has all of the necessary framework of proper
machine learning once a fitness function is written that links costs and
utilities of each agent to agent actions. There are a few things that
will need to be updated thereafter, which I am putting off until later
when the full genetic algorithm is complete and I am sure how it should
be called by user.c
. As of now, the function runs only once
for the first agent in the AGENT
array. Eventually, the
function ga
will need to be looped within
user.c
for each stake-holder (and called in
manager.c
, not yet written, for the manager). I also still
need to pass the parameter vector to ga
with values for the
genetic algorithm which are currently hard coded into
ga
.
After some additional debugging of the
find_descending_order
in utilities.c
(which
was returning the incorrect index and therefore not selecting for high
fitness strategies), I have a working genetic algorithm with a very
simple fitness function.
void strategy_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, int COLS, double ***landscape,
double **resources, double **agent_array){
int agent;
for(agent = 0; agent < pop_size; agent++){
fitnesses[agent] = population[0][12][agent];
}
}
Essentially, the above function checks row zero and column 12 in an
agent’s action array, and defines fitness as whatever value is in this
array element. Fitness cannot increase indefinitely because of the cost
constraints from the COST
array. Hence the genetic
algorithm should increase fitness up to the point where it can’t any
longer because it is constrained by costs. We can see this over 20
generations of the genetic alogrithm (note, this is different
than simulation time steps – each simulated time step of G-MSE
includes, in this example, a genetic algorithm where strategies updated
over 20 generations). The plot below therefore represents an agent
‘’evolving’’ the best strategy for one G-MSE time step
The ACTION
array for the zero agent (the only one run
for a genetic algorithm in test simulations) showed a corresponding
change in each simulated G-MSE time step, with agents having the actions
below (or very similar actions).
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] -2 1 0 0 0 0 0 0 0 0 0 0 12
[2,] -1 1 0 0 0 0 0 0 0 0 0 0 0
[3,] 1 1 0 0 0 0 0 0 0 0 0 0 0
[4,] 2 1 0 0 0 0 0 0 0 0 0 0 0
In the above, the agent’s only action is to invest all of their
energy to doing the action in ACTION[1,13]
(bankem
), as predicted given the simple fitness function
assigned a priori. Hence, with a working genetic algorithm for
agents, what is necessary now is to clarify the fitness function to
reflect agent utilities. Some clean-up is also necessary to call genetic
algorithm specific parameters from the main gmse.R
file –
right now there are some hard-coded values in the ga
function, and user.c
doesn’t loop through multiple agents
(or check and use only stake-holders).
Part of the problem from last Friday was
that the arrays fitnesses
and winners
were
uninitialised in the genetic algorithm before being used. Fixing this
and running Valgrind returns no errors and no memory leaks.
==32451==
==32451== HEAP SUMMARY:
==32451== in use at exit: 89,001,346 bytes in 13,024 blocks
==32451== total heap usage: 5,218,764 allocs, 5,205,740 frees, 621,820,827 bytes allocated
==32451==
==32451== LEAK SUMMARY:
==32451== definitely lost: 0 bytes in 0 blocks
==32451== indirectly lost: 0 bytes in 0 blocks
==32451== possibly lost: 0 bytes in 0 blocks
==32451== still reachable: 89,001,346 bytes in 13,024 blocks
==32451== suppressed: 0 bytes in 0 blocks
==32451== Reachable blocks (those to which a pointer was found) are not shown.
==32451== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==32451==
==32451== For counts of detected and suppressed errors, rerun with: -v
==32451== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
So we have an error that Valgrind can’t figure out for some reason.
It’s worth noting that the crash never occurs on the first simulation;
it always takes a couple re-runs of gmse
in success for it
to crash. It could just be overloading Rstudio, but I want to keep
pushing to figure it out. I’m now going to test by simply running 100
times in succession.
test <- NULL;
for(i in 1:100){
test <- gmse( observe_type = 0,
agent_view = 20,
res_death_K = 400,
plotting = FALSE,
hunt = FALSE,
start_hunting = 95,
fixed_observe = 1,
times_observe = 1,
land_dim_1 = 100,
land_dim_2 = 100,
res_consume = 0.5
)
print(i);
}
Unfortunately, the above crashed in the first loop upon running. Then
in the second attempt, it crashed on the 11th loop. When I stop running
the genetic algorithm, it never crashes though, so I can at least start
to isolate the problem. I have a feeling it’s in the
utilites.c
file.
Except that now the crash occurs when the ga
is
commented out. The issue actually appears to be somewhere else in
user.c
because I can run the above for
i in 1:1000
and not get an error if I don’t call
user.c
from R at all. Now I need to try to really examine
user.c
and see what’s happening. I’m going to start by not
calling send_agents_home
or count_cell_yield
within the user function (while still calling the genetic algorithm) to
see if a crash occurs.
The problem, as it turns out, was in the function
send_agents_home
. After much hassle and multiple times
running Valgrind, I found that initialising agent_xloc
or
agent_yloc
to the agent’s array values would occaisionally
produce a segfault because the values were not always within the
landscape. This was corrected by initialising these values to zero
before ‘’sending agents home’’, but I’m not sure why it arose at all in
the first place. Where the agents are has never been a focus, except for
when manager agents (type1 = 0
) are observing (the code for
which is stable). To solve the problem more flexibly, I’ve replaced a
straight assignment with the below code.
agent_xloc = agent_array[agent][4];
agent_yloc = agent_array[agent][5];
if(agent_xloc < 0 || agent_xloc >= xdim){
agent_xloc = 0;
}
if(agent_yloc < 0 || agent_yloc >= ydim){
agent_yloc = 0;
}
Now in the very rare cases where agent locations are off the map (and
it might be worth figuring out why – perhaps they’re getting moved
somewhere arbitrarily and not moved back?), they will be placed on a
cell that they own. This was the point of the function anyway, so it’s
not a huge deal. It’s still a bit odd though, and I’m not sure why it
was affecting only about one in thirty simulations. I’ll consider
Issue #16: Potential bug: In user.c
closed
now, and move on to the genetic algorithm again.
Placing tournament winners into a new array
At the end of the tournament function, we have a vector of winners
with high fitness. These winners represent the array layers that need to
comprise the new 3D array, which will be the start of the next
generation of the genetic algorithm. Hence the need for a
place_winners
function to make a new
POPULATION
array to replace the old one. This could be done
by individually replacing elements of a NEW_POPULATION
into
the old array POPULATION
, but a handy swapping of pointers
can do this without the multiple loops.
/* =============================================================================
* Swap pointers to rewrite ARRAY_B into ARRAY_A for a an array of any dimension
* ========================================================================== */
void swap_arrays(void **ARRAY_A, void **ARRAY_B){
void *TEMP_ARRAY;
TEMP_ARRAY = *ARRAY_A;
*ARRAY_A = *ARRAY_B;
*ARRAY_B = TEMP_ARRAY;
}
The above function works for 2D and 3D arrays by running the below.
swap_arrays((void*)&MAT1, (void*)&MAT2);
We can see the arrays swapped in the output (the first 3 columns before the “|” partition denotes layer 1, and after denotes layer 2, so the array is \(3 \times 3 \times 2\) dimensions).
=========================================
---------------- Pre-swap MAT 1 ------------
0 0 1 | 6 5 0
2 8 6 | 9 2 4
1 2 1 | 9 2 5
---------------- Pre-swap MAT 2 ------------
1 4 3 | 8 8 6
1 5 8 | 3 2 4
3 9 2 | 8 3 8
---------------- Post-swap MAT 1 ------------
1 4 3 | 8 8 6
1 5 8 | 3 2 4
3 9 2 | 8 3 8
---------------- Post-swap MAT 2 ------------
0 0 1 | 6 5 0
2 8 6 | 9 2 4
1 2 1 | 9 2 5
Since this works, we can use swap_arrays
to write a
concise function for placing the new individuals.
Potential bug: In user.c
I can’t tell if I’m just overloading R by running the simulation too
many times too quickly (clicking to fast), or if there’s actually a bug
here. But when I comment out the below lines of code in the
send_agents_home
function of user.c
, things
seem fine.
while(agent_ID != landowner){
do{
agent_xloc = (int) floor( runif(0, xdim) );
}while(agent_xloc == xdim);
do{
agent_yloc = (int) floor( runif(0, ydim) );
}while(agent_yloc == ydim);
landowner = (int) landscape[agent_xloc][agent_yloc][layer];
}
When I re-run the code quickly in succession, the above (I think) will very rarely crash the G-MSE program. I can’t figure out why yet. It’s logged as an issue now. Valgrind report below.
==15500== Invalid read of size 8
==15500== at 0xC298756: is_number_on_landscape (user.c:19)
==15500== by 0xC298811: send_agents_home (user.c:50)
==15500== by 0xC299166: user (user.c:303)
Valgrind doesn’t appear to like the comparing of a landscape value
(double
) with an int
, so I’m going to change
this now. So the function is_number_on_landscape
now
defines land_num = (int) landscape[xval][yval][layer];
instead of calling the landscape value directly. I have also gotten rid
of the sub-function is_number_on_landscape
, but the crash
still sometimes happens. It’s possible that this was actually two bugs
though, one affecting the ga
. From Valgrind below now
(invalid read is gone).
==16758== Conditional jump or move depends on uninitialised value(s)
==16758== at 0xC29819E: sort_vector_by (utilities.c:63)
==16758== by 0xC29A1E1: tournament (game.c:280)
==16758== by 0xC29A66D: ga (game.c:415)
==16758== by 0xC29914F: user (user.c:294)
==16758== by 0x4F0A57F: ??? (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F4272E: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F44E47: ??? (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F42520: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F43DDC: Rf_applyClosure (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F422FC: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F45FB5: ??? (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F42520: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== Uninitialised value was created by a heap allocation
==16758== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16758== by 0xC29A583: ga (game.c:390)
==16758== by 0xC29914F: user (user.c:294)
==16758== by 0x4F0A57F: ??? (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F4272E: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F44E47: ??? (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F42520: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F43DDC: Rf_applyClosure (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F422FC: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F45FB5: ??? (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F42520: Rf_eval (in /usr/lib/R/lib/libR.so)
==16758== by 0x4F44E47: ??? (in /usr/lib/R/lib/libR.so)
==16758==
This all goes back to the sort_vector_by
, which I should
probably look at a potentially rewrite. The sort function is called by
the tournament
function.
Some progress on the genetic algorithm
Despite this, there has been progress on the genetic algorithm.
Enough that I want to merge the local branch to dev and rev, but not
master yet. The place_winners
function appears to work
fine.
void place_winners(double ****population, int *winners, int pop_size, int ROWS,
int COLS){
int i, row, col, layer, winner;
double a_value;
double ***NEW_POP;
NEW_POP = malloc(ROWS * sizeof(double *));
for(row = 0; row < ROWS; row++){
NEW_POP[row] = malloc(COLS * sizeof(double *));
for(col = 0; col < COLS; col++){
NEW_POP[row][col] = malloc(pop_size * sizeof(double));
}
}
for(i = 0; i < pop_size; i++){
winner = winners[i];
for(row = 0; row < ROWS; row++){
for(col = 0; col < COLS; col++){
a_value = (*population)[row][col][winner];
NEW_POP[row][col][i] = a_value;
}
}
}
swap_arrays((void*)&(*population), (void*)&NEW_POP);
for(row = 0; row < ROWS; row++){
for(col = 0; col < COLS; col++){
free(NEW_POP[row][col]);
}
free(NEW_POP[row]);
}
free(NEW_POP);
}
Once I get the bugs worked out of it, the genetic algorithm should start to work. Then a fitness function needs to be made that is more realistic. Fortunately, all of the bugs now appear to be isolated in the genetic algorithm, but I might need to keep testing to be sure.
Initialise new function to constrain costs in the genetic algorithm
A new function has been written to constrain costs in the genetic algorithm when they go over budget as a consequence of crossover and mutation.
/* =============================================================================
* This function will ensure that the actions of individuals in the population
* are within the cost budget after crossover and mutation has taken place
* Necessary variable inputs include:
* population: array of the population that is made (malloc needed earlier)
* COST: A 3D array of costs of performing actions
* layer: The 'z' layer of the COST and ACTION arrays to be initialised
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* budget: The budget that random agents have to work with
* ========================================================================== */
void constrain_costs(double ***population, double ***COST, int layer,
int pop_size, int ROWS, int COLS, double budget){
int xpos, ypos;
int agent, row, col;
double tot_cost, action_val, action_cost;
for(agent = 0; agent < pop_size; agent++){
tot_cost = 0;
for(row = 0; row < ROWS; row++){
for(col = 4; col < COLS; col++){
action_val = population[row][col][agent];
action_cost = COST[row][col][layer];
tot_cost += (action_val * action_cost);
}
}
while(tot_cost > budget){
do{ /* This do assures xpos never equals ROWS (unlikely) */
xpos = floor( runif(0,ROWS) );
}while(xpos == ROWS);
do{
ypos = floor( runif(4,COLS) );
}while(ypos == COLS);
if(population[xpos][ypos][agent] > 0){
population[xpos][ypos][agent]--;
tot_cost -= COST[xpos][ypos][layer];
}
}
}
}
The function has been tested, and works as intended. When the sum of
the action elements of an individual multiplied by the cost of each
action (tot_cost
in the above function) are higher than the
allowable budget
, actions are randomly removed until the
total costis at or under budget. Note that lower-cost actions are not
removed preferentially so as not to bias evolution toward low-cost
actions.
Initial thoughts on the fitness function
Having now completed functions modelling crossover, mutation, and cost-constraints in C, there are two functions left in the genetic algorithm that are needed. The second is a tournament function modelling selection – this will be relatively easy to code once I have individual fitnesses in the population. The first is the fitness function, which be very complex – so much so that I’m planning to write a very quick simplified version of the fitness function before expanding it out to deal with more difficult questions. What has to happen with the fitness function is that each simulated individual in the popuation has to use whatever information is available to an agent (e.g., manager observations, anecdotal surveys, past decisions of other agents, landscape status, etc.) to predict what the future status of the resources and landscape will be, then assign a fitness to that prediction. Utilities of each resource are in the (truncated) action and cost arrays, as below.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem | bankem |
---|---|---|---|---|---|---|---|---|---|---|---|---|
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
-1 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 0 | 0 | 1 |
1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
3 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
Above we have the utilities of each resource type
(type1
), but I’m just realising that the utilities
of the landscape are absent. There isn’t really
anything in the above table, for example to say that a stake-holder
assigns a utility to the value of a given landscape cell.. But
this needs to be the case if we want something like crop yield (perhaps
I should more generally be calling it ‘’food security’’) to be modelled
as part of the landscape. I think the best solution for
this is to include the landscape in type1
as a negative
integer. The landscape layer identifying crop yield is 1 in C (2 in R) –
if I placed a new row of type1 = -1
in the
COST
and ACTION
arrays for each agent, then
the negative could simply indicate that we are looking at the
LANDSCAPE
array instead of the RESOURCE
array.
I also don’t think more than one layer of landscape will ever be used,
so I’m not seeing a confusing mess of negative and positive types. The
corresponding action columns (movem
, castem
,
etc.) could have interpretations for landscape, some of them such as
feedem
are obvoius, while others could just be ignored
because they don’t really apply. In the end the arrays would then look
something like the below.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem | bankem |
---|---|---|---|---|---|---|---|---|---|---|---|---|
-1 | -1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
-1 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 0 | 0 | 1 |
1 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
2 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
3 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
3 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
Maybe not the most elegant solution, but it keeps everything on a
single array and the interpretations of types are fairly
straightforward. I’ll implement this next as new array initialisations,
then build a prototype fitness function that attempts to maximise crop
yield through feedem
(not sure if this should actually be
an action in the model).
Manager summary missing
In working with the fitness function in the user model, I realised
that the manager information was obviously missing, so this will have to
be added in later (should be easy to do so). One reason for doing the
user model first is because the manager model (particularly the genetic
algorithm) is going to get much more complicated. Nevertheless, the
manager model’s use of the genetic algorithm necessitates that the
genetic algorithm be able to use both the OBSERVATION
array
and the manager’s OBS_SUMMARY
of the array. Different users
will have access to do different information, but I’m starting small to
make sure everything is built clearly.
/* =============================================================================
* This is a preliminary function that checks the fitness of each agent -- as of
* now, fitness is just defined by how much action is placed into savem (last
* column). Things will get much more complex in a bit, but there needs to be
* some sort of framework in place to first check to see that everything else is
* working so that I can isolate the fitness function's effect later.
* fitnesses: Array to order fitnesses of the agents in the population
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* landscape: The landscape array
* resources: The resource array
* agent_array: The agent array
* ========================================================================== */
void strategy_fitness(double *fitnesses, double ***population, int pop_size,
int ROWS, int COLS, double ***landscape,
double **resources, double **agent_array){
int agent;
for(agent = 0; agent < pop_size; agent++){
fitnesses[agent] += population[0][12][agent];
}
}
The above function therefore simply returns the last column
(bankem
) as the individual’s fitness. I’m now going to
maximise this using a tournament approach to fitness, as suggested by
Hamblin (2013).
Functioning tournament function
After some toiling with swaps and pointers, I’ve managed to come up
with a somewhat concise and clear function that randomly samples
sampleK
individuals from the population and selects the
chooseK
individuals with highest fitness.
/* =============================================================================
* This function takes an array of fitnesses and returns an equal size array of
* indices, the values of which will define which new individuals will make it
* into the next population array, and in what proportions.
* fitnesses: Array to order fitnesses of the agents in the population
* winners: Array of the winners of the tournament
* pop_size: The size of the total population (layers to population)
* sampleK: The size of the subset of fitnesses sampled to compete
* chooseK: The number of individuals selected from the sample
* ========================================================================== */
void tournament(double *fitnesses, int *winners, int pop_size,
int sampleK, int chooseK){
int samp;
int *samples;
int left_to_place, placed;
int rand_samp;
double *samp_fit;
samples = malloc(sampleK * sizeof(int));
samp_fit = malloc(sampleK * sizeof(double));
placed = 0;
while(placed < pop_size){ /* Note sampling is done with replacement */
for(samp = 0; samp < sampleK; samp++){
do{
rand_samp = floor( runif(0, pop_size) );
samples[samp] = rand_samp;
samp_fit[samp] = fitnesses[rand_samp];
}while(rand_samp == pop_size);
}
sort_vector_by(samples, samp_fit, sampleK);
if( (chooseK + placed) >= pop_size){
chooseK = pop_size - placed;
}
samp = 0;
while(samp < chooseK && placed < pop_size){
winners[placed] = samples[samp];
placed++;
samp++;
}
}
free(samp_fit);
free(samples);
}
Note that in writing the above, I had to write a simple sort
(sort_vector_by
) and swap function in
utilities.c
. I also need to write some error messages into
the above (or in ga
itself); chooseK
cannot be
larger than sampleK
. Next up will be to iterate the
ga
functions and make sure that fitnesses asymptote to high
fitness. The framework for the genetic algorithm will then be in place,
and it will be time to switch to the complex part of more interesting
fitness functions.
Initialisation of action populations
A new function has been written to initialise a population of agents,
duplicated from a single agent in the larger G-MSE model and to be used
for the genetic algorithm. Initial testing of this function shows that
it returns appropriate arrays, in which actions are selected
appropriately based on their cost values in the COST
array.
/* =============================================================================
* This function will initialise a population from the ACTION and COST arrays, a
* particular focal agent, and specification of how many times an agent should
* be exactly replicated versus how many times random values shoudl be used.
* Necessary variable inputs include:
* ACTION: A 3D array of action values
* COST: A 3D array of costs of performing actions
* layer: The 'z' layer of the COST and ACTION arrays to be initialised
* pop_size: The size of the total population (layers to population)
* carbon_copies: The number of identical agents used as seeds
* budget: The budget that random agents have to work with
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* population: array of the population that is made (malloc needed earlier)
* ========================================================================== */
void initialise_pop(double ***ACTION, double ***COST, int layer, int pop_size,
int budget, int carbon_copies, int ROWS, int COLS,
double ***population){
int xpos, ypos;
int agent;
int row, col;
double lowest_cost;
double budget_count;
double check_cost;
/* First read in pop_size copies of the ACTION layer of interest */
for(agent = 0; agent < pop_size; agent++){
for(row = 0; row < ROWS; row++){
population[row][0][agent] = ACTION[row][0][layer];
population[row][1][agent] = ACTION[row][1][layer];
population[row][2][agent] = ACTION[row][2][layer];
population[row][3][agent] = ACTION[row][3][layer];
if(agent < carbon_copies){
for(col = 4; col < COLS; col++){
population[row][col][agent] = ACTION[row][col][layer];
}
}else{
for(col = 4; col < COLS; col++){
population[row][col][agent] = 0;
}
}
}
lowest_cost = min_cost(COST, layer, budget, ROWS, COLS);
budget_count = budget;
if(lowest_cost <= 0){
printf("Lowest cost is too low (must be positive) \n");
break;
}
while(budget_count > lowest_cost){
do{
do{ /* This do assures xpos never equals ROWS (unlikely) */
xpos = floor( runif(0,ROWS) );
}while(xpos == ROWS);
do{
ypos = floor( runif(4,COLS) );
}while(ypos == COLS);
}while(COST[xpos][ypos][layer] > budget_count);
population[xpos][ypos][agent]++;
budget_count -= COST[xpos][ypos][layer];
} /* Should now make random actions allowed by budget */
}
}
The above function cals the min_cost
function, which
simply examines the COST
array to find the lowest cost
action. It keeps filling up actions in the ACTION
array
until it’s full.
/* =============================================================================
* This function will find the minimum cost of an action in the COST array
* for a particular agent (layer). Inputs include:
* COST: A full 3D COST array
* layer: The layer on which the minimum is going to be found
* budget: The total budget that the agent has to work with (initliases)
* rows: The total number of rows in the COST array
* cols: The total number of cols in the COST array
* ========================================================================== */
int min_cost(double ***COST, int layer, double budget, int rows, int cols){
int i, j;
double the_min;
the_min = budget;
for(i = 0; i < rows; i++){
for(j = 0; j < cols; j++){
if(COST[i][j][layer] < the_min){
the_min = COST[i][j][layer];
}
}
}
return the_min;
}
We now have a functioning way to initialise a population of agents
that will later go through a genetic algoirthm to select the best
actions. In working through this, I’ve seen that an earlier idea of mine
(not sure if I wrote this down below) might be useful – have a column in
both COST
and ACTION
that is simply
bankem
– essentially stashing costs in a way that doesn’t
do anything. This might be important for situations in
which an agent actually benefits by doing nothing, or when we want some
general way to consider the benefits of stake-holder actions that affect
utility but have no effect on resources or other stake-holders (e.g.,
holiday time).
Add new bankem
action on COST
and
ACTION
arrays
I have added a new action bankem
onto the
COST
and ACTION
arrays, which was not too
difficult at all in practice. I envision this category of actions as
(probably) always having a cost equal to one. Essentially, it’s a way to
shift unspent costs to a category, which might or might not affect the
agent’s overall utility.
Initialise a new crossover function
I have written a crossover function that, for each individual in the
population, assigns a crossover partner (e.g., as would occur in sexual
reproduction). With the partner assigned, the function then swaps
ACTION
array elements with some fixed probability (uniform
crossover method). I don’t see any reason to consider multiple types of
crossover at this point, so I believe this method will be
sufficient.
/* =============================================================================
* This function will use the initialised population from intialise_pop to make
* the population array undergo crossing over and random locations for
* individuals in the population. Note that we'll later keep things in budget
* Necessary variable inputs include:
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* pr: Probability of a crossover site occurring at an element.
* ========================================================================== */
void crossover(double ***population, int pop_size, int ROWS, int COLS,
double pr){
int agent, row, col;
int cross_partner;
double do_cross;
double agent_val, partner_val;
for(agent = 0; agent < pop_size; agent++){
do{
cross_partner = floor( runif(0, pop_size) );
}while(cross_partner == agent || cross_partner == pop_size);
for(row = 0; row < ROWS; row++){
for(col = 4; col < COLS; col++){
do_cross = runif(0,1);
if(do_cross < pr){
agent_val = population[row][col][agent];
partner_val = population[row][col][cross_partner];
population[row][col][agent] = partner_val;
population[row][col][cross_partner] = agent_val;
}
}
}
}
}
Originally, I was going to use a swap function to swap agent and
partner values. The swap function is still in the
utilities.c
file, but I think the above code is more
readable.
I think it will make more sense to deal with the budget after
mutation. That is, as a result of crossover and mutation, some
individuals might go overbudget on their actions. I think randomly
removing actions in the event of being over budget is best solved after
mutation to prevent redundancy; this was a constrain_cost
command originally written in R, so I can
use this as a template.
Mutation function created
I have written a function to cause random mutations in the population array during the genetic algorithm.
/* =============================================================================
* This function will use the initialised population from intialise_pop to make
* the population array undergo mutations at random elements in their array
* Necessary variable inputs include:
* population: array of the population that is made (malloc needed earlier)
* pop_size: The size of the total population (layers to population)
* ROWS: Number of rows in the COST and ACTION arrays
* COLS: Number of columns in the COST and ACTION arrays
* pr: Probability of a mutation occurring at an element.
* ========================================================================== */
void mutation(double ***population, int pop_size, int ROWS, int COLS,
double pr){
int agent, row, col;
double do_mutation;
double agent_val;
double half_pr;
half_pr = 0.5 * pr;
/* First do the crossovers */
for(agent = 0; agent < pop_size; agent++){
for(row = 0; row < ROWS; row++){
for(col = 4; col < COLS; col++){
do_mutation = runif(0,1);
if( do_mutation < half_pr ){
population[row][col][agent]--;
}
if( do_mutation > (1 - half_pr) ){
population[row][col][agent]++;
}
if( population[row][col][agent] < 0 ){
population[row][col][agent] *= -1;
} /* Change sign if mutates to a negative value */
}
}
}
}
I might or might not want to tweak this later on because I’m not sure if the type of mutation is agressive enough to search for adaptive strategies. This issue will be greatly mitigated by the seeding of random action arrays and crossover, but I might want to come back to allow mutation to a wider range of numbers later. For now, there is simply a probability of a mutation occurring at each element, then, if a mutation occurs, the action value will either increase by one or decrease by one (if the original value was zero, it will increase to one). It’s tempting to allow for bigger jumps, but if they are too big then they will regularly go over budget and hence cause the whole array to reshuffle again (essentially creating a random array and removing a potential opportunity for increased fitness.
The next function that needs to be written is one that constrains the costs to be at or under budget after crossover and mutation, then a fitness function is needed (which will probably require several sub-functions to keep the code readable).
Separate ACTION
and COST
arrays
I’ve now made separate the arrays that affect an agent’s
actions and the agents costs (from a total budget) for
performing things actions. The indices of these arrays will match at all
times, such that COST[i][j][k]
will be the cost of an agent
k
performing ACTION[i][j][k]
Each agent will
therefore have its own 2D layer that will include rows of other agents
and columns of utilities and actions. This adds an extra array to a
considerable number of things that we already need to keep track of, but
I think it is less confusing than what I was doing before, and in the
end separating costs from actions will be worth it. Ideally, all of this
would just be some special struct
in C, but, as mentioned
yesterday, this won’t work because R and C
need to work seemlessely.
This is much more comprehensivle in another respect;
the genetic algorithm only needs to deal with the ACTION
array, using the COST
array as a reference. This
readability of the code alone will probably be worthwhile. As
another bonus, while re-writing the code, it is now obvious
that it is unecessary to mutate, crossover, etc., only a select few
rows; in the ACTION
array, they are all fair game as
determined by COST
(columns 0-3 cannot be changed, but this
is easy to remember).
Working call to game.c
, but bad action
return
There is now a working game.c
file that
user.c
functions call, with proper header files to link.
For some reason, the action arrays returned right now are incorrect, so
this is the next thing that needs to be done. In general, I think it
will be a good idea to make sure that calls from gmse.R
are
maintained without crash.
Begin working on the genetic algorithm
I have now initialised the file game.c
, which will hold
everything related to the genetic algorithm, including multiple
functions for running each individual process. The file will include a
high-level function that brings in five arrays.
UTILITY
array. The whole thing will need to be read
in because agents need to have the option to affect one another’s arrays
(e.g., the potential to affect the cost of each others actions).
I’ll need to be careful, eventually, regarding the
order of agent actions to make sure that the order in
which stake-holders are put through the genetic algorithm doesn’t affect
resulting agent strategies (or, if this is inevitable, then stake-holder
order should be randomised).AGENTS
array will be necessary for agents to look
up one anothers (and their own) locations, yield, etc.RESOURES
array will be needed for agents to look up
how many resources there are of each type, where they are located, and
what consequences of these agents might be expected.para
array of parameter values will be needed for
any specifications of the genetic algorithm (e.g., mutation and
crossover rate) we might want to implement from R.LANDCAPE
array needs to be read in to identify both
the owners of cells and the yield from cells, and anything else that
might be of interest.A couple other challenges that I need to keep in mind (but do not want to implement yet).
parameters
, for now) will need to be
included. Or, at least, the histories back to some arbitrary point in
time. The reason for this is that we’ll eventually want agents to be
able to look back on past decisions and adjust their behaviours to
maximise their own utilities. This will get nasty, and I think the best
thing to do might be to read in histories as separate arrays (e.g., have
a UTILITY
and a UTILITY_REC
), or at least
immediately separate them after reading them into the ga
function. Nevertheless, doing so will be a challenge, in the case of
UTILITY
requiring a 4D array that agents will search
through. I will build the framework of the genetic algorithm
with this in mind, making it flexible enough to expand into
histories. This needs to be done in C, else it will be
extremely slow, and it might take some time even with good coding in
C.ga
function to be able to call
from R and C. This isn’t actually difficult, but worth
mentioning because I think it will be helpful for users of the G-MSE R
package. Really, the ga
function will be called by default
within user.c
and manager.c
, being linked to
each in compiling – keeping the genetic algorithm code in its own file
seems like a good idea.I think it will be best to force ga
to specify a single
agent whose fitness will be maximised (as this agent will need to be replicated 100ish
times for the evolution of a single agent to be simulated). If
nothing else, this will make the code easier to follow. Hence, the main
functions of both manager.c
and user.c
will
call ga
(linked with the game header file
#include "game.c"
), reading in all of the five arrays above
and specifying for which agent it is running the genetic algorithm. In
manager.c
, for example, only type1 = 0
agents
will be run, while these agents will be exclused in
user.c
.
Progess while coding the initialisation of a population
I think it makes sense to keep these functions generaly and very
explicit about what can and cannot be tweaked. For example, given a 2D
array, I am using x0
, x1
, y0
and
y1
as indices that determine where to start and stop in
terms of changing things. For example, this function that will be called
from the initialise_pop
function specifies all points in
where to search the UTILITY
layer for the lowest
possible cost (needed for later).
/* =============================================================================
* This function will find the minimum cost of an action in the UTILITY array
* for a particular agent (layer). Inputs include:
* UTILITY: A full 3D utility array
* layer: The layer on which the minimum is going to be found
* budget: The total budget that the agent has to work with (initliases)
* ========================================================================== */
int min_cost(double ***UTILITY, int layer, double budget, int x0, int x1,
int y0, int y1){
int i, j;
double the_min;
the_min = budget;
for(i = x0; i < x1; i++){
for(j = y0; j < y1; j++){
if(UTILITY[i][j][layer] < the_min){
the_min = UTILITY[i][j][layer];
}
}
}
return the_min;
}
This requires more input, but I think it’s also clearer what is meant to happen. The above function compiles without error.
Change to the UTILITY array
Having started coding in C, I’ve decided that it will be much easier
to code if I switch what is represented in the first four rows of the a
layer of the UTILITY
array. Now, the first two rows in
which agent = -2
will be the focal agent’s cost,
while rows 3 and 4 will be the focal agents actions. This will
make it easier to code for the manager’s actions later.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|
-2 | 1 | 0 | 0 | 2 | 1 | 0 | 1 | 1 | 2 | 3 | 3 |
-2 | 2 | 0 | 0 | 0 | 1 | 0 | 5 | 20 | 12 | 5 | 10 |
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
-1 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
1 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
3 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
3 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
The reason this is easier is because now I can just randomise
elements in the genetic algorithm below some value of rows.
Agents should never be able to change their own costs, but can always
change their own actions (agent = -1
), and potentially the
actions of other agents (agent
> -1). Fortunately, this
doesn’t require any extra coding of the initialisation of
UTILITY
– I just need to note that I’m doing it this way
from now on.
Scrap the above idea completely
It was better the way it was – I confused myself with the 3 dimensions. The only actions on resources are in the focal agents first two rows. The second two rows will always be the costs of the focal agent for performing the first two rows of actions, and every other row is a cost associated with adjusting the cost of each other agent – but the actual change that is made where these costs are not infinite (i.e., for the managers) will be made in other layers of the UTILITY function.
Here’s how it will work: Agents can
do things to resources movem
,
castem
, killem
, feedem
,
helpem
at a cost. What they do is specified by the first
two rows of their UTILITY
layer (agent = -2
).
The cost of doing each of these is specified in the second two rows
(agent = -2
). They can also potentially change the
cost of other agents doing things to resources; this is
determined by other remaining rows. But the tricky bit is that
their actions need to take effect in the other layers
of UTILITY
. Hence, we need to somehow hold the
actions as they apply to UTILITY
without affecting the
UTILITY
array itself throughout the process of the genetic
algorithm (if we start changing UTILITY
, then we need some
way to test changes with respect to agent fitness and then put the array
back as it was – actions therefore need to be recorded).
I didn’t want to do this, but I think it might actually be necessary
to have two arrays instead of one UTILITY
array. These two
arrays would include:
COST
array, which would be a 3D array (layers are
agents) that identifies the cost of each agent changing
something that affects agent actions.agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 1 | 1 | 2 | 3 | 3 |
-1 | 2 | 0 | 0 | 0 | 1 | 0 | 5 | 20 | 12 | 5 | 10 |
1 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
1 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
3 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
3 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
The agent = -1
here would just be the direct cost of the
focal agent in the layer affecting resources.
ACTION
array, which would be a 3D array of dimenions
identical to that of COST
that would determine what an
agent actually does.agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
-1 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 0 | 0 |
1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
3 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
The benefit here is that the elements would line up completely so
that it would be easy to keep track of actions and costs, and the
ACTION
array would be all that needs to be tweaked for the
genetic algorithm.
It would be nice to specify a new struct
in C for all of
this, but that wouldn’t change the fact that everything needs to read in
and out seemlessly with R, so I don’t think that this is possible.
Regrouping and finding a way forward on the utility functions
Reviewing my old thoughts on getting the
genetic algorithm to work and get agents to do something to maximise
thier own utilities. The first thing to do is to initialise a
UTILITY
array. I don’t see anyway around this – what is
needed is a three dimensional array where each dimension z
is an agent. A single agent’s utility and decision-making process is
therefore represented in a matrix like the one below.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|
-2 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
-2 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 1 | 1 | 2 | 3 | 3 |
-1 | 2 | 0 | 0 | 0 | 1 | 0 | 5 | 20 | 12 | 5 | 10 |
1 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
1 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
3 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
3 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Each agent will need to have a total cost budget, which will be
specified in the AGENT
array in its own column. In the
UTILITY ARRAY
above, the rows where agent = -2
(column 1) identify the actions of an agent – these are the
things that an agent can do to resources. In the above example,
the agent is not doing anything to resources (all are zeros). The rows
where agent = -1
indicate the costs of doing things
that affect the resources (i.e., the columns where
agent = -2
. The agent represented by this z
layer of the 3D array can therefore spend from their total budget where
agent = -1
to add actions where agent = -2
,
which in turn affects resources in one way or another. All of the
remaining rows (agent = 0
to agent = 2
) define
actions that would affect the costs of other agents.
Esseentially, values (all currently Inf
) represent the cost
of changing another agent’s cost by 1. So if we imagine a manager that
wants to change the cost of movem
for a stake-holder from 5
to 10, and their cost value in the table is 0.5, then it will cost 2.5
from their budget to increase this amount (or decrease). Note that there
is also the opportunity for stake-holders to directly affect the
utilities of other stake-holders – for a cost. I’m not
going to play around with these options yet because it will get very
complicated. Instead, I will now write a function for
inialising this array in R. Once the simple case of a genetic algorith
for affecting resources based on utilities and budgets is up and
running, then I will start doing more complex things like having
stake-holders affect one another’s utilities and costs.
Note that column 1 refers to the agent ID, not the agent
type. Hence, agent = 1
will be a manager, not a
stake-holder. It’s possible that there could be other managers too, but
the status of an agent can be accessed with the AGENT
array.
Initial function making utility array
A function below returns all of the necessary information for the
table above, but with random numbers placed for all columns after
type3
.
make_utilities <- function(AGENTS, RESOURCES){
UTILITY <- NULL;
agent_IDs <- c(-2, -1, unique(AGENTS[,1]) );
agent_number <- length(agent_IDs);
res_types <- unique(RESOURCES[,2:4]);
unique_types <- dim(res_types)[1];
types_data <- lapply(X = 1:agent_number,
FUN = function(quick_rep_list) res_types);
column_1 <- sort( rep(x = agent_IDs, times = unique_types) );
columns_2_4 <- do.call(what = rbind, args = types_data);
static_types <- cbind(column_1, columns_2_4);
dynamic_types <- matrix(data = 0, nrow = dim(static_types)[1], ncol = 8);
dynamic_vals <- sample(x = 1:10, size = length(dynamic_types),
replace = TRUE);
dynamic_types <- matrix(data = dynamic_vals, nrow = dim(static_types)[1],
ncol = 8);
colnames(static_types) <- c("agent", "type1", "type2", "type3");
colnames(dynamic_types) <- c("util", "u_loc", "u_land", "movem", "castem",
"killem", "feedem", "helpem");
UTILITY <- cbind(static_types, dynamic_types);
return( UTILITY );
}
I’m not sure the best way to add the currently random numbers in a function, except that it these values might need to be put into the array by the user, who will want to specify which agents care about which resources and how much it will cost to change things. Better, the user could just perhaps, eventually, just specify the utilities of each stake-holder with each type (this is less to input). Then, once the genetic algorithm for the manager is up and running, all of the costs will be initialised by the manager, somehow – with default costs for the manager to affect stake-holder costs. This scheme would minimise user input and have the costs arise organically from the model and management system, while the utilities would be specified by the user. For now though, I’ll have to input the cost values by hand.
Function tweak to make 3D array
The previous function wasn’t quite right because it only made one
layer of the 3D UTILITY
array. Really, each layer needs to
be replicated for each agent, as below.
#' Utility initialisation
#'
#' Function to initialise the utilities of the G-MSE model
#'
#'@param AGENTS The agent array
#'@param RESOURCES The resource array
#'@export
make_utilities <- function(AGENTS, RESOURCES){
agent_IDs <- c(-2, -1, unique(AGENTS[,1]) );
agent_number <- length(agent_IDs);
res_types <- unique(RESOURCES[,2:4]);
UTIL_LIST <- NULL;
agent <- 1;
agents <- agent_number - 2;
while(agent <= agents){
UTIL_LIST[[agent]] <- utility_layer(agent_IDs, agent_number, res_types);
agent <- agent + 1;
}
dim_u <- c( dim(UTIL_LIST[[1]]), length(UTIL_LIST) );
UTILITY <- array(data = unlist(UTIL_LIST), dim = dim_u);
return( UTILITY );
}
#' Utility layer for initialisation
#'
#' Function to initialise a layer of the UTILITY array of the G-MSE model
#'
#'@param agent_IDs Vector of agent IDs to use (including -1 and -2)
#'@param agent_number The number of agents to use (length of agent_IDs)
#'@param res_types The number of unique resource types (cols 2-4 of RESOURCES)
#'@export
utility_layer <- function(agent_IDs, agent_number, res_types){
LAYER <- NULL;
unique_types <- dim(res_types)[1];
types_data <- lapply(X = 1:agent_number,
FUN = function(quick_rep_list) res_types);
column_1 <- sort( rep(x = agent_IDs, times = unique_types) );
columns_2_4 <- do.call(what = rbind, args = types_data);
static_types <- cbind(column_1, columns_2_4);
dynamic_types <- matrix(data = 0, nrow = dim(static_types)[1], ncol = 8);
dynamic_vals <- sample(x = 1:10, size = length(dynamic_types),
replace = TRUE); # TODO: Change me?
dynamic_types <- matrix(data = dynamic_vals, nrow = dim(static_types)[1],
ncol = 8);
colnames(static_types) <- c("agent", "type1", "type2", "type3");
colnames(dynamic_types) <- c("util", "u_loc", "u_land", "movem", "castem",
"killem", "feedem", "helpem");
LAYER <- cbind(static_types, dynamic_types);
return( LAYER );
}
So when there are two agents, the make_utilities
function returns a 3D array of 4 rows, 12 columns, and 2 layers.
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] -2 1 0 0 9 2 1
[2,] -1 1 0 0 7 3 1
[3,] 1 1 0 0 9 8 4
[4,] 2 1 0 0 8 5 1
[,8] [,9] [,10] [,11] [,12]
[1,] 8 8 8 3 9
[2,] 2 7 10 2 1
[3,] 5 10 6 3 8
[4,] 2 6 6 1 5
, , 2
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] -2 1 0 0 1 8 9
[2,] -1 1 0 0 3 7 2
[3,] 1 1 0 0 6 2 4
[4,] 2 1 0 0 4 3 7
[,8] [,9] [,10] [,11] [,12]
[1,] 9 5 3 9 10
[2,] 6 5 7 5 7
[3,] 1 2 2 2 9
[4,] 4 8 9 7 10
I’ll record changes in the UTILITY
array over time to
track social changes and game strategy. For now, the next goal is to
write a genetic algorithm that will work on the UTILITY
array (with input from the AGENT
, LANDSCAPE
,
and RESOURCE
arrays) to optimise stake-holder actions. The
simplest case will be maximising crop yield.
Plans for the genetic algorithm, short and long term
In the short term, it is therefore necessary to write a set of functions for a genetic algorithm, starting first with the functions written in R on 7 FEB 2017 to show proof of concept. I will use these on the UTILITY arrays that I made today and show how agent actions can be simulated to maximise a simple scenario – trying to make as much crop yield as possible, where resources decrease yield if they are on the land. The most difficult part of this will be the fitness function. Essentially, stake-holder agents are going to need to learn or know the relationship between resources and their crop yields, then do something to affect the resources. There are two ways that the relationship between resource and crop yield could be implemented in the model:
consume
column in the RESOURCES
array. This is
pretty straightforward to implement. Each agent could simply count the
number of resources on its cells, look at the landscape cel values, then
calculate the proportion their crop yield is predicted to decrease and
act accordingly to maximise yield (e.g., by killing resources).
This is probably the first implementation to try.Bringing in the manager will, of course, make things even more complex. I think the best order to do all of this is to focus on 1 above first, then build managers into the model with 1, and then work on thinking about how to implement 2.
The plotting of \(2 \times 2\) figures that include maps of land ownership and individual stake-holder yields is now complete for observation types 2 and 3. With this complete, I will now turn to writing yesterday’s R function in C (which needs to happen anyway – may as well do it now to keep things fast). Once this is complete, then it will be easier to start building a genetic algorithm for maximising the utility of one stake-holder. Ignoring manager decision-making and conflicting stake-holders for the time being, I will focus on a stake-holder type with a relatively clear goal: maximise crop yield. Using the utility matrices and genetic algorithm notes from earlier, I’ll be able to write a general function in c that affects user behaviour.
User function now written in C
The user function that was written originally in R has now been coded in c. This makes it much faster to first place agents on their own land (if they own land), then count up their yield from the landscape. Testing of this function finds that everything appears to work normally for all observation types and different land dimensions.
I have run valgrind to check for memory leaks again (since it’s been a while).
R -d "valgrind --tool=memcheck --leak-check=yes --track-origins=yes" --vanilla < gmse.R
No memory leaks were reported.
==26147== HEAP SUMMARY:
==26147== in use at exit: 104,719,416 bytes in 18,583 blocks
==26147== total heap usage: 5,168,708 allocs, 5,150,125 frees, 953,760,506 bytes allocated
==26147==
==26147== LEAK SUMMARY:
==26147== definitely lost: 0 bytes in 0 blocks
==26147== indirectly lost: 0 bytes in 0 blocks
==26147== possibly lost: 0 bytes in 0 blocks
==26147== still reachable: 104,719,416 bytes in 18,583 blocks
==26147== suppressed: 0 bytes in 0 blocks
==26147== Reachable blocks (those to which a pointer was found) are not shown.
==26147== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==26147==
==26147== For counts of detected and suppressed errors, rerun with: -v
==26147== ERROR SUMMARY: 196884 errors from 2 contexts (suppressed: 0 from 0)
Next, I can start to make the users actually do things that might maximise their own yield (e.g., shoot resources or farm cells more effectively). I play to write a flexible genetic algorithm function in c. The function itself could be called from a higher-level function so as to be used directly in R (though I don’t plan to do this for normal G-MSE operations, but it might be useful to include direct R call optoins once the package is complete).
New landscape layer identifying land ownership
There is now a new layer of landscape, and I have tweaked things to make the current default three layers. These layers include:
When the cell owner is 0, this effectively means the land is under manager (e.g., public) control. The new initialise landscape function now allows the user to explicitly set the proportion of cells that should go to each owner (vector input).
make_landscape <- function(model, rows, cols, cell_types, cell_val_mn,
cell_val_sd, cell_val_max = 1, cell_val_min = 0,
layers = 3, ownership = 0, owner_pr = NULL){
the_land <- NULL;
if(model == "IBM"){
if(rows < 2){
stop("Landscape dimensions in IBM must be 2 by 2 or greater");
}
if(cols < 2){ # Check to make sure the landcape is big enough
stop("Landscape dimensions in IBM must be 2 by 2 or greater");
}
cell_count <- cols * rows;
the_terrain <- sample(x = cell_types, size = cell_count,
replace = TRUE);
the_terrain2 <- rnorm(n = cell_count, mean = cell_val_mn,
sd = cell_val_sd);
if( length(ownership) == 1 ){
who_owns <- sample(x = 0:ownership, size = cell_count,
replace = TRUE);
the_terrain3 <- sort(who_owns); # Make contiguous for now
}else{
who_owns <- sample(x = ownership, size = cell_count,
replace = TRUE, prob = owner_pr);
the_terrain3 <- sort(who_owns);
}
the_terrain2[the_terrain2 > cell_val_max] <- cell_val_max;
the_terrain2[the_terrain2 < cell_val_min] <- cell_val_min;
alldata <- c(the_terrain, the_terrain2, the_terrain3);
the_land <- array(data = alldata, dim = c(rows, cols, layers));
}
if( is.null(the_land) ){
stop("Invalid model selected (Must be 'IBM')");
}
return(the_land);
}
Hence in the above, if ownership = 0
, then the layer is
effectively ignored, or if it is a scalar, then ownership of landscape
cells is divided equally among integer values from zero to the scalar.
However, the most thorough way to set ownership will be by setting
ownership
to a vector of possible owners and
owner_pr
to their relative proportions of cells owned.
Addition of this landscape layer has been tested and runs without
error.
Linking cell yield with agents
I have now begun a user
R function, which currently (1)
moves agents to somehwere on their owned landscape (if not already
there) and (2) calculates the amount of their total yield from the
landscape and stores this total amount in the AGENTS
array.
user <- function(resource = NULL,
agent = NULL,
landscape = NULL,
paras = NULL,
model = "IBM"
) {
check_model <- 0;
if(model == "IBM"){
# Relevant warnings below if the inputs are not of the right type
if(!is.array(resource)){
stop("Warning: Resources need to be in an array");
}
if(!is.array(agent)){
stop("Warning: Agents need to be in an array");
}
if(!is.array(landscape)){
stop("Warning: Landscape need to be in an array");
} # TODO: make sure paras is right length below
if(!is.vector(paras) | !is.numeric(paras)){
stop("Warning: Parameters must be in a numeric vector");
}
# If all checks out, then run the population model
#======================================================================
# TEMPORARY R CODE TO DO USER ACTIONS (WILL BE RUN FROM C EVENTUALLY)
#======================================================================
for(agent_ID in 1:dim(agent)[1]){
owned_cells <- sum(landscape[,,3] == agent_ID);
# --- Put the agent on its own land
if(owned_cells > 0){ # If the agent owns some land
a_xloc <- agent[agent_ID, 5];
a_yloc <- agent[agent_ID, 6];
while(agent[agent_ID,1] != landscape[a_xloc, a_yloc, 3]){
a_xloc <- sample(x = 1:dim(landscape)[1], size = 1);
a_yloc <- sample(x = 1:dim(landscape)[2], size = 1);
}
agent[agent_ID, 5] <- a_xloc;
agent[agent_ID, 6] <- a_yloc;
}
# --- count up yield on cells
agent_yield <- 0;
xdim <- dim(landscape[,,3])[1]
ydim <- dim(landscape[,,3])[2]
for(i in 1:xdim){
for(j in 1:ydim){
if(landscape[i,j,3] == agent[agent_ID,1]){
agent_yield <- agent_yield + landscape[i,j,2];
}
}
}
agent[agent_ID, 15] <- agent_yield
}
USER_OUT <- list(resource, landscape, agent);
# TODO: User actions are next...
#======================================================================
check_model <- 1;
}
if(check_model == 0){
stop("Invalid model selected (Must be 'IBM')");
}
return(USER_OUT);
}
It might be useful to also have a column in the AGENTS
array that records percent capacity of yield for stake-holders, perhaps
by saving the original landscape (before resources remove yield) and
calculating a proportion. A couple notes, the indicated
code above will need to be put into C – it’s much to slow for R already.
Also, for some reason, if I don’t store a_xloc
and
a_yloc
back into the appropriate
agent[agent_ID, 5]
and agent[agent_ID, 6]
,
respectively, a weird bug appears. The actual resource population (but
not its estimate) flatlines after 20 or so generations at some value.
This is very weird because the file gmse.R
doesn’t even
return the resource or landscape arrays – not yet. I’m not sure why a
bug in this the code affects population demographics, but fixing it also
appears to correct the problem completely. This is something to
watch out for, however.
Plotting owned landscape and stake-holder yield
The figure below shows some new output for G-MSE. The left column of the figure is familiar, but the right column now provides some feedback for five simulated stake-holders that own roughly equal amounts of land. The actual plots of land are shown in the upper right, while the individual yields for each stake-holder’s plots are shown over time in the lower right.
As of now, this image is only produced for the first two observation functions (case 0 and 1), so I need to replicate it in the other two observation functions. Eventually, it would be better to just have one function for plotting so that any changes made would really be global.
Tracking crop yield over time
Given that resources now can affect the second layer of the landscape, which can model the percent crop yield (or anything else), we can now plot the mean percent yield per cell (orange) over time along with resource abundance (black) and its estimate (blue). The figure below shows this for an example in which each independent visit by a resource reduces crop yield by 50% (e.g., the individual consumes half of the resources on a cell if it arrives there at a time step).
This has now only been coded for the mark-recapture plot, so the next task is to fill this out for all of the plot types, then add a new layer of the landscape that will designate each cell with a number that identifies the owner of the land, or if the land is public (type 0). This will allow me to link crop yield to a specific agent.
Fix read in and out of landscape array from R to C
While testing the resource-landscape interaction, there was an issue with the landscape array being read into C correctly. When R sends an array or vector into C, it is sending the contents of a list (i.e., what might be a \(2 \times 2\) array in R gets read in as if each element were in a list of four elements). The structure of the array then needs to be correctly defined in C so that it matches what it was in R. This requires placing the contents of the elements coming in from R in the correct order with respect to pointers in C, and this occurs in reverse order, so if we had a table in R
Y1 | Y2 | |
---|---|---|
X1 | 1 | 2 |
X2 | 3 | 4 |
The list would be read in (apparently) as [1, 3, 2, 4], so if we want
to read this in to an array in R, and we prefer to make a pointer to X1
and X2 location (which is easier for my brain because it allows
array[i][j]
to refer to the i individual and j trait), then
we need to read in the array as follows:
the_array = malloc(x_size * sizeof(double *));
for(i = 0; i < x_size; i++){
the_array[i] = malloc(y_size * sizeof(double));
}
vec_pos = 0;
for(j = 0; j < y_size; j++){
for(i = 0; i < x_size; i++){
the_array[i][j] = R_ptr[vec_pos];
vec_pos++;
}
}
This is not quite intuitive at first, but doing it this way gets R
and C on the same page. For example, here is the RESOURCES
array moving from R to C and back again. Printed in each environment,
the array is the same (note, they could be differently
structured and still be technically consistent – e.g., if all arrays
were transposed – but this would be a nightmare to code).
> RESOURCES[1:4,1:4]
IDs type1 type2 type3
[1,] 1 1 0 0
[2,] 2 1 0 0
[3,] 3 1 0 0
[4,] 4 1 0 0
> RESOURCE_NEW <- resource(resource = RESOURCES,
+ landscape = LANDSCAPE_r,
+ paras = paras,
+ move_res = TRUE,
+ model = "IBM"
+ );
1.000000 1.000000 0.000000 0.000000
2.000000 1.000000 0.000000 0.000000
3.000000 1.000000 0.000000 0.000000
4.000000 1.000000 0.000000 0.000000
> RESOURCES <- RESOURCE_NEW[[1]];
> RESOURCES[1:4,1:4]
[,1] [,2] [,3] [,4]
[1,] 1 1 0 0
[2,] 2 1 0 0
[3,] 3 1 0 0
[4,] 4 1 0 0
When reading in the landscape, this got confusing beause the same thing had to be done in three dimensions, and initially I lost track of the pointers causing the layers to mix. This has been resolved now, and I have tested to ensure that landscape elements are identical when read into C and when returned back into R
> LANDSCAPE_r[1:4,1:4,1:2]
, , 1
[,1] [,2] [,3] [,4]
[1,] 1 2 2 2
[2,] 2 2 2 2
[3,] 1 1 1 2
[4,] 1 1 2 2
, , 2
[,1] [,2] [,3]
[1,] 1.540700 -1.7960987 2.7759525
[2,] 1.483312 0.5855166 -0.4789347
[3,] 1.579536 1.0600302 2.2923279
[4,] 1.745043 0.2437264 0.6171671
[,4]
[1,] 0.6596141
[2,] 0.8117666
[3,] 2.0330554
[4,] 1.1496975
> RESOURCE_NEW <- resource(resource = RESOURCES,
+ landscape = LANDSCAPE_r,
+ paras = paras,
+ move_res = TRUE,
+ model = "IBM"
+ );
1.000000 2.000000 2.000000 2.000000
2.000000 2.000000 2.000000 2.000000
1.000000 1.000000 1.000000 2.000000
1.000000 1.000000 2.000000 2.000000
1.540700 -1.796099 2.775952 0.659614
1.483312 0.585517 -0.478935 0.811767
1.579536 1.060030 2.292328 2.033055
1.745043 0.243726 0.617167 1.149697
> RESOURCES <- RESOURCE_NEW[[1]];
> RESOURCE_REC[[time]] <- RESOURCES;
>
> LANDSCAPE_r <- RESOURCE_NEW[[2]];
> LANDSCAPE_r[1:4,1:4,1:2]
, , 1
[,1] [,2] [,3] [,4]
[1,] 1 2 2 2
[2,] 2 2 2 2
[3,] 1 1 1 2
[4,] 1 1 2 2
, , 2
[,1] [,2] [,3] [,4]
[1,] 1.540700 -1.7960987 2.7759525 0.6596141
[2,] 1.483312 0.5855166 -0.4789347 0.8117666
[3,] 1.579536 1.0600302 2.2923279 2.0330554
[4,] 1.745043 0.2437264 0.6171671 1.1496975
>
The biological interactions (i.e., the function from 3 MAR) now does what it is supposed to do, and I will move on to make the landscape interactions more interesting.
Allow layers to change by themselves each generation
Given that some resources will affect layers of the landscape, modelling consumption of biomass on cells, it is necessary to also include a function that changes the landscape cell values without any input from resources. This can model the growth of biomass on cells between time steps. I’ve therefore written a new function that does this in R (I don’t think this will be complex enough to require it in C).
update_landscape <- function(model = "IBM", landscape, layer, mean_change,
sd_change = 0, max_val = 1, min_val = 0){
the_land <- NULL;
if(model == "IBM"){
xlength <- dim(landscape[,,layer])[1];
ylength <- dim(landscape[,,layer])[2];
lsize <- xlength * ylength;
adj_vals <- rnorm(n = lsize, mean = mean_change, sd = sd_change);
adj_layer <- matrix(data = adj_vals, nrow = xlength, ncol = ylength);
new_layer <- landscape[,,layer] + adj_layer;
new_layer[new_layer > max_val] <- max_val;
new_layer[new_layer < min_val] <- min_val;
landscape[,,layer] <- new_layer;
the_land <- landscape;
}else{
stop("Invalid model selected (Must be 'IBM')");
}
return(the_land);
}
One feature of the G-MSE model is now that, in addition to a hard imposed carrying capacity on resource types, it is also possible to make the carrying capacity a natural function of the landscape. For example, we might force individuals on the landscape to consume a certain amount of resources on the landscape to survive or reproduce. Hence, as landscape cell values decrease modelling the consumption of biomass, fewer individual resources can survive or reproduce.
Ideally, it will then be possible to parameterise the model using data for, e.g., how much damage to biomass a goose can do to a patch of land. As of now, by default, I’m just assuming that it decreases crop yield by 10%, and increases its own survival probability by the same when it lands on a cell.
For some reason, a function that I wrote to reset the landscape values screwed with the resource abundances (flat-lined after 20 gens for no clear reason). I’ve reverted to a simpler function, and will build up off of this tomorrow, but it would be nice to know why the R function was affecting the population dynamics even when it returned the same landscape that it took in. Tomorrow, I will build up a new function with similar features piece by piece to make sure it works. Then, I will do some initial simulations modelling crop growth as affected by resources on a landscape, and resource dynamics in turn affected by crops. Things to add after include:
I’m not sure which to tackle first just yet – perhaps the former because the latter doesn’t seem necessary now.
Resource-landscape interactions
Having now resolved the issue concerning multi-layered landscapes, it’s time to actually use one of these layers in the model. The goal here is to do the following:
It would be nice if, for example, individuals could have their probability of death decrease if they are on a cell of high value (modelling increased food consumption), or their probability of giving birth (or number of offspring) increase. Movement rules could also allow individuals to gravitate towards high value cells (or stop when landing on one), thereby modelling behavioural change to move toward areas where opportunities for foraging (or nesting, or something else) are high. This could affect consumption of food on different landscape types (e.g., cropland) and hence make it possible to also model management strategies of diversionary feeding.
To incorporate the above, a new function in c is going to be needed that models interaction beween resources and landscapes. This function will require input of:
resource
arraylandscape
arrayI will program this in a flexible way within c, and use some default features that will probably decrease a trait and landscape value by a uniform proportion each time (which seems intuitively more reasonable than a uniform value if we’re thinking about probabilities of mortality and proportion of food on a landsdcape eaten). Key options will be called from R.
Progress on resource-landscape interactions
The initial code to allow interaction is written in the form of the following function, locating on a local branch (not pushed on GitHub).
/* =============================================================================
* This function reads in resources and landscape values, then determines how
* each should affect the other based on resource position and trait values
* Inputs include:
* resource_array: resource array of individuals to interact
* resource_type_col: which type column defines the type of resource
* resource_type: type of resources to do the interacting
* resource_col: the column of the resources that affects or is affected
* rows: the number of resources (represented by rows) in the array
* resource_effect: the column of the resources of landscape effect size
* landscape: landscape array of cell values that affect individuals
* landscape_layer: layer of the landscape that is affected
* ========================================================================== */
void res_landscape_interaction(double **resource_array, int resource_type_col,
int resource_type, int resource_col, int rows,
int resource_effect, double ***landscape,
int landscape_layer){
int resource;
int x_pos, y_pos;
double c_rate;
double current_val;
double esize;
for(resource = 0; resource < rows; resource++){
if(resource_array[resource][resource_type_col] == resource_type){
x_pos = resource_array[resource][4];
y_pos = resource_array[resource][5];
c_rate = resource_array[resource][14];
landscape[x_pos][y_pos][landscape_layer] *= (1 - c_rate);
current_val = resource_array[resource][resource_col];
esize = resource_array[resource][resource_effect];
resource_array[resource][resource_col] += (1 - current_val) * esize;
}
}
}
This needs to be tested more carefully – for some reason both layers are being affected, and I need to make sure that the landscape is being read in correctly.
RESOLVED ISSUE #14: Success on multi-layered landscapes
Initial testing suggests that I have successfully coded landscapes into G-MSE that have more than one layer. The G-MSE program now initialises (for the moment) landscapes that have depth of two layers, such as the below.
## , , 1
##
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 5 8 8 2 2 9 5 3 7 6
## [2,] 9 6 9 8 7 9 10 2 9 8
## [3,] 7 1 1 10 8 6 9 8 5 5
## [4,] 8 7 6 8 1 9 10 4 5 9
## [5,] 5 10 8 10 7 7 8 5 8 5
## [6,] 5 4 3 5 1 10 9 1 6 9
## [7,] 7 3 7 5 8 5 2 1 3 2
## [8,] 3 7 10 7 7 3 8 10 9 1
## [9,] 2 9 1 5 10 1 6 5 10 9
## [10,] 6 2 10 10 7 6 7 3 7 1
##
## , , 2
##
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 0.82 0.98 0.72 0.70 0.19 0.72 0.97 0.52 0.60 0.75
## [2,] 0.95 0.79 0.26 0.45 0.81 0.03 0.60 0.04 0.85 0.10
## [3,] 0.71 0.74 0.31 0.52 0.65 0.11 0.24 0.39 0.15 0.35
## [4,] 0.59 0.87 0.73 0.31 0.13 0.50 0.94 0.47 0.38 0.01
## [5,] 0.87 0.58 0.14 0.64 0.15 0.52 0.47 0.81 0.68 0.96
## [6,] 0.66 0.25 0.20 0.22 0.87 0.17 0.86 0.91 0.05 0.63
## [7,] 0.83 0.96 0.35 0.40 0.83 0.16 0.34 0.13 0.59 0.97
## [8,] 0.32 0.64 0.23 0.86 0.16 0.02 0.65 0.88 0.61 0.68
## [9,] 0.81 0.07 0.47 0.49 0.84 0.57 0.58 0.63 0.21 0.19
## [10,] 0.31 0.94 0.43 0.61 0.60 0.81 0.57 0.91 0.57 0.69
Hence, we can now have different layers representing different
aspects of the landscape. For example, the first layer of the array
above (,,1
) could represent the kind of terrain type for
each cell, while the second layer (,,2
) could represent the
potential crop yield of the cell. The resource function now
returns both the resource array and the multi-layer landscape,
meaning the code structure is now in place to do some actual biology. We
might have resources located on a particular cell increase or decrease
the values of one or more layers. This can then be returned as
information to agents or retained somehow. It might get fairly
memory-intensive if G-MSE is saving ever layer of landscape for every
time step, so it’s worth thinking about how to use the dynamic landscape
in each generation.
Running valgrind
on the new program reassures me that
I’ve not done anything too bone-headed in allocating memory for a three
dimensional landscape array.
R -d "valgrind --tool=memcheck --leak-check=yes --track-origins=yes" --vanilla < gmse.R
It all appears to be freed successfully, with no memory leaks picked up.
==27105==
==27105== HEAP SUMMARY:
==27105== in use at exit: 78,219,488 bytes in 16,679 blocks
==27105== total heap usage: 2,791,027 allocs, 2,774,348 frees, 621,207,897 bytes allocated
==27105==
==27105== LEAK SUMMARY:
==27105== definitely lost: 0 bytes in 0 blocks
==27105== indirectly lost: 0 bytes in 0 blocks
==27105== possibly lost: 0 bytes in 0 blocks
==27105== still reachable: 78,219,488 bytes in 16,679 blocks
==27105== suppressed: 0 bytes in 0 blocks
==27105== Reachable blocks (those to which a pointer was found) are not shown.
==27105== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==27105==
==27105== For counts of detected and suppressed errors, rerun with: -v
==27105== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Hence, I have merged from a local branch to branch dev. I have also read in the new landscape into the observation function.
Where all of this is going now
Now that I have the hang of returning multiple elements from C to R
simultaneously (which is accomplished basically by making a structure
SEXP
allocating to a list VECSXP
, each element
of which can be an array), it will be easier to think about the code
more holistically – each part of the model can potentially take in and
return every type of object, hence there are no restrictions on what one
model can affect. To model geese, which is probably the first type of
conflict I’m tempted to look at, I can use the population model to allow
landscape layers to affect probability of geese mortality and for geese
(RESOURCE
array) in turn to affect the landscape, and hence
crop output. Next week, I’m hoping to get the required code for doing
this in place, and to also get some feedback regarding how utility
functions of agents should be modelled at the upcoming workshop after my presentation. The
game-theoretic component can probably be a work in progress though, and
it should be possible to model geese without adding these complexities
until after receiving feedback, though thinking about the game-theoretic
algorithm and data structures continues to be a high priority of
mine.
Future decision-making algorithm
A recent paper by Miyasaka et al. (2017) looks at land use in a social-ecological system using an agent-based model and some interesting decision-making rules. Individuals calculate utility ’‘(expressed in terms of probability) for all land-use and location options […] and select the option with the highest utility’’. Basically, agents in this model rank probabilities of all land-use options, then try the one with highest probability, then go down the line if the first is not successful (I assume that the payoffs after success are identical, though this isn’t entirely clear to me). Agents also shift decisions and labour allocation based on similar households (imitation).
Coding goals for the day
movement_dir
: Causes movement in the x
or
y
directionedge_effect
: Does something at the edge when
encounteredFurther progress
Goal 1 has been completed, and, as I’ve been tempted to do before, I
have added a new utilities.c
file for holding functions
that need to be called by other c files (e.g., moving resources).
The next thing I have done (on an unpushed branch,
now merged) is to tweak the observation
function in C to
return a list with two elements. The first element is the set of
observations in array form (as before), and the second element is the
AGENTS
array. With the new c code, I can now return
multiple things from C to R with the same function, which opens up some
new possibilities, in particular allowing the landscape to change along
with the resource and observation arrays within the same C function. It
also potentially makes the anecdotal function obsolete,
as the change in the AGENTS
array can just be made within
observation
instead of calling a new R function.
The next challenge is to get a multi-layered landscape in and out of C from R. I’m not sure how many ways there are to do this, exactly, but the simplist might actually be to write a three dimensional array to read in the same way as the two dimensional arrays. This will require a very nasty set of loops allocating pointers of pointers of pointers (i.e., ***land), but the idea should be easy enough.
Moving back to machine learning in G-MSE
Having now completed a new R package modelling a genetic algorithm for iterated games played on \(2 \times 2\) symmetric payoff matrices, and applied this package to an upcoming presentation on the future of G-MSE, I now turn back to coming up with a functional genetic algorithm for G-MSE software.
Additional issues
Currently, there are five outstanding issues
in G-MSE. Issues 9, 11, and 12 are rather
trivial and easy to implement. Issues 10 (dealing with
the implementation of multiple resources) and 14 (dealing with
multiple landscape dimensions) require more serious consideration.
Perhaps partly because of the recent ConFooBio focus on geese and the
sizeable special
issue in Ambio that just came out, I am thinking about first coding
in multiple layers to LANDSCAPE
. As noted in issue 10, this
can be done fairly straightforwardly in R and C – the different layers
can simply be list elements in R (so LANDSCAPE[[i]]
is one
layer that is actually a matrix of real values) and read in as a three
dimensional array in C. I was able to do this while making the gamesGA package, with
the agents
array being set in
R, then being unlisted with unlist()
before being
changed into array form and read into C in fitness.R.
This isn’t a particularly elegant solution, but it’s one that could work
with code such as the following:
land_r_vec <- unlist(LANDSCAPE_r);
land_r_arr <- matrix(data = land_r_vec, nrow = dim(LANDSCAPE_r)[2]);
Alternatively, and probably better in the long run for efficiency (though I doubt that the above call would lose much) if the lists were directly read into C and returned as lists. I’m not sure if this is possible, but if it is, I’ll have to make use of C data structures that are read in from R’s C interface.
The reason that I’m keen to start with the landscape layer
implementation is that I think this might be the best way to initially
model crop production. The more flexible way to do it would be to put
crops in the RESOURCE
array, but this would require much
more memory and computation time than I really think is necessary for
what, in all cases that I can currently concieve, really comes down to
just a real number at a location. By adding this real number to the
landscape and letting it be increased or decreased by agents and
resources, we can have the most straight-forwad method of modelling crop
production as affected by farmers, managers, and animals. Another layer,
however, is potentially needed mapping x
and y
locations to ownership of a particular stake-holder. Thus, I can imagine
a landscape with three layers, the combination of which will let us
address some basic questions concerning conflict between farmers and
conservationists:
## [[1]]
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 4 2 4 2 3 4 3 1 3 3
## [2,] 2 3 2 4 3 1 4 2 2 1
## [3,] 4 3 4 2 4 2 2 1 1 2
## [4,] 2 2 1 1 3 1 1 1 2 1
## [5,] 2 1 4 2 2 1 4 4 2 1
## [6,] 4 3 1 4 1 1 4 3 1 4
## [7,] 3 1 2 3 1 3 1 4 1 1
## [8,] 2 2 4 1 4 2 1 1 4 3
## [9,] 3 2 1 4 1 2 3 4 3 3
## [10,] 2 1 4 1 2 1 1 2 4 4
##
## [[2]]
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.3048322 0.93854880 0.360625346 0.26640245 0.66067493 0.98128527
## [2,] 0.3052339 0.02232611 0.592283911 0.91224653 0.78735033 0.18406582
## [3,] 0.3242061 0.40284640 0.329406406 0.05475348 0.91237753 0.81071997
## [4,] 0.5196604 0.12795754 0.597356118 0.10584243 0.97882973 0.33254518
## [5,] 0.7028296 0.03562418 0.719840874 0.89725788 0.13097754 0.17327250
## [6,] 0.6050400 0.33825595 0.591730594 0.51221232 0.04735206 0.50750169
## [7,] 0.2728841 0.72059192 0.006115102 0.03907956 0.31064451 0.07749048
## [8,] 0.2913193 0.35976381 0.293800810 0.20000116 0.38692792 0.95217694
## [9,] 0.1910186 0.81892451 0.575090826 0.45219661 0.57712307 0.21928149
## [10,] 0.4101885 0.27199997 0.087702182 0.25655657 0.72658493 0.59904482
## [,7] [,8] [,9] [,10]
## [1,] 0.5726239 0.68926847 0.46690671 0.8454668
## [2,] 0.9903127 0.99658248 0.49921442 0.1771480
## [3,] 0.9529401 0.04276868 0.33234136 0.5294589
## [4,] 0.9148535 0.19191359 0.96206345 0.5343843
## [5,] 0.8408531 0.67922100 0.04351657 0.9981276
## [6,] 0.1142356 0.71855988 0.77360567 0.1917698
## [7,] 0.1074951 0.07479546 0.56325225 0.2720286
## [8,] 0.6252215 0.72698941 0.48490239 0.6062893
## [9,] 0.1422347 0.59858735 0.51697953 0.8531933
## [10,] 0.1991840 0.29108627 0.61746590 0.6977964
##
## [[3]]
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 1 1 2 1 2 2 1 2 1 2
## [2,] 1 1 1 1 1 1 2 1 1 2
## [3,] 2 1 2 1 1 1 1 2 2 2
## [4,] 1 1 2 2 2 1 2 2 1 2
## [5,] 1 2 2 2 1 2 1 2 2 2
## [6,] 1 2 1 2 1 2 1 2 2 2
## [7,] 1 1 2 1 2 1 2 1 2 2
## [8,] 1 1 1 2 1 1 2 1 2 2
## [9,] 1 1 1 2 2 2 2 1 2 1
## [10,] 1 2 2 1 2 1 2 1 2 2
Where, above, the first element (i.e., layer) is terrain type (already in G-MSE), the second element is the production potential of crops, and the third element is the stake-holder that owns the land (a zero could be included for public land). It would make sense if the land was contiguous – I don’t see a good algorithm for this, so it might be necessary to make one. I could imagine something that breaks down the map into equal segements (e.g., like programs created to avoid gerrymandering, but easier because we don’t have to worry about population size – at the moment). Of course, if the number of cells does not evenly divide by the number of simulated farmers, then some farmers are going to have bigger farms than others, but perhaps we want this to be the case? It would be nice to be able to specify the variation in farm size. In fact, it would probably be good to use this to also incorporate the total amount of farm space. Something like the below
#farmers <- 10;
#total_cells <- dim(landscape[[3]])[1] * dim(landscape[[3]])[2]
#pr_farmland <- 0.9;
#farm_cells <- floor( pr_farmland * total_cells );
#extras <- farm_cells - farmers; # Every farmer needs at least one cell
#farm_prp <- rep(x = extras / farmers, times = farmers); # Vec can change
#farm_cells <- sample(x = farmers, size = extras,
# prob = farm_prp, replace = TRUE);
#farm_cells <- c(farm_cells, 1:farmers);
#cell_table <- table(farm_cells);
#print(cell_table);
NOTE: The above has been commented out due to errors in making the page, for some reason.
The above cell_table
therefore shows how many cells each
farmer gets. Some sort of (simple) cluster algorithm is needed to
distribute each farmer’s allocated cells to an area of the landscape.
Remaining cells will be 0
cells, indicating other land.
Note that the pr_farmland
could also be a function of the
number of cell types in landscape[[1]]
. I haven’t decided
if this should be done in R (as above) or C. I’m leaning toward C (or,
at least, kicking things to C when then get complicated) because I can
imagine the need to specify some detail in these landscapes.
Landscape connections to birth rate
Given the above proposed additions to the landscape, it’s worth
perhaps considering an option in the resource
function to
link resource birth rate to properties of the landscape. As of now,
carrying capacity is just assumed to be a parameter of the model that is
static, but it would be particularly interesting if the parameter value
could change based on properties of the landscape. For example, all of
the values of landscape[[2]]
could be summed up (perhaps
multiplied by a scalar) to determine number of offspring produced on any
given cell. If instead of random uniforms between zero and one,
landscape[[2]]
instead represented something more concrete
such as kilograms of edible biomass produced (or whatever),
this could be directly translated to offspring reared. Of course, with
the geese, this opens up some issues – mainly that breeding is done
somewhere else; perhaps the landscape[[2]]
should instead
affect survival instead of birth rate?
It’s important to think about the scale here too – as a habit, I’ve been thinking of cells as kind of mid-sized things, perhaps a square kilometer at most, but it might be better to think of them as much larger, so each cell could be its own farm with potentially many geese. Of course, we’ll want the option to do both, but given the scale of the geese scenario, I’m thinking bigger might be better. It also would be useful to have managers be able to allocate refuge space from their budgets.
Big picture notes regarding G-MSE presentation
In presenting G-MSE, I think it is important to emphasise that game-theory is the standard, formal, tool for understanding conflict between rational agents. Hence, it is the natural tool for addressing issues of cooperation and conflict in conservation (Colyvan et al. 2011; Lee 2012; Kark et al. 2015; Adami et al. 2016; Tilman et al. 2016). It’s important to also recognise that game theory is broader in scope than the application of standard pay-off matrices, and includes extensions such as adaptive dynamics. Where strategies are complex, machine learning techniques such as the use of genetic algorithms can be used to find adaptive strategies for games (e.g., Balmann and Happe 2000; Tu et al. 2000). And a game-theoretic framework is entirely compatible with agent-based modelling Tesfatsion et al. (2017). Bonabeau (2002), citing Axelrod, argues strongly for an agent-based approach to game theory. Hence, a good summary of the key concepts of G-MSE is shown below.
The big green circle is the engine that drives the decision-making of rational agents (i.e., managers and stake-holders) under complex choices and payoffs.
New considerations for machine learning in gmse
Finishing the gameGA R package has
given me a bit more perspective on the eventual structure of the genetic
algorithm of gmse
. Taking into account the history of
interations between two agents was straightforward in the case of
Prisoner’s dilemma, or any symmetrical \(2
\times 2\) payoff matrix. Because there were only two options to
consider (‘cooperate’ and ‘defect’), every locus of each agent’s
strategy just represented a response to a different interaction history.
By changing the default parameters, I was also able to recreate the
results of Darwen and Yao (1995), who
found that strategy evolution under the following payoff values tended
to result in long periods of defection or cooperation punctuated by
rapid transitions from one to the other.
Opponent cooperates | Opponent defects | |
---|---|---|
Focal player cooperates | 3, 3 | 0, 5 |
Focal player defects | 5, 0 | 1, 1 |
Typical evolution of strategies given the above payoff matrix and a three-move memory history with 100 rounds per opponent looks like the below. Periods of low fitness show areas where most strategies have evolved to defect, while periods of high fitness show areas where most strategies cooperate.
This simple example highlights something that is potentially important for understanding conflict in conservation scenarios, cooperation (and conflict) might be fragile, with rapid shifts from one strategy dominating to another taking over without much external pressure.
The fragility or robustness of conflict in conservation could have major influences on policy, particularly where we’re interested in coming up with long-term sustainable solutions. Two key questions immediately come to mind:
gamesGA
scale up with
complexity? That is, real-world conflicts are much more complex
than this simplified model, so will this complexity make existing
cooperation and conflict more robust, or more fragile? We can draw a
comparison here, perhaps, with the community ecology literature, where
the questions have long been posed, are more complex communities
more stable and more productive? There is a lot of recent
literature on this, both from theoretical and empirical studies, and
stretching all the way back to the early works of Elton
and May.
Applying similar ideas to social-ecological systems could be useful – it
could be that, like community ecology, there are qualities of such
systems that make cooperation or conflict more robust (I’ve been
particularly interested in degeneracy). I’ll revisit some of the
community ecology literature to remind myself what the key points
are.I’m wondering whether a couple papers could be especially useful to
the conservation literature – one could be a perspective piece just on
the application of game theory to understanding conflict and management,
and things that will need to be taken into account (more on this later –
but would include time lags, interactions among stake-holders,
degeneracy of effects, etc.); the idea of applying game theory to
management questions and conflict is now familiar to ConFooBio, but a
lot of the thinking we’ve done could risk being lost if not published as
a lead-in to more complex modelling, behavioural games, or time-series
analyses. A second paper could be a basic starting point for addressing
how robust cooperation and conflict are predicted to be – this might be
answerable without the full power of a complex gmse
software, focusing instead on modelling some simplified scenarios (using
a version of gmse
with a more samplified ‘g’), then
(probably) concluding that additional work will be needed to really get
at complex real-world case studies (which we’ll do with
gmse
).
Other thoughts on strategy
It’s also worth noting that gamesGA
does not allow for
some strategies that might be relevant, such as the ‘win-stay’ and
‘lose-shift’ strategy (Nowak et al. 1995).
There is also considerable work on the robustness of
cooperation in games such as Prisoner’s dilemma – a lot of which
consider spatial effects (local networks, grid-based cooperation)
explicitly. I’ve not found anything that looks at this in the context of
conservation though, so I think there could well be scope for a
high-level perspective paper here. It could also be worth considering
other types of games, such as the snowdrift (chicken, hawk-dove) game,
which cooperation and conflict are potential outcomes.
Opponent cooperates | Opponent defects | |
---|---|---|
Focal player cooperates | 2, 2 | 3, 1 |
Focal player defects | 1, 3 | 0, 0 |
The above payoff matrix produces even more fragile results (shown below).
.
The robustness of these results gets stronger though when the payoff differences get more severe so that mutual defection is much worse. For example, consider the following payoff matrix.
Opponent cooperates | Opponent defects | |
---|---|---|
Focal player cooperates | 12, 12 | 13, 11 |
Focal player defects | 11, 13 | 0, 0 |
Defection given the payoffs below has a much more difficult time getting a foothold because anytime defectors become sufficiently frequent in a population, their fitness drops dramatically compared to cooperators.
.
The point is that the differences between payof values will matter by increasing the risk associated defection (or cooperation). Note that values near zero in the first plot implied a population of mostly defectors, whereas equal magnitude of change in mean fitness does not correspond to as great a difference in the proportion of cooperators and defectors in the second figure.
A perspective paper on theory of conflict and cooperation in conservation?
Given that there is little to nothing on application of game theory
to conflict and cooperation in conservation, it strikes me that a
forward-looking paper could be useful for establishing some things –
perhaps making use of the gamesGA
R
package as a conceptual tool for demonstrating some key points.
Relevant topics include:
Background on conflict and cooperation in the context of conservation, particularly biodiversity and food security
Background on game theory, and its use in understanding the logic of conflict and cooperation in both biological and social systems
Key questions in applying game theory to conflict and cooperation in conservation social-ecological systems.
gamesGA
point about
robustness.Specific points regarding the complexity inherent to predicting social-ecological conflict: how to address these in a way that can be beneficial for management recommendations
- [Machine learning](https://en.wikipedia.org/wiki/Machine_learning) and [genetic algorithms](https://en.wikipedia.org/wiki/Genetic_algorithm): flexible tools for understanding and predicting complex strategies. Discuss how these tools are already used successfully in other contexts.
- [Agent-based modelling](https://en.wikipedia.org/wiki/Agent-based_model) as an approach to simulating realistic scenarios of cooperation and conflict
- Explicit consideration of [degeneracy](https://en.wikipedia.org/wiki/Degeneracy_(biology)) as a potential management strategy for increasing the robustness of cooperation [see recent work by @Man2016]
I’m not sure if this is the best structure or not, but I think I could see a paper like this succeeding in setting up the importance for future work.
gameGA: a new R package that also can be run from a browser
In preparation for two upcoming workshop talks, I have developed a new R package to
demonstrate the potential of machine learning and genetic algorithms in
understanding human conflicts between food security and biodiversity.
Package installation
instructions are available on the GitHub repository, and the program
can also be run directly from a
browser courtesy of shiny.
This package is mostly to serve as a proof of concept; while limited in
its applications (though it could later be incorporated into
gmse
, if desired), it demonstrates a relevant and flexible
application of machine learning to games theory. Further, the fact that
the processing time of simulations is very rapid – not even noticeable
even when run from a browswer (note, the fitness function is coded in c;
had it been coded in R, most simulations would take up to a minute),
shows that it is realistic to simulate multiple genetic algorithms
(underlying multiple agent strategies) within a program. I have no
desire to upload gameGA
onto the CRAN Repository, unless it is
requested.
In the coming days, I will continue to put together a forward-looking talk that outlines how management strategy evaluation can be combined with game theory (making use of genetic algorithms to understand behaviour) to better understand and potentially help resolve conflicts over food security and biodiversity. I think that it will be reasonable to argue that the range of strategies predicted by even a simple iterated Prisoner’s dilemma (or any other two player two decision symmetric payoff scenario) probably reflect, reasonably well, the kind of variation in human behaviour that might be predicted in real systems. Most humans will not act completely rationally, therefore we might expect a lot of uncertainty in human behavoiur where conflict arises; most strategies will be aligned with the interests of individual stake-holders, even if they are not perfect at maximising stake-holder interests.
A central purpose of G-MSE software will be to provide a
user-friendly yet flexible tool for simulating the management of
populations, with particular emphasis on a mechanistic simulation of
uncertainty and interactions among managers and stake-holders. Hence,
the software will be able to address key questions concerning conflict
in all of the specific ConFooBio case studies, but also provide a
general framework for developing social-ecological theory. My hope is to
introduce v1.0 by the end of the year, which will take advantage of shiny to let users run simulations
and view results within a browser, giving as many users as possible
access to the key features of G-MSE. Because shiny is called directly
from R, users who are familiar with R will be able to use functions
within a gmse
package (the name is not yet taken on the CRAN
list). All of the code underlying G-MSE, and its complete
development history, will be available on GitHub for
maintenance, further development, and collaboration (currently, the
repository is private, meaning it is viewable by invite only, but I’ve
no qualms with making it public). My goal now is to summarise
what aspects of G-MSE have already been developed, and to outline my
plans for future development. Feedback at this stage is very welcome,
particularly concerning what features of the software are most (or
least) important. The figure below illustrates a general
overview of G-MSE. The left panel represents how users will interact
with the software, and the right panel represents the model itself,
which uses the G-MSE concept proposed in the ConFooBio ERC
proposal.
We now have a working, stable (i.e., bug-free, as far as I can tell), resource model (blue box above) and observation model (yellow box above). Details of how these models are coded and used can be found in the notes below, and I am happy to summarise them. For now, I will avoid the technical details and focus on what these models can do; the code is written with future development in mind, meaning that if there is a feature that is not in either of these models that should be, adding it will almost certainly be a matter of inserting a bit of additional code rather than re-coding major chunks of the model. I’ll start by talking about the resource model.
The resource model is, by default, individual-based.
What this means is that each resource is represented as a discrete
entity with a number of different attributes. I use ‘resource’ as a
general term because these resources can really be anything that we want
them to be; potential resources include grouse, hen harriers, geese,
fish, elephants, crops, hunting licenses, etc. Basically, anything that
we want to represent discretely that is not an agent (a manager
or stake-holder) can be considered a resource. Each resoure has its own
ID, and can take an natural number of types in three type categories
(i.e., type1
can take any natural number to group resources
in some way, as can type2
and type3
). Types
could be used for different populations of resources within the same
simulation (e.g., hen harriers and grouse; wild and farmed salmon), or
further define life-history stages, sex, or something else.
Resources occupy some x
y
location on a
landscape. The landscape can be of any length and width combination, and
has a torus edge
whereby opposite edges are joined so that resource moving off of one
side appear on the other (I can easily add a hard boundary, or a
reflecting edge if desired). Currently, the landscape has one layer
(more could be added), with cells on the landscape taking any real
number. I’m not using cell values at the moment, but these could
represent anything from terrain types to environmental variables. During
one iteration of the resource()
function, resources move
according to one of four pre-specified rules:
x
and y
direction
selected from a uniform distribution.x
and y
direction
selected from a Poisson distribution.x
and y
direction selected
from a uniform distribution (Duthie and Falcy
2013)After movement, each resource can potentially reproduce according to
a growth
parameter. The number of offspring that a resource
produces is determined by using this as the rate parameter in sampling
from a poisson distribution. A carrying capacity can be applied to birth
such that if too many offspring are born, then offspring are randomly
removed until carrying capacity is reached. Offspring resources have
identical traits to their parent resources. It is also possible to not
allow any birth for some resources.
After birth, resources that were not just born can be removed (i.e., death) in one of three ways:
remove
trait.The resource model then returns the new set of individuals; we therefore have the basic processes of movement, birth, and death. These processes could be made more complex (e.g., sexual reproduction, more complex movement rules – toward or away from other resources, perhaps), and any number of other processes might be added into the resource model, including interactions between resources, if desired. I’m considering what we have now as a starting point.
The observation model simulates the process of data collection (but not data analysis, which is done elsewhere – eventually probably in the manager model). It basically generates an uncertain snapshot of the real population(s) by sampling from it in one of four ways (A-D):
The figure below shows the dynamics of a real population (black line) with a carrying capacity on death of 800 (dotted red line), as estimated by each method (blue lines, panels A-D).
We can run 100 time steps of 800 resources in a trivial amount of time (less than half a second) using any observation method. Of course, things slow down when adding more resources or generations, but even hundreds of thousands of resources can be simulated over 100 time steps can be simulated in under a minute.
In addition to the resource and observation models,
I have played around with a few more minor things that can be called on
when desired. This includes a function called anecdotal
that allows agents
(managers and stake-holders) to see any
resources within their local view
– essential mimicking
anecdotal observation through seeing how many resources are around them
at any given time (this might later affect stake-holder attitudes or
decisions).
The most interesting other thing that I’ve added is a prompt for user input. Basically, after a certain number of time steps (or right from the start of the simulation), an option in the program allows the user to act like a stake-holder or manager. After a time step has finished, the user is prompted with a message like the following on the R command line:
Year: 95
The manager says the population size is 181
You observe 11 animals on the farm
Enter the number of animals to shoot
The user then types in how many animals that they wish to shoot, and these animals are removed from the population.
A detailed journal of recent development history is below. Here I will summarise how I plan to
complete the software, and the rationale behind some (tentative)
decisions. Right now, I am focused on getting through the main engine of
G-MSE (red, green, and orange boxes from the first figure above), with the primary challenge of
integrating game theory into G-MSE. The manager and
users models are unique in that both require agents to make decisions
that potentially affect each other and the resources. I am simulating
agents as discrete individuals, but unlike resources, agents have
different traits and are represented by different data structures in the
code. Like resources, however, agents can take on any number of three
different category types. Category type1
is the type of
most importance, which is used primarily for distinguishing among
managers and different types of stake-holders. The manager(s) is always
of type1 = 0
, and plays a special role in observing the
population, and will make policy decisions based on the observational
model and (eventually) the numbers and past behaviour of stake-holders
through the manager model. Other type1
agents act as
stake-holders instead of managers, and act through the user model.
I’ve spent some time trying to decide how to incorporate game theory into the G-MSE software. There is more than one way to do it. My first thought was to model games using the traditional \(2 \times 2\) payoff structure, with managers setting the payoffs and two stake-holders acting as players trying to maximise their gains. Given this sort of structure, solving for optimal strategies can be easily accomplished, and we can certainly add this type of mathematical solution as an option in G-MSE. The utility of this kind of mathematical approach starts to unravel, however, when games become more complex (discussion and references all below, mostly from late January). In particular, solving for optimal strategies and equilibria (of which there can be multiple) can become increasing difficult to intractable given any of the following:
Any realistic social-ecological conflict is probably going to include one or more more of the above complications. While I really like simple mathematical and conceptual models (particularly those that provide unifying explanations), and believe that they are especially useful in developing theory, I don’t believe that the case studies that we are interested in will be tractable if we exclude the above bulleted possibilities. Nor will the software be very flexible if we confine users to very simple games. Hence, I think a different approach is needed to model the strategies of rational agents.
An increasingly used method of simulating complex, goal-oriented strategies, is through the use of machine learning. The idea behind the machine learning approach is to teach a program to learn, so the program can figure out how to solve a problem (e.g., find the best strategy) without actually being told the solution; figuring out the best solution would be effectively impossible because there are too many possible solutions to explore and compare. One technique for narrowing down the possibilities and arriving at a very good (though possibly not best) solution is to simulate the process of natural selection using a genetic algorithm. The genetic algorithm starts of with a random set of simulated genomes (genotype), each of which maps to a random strategy (phenotype). It then allows for recombination between genomes, and mutation, and checks each strategy to see how good each is at solving the problem at hand (e.g., maximising the payoff in a game). The most fit strategies reproduce, and more generations are simulated until some criteria is filled (e.g., the mean fitness of strategies is no longer improving, or 100 generations have passed). Once this criteria is met, the evolved strategies have been selected to solve the problem.
Additionally, a machine learning approach can use data to learn how to behave in a particular scenario. Google uses this in some of their software, perhaps most familiarly in gmail sorting incoming emails into different categories. Most timely, and perhaps most excitingly for those of us who are interested in games, a machine learning algorithm has been very recently developed that can consistently beat professional poker players. From the linked article in MIT Technology Review,
‘’DeepStack learned to play poker by playing hands against itself. After each game, it revisits and refines its strategy, resulting in a more optimized approach. Due to the complexity of no-limit poker, this approach normally involves practicing with a more limited version of the game. The DeepStack team coped with this complexity by applying a fast approximation technique that they refined by feeding previous poker situations into a deep-learning algorithm.’’
My goal is to apply a genetic algorithm to G-MSE, which will allow manager and stake-holder behaviour to be modelled for any potential objectives (e.g., maximise crop yield, keep populations near carrying capacity, keep all stake-holders happy, etc.) and allowing for multiple types of actions (e.g., hunt, scare, plant crops, protect offspring, pester the manager, forbid stake-holders). I’m not yet sure if this is realistic or not, but I think the genetic algorithm approach will at least get us further than anything else (save for some sort of brilliant new conceptual theory that shows how we can avoid the aforementioned complications). I’ve drafted a prototype genetic algorithm in R, which conceptually looks like the following figure.
The end result of this kind of implementation of G-MSE could allow us to:
This concludes the summary. There are a lot of challenges to implementing the genetic algorithm, but a very initial prototype below shows the idea. My hope is to have a beta version of G-MSE up and running sometime in the summer (with a polished version later in the year), and to continue to build upon the model as needed to allow for new scenarios and improved genetic algorithms. I am very open to feedback on what is and is not important for initial versions of G-MSE.
Prototype of genetic algorithm in R
I have constructed a prototype of a genetic algorithm, written in R, but deliberately avoiding most base R functions that are not available (or that I won’t want to use) in c. Once I have a prototype that I’m happy with, I will write it up in c and start to implement it into G-MSE. There are afew tricks that I’m going to want to use, particularly to swap arrays in the tournament, which I believe can be accomplished just by swapping all pointer addresses. Additional optimisation ideas might be found here; I’ll probably need to be careful to keep this process speedy, but even the initial R code is fairly efficient, so I’m optimistic.
I’ve broken the R code down into five basic functions, representing
the boxes from the most recent conceptual diagram from 3 FEB, with the exception of ‘replace’, which
is done automatically within ‘tournament’. The first function identifies
the focal agent in UTILITY, then initialises a population of 100 agents,
10 of which are identical to the focal agent, and 90 of which are
identical in all except their five action columns, which are
randomised instead (note, the whole file is recorded by git in
scratch.R
, which might later be overwritten). We also need
a min_cost
function to run in initialise_pop
too allocate actions according to costs cleanly.
min_cost <- function(budget_total, util, row){
the_min <- budget_total;
for(check in 1:5){
index <- (2*check) + 5;
if(util[row, index] < the_min){
the_min <- util[row, index];
}
}
return( as.numeric(the_min) );
}
# Add row 10X to 90 random (first brown box)
initialise_pop <- function(UTILITY, focal_agent, population){
for(agent in 1:dim(population)[1]){
if(agent < clone_seed){
for(u_trait in 1:dim(population)[2]){
population[agent, u_trait] <- UTILITY[focal_agent, u_trait];
}
}else{ # No need to bother with a loop here -- unroll to save some time
population[agent, 1] <- UTILITY[focal_agent, 1];
population[agent, 2] <- UTILITY[focal_agent, 2];
population[agent, 3] <- UTILITY[focal_agent, 3];
population[agent, 4] <- UTILITY[focal_agent, 4];
population[agent, 5] <- UTILITY[focal_agent, 5];
population[agent, 6] <- UTILITY[focal_agent, 6];
population[agent, 7] <- UTILITY[focal_agent, 7];
population[agent, 9] <- UTILITY[focal_agent, 9];
population[agent, 11] <- UTILITY[focal_agent, 11];
population[agent, 13] <- UTILITY[focal_agent, 13];
population[agent, 15] <- UTILITY[focal_agent, 15];
population[agent, 8] <- 0;
population[agent, 10] <- 0;
population[agent, 12] <- 0;
population[agent, 14] <- 0;
population[agent, 14] <- 0;
population[agent, 16] <- 0;
lowest_cost <- min_cost(budget_total = budget_total, util = UTILITY,
row = focal_agent);
budget_count <- budget_total;
while(budget_count > lowest_cost){
affect_it <- 2 * floor( runif(n=1) * 5); # In c, do{ }while(!=6)
cost_col <- affect_it + 7;
act_col <- affect_it + 8;
the_cost <- population[agent, cost_col];
if(budget_count - the_cost > 0){
population[agent, act_col] <- population[agent, act_col]+1;
budget_count <- budget_count - the_cost;
} # Inf possible if keeps looping and can't remove 1
}
}
}
return(population);
}
After the initiali population of agents is made, we include functions through which the genetic algorithm will loop, simulating key evolutionary processes. The first such process is crossing over.
# Crossover (second brown box)
# Would really help to define the SWAP function in c here -- use int trick
crossover <- function(population){
agents <- dim(population)[1];
cross_prob <- 0.1;
for(agent in 1:dim(population)[1]){
c1 <- runif(n=1);
if(c1 < cross_prob){
cross_with <- agents * floor(runif(n=1)) + 1;
temp <- population[cross_with, 8];
population[cross_with, 8] <- population[agent, 8];
population[agent, 8] <- temp;
}
c2 <- runif(n=1);
if(c2 < cross_prob){
cross_with <- agents * floor(runif(n=1)) + 1;
temp <- population[cross_with, 10];
population[cross_with,10] <- population[agent, 10];
population[agent, 10] <- temp;
}
c1 <- runif(n=1);
if(c1 < cross_prob){
cross_with <- agents * floor(runif(n=1)) + 1;
temp <- population[cross_with, 12];
population[cross_with,12] <- population[agent, 12];
population[agent, 12] <- temp;
}
c1 <- runif(n=1);
if(c1 < cross_prob){
cross_with <- agents * floor(runif(n=1)) + 1;
temp <- population[cross_with, 14];
population[cross_with,14] <- population[agent, 14];
population[agent, 14] <- temp;
}
c1 <- runif(n=1);
if(c1 < cross_prob){
cross_with <- agents * floor(runif(n=1)) + 1;
temp <- population[cross_with, 16];
population[cross_with,16] <- population[agent, 16];
population[agent, 16] <- temp;
}
}
return(population);
}
Crossing over is followed by mutation.
# Mutation (third brown box)
# Note that negative values equate to zero -- there can be a sort of threshold
# evolution, therefore, a la Duthie et al. 2016 Evolution
mutation <- function(population, mutation_prob){
mutation_prob <- mutation_prob * 0.5;
for(agent in 1:dim(population)[1]){
c1 <- runif(n=1);
if(c1 < mutation_prob){
population[agent,8] <- population[agent, 8] - 1;
}
if(c1 > (1 - mutation_prob) ){
population[agent,8] <- population[agent, 8] + 1;
}
c2 <- runif(n=1);
if(c2 < mutation_prob){
population[agent,10] <- population[agent, 10] - 1;
}
if(c2 > (1 - mutation_prob) ){
population[agent,10] <- population[agent, 10] + 1;
}
c3 <- runif(n=1);
if(c3 < mutation_prob){
population[agent,12] <- population[agent, 12] - 1;
}
if(c3 > (1 - mutation_prob) ){
population[agent,12] <- population[agent, 12] + 1;
}
c4 <- runif(n=1);
if(c4 < mutation_prob){
population[agent,14] <- population[agent, 14] - 1;
}
if(c4 > (1 - mutation_prob) ){
population[agent,14] <- population[agent, 14] + 1;
}
c5 <- runif(n=1);
if(c5 < mutation_prob){
population[agent,16] <- population[agent, 16] - 1;
}
if(c5 > (1 - mutation_prob) ){
population[agent,16] <- population[agent, 16] + 1;
}
}
return(population);
}
The function below ensures that the costs of agents actions are not
over the total budget. If they are after crosover
and
mutation
, then actions are randomly removed until they are
within the costs.
# Need to incorporate selection on *going over budget* and *negative values*
constrain_cost <- function(population){
for(agent in 1:dim(population)[1]){
over <- 0;
if(population[agent, 8] < 0){
population[agent, 8] <- 0;
}
over <- over + (population[agent, 8] * population[agent, 7]);
if(population[agent, 10] < 0){
population[agent, 10] <- 0;
}
over <- over + (population[agent, 10] * population[agent, 9]);
if(population[agent, 12] < 0){
population[agent, 12] <- 0;
}
over <- over + (population[agent, 12] * population[agent, 11]);
if(population[agent, 14] < 0){
population[agent, 14] <- 0;
}
over <- over + (population[agent, 14] * population[agent, 13]);
if(population[agent, 16] < 0){
population[agent, 16] <- 0;
}
over <- over + (population[agent, 16] * population[agent, 15]);
while(over > budget_total){
affect_it <- 2 * floor( runif(n=1) * 5); # Must be a better way
cost_col <- affect_it + 7;
act_col <- affect_it + 8;
if(population[agent,act_col] > 0){
the_cost <- population[agent, cost_col];
population[agent, act_col] <- population[agent, act_col] - 1;
over <- over - the_cost;
}
}
}
return(population);
}
After mutation, a fitness function checks the fitness of each agent.
This will eventually be a complex function balancing actions according
to costs and utility, but for now I’ve just given the agent with the
highest 16th column the highest fitness (i.e., maximise
helpem
).
# Fitness -- this is the most challenging function
# Just as proof of concept, let's just say fitness is maximised by helpem (16)
strat_fitness <- function(population){
fitness <- rep(0, dim(population)[1]);
for(agent in 1:length(fitness)){
fitness[agent] <- population[agent,16];
}
return(fitness);
}
Finally, we have tournament selection, which also replaces the original population. Tournament selection proceeds by randomly selecting four agents from the population, and passes the agent out of the four with the highest fitness to the next population. This kind of tournament selection seems effective, and will be more efficient in c that some other tournament types (e.g., best 4 out of 10), I think.
# Tournament selection on population
tournament <- function(population, fitness){
agents <- dim(population)[1];
traits <- dim(population)[2];
winners <- matrix(data = 0, nrow = agents, ncol=traits);
for(agent in 1:dim(winners)[1]){
r1 <- floor( runif(n=1) * dim(winners)[1] ) + 1
r2 <- floor( runif(n=1) * dim(winners)[1] ) + 1
r3 <- floor( runif(n=1) * dim(winners)[1] ) + 1
r4 <- floor( runif(n=1) * dim(winners)[1] ) + 1
wins <- r1;
if(fitness[wins] < fitness[r2]){
wins <- r2;
}
if(fitness[wins] < fitness[r3]){
wins <- r3;
}
if(fitness[wins] < fitness[r4]){
wins <- r4;
}
for(trait in 1:dim(winners)[2]){
winners[agent, trait] <- population[wins, trait];
}
}
return(winners);
}
We can therefore simulate the genetic algorithm with the following code, which simulates 30 iterations (i.e., generations) of crossover, mutation, and selection.
mean_fitness <- NULL;
clone_seed <- 11;
budget_total <- 100;
focal_agent <- 2;
# Add three agents, representing three stake-holders, to the utility array
a0 <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
a1 <- c(1, 0, 0, 2, 0, 0, 8, 5, 30, 0, 20, 0, 10, 0, 10, 0);
a2 <- c(2, 0, 0, -1, 1, 1, 0, 0, 50, 0, 0, 1, 1, 2, 2, 1);
UTILITY <- rbind(a0, a1, a2);
population <- matrix(data = 0, ncol = 16, nrow = 100);
population <- initialise_pop(UTILITY = UTILITY, focal_agent = 2,
population = population);
mean_fit <- NULL;
iterations <- 30;
while(iterations > 0){
population <- crossover(population = population);
population <- mutation(population = population, mutation_prob = 0.2);
population <- constrain_cost(population = population);
fitness <- strat_fitness(population);
population <- tournament(population = population, fitness = fitness);
mean_fit <- c(mean_fit, mean(fitness));
iterations <- iterations - 1;
}
The plot below shows that the algorithm converges on the best fitness
strategy (mean_fit
) quite rapidly.
Note that a strategy fitness of 10 is the highest possible because
agents have a total budget of 100 and each helpem
costs 10
from this total budget. The rapid convergence is encouraging – the time
taken from start to finish for this genetic algorithm is only 0.182
seconds in R (note also, that it found the solution in half the time),
and will of course be much, much faster in c. Things will get slower as
fitness functions become more complicated, and convergence might take a
while given optimisation of multiple things.
Also note that the genetic algorithm will need to be run for multiple agents, slowing the processes down.
Re-structuring the UTILITY array
I’m now noticing that there is an error in the genetic
algorithm as applied to G-MSE. While the algorithm shows a
proof-of-concept well, agents aren’t actually individual rows in
UTILITY
, they’re list elements made up of data frames.
Hence, The code that I just constructed needs to be applied not to an
individual row, but to lists.
I think that the above point might be a good excuse to improve upon the data frame itself, and specifically to incorporate costs in a more effective way, then improve upon the algorithm. It’s always important to keep in mind that the goal of the genetic algorithm is to teach agents to learn to maximise their own utility. Different agents will do this in different ways, so we need to keep everything broad – one idea might be to incorporate everyone’s UTILITY in the utility list; this could help out with the manager too. So now the data frame would like like the below.
agent | type1 | type2 | type3 | util | cost_util | … | cost_h | helpem |
---|---|---|---|---|---|---|---|---|
0 | 1 | 0 | 0 | 2 | 101 | … | 101 | 0 |
0 | 2 | 0 | 0 | 0 | 101 | … | 101 | 0 |
1 | 1 | 0 | 0 | 2 | 0 | … | 10 | 0 |
1 | 2 | 0 | 0 | -1 | 1 | … | 2 | 1 |
2 | 1 | 0 | 0 | 0 | 101 | … | 101 | 0 |
2 | 2 | 0 | 0 | 1 | 101 | … | 101 | 0 |
Something is still not quite perfect yet. Note that I’ve added a
cost_util
, which could be the cost of lobbying another
stake-holder, or manager. We could then see stake-holders using some of
their budget
to affect manager utilities. Each agent has
its own array too – so the costs need to be uniquely reflecting the cost
of agent index
on affecting another agent’s action or
utility.
Maybe this is the wrong way to go – first two columns might be the
utility then costs of the focal agent (as partially imposed by other
agents), where subsequent rows could just be costs for
imposing on all other agents? The array could then look something like
the below. Note that values of Inf
, in the code, should
just be some value that is higher than the cost of any agent, making it
impossible that such values can be altered.
agent | type1 | type2 | type3 | util | u_loc | u_land | movem | castem | killem | feedem | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|
-2 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
-2 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
-1 | 1 | 0 | 0 | 2 | 1 | 0 | 1 | 1 | 2 | 3 | 3 |
-1 | 2 | 0 | 0 | 0 | 1 | 0 | 5 | 20 | 12 | 5 | 10 |
0 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
0 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
1 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
1 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 1 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
2 | 2 | 0 | 0 | Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Inf |
Note the agent number -2 refers to the
utility values of the focal agent – in its rawest form, what
does the agent want or value. This is defined for each
type of resource (type1
). In the above, for example, if we
consider resources where type1 = 1
to be crops and
type1 = 2
to be geese, then we might have a farmer
represented (note – the farmer likes crops, but is neutral on geese
per se – I’m just assuming that geese are fine as long as they
don’t affect crop production). The farmer can also specify whether the
utility of each resource is dependent upon its location
u_loc
being within its view
, or some other
range – I’m a bit nervous about this column as I fear that it might
constrain the kinds of questions that can be addressed. It might be good
actually make this any natural number, rather than a TRUE
or FALSE
, then have utility attached to a natural number on
some layer of the landscape, so that one layer of LANDSCAPE
can be the number of who owns it (zero for manager); a -1 could always
just code for within view
. The u_land
specifies whether or not utility is attached to the value of some
landscape layer (e.g., perhaps representing quality of land, or
production in some cases). And finally the variables movem
,
castem
, killem
, feedem
, and
helpem
are all actions – what the focal agent
does (initialised at zero above).
The agent number -1 is identical for all but the
last five rows, which refer to the cost values for affecting
each of the actions; cost is drawn from some total budget. Finally, the
remaining rows are the costs for changing all other agents’
costs with respect to all resource types (note that before now,
we’ve effectively only had the first four rows – this adds to things).
This means that the focal agent might, potentially, increase or decrease
the cost of a different stake-holder performing an action – this will
mostly be applied when the manager is the focal agent, changing, for
example, how much it costs for stake-holders to scare or hunt resources.
But the model allows stake-holders to affect each other’s costs too, if
this is useful. Focal agents might also try affect others utility
values; for example, a stake-holder might pay some cost to try to close
the gap between their utility value for a particular resource and the
manager’s (or rival stake-holder’s) value (i.e. ‘lobbying’). Note that a
focal agent’s costs are represented twice, once where
agent = -1
, and once where agent
equals the
agent’s natural number. However, the latter represents the ‘cost’ of
changing its own cost which I’ve outlawd by setting it to
Inf
. There are potentially some other uses for this
redundancy. It might be useful in the future if we want to simulate
negotiations, so an agent has their original cost/utility values, and a
copy of what they are after they have been altered in some way.
The data frame above is therefore one of three data frames (one for
each agent), each of which is an element in a list. The data frame above
shows the utilities, actions, and costs of a focal agent, and the costs
of affecting other agents’ costs. In the above, affecting other
agents costs is always forbidden, so these values are all
Inf
. A manager should be able to adjust these costs to
enact policy – for example, by outlawing killing of resources.
Modelling crop yields and compensation
The compensation scheme and importance of government funding that
Saro has noted in her summary of geese and farming conflicts leads me to
believe that some direct form of compensation needs to be included in
the genetic algorithm. Adjust the government funding is simple enough –
we can just change the budget
of the manager. How
compensation – and farming more generally – will work is a bit more
complicated. Here are three ways that I see could work:
LANDSCAPE
to represent
maximum farm yield. Reduce this yield for each organism on the land –
assign one type of stake-holder to a patch of land using
type
as an index of an individual farmer (note that the
extra rows in the AGENTS
array might need to be ignored for
determining agent actions).
RESOURCE
, thereby allowing it to
interact more directly with other resources.
RESOURCE
array could get quite
big, and more loops would probably be required to manage it.utility
and cost
columns in
the UTILITY
array.
Of course, there’s no reason that all of these options can’t
be implemented depending on the situation; they aren’t mutually
exclusive in the code. I’m inclind to try the first option as a default.
Adding a real number to each landscape cell could represent expected
crop yield, and this number could change depending on the presence of
resources and the actions of farmers. Note that the landscape cells are
already initialised with a real number that was meant to represent types
of landscape – this can just be changed to a real number that represents
crop yield. The files resource.c
and
observation.c
already read in c as an array of type
double
; it’s really just a matter of using the
landscape that is already available.
Crop yield could therefore affect utility – in that some utility value is assigned to (and multiplied by) the value of each cell. Presence of an organism could decrease this value (sidenote: we’ll need to think about order of operations in the model). Compensation could directly off-set the loss of utility. So we could take the data frame from Friday:
type1 | type2 | type3 | util | u_loc | u_land | cost_m | movem | cost_c | castem | cost_k | killem | cost_f | feedem | cost_h | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0 | 0 | 2 | 0 | 0 | 8 | 5 | 30 | 0 | 20 | 0 | 10 | 0 | 10 | 0 |
2 | 0 | 0 | -1 | 1 | 1 | 0 | 0 | 50 | 0 | 0 | 1 | 1 | 2 | 2 | 1 |
The column util
could just be the direct effect that
resources of type1
have on the agent, u_loc could be
whether or not utility is affected by location, and if it is, then
u_land could be its effect on the landscape layer (for now, there is
only one layer of landscape values, but another column should probably
be added to specify). Hence, representing a goose to a farmer that
doesn’t care about geese at all but wants high crop yield could be
util = 0
, u_loc = 0
, and
u_land = -1
. A farmer that kind of likes geese but also
crop yeld could be something like util = 1
,
u_loc = 0
, and u_land = -1
. Note that the last
farmer likes geese, but does not care where the geese are
(u_loc = 0
), but does not want their effects on the
farmer’s land (u_land = -1
). I think that this is probably
the right way to go, though optimising fitness given these multiple
interests will be a challenge.
It might also be useful to have a compensation
column, which managers could affect, though this could be done in other
ways too. Also, I think I have interprted util
in a couple
different ways throughout the notes, so it might be worth having a
different column – perhaps target
, or something that
identifies the target value that affects util
in some way.
If a resource is at carrying capacity, for example, we might want some
mechanism in the model by which stake-holders no longer want it to
increase.
Given Saro’s notes, it might also be worth just allowing an option for a traditional \(2 \times 2\) payoff matrix too. This would allow another use of a genetic algorithm – to simulate the barganing process as in Tu et al. (2000).
Minor error correction
At 12:50, I have corrected a typo in the function
ind_to_land()
, which was making it impossible to plot
non-square landscapes correctly.
Late night (or early morning) idea
One way to solve the first major concern at the end of yesterday’s notes could be to do the following:
Create a new list strategy
in which each element of the
list corresponds to an individual agent,
strategy[[agent_i]]
. The element itself would be a data
frame that is eight columns wide and 1000 (ish) rows down. The first six
of eight columns would identify a particular kind of individual
– an agent or a resource. The columns would indicate a type as
follows:
In the above, any negative values would indicate all individuals (i.e., disregard ID if 1 is -2) In all cases, some value The remain columns would define:
It could be a messy optimisation procedure, but this would ensure that agents could pinpoint which individuals to target, what to target, and by how much. The cost of doing so could be factored in perhaps by either: only enacting rows until actions are below the cost or constraining all actions to have an effect that is lower than the cost (e.g., by normalisation). There would need to be some error checks in it, but this could be the most flexible way to handle the search algorithm. The new structure would be either a list of data frames or (perhaps interpreted in c) a 3D array that is \(1000 \times 8 \times agent_{number}\).
Another look at the search algorithm
The above proposal seems reasonable, though I now need to think a bit
more about the implementation. If doing something is the
consequence of an if
statement in the code, then
inapplicable values (e.g., non-existent types or locations) simply don’t
add to the actions (or cost) – they would just be junk. Alternatively,
they could add to the cost, and therefore be selected out in favour of
better actions. I kind of like the latter more, for now, because I
suspect it would cause convergence to happen more quickly and make the
‘genome’ more readable.
I also think that the entire strategy
array should
probably be an int
, with random sampling during a mutation
– note that this is a change from my earlier notes (anything before
yesterday) in which I was planning to use double
values
within the AGENTS
array. The previous plan would
have just been a mess to implement, and this way we have a separate
structure that acts as a ‘genome’ for strategies (I’ve not decided on
how the utilities of strategies will be held yet – probably a separate
UTILITY
array). The first seven columns will always be
integers anyway, and it will be faster and more easy to understand if
mutation just causes an integer change – just sample with
new_mutation = floor( rbinom(0,1) * maxcol)
. Or, because we
don’t have to do this biologically, maybe we come up with something like
the pseudocode below:
mutate = rbinom(0,1);
if(mutate < 0.05){
effect = floor( rbinom(0,1) * maxcol);
value -= effect;
}
if(mutate > 0.95){
effect = floor( rbinom(0,1) * maxcol);
value += effect;
}
Note that the above avoids calling rbinom more than necesary, and it
is fairly agressive in searching. My other thought was to just have
mutation cause either value--
or value++
. My
fear is that this could result in local maxima issues because ‘jumping
over’ a type would be impossible. This could be fixed by something like
the below:
mutate = rbinom(0,1);
if(mutate < 0.05){
effect = floor(mutate * 100);
value -= effect;
}
if(mutate > 0.95){
effect = floor( (1 - mutate) * 100);
value += effect;
}
This avoids calling rbinom
more than necessary, and
avoids local maxima by letting mutations jump over types. It
should also result in a mutation rate of 0.1, with equal probably of
incrementing from 1 to 5 or decrementing from 1 to 5. But I’m still not
terribly excited about the idea of making it add or subtract from
value
above, as types are not ordinal.
Here’s another idea, maybe start with the aforementioned utility
list/array. This array could be a list of arrays
UTILITY[[agent]][row][col]
in R and a
UTILITY[agent][row][col]
3D array in c in which each
agent
is a list element or dimension. Rows could exhaust
all possible types with their utilities in the final column such that
the element corresponding to one agent, e.g., could be:
type0 | type1 | type2 | type3 | utility | cost |
---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 1000 |
0 | 1 | 0 | 0 | 0 | 1000 |
0 | 2 | 0 | 0 | 0 | 1000 |
1 | 1 | 0 | 0 | 2 | 8 |
1 | 2 | 0 | 0 | -1 | 12 |
In the above, there are three types of agents
(type0 = 0
), which includes the manager
(type1 = 0
) and two stake-holders
(type1 = 1 & type1 = 2
). There are also two different
types of resoures (type0 = 1
), which include
type1 = 1
and type1 = 2
. Types 2 and 3 are
unused for both agents and resources. Each type has a
utility
– how much the particular agent values the
identified agent or resource (though I’m not sure how this will be
interpreted for agents, particularly when the identity becomes
self-referential). The cost
identifies how much expenditure
is required to affect the agent or resource – Note that this
already creates the problem that different attributes of resources and
agents should cost different amounts to affect. At the very
least, we don’t want the cost of culling versus scaring to have
to be the same. At the same time, we don’t want the ouput of
this software to be so messy that end users won’t be able to interpret
what is going on – the possibilities should correspond to clear
management options, I think (though we also don’t want to constrain the
model to force it to do what we presume is best; we want it to find
novel solutions, where possible).
Ideally, it would be nice if both managers and stake-holders to
potentially affect each other’s costs, but this creates a kind of
infinite regress problem – the need for meta-costs for how much it costs
to affect another agent’s cost; this is probably too much.
Instead, maybe costs are a function of the manager’s
utility
, and that all lobbying occurs on
utility
. This avoids the ‘cost of costs’ problem – the only
thing we lose is that one stake-holder might not be able to directly
affect how easy it is to hunt or scare, or do anything else – though
they might still affect each other’s utilities? Even this seems to get a
bit too complex.
Maybe theres a starting point that gets the model working but also
leaves remove for further development. What if the UTILITY
array looked more like this:
type1 | type2 | type3 | utility | cost_m | movem | cost_c | castem | cost_k | killem | cost_f | feedem | cost_h | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0 | 0 | 2 | 8 | 5 | 30 | 0 | 20 | 0 | 10 | 0 | 10 | 0 |
2 | 0 | 0 | -1 | 0 | 0 | 50 | 0 | 0 | 1 | 1 | 2 | 2 | 1 |
Now in the above, only resources are actually considered in the
UTILITY
array. There are five basic things that an agent
can do to a resource – two ways to benefit it and three ways to
have a negative affect on it. Agents can move resources
(movem
), castrate resources (castem
), or kill
resources (killem
). And agents can feed resources
(feedem
) or help resources (feedem
). Doing
each of these things comes with an associated cost. It’s important not
to take these categories too literally, but for now they could loosely
correspond to:
xloc
and yloc
(movem
),castem
)killem
)feedem
)helpem
)I’m not sure how this last one will work yet. This sacrifices some of
the generality of the code, but in the context of what G-MSE is for, I
don’t think that we lose much, and it’s easy to see how we could add
columns to UTILITY
later as necessary. Then, following the
diagram from yesterday, when mangers use the
genetic algorithm, they could affect their own cost columns and
everyone elses as a function of their own utilities and the
population sizes of each resource. The costs would then be update for
the stake-holders, who could adjust their own parameters accordingly to
maximise their utility.
Note that the index of each array in the list UTILITY
will correspond to the ID in the AGENT
array. This could be
useful, if we want, for example, to eventually let stake-holders affect
one another. The first agent is also always the manager (and it’s hard
to see a situation where we have more than one, but even if so, we can
always have one head manager), so stake-holders can lobby the
manager by indexing the zero index of UTILITY
with
ease.
Focusing first on the stake-holder genetic algorithm
The genetic algorithm can be fairly straightforward, and (I think), fairly efficient given the data structure above. All of the five aforementioned columns can be mutated by any integer value, and unlike the case in which types were random, the numbers are ordinal so that the following code isn’t too much of a problem:
mutate = rbinom(0,1);
if(mutate < 0.05){
value--;
}
if(mutate > 0.95){
value++;
}
So the the code would do the following in c for a single agent:
malloc
a 3D array that is 100 deep, and copy the
agent’s entirey UTILITY
data frame each time.Given the above, we need to update the internal structure of the genetic algorithm to the below:
The result is simpler, and therefore it should be faster and easier
to implement. I’m hoping that it will be possible to code this to be as
flexible as possible – enough to really allow for some complex
interactions with agents affecting agents in different ways, but I think
this will have to be addressed when I actually start writing the code.
For the moment, I think the above is a good balance for stake-holder
genetic algorithms. I also think that the spatial effects will also
better emerge organically through restricting stake-holders to affecting
only their own cell (or the view around their cell). If we let the
genetic algorithm try to evolve to find the locations
where an agent should do something, I think it would slow down
considerably. We can always use view = 100
, or turn off
spatial implementation, to have stake-holders affect across the whole
region, and it’s hard to see why we would want stake-holders to
arbitrarily pick out parts of the map to care about (if it’s caused by
the presence of another resource, then the algorithm should find the
right actions based on the resource’s utility).
Looking specifically at the fitness function
Let’s assume that costs (‘policy’ in the updated figure) are fixed for now, and the genetic algorithm takes these as a given. Then, instead of accounting for every other agents actions (like the manager might have to do – figure this one out later), each stake-holder could just check to see how their actions affect the abundance or local density of each resource. In fact, why don’t we add another column:
type1 | type2 | type3 | util | u_loc | cost_m | movem | cost_c | castem | cost_k | killem | cost_f | feedem | cost_h | helpem |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0 | 0 | 2 | 0 | 8 | 5 | 30 | 0 | 20 | 0 | 10 | 0 | 10 | 0 |
2 | 0 | 0 | -1 | 1 | 0 | 0 | 50 | 0 | 0 | 1 | 1 | 2 | 2 | 1 |
For space, utility
has been shortened to
util
, but I’ve also added a column u_loc
,
which codes for whether or not the utility of the resource depends on
that resource being ‘local’, in whatever sense this could be relevant.
For example, a zero could be a simple FALSE
, while a 1
could be TRUE
in the sense that resource do not affect
utility or therefore fitness unless they are on the same cell as the
agent, or within view
. Better, values within this column
might correspond to different definitions of ‘local’ – It could mean
utility is important when the resource is on any cell of the same type
of agent, or on a cell with a particular landscape property.
Now, to implement this, we can’t really just use the
RESOURCE
array, because neither managers nor stake-holders
should have access to it. We have to use either the observation array
(or summary statistics from it) or the agent array. When stake-holders
have an estimate of how many resources there are, their utility is
affected and they can act in a certain way. Note that I’ve placed some
odd utility values in the rows above, but maybe they should actually be
much different, reflecting the ideal number of resources –
perhaps more utility values are needed too:
I wonder if a second utility parameter is needed so that utility doesn’t not need to increase or decrease linearly with utility. Perhaps up to 3-5
util
andu_loc
values are needed – unneeded ones can be ignored later.
The above could see stake-holders switch strategies when satisfied.
It might also be worth having some sort of a dummy resource, or a
reluctance to spend cost
if not necessary – stake-holders
have other things to do, and if everything is working fine, then they
don’t need to do anything in the model, including use up costs,
and perhaps there should be some pressure against it anyway.
In any case, I could see two types of information about resources being most relevant for stake-holders decisions, thereby affecting the fitness function:
anecdotal
).I can’t, offhand, thinking of why anything else would be absolutely
necessary – not as a starting point, at least. Stake-holders could use
one or both estimates to make their decisions, so those values and
UTILITY
could be read into the fitness function. The
fitness function would then estimate how resource abundances would
change as a consequence of the stake-holder’s action, with higher
fitness being awarded to actions that match utility values.
To make things even more complicated, it might be necessary to record
the UTILITY
actions over time – eventually, stake-holders
should be able to correlate them with changes in utility (both as a
consequence of their own actions and other stake-holder responses) and
use this in the genetic algorithm. For now, I think establishing a
record-keeping method is enough. I’ll worry about how to
incorporate the game history into decision making after I have a simpler
working model.
Additional thoughts
While this is going on, I’ll want to keep in mind the four categories of Tu et al. (2000), which I mentioned on 30 JAN 2017. These define conflict by different utility functions. This might be especially relevant because next I’ll need to figure out how the manager is going to strategise given both stake-holders and resources.
For next week: Consider writing a proto-type for the genetic algorithm in R. Make sure that it works and trouble-shoot any unforseen issues before trying to write the whole thing in c
I have now started using GitHub projects to better organise G-MSE development. This appears to improve the workflow a bit better, which is good because the workflow is probably going to get more complicated once I starting coding the genetic algorithm and integrating it into G-MSE. Below is an updated overview of how I expected G-MSE to work in light of the genetic algorithm approach.
Note that the Genetic algorithm (in green above) is being used twice,
once by managers and once by users (stake-holders). What is happening
here is that managers are taking in the observation model and updating
their management plan by tapping into the genetic algorithm. Likewise,
after the managers do this, the stake-holders respond by also tapping
into the same genetic algorithm to update their off-take. The
flow of the model here needs to be planned carefully. I think
the best places to start after the observation model is by reading in
the observation data into the manager model. Managers can then use the
observation data for analysis in manager.R
, which will call
manager.c
(I don’t think that this will need to be a very
intense program, but using c for the analysis will at least allow us to
build upon the code in a complex way, if need be). The output from the
manager model should then be the kinds of summary statistics that are
relevant for policy-making. This could be as simple as an estimate of
population size or density, or perhaps the abundance of different age
classes. To keep things flexible, I think that a new object is needed –
a new list or array – as output that is relevant for policy making.
Perhaps if the output is just a scalar, then this can be interpreted as
an abundance or density estimate, while if it is a vector, than it can
represent ages or types?
Perhaps what we really want to do is tidy up the observational data?
The manager.c
function could serve as a special type of
apply
or tapply
R function, which takes in
different column arguments and then calculates summary statistics for
the specified column. So, if no column is specified, then
manager.c
just estimates the population size or density of
the entire population based on the observed data provided (using the
mark-recapture or density procedures currently implemented by the
chapman_est()
and dens_est()
R functions
written in gmse.R
). Whereas, instead, if we specify a type
column, then the estimate is done for all estimates of that type. If a
type and an age column, for example, are specified, then estimates for
all unique combinations of type and age are performed. The output is
then an array with fewer rows – each row corresponds to an averaged out
resource array where some columns don’t mean anything (e.g., ID). This
will require a lot of error checking, as bad user input could cause the
manager to do odd things – in fact, I think only column types and age (4
columns total) should be allowed to be uniquely estimated; even that is
a bit much. The zero column of the array could then stand for an
estimate instead of an ID in the output. For the simplest of cases where
only one resource is being estimated, therefore, the relevant array
manage
would have a zero index
manage[0][0]
.
There is a bit of fuzziness here that should perhaps be better
cleared out with planning. Currently the observation.R
function requires a type and category type to be specified. In other
words, the observation function has been constructed (deliberately) to
look at one type at a time. This is probably good – no need to change
it, but for a simulation where multiple resources are being managed,
observation()
will need to be run multiple times. I think
that the best thing to do in this case is to store observation data
frames in a list, such that OBSERVATION[[1]]
stores type 1
and OBSERVATION[[2]]
type 2. This will keep the high-level
resource types separate; within the list elements, other types
(e.g. sex) and age can be managed, and we can pass each list element to
the manager model separately to produce some output. The output can also
be stored as a list MANAGED[[1]]
, with all list elements
subsequently being merged into one data frame – ideally in c,
but possibly before in R – to go into the genetic algorithm. Again, a
lot of the time all of this will just end up as one vector, with really
just one number being of interest, but we want the model to stay
flexible so that we can deal with eventual demands. The end result will
be all of the information the manager is actually going to use
to make decisions – which may then be separated by resource type(s) and
ages.
G-MSE will run an independent observation model for each type of resource of interest. The output of observations will be respresented by a list of arrays in R. Each array in the list will then independently be run by the manager model, each run of which will return an array of summary statistics that will be added to a new list. The list (or an array of concatenated elements – merged data frames) will then be read into the genetic algorithm. Additionally, the manager model should also affect the landscape list/array too – this will give the option of using resource distribution in making decisions; a simple increment for each time a resource is observed on an x y location should do.
Difficulties remain with the genetic algorithm
This brings us to the genetic algorithm itself. Once the ‘’managed’’ data has been finalised by the manager model (or after the manager model has been run for each reasource), then the genetic algorithm will take the managed data, the agent array, the landscape, and the parameter vector and output a new agent array in which elements of the manager’s row have potentially been changed. This models the process of the manager potentially receiving summary statistics, information about stake-holders, distributions of resources and stake-holders on the landscape (along with other landscape-level properties), and other globally relevant parameters and potentially adjusting their policy and even interests accordingly. Actions, interests, and costs are encoded in the agent’s rows:
I’m not sure if we need managers to be able to have their
own actions given the point about zero costs, but I think leaving this
option open is easy, even if it’s rarely used. Having three blocks of
column types (actions, utilities, and costs) also mght allow
stake-holders to affect each others costs, potentially, so it’s worth
planning this way. I think the best way to do this is to probably have
the dimensions of AGENTS
be adjustable, based on how many
columns are needed. Four columns would actionally be needed to
optimse
All of these values can be optimised as a consequence of data or other agents (for example, a high population size might cause the manager to allow more resources to be hunted or scared – but more or angrier stake-holders might also causes this to happen). The interests of managers, obviously, should change before the actions, as how they act will depend on what they are interested in.
The actions of managers, as changed through the genetic algorithm will be directly interpreted by users as policy. Hence the relevant row(s) of the agent array will feed into the user model. These rows will affect the costs of stake-holder actions (recall that each stake-holder has a total budget). The stake-holders react to these policies and simultaneously adjust their own actions (and potentially utility).
A major challenge here is the sheer number of things that an
agent could potentially do, and getting all of those options into the
AGENTS
array. Things that a set of columns is going to have
to specify for an agent include:
It seems as though there should be an action to directly affect the
landscape in some way. E.g., fencing – though I suppose a fence could
just be a type of resource in the RESOURCE
array. The
problem is that if we allow users to directly affect resources
or the landscape, then there has to be some sort of switch in
the code to allow this. If, however, things like
fence or crop is a resource, and therefore in the
resource array, then agent actions could be restricted to subtracting or
adding resources by adjusting the resource array. Then again, if a
particular resource (e.g., a fence) is not in the resource
array, then there is nothing to adjust. Of course, code-wise, is there
really a difference between a fence and scaring? They both adjust the x
y location of a resource. Maybe we really don’t need much to do with the
landscape – just work with a displacement cost, leaving
how resources are displaced to be abstract and interpreted by
the end user of the software.
Just working with the idea that displacement is all the same (maybe leave some hooks in the code for adjusting the landscape), what we really need to know then is the following:
Complicating things even more, costs might be different when adding or subtracting values from resources – e.g., it might be not so costly to greatly decrease the birth rate of an organism, but increasing it by the same amount should be near impossible. I’m not yet sure the best way to structure the arrays, or anything else, to handle all of these complications, so this will be a major project in the near future (added to GitHub projects).
Once I resolve the issue of how to structure the data so that the
appropriate values on AGENTS
, RESOURCES
, and
possibly even LAND
arrays can be tweaked through the
genetic algorithm, the internal structure will look something like the
below.
Essentially, the relevant row from AGENTS
will be
brought into the genetic algorithm; ten copies of the row will be added
to 90 copies with random numbers. The fitness of all 100 copies will be
checked in the fitness function – the fitness function will adjust the
resource and agent values according to the copy being assessed, and then
some function needs to be called to predict what will happen to resouces
and (possibly) agents and the landscape. Originally, I thought that this
might be accomplished using the resource function itself, but I don’t
think this is best anymore – mainly because it misses an error step
(i.e., agents shouldn’t be able to perfectly predict effects on
resources). Perhaps it should just loop back through the observation
data? Or the resources within view? It could then consider the effect of
resources being removed on the agents own utility. It could be something
as simple instead as: directly scare or remove resources from location –
does this decrease undesired resources from the location? Then lobby the
manager: does the change in manager policy affected undesired resources
form the location? In other words, should we just have agents look at
the direct and immediate effects of what they’re doing on a particular
location. In the case of the manager, the location could be the entire
landscape, perhaps incorporating birth and death rates into the fitness
function?
For tomorrow, it might be worth just working through some of the things that will definitely need to be coded. Alternatively, there are two definite challenges that remain for using the genetic algorithm:
More thinking about agent fitness functions
While I have a general idea of how to implement the genetic algorithm
now, how agents make decisions and act on them is still not clear from a
modelling perspective, so more critical thinking needs to be done here
before any coding. Unlike the resource
and
observation
models, I also think it might be better, given
the complexity, to write a prototype of the code in R to show proof of
concept before optimising the code in c. One thing that I think every
agent needs will be some sort of total budget (note, this
budget is not necessarily currency – at the moment, I’m thinking about
it more like a time budget; it’s also possible we’ll need two budgets,
giving the option of one used explicitly for time and the other for
currency, but I’m keeping it simple for now). This will give us the
option of constraining agents’ behaviours if desired so that agents
cannot take unrealistic actions to increase their utility, and instead
might have to consider trade-offs between different actions. For
example, a farmer might be able to either tend crops, scare or kill
organisms, build fence, or lobby the manager to increase utility, even
though the best thing to do would be all four. A utility
function would then determine how a combination of actions maps to
utility, and a genetic algorithm could find the optimal behaviour to get
the highest utility. We might consider different stake-holders, or
different types of stake-holders, to have different total budgets from
which to make decisions – these budgets could also be affected by, and
affect, RESOURCES
.
In the software, what this might look like is each AGENT
having the opportunity to modify the following:
Note that this way of conceptualising the implementation of actions is broad enough to include managers (who might lobby stake-holders, or intimidate them through laws to not do something). There might be other things to consider, but this suggests to me at least seven potential variables that an agent could affect, and agents will need to maximise their utility using a genetic algorithm that tweaks all of these parameter values – ideally it would also take into account past actions of other agents to predict utility.
Agents might also be spatially restricted in their ability to perform
any of these actions, thus making strategy dependent upon location
(e.g., a farmer might not be able to hunt in certain areas). Here the
option to define type2
agents could come in hand – one
agent might be represented by multiple rows of the AGENT
array with actions for each type type2
, but each row having
a unique xloc
and yloc
, thereby representing
land owned. Managers could own all land, or just public land if
they cannot do anything on stake-holder land. Some agents might have
locations of -1 (or lower), meaning that they cannot do anything that
requires control of land.
Implementing this type of system could be challenging, and will
require that the landscape be a 3 dimensionsal array (or list)
with the third dimension or [[layer]]
list element
representing a different layer of the landscape. I’ll make this
an ISSUE later.
The game implementation of G-MSE will require several additional
AGENT
columns corresponding the bullets above, but also
type specifications. These columns would correspond to the
G1
to Gn
columns suggested earlier. More concrete, they will
look something like this:
IDs | type1 | type2 | … | see2 | see3 | … | budget | lobby_type_1 | lobby_col_1 | lobby_val_1 | … | farm_product |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | … | 0 | 0 | … | 1000 | 0 | 0 | 0 | … | 4.5 |
1 | 1 | 0 | … | 0 | 0 | … | 100 | 2 | 14 | 11.4 | … | 19.6 |
2 | 1 | 0 | … | 0 | 0 | … | 100 | 2 | 14 | 16.1 | … | 10.3 |
… | … | … | … | … | … | … | … | … | … | … | … | … |
N-1 | 2 | 0 | … | 0 | 0 | … | 75 | 0 | 12 | 3.2 | … | 12.3 |
N | 2 | 0 | … | 0 | 0 | … | 75 | 0 | 12 | 4.2 | … | 8.8 |
Hence in the above, AGENTS
has a budget column, and
columns for each type of actions that can be performed. For lobbying,
agents can select a type to lobby (lobby_type_1
), the
column to try to affect (lobby_col_1
), and a value to
affect it by (lobby_val_
). This raises an issue that an
agent might want to lobby multiple types of agents, or even multiple
columns of multiple types of agents. It might be worth thinking
about if there is a better way to organise what parameters can be
affected and how. Of course, we can make global changes that
change the number of columns in AGENTS
, giving all of the
columns needed, but maybe there is a better way to do it. Between
see3
and budget
, columns will include utility
values on resource types and perhaps cell types on the landscape (these
are what lobb_val_1
affects – agents should also be able to
potentially affect each others lobby_val_1
, but probably
not lobby types or columns – I can’t see how realistic it would be to
convince a different stake-holder to do something with the same value to
a different type of agent or a different resource). It
might just have to be the other agents’ own cells (or all those of an
agent’s type, if we represent type as an individual), or the cells
within view
.
This setup could offer some interesting insights – potentially figuring out the conditions under which it benefits stake-holders to take different actions for themselves (increasing yield, hunting, scaring, etc.), or taking different types of actions (e.g., doing work for oneself, lobbying managers, harrassing other stake-holders). Perhaps it’s possible that conflicts could lead to energy being invested in different types of actions depending upon different costs of those actions (is it easier to lobby the manager or shoot an organism?), or arms races could develop that don’t make a whole lot of sense until we understand the history of the conflict (easy to bother other stake-holder, which causes a retaliation, which escalates, etc., with not much action taken to manage). A key here will be to adequately parameterise how much investment each type of action requires so that we have an idea of the kinds of trade-offs that stake-holders experience. Again, in the absence of these trade-offs, it seems like stake-holders should and would try to do everything – interacting with managers and other stake-holders, adding fences, maximising yield, hunting, etc. But I don’t think such an unlimited model would reflect how people actually budget their time and money (i.e., I don’t think that the assumption that agents have unlimited resources is realistic or useful, and that it would be both more realistic and more useful to allow for limited budget).
To summarise briefly here – what we’re going to do
is have those columns in the AGENTS
array, then use a
genetic algorithm allowing each agent to tweak these values, which will
produce the effect of changing other values of the AGENTS
,
RESOURCES
, and LANDSCAPE
arrays – constrained
to a certain budget
– to affect the focal agent’s utility.
This requires a utility function for the focal agent to somehow predict
the consequences of these actions (perhaps by simulating a run of the
G-MSE to predict what happens in the next generation if only
their actions were to be applied). This could get computationally
intense, but I don’t see a speedier option just yet.
Implementation of agent strategies
More needs to be planned for the input and output of agent
strategies. That is, what variables should and should not be available
to managers and stake-holders when optimising strategies through the
genetic algorithm, and how should these variables be incorporated into a
strategy that causes agents to take one or more actions? Note, there are
plenty of resourecs for incorporating multiple objectives into genetic
algorithms (Fonseca and Fleming 1993; Horn et al.
1993; e.g., Fonseca and Fleming 1998; Jaszkiewicz 2002), so
agents can be complex in their utility functions. What I’m talking more
about is what do agents get to consider when optimising to maximise
their own utility functions? And what kinds of actions do
stake-holders engage in upon formulating a strategy? Once
the answers to these questions are clear, it will possible to start the
process of coding manager
and user
functions. Some potential things to consider as variables
affecting manager and stake-holder strategies:
RESOURCE
array if resources are meant to be known
(e.g., if hunting licenses or crop yields are modelled as
resources).Some things to consider as potential actions (outputs) of
manager
and user
functions:
birth_rate = 0
), perhaps at
some cost that should be considered explicitly? There are probably some
high-level decisions to consider here, and it would be ideal to have
many possibilities to choose from.Neither of these lists are exhaustive, and the input and output options could get very complex. I think that this is okay as long as it doesn’t cause the program to be too inefficient, intractable, or unrealistic. We want the options available to managers and stake-holders to reflect those of real systems as much as possible, but it is also worth thinking about whether some options can be safely pruned out of the software, or at least tabled for a later time.
Note, it might be that for most stake-holders, the strategy is really obvious – always act in such a way as to maximise the resources that you’re interested in – no need to optimise much then because the action to take is clear. For managers, however, I can imagine that the decision will always be a bit more challenging, requiring trade-offs between the interests of different stake-holders in determining policy.
Also Note, there should be no need to tell managers
what kind of approach to take with respect to policy (though this should
be an option, of course). The genetic algorithm should be able to handle
this sort of thing – indeed, we might just see very different approaches
come out of this model organically as a consequence of different
resource abundances and distributions and stake-holder interactions. For
example, between time steps, we might see managers switch from
establishing a global hunting quota to prohibiting hunting and
constructing fences (protected areas of landscape) instead; all we need
to do is allow some sort of switch
to affect manager’s
general approach, then incorporate this switch
variable
into the genetic algorithm.
Use of genetic algorithms in ecology and evolution
Hamblin (2013) has a nice methods paper on the use of genetic algorithms, focused especially on a ecology and evolution audience. He cites a highly relevant book by Sean Luke, which includes a general introduction to genetic algorithms, but also chapters on coevolution (competing strategies), multiobjective optimisation, and policy optimisation (Luke 2015). Luke (2015) is particularly cited for the a quote on the utility of metaheuristics (which includes genetic algorithms), which I’ll just include here in full:
‘’Metaheuristics are applied to I know it when I see it problems. They’re algorithms used to find answers to problems when you have very little to help you: you don’t know what the optimal solution looks like, you don’t know how to go about finding it in a principled way, you have very little heuristic information to go on, and brute-force search is out of the question because the space is too large. But if you’re given a candidate solution to your problem, you can test it and assess how good it is. That is, you know a good one when you see it.’’
I think this probably applies well to G-MSE. Hamblin (2013) notes that ‘’fitness evaluation’’ is the larges performance bottelneck, so it is probably not worth investing too much energy on optimising the specifics of structure types, or crossover, mutation, and reproduction algorithms; instead, more attention might be paid to making speedy assessments of the fitness (payoffs) of agent strategies. It’s also possible to control recombination (I’m going to call it that sometimes – ‘’crossover’’ strikes me as a bit of a confused term from the computer science literature) and mutation frequency through a parameter, so they could effectively be turned off if the parameter were set to zero. Hamblin (2013) notes that mutation type (e.g., random per locus or chromosome) is not terribly important (but it’s worth pointing out that the mutation rates from the literature search in Table 3 are generally much lower than Luo et al. (2014) mentioned – 0.1 still seems reasonable to me), but recombination parameters can be important – one point crossover (i.e., forcing cross-over to happen once for all individuals) can break up good linkage combinations – better to just use uniform (probabilistic) crossover. Population sizes shown in Table 2 of Hamblin (2013) references shows that population sizes around 100-200 (though some much lower, but nearly always less than or equal to 2000) are common, with run lenghts commonly around 500 (1000 is also commmon); reals are about as common as binaries. The most popular selection algorithm is truncation, making up well over half of ecology and evolutionary biology applications of genetic algorithms (Table 1 of Hamblin 2013). To my surprise, truncation selection is not the consensus recommendation for genetic algorithms (and proportional methods are quite bad when multiple strategies are near an optimum, resulting in premature convergence). The recommended selection method according to Correia (2010) is actually tournament selection. The algorithm is described by the quote below:
‘’It randomly picks k individuals from the population and copies the fittest of them to the mating pool. All the k individuals go back to the population. The process is repeated until the mating pool has the desired size.’’
So tournament selection is not probabilistic – in that sense, it is like truncated selection, but there is an extra sampling step that is iterated until the new generation is formed. If k is the same size as the mating pool, the this is effectively truncation selection, so really tournament selection is a generalisation of this that will be useful to code. Hamblin (2013) also cites a book chapter by Syswerda (I’m still waiting on the full text, but the link has all of it) that shows that overlapping generations (termed ‘’steady state’’ in the computer science literature) perform better than non-overlapping (termed ‘’generational’’) algorithms. This can be easy to implement – allow selected agents to be placed in a new array, but have mutation and crossover in half. This will fit especially well with G-MSE given that agent strategies might not be expected to change much from one time step (of the model, not the genetic algorithm) to the next. Hence, the optimal solution from the previous time step will be included in next time step, and if nothing changes, then convergence will occur as soon as possible.
It will obviously be important to run diagnostic tests on the G-MSE genetic algorithm. Hamblin (2013) recommends,
‘’Plots of mean population fitness (and its variance) and the fitness of the best individual over time can be important for both diagnostic and reporting purposes; populations that reach a single solution (close to zero variance) within a few generations are a clear sign of premature convergence, likely stemming from a problem in the balance of exploration and exploitation (selection too strong, too little mutation/crossover, population size too small, etc).’’
Testing shouldn’t be too difficult – the results of genetic algorithms can be printed off to a c file, then read in by R and presented in a figure. Hamblin (2013) suggests that genetic algorithms are robust, so it’s unlikely that parameter values choices will cause major problems or affect things greatly, but it’s worth doing all of the quality checks.
More review of utility functions in genetic algorithms
I’m turning now to the use of utility functions, particularly the use of them in genetic algorithm and games. It appears that these can be found in economics and business. For example, Luo et al. (2014) address an optimsation problem for product demand using a utility function to be maximised and a genetic algorithm. Luo et al. (2014) use fuzzy numbers to model market segements, which include three numbers representing most pessimistic, most likely, and most optimistic values. The authors use conjoint analysis, apparently a technique to figure out what people will pay for, combined with a ‘part-worth utility model’. Utility is modelled as a USD amount, and as a linear function (summation) of the product of weights, part-worth utilities, and a binary variable linking product profiles and product attributes – summations over levels and product attributes. I’m not too worried about the details here, just that total utility is measured in currency in this case, and is calculated as weighted sub-utilities – this kind of logic is relevant for G-MSE.
Luo et al. (2014) then go on to model
how utility determines a consumer’s choice of product (essentially,
consumers pick the product of highest utility, or none at all). Several
constraints on product choice and product attribute-profiles are noted
in the model, but the genetic algorithm is implemented using
int
coding – one chromosome has consumer choice and product
configuration sections. Genes within the consumer choice section each
represent consumers in a particular market segment – values of these
genes correspond to choice of different product profiles (if the value
is zero, no product is chosen). The product configuration section
contains subsections related to product profiles; each subsection has
genes whose integer values indicate the level selected for a product
attribute. A population’s gene values are initialised randomly, and
individual fitness is calculated using the linear models introduced
prior to the genetic algorithm. The authors use a uniform crossover
procedure, which might be useful type of algorithm – apparently
searching a lot of strategy space, though the costs and benefits of
different crossover methods are still unclear to me. Parameters for the
genetic algorithm seemed unusual, to me at least; Luo et al. (2014) set a population size of 30, a
crossover probability of 0.7, and a mutation probability of 0.4. I would
have considered the population size much too low, and the mutation
probability much too high, but it’s worth keeping in mind that perhaps
these parameter combinations are useful in genetic algorithms even they
appear odd biologically – it’s worth experimenting with them, at least
(their Figure 2 suggests that my presumed ideal parameters might be on
the low side for crossover and mutation). Surprisingly (at
least, to me), Luo et al. (2014)
conclusded that their algorithm had the best performance (in terms of
profit maximisation) when ‘’crossover probability was 0.7 and mutation
probability was 0.7’’. The authors used MATLAB to implement the
genetic algorithm, so the might have been stressed on computation
efficiency – it took 85 seconds for 50 generations on a Pentium IV
processor; had the analysis been run in c, it surely would have been
faster.
Luo et al. (2014) do note that ‘’low mutation probability (e.g. 0.1) is a good choice’’ for genetic algorithms (as a biologist, of course, this seems very, very high!), but their problem was an exception because the space that needed to be explored was very large. The general take-home I get from this is that the relatively low mutation and recombination rates that we observe as biologists are probably not appropriate for a good genetic algorithm; higher ones should be used by default – of course, this will require citation to reassure reviewers that this is standard practice.
Tu et al. (2000) look at genetic algorithms for negotiations among agents using utility functions, which is exactly the kind of thing that we’re interested in for G-MSE. In addition to being a useful resource for showing an overlap between utility functions and genetic algorithms, this conference proceedings is very interesting in that it has interacting agents, and considers negotiation as ‘’a serch for an optimal negotation outcome with respect to the utility functions of each partner’’ (Tu et al. 2000). I’m not sure if we’ve proposed it this way before, but given that I’ve been conceptualising the manager in G-MSE as a special kind of agent (and, in that sense, similar to stake-holders, but following its utility to make rules rather than work within rules to maximise utility), it would be very interesting if we could use a genetic algorithm and the manager agent’s utility function to optimise negotiation outcomes in addition to management outcomes – or perhaps, define ideal management outcomes as the optimal negotiation outcomes that maximise the interests of stake-holders. We could then use the manager genetic algorithm as a tool in real-world case studies where real or simulated stake-holders play the role of agents.
The use of automated negotiation strategies in online commerce appears to follow a protocol using simple sequential rules and threshold utility values. Tu et al. (2000) created a generic framework for a genetic algorithm, implemented using Java. Three functions needed included mutation, crossover, and reproduction. The algorithm for selection seems a bit unclear. It appears that parent individuals (i.e., reproduction) is chosen based on probability, while selection of offspring simply draws the highest fitness offspring to become the next generation of parents? (’’The parent individuals are chosen with a probability proportional to their fintess and the operators are chosen randomly. From the new population of size \(\lambda\), the \(\mu\) individuals with the highest fitness are propogated into the next generation as parents’’). This isn’t entirely clear.
The method by which agents reach a consensus is really interesting as way that an agreement – e.g., a policy – is reached. It occurs to me that there might need to be some utility in inaction as well – rather, some cost associated with doing something as a consequence of low utility, though I’m not yet entirely sure how this would be implemented practically. Stake-holders have other interests, of course. The authors consider four types of scenarios on which negotiations take place:
Tu et al. (2000) tweaked crossover and mutation probabilities to get best results (unfortunately, the exact values they used aren’t reported anywhere in the proceedings, that I can find).
Sunday musings
As a bit of an aside, I’m thinking about how biological degeneracy might fit in to the efficacy of management policies, given that multiple independent agents might affect a biological system in different ways. I think that degeneracy is interesting and probably greatly under-considered across all biological scales, but it appears entirely absent as a theoretical or practical consideration in conservation and the maintenance of ecosystem function. Man et al. (2016) very recently developed the theory to quantify degeneracy, doing so while simulating networks of complex neoronal systems characterised by non-linearity – specifically comparing degeneracy to redundancy and complexity, which were also defined mathematically. I think there’s a lot of room for theoretical development on degeneracy, and a lot of scope for the application of degeneracy theory to big questions in evolutionary ecology, community ecology, and conservation biology. The modelling in G-MSE is general enough to be potentially able to address these kinds of questions, perhaps using the mathematical definitions introduced by Man et al. (2016) for analysis of simulation results.
I’ve been doing a bit more literature review on the subject of
genetic algorithms, particularly as applied to economic and
social-ecological questions (e.g., Balmann and
Happe 2000; Ascough et al. 2008). Given the need to keep things
computationally efficient while also repeatedly updating agent
strategies, I think it’s worth defining AGENTS
as an
integer
array (I’m not sure why RESOURCES
can’t also be one, actually, so it might be worth checking on this)
instead of a double
. Supporting this:
AGENT
array
that needs to be a non-integer. the closest thing is a
parameter affecting movement, but this can be made into an
int
, I should think. It might also help if the parameter
affecting error
was continuous, though I’m not yet
convinced it must be – error
could just be the
probability of error from zero to 100, interpreted as 0 to 1.0 by
increments of 0.01.RESOURCES
that needs to be a non-integer either. The probabilities of
removal (i.e., death) and growth (i.e., birth) are the closest, but I
don’t know if there’s any good reason to have these be especially
precise – i.e., why not just have an int
value from zero to
100, corresponding to a 0.01 to 1.0 probability of mortality later? That
way, the whole array could be int
. I suspect the same can
be done for the birth parameter, though the case is certainly less
convincing than for the agent array.NEW ISSUE 13: Switch agent array to type int
In light of the above reasoning, I think I’ll plan to switch
AGENTS
to an int
type, then see how this
affects things. Using integers to define ‘genotypes’ that affect agent
strategies would permit the use of bitwise
operators to increase speed at a very computationally intense part
of the model (genetic algorithm mutation and selection). The size of an
int
must
be at least 16 bits in c, so a signed int
could
correspond to \(2^{15} - 1 = 32,767\)
unique values – plenty, I would think, for coding a sufficient number of
strategies. I’ll want to do a bit more digging to see how much this
could be expected to speed up the genetic algorithm (see here
). Of course, if it’s trivial, then using double
and
columns affecting behaviour is probably just fine. But if speed is an
issue, a vector of int
values could really be better than
several columns of double
values; I’m just not sure what
would have to be sacrificed yet. Quick random number sampling will be
needed.
Having second thoughts about binary encoding
I’m not entirely convinced yet, actually that binary
instead of real encoding is needed. One advantage of real encoding,
besides that it fits a bit more easily into the current data structure
I’m using, is that it might converge on optimal strategies sooner even
if the bitwise calculations are faster (Salomon
1996). Note that phenotypes in bitwise encoding are affected by
both the position and value of bytes, whereas phenotypes in real
encoding are only affected by the value of real numbers (Kumar 2013). There are some techniques to map
binary values to real numbers, though I’ve not yet found anything
comparing the efficiency of binary versus real encoding, but Salomon (1996) argued that real encoding was the
best choice of applying genetic algorithms to optimisation – I think
this might be the way to go, though I’ll want to think about how
crossing over and mutation will work efficiently. I’m not
entirely sure I do want to finish issue 13. In the end, using
int
instead of double
could cut the memory in
half, but this would be almost useless for the AGENT
array
– if it could be done for the RESOURCES
array, it might be
more useful, but R doesn’t differentiate, so it really won’t matter that
much, if at all.
REMOVING ISSUE 13: Convinced myself that this was a bad idea
Note that Balmann and Happe (2000) writes that ‘’population size usually ranges between 10 to 50’’, though from population-genetics perspective, this seems too small to me.
Fleshing out the use of Genetic Algorithms for G-MSE
I’m becoming more convinced that some sort of genetic algorithm is the best way to model the strategies of all agents, including managers and stake-holders. Here is a rough overview of how I see the next step of the software development process:
Insert columns into the AGENT
data frame that
represent utility values associated with each type of resource. This
will effectively quantify how much of each resource managers and
stake-holders want. For example, while managers might prefer a
balance of resources (perhaps the average of stake-holders?),
stake-holders might prefer to maximise only one resource with little or
no concern for another (or to actually prefer some resource quantities
to be minimised). The utility values of each agent will be used as
variables in a utility function, which will calculate agents’
satisfaction (or happiness or contentness)
with a current situation of resource quantities (note: this utility
function need not be linear – for some stake-holders, I’d expected it to
be more log-linear, but it might be good to try different functions and
ask real stake-holders what they think). Hence, a function
calc_utility
will be needed.
Insert another set of columns into AGENT
that
influences agent actions – how managers and stake-holders will
do something in their environment. This can be thought of as
analagous to genes affecting an organisms phenotype in an evolutionary
model, but will have different types of effects for agents:
manager
function will therefor be needed.user
function will therefore be
needed.The second set of AGENT
columns affecting manager
and stake-holder actions will be updated before every decision using a
genetic
algorithm. This will require a separate opt_utility
function. This general function will work as follows:
calc_utility
to calculate the utility of the
agent of interest.manager
or user
function for
managers or stake-holders (perhaps need an R and C version of these
functions – c for here, R for later), respectively, to temporarily
simulate each offspring’s decision if used in one or more previous time
steps (e.g., by using the current AGENT
values)calc_util
to find the utility associated with the
simulated decision in 2 – this effectively tests each pseudo-agent to
see if their action variables are good at maximising
utility.The above genetic algorithm can be used both for maangers
maximising utility through establishing game rules and for stake-holders
maximising utility by affecting resources. The idea is to have the
general opt_utility
to optimise what an agent does to
maximise their utility through the use of a general genetic algorithm
(perhaps simulating human planning, if it were as good as adaptation by
natural selection, which I don’t think it is).
This entire process will need to go into one c
function for reasons of efficiency – we’re going to add some time onto
these simulations, but I think it will be worth it provided we:
So, perhaps, the manager.r
function
will take in all of the necessary information and send it to c, and then
c will go through the entire process of potentially automating manager
interpretation of observation data and decision of making game rules
based on manager utility values. Other management options will of course
be available.
Then, the user.r
function will likewise take all of
the necessary information and send it to c to go through the entire
process of automating stake-holder interpretation of the manager’s rules
and updated actions based on stake-holder utility valuse. Other user
options will of course be available.
This removes the need for a specific game arena,
games.R
, because the game is defined by
manager.r
and effectively played by users in
user.r
. The novelty is that we’re using evolutionary game
theory under the hood in both management and stake-holder actions to
infer broader patterns about how cooperation and conflict might arise
when all parties are acting according to their own interest.
I think this is getting on the right track, and I am starting to see how the code will look and run. We also might want to include a spatial component to all of this, affecting both manager and user actions. For example, perhaps some stake-holders can only have their utility functions affected by or act in resources within certain areas of the landscape.
NEW ISSUE 12: Observe multiple times for density estimator
Currently, estimating total population size using a sub-sample of
observed area and assuming that the density of this sub-sample reflects
global density (method = case 0
) only works when one
sub-sample is taken. There are multiple ways of fixing this so that the
population size estimate takes into account multiple sub-samples. It
would be a good idea to think about the most efficient way to do this
and program it into R (perhaps with tapply
to start, but
eventually in the manager.c
function, maybe).
NEW ISSUE 11: Permanently move agents
Allow agents to move in each time step, permanently, in some way.
This might be best done through the anecdotal
function. As
of now, they go back to their original place at the end of each time
step, and it would be good to have an option to let them move all around
the landscape.
Agent-based modelling in economics – potentially useful ideas
Phan (2003) briefly summarises the
emerging (at least, at the time emerging) field of Agent-based
Computational Economics, noting that agent-based models can complement
mathematical theory in economics especially when equilibrium conditions
cannot be easily computed or attained by agents. Relating agent-based
models to cognitive economics, Phan (2003)
notes that the latter ‘’is an attempt to take into account the
incompleteness of information in the individual decision making
process’’, which seems especially relevant to G-MSE. The program SWARM might
be useful to explore – written in java though. Software like SWARM, MODULECO,
and CORMAS
appear to have a similar interface as G-MSE has (or will have), but I
think that writing G-MSE from the ground up was definitely the right
choice. This makes G-MSE more targeted to a specific social-ecological
problem, allowing it to be written in a way that is computationally
efficient, but can also be accessible through a browser by end users
without proficiency in R (regarding efficiency, current simulation times
for the model itself are: 100 time steps = 0.241
seconds,
1000 time steps = 3.179
seconds, and 10000 time steps =
27.740
seconds; I can’t imagine anyone would want
simulations longer than 1000 time steps, but the efficiency allows many
replicate simulations in a time frame that will not be an issue for
serious research – especially if run in parallel. Things do slow a bit
when more individuals are needed, but I’ve simulated 100 time steps with
over 100000 individuals and found the simulation to take only
22.8
seconds. Memory might be an issue, but I’m currently
storing entire resource and observation histories – an option to not do
this would cut back massively).
Phan (2003) discusses how agents might
optimise behaviour over the course of some number of iterations, which
appears analagous to evolution of traits, except that it’s one
individual essentially working through a trial-and-error process of
finding the best behaviour to adopt to maximise some sort of utility
function (in this case, profit). Beliefs
are reported over
time as numeric values that affect behaviour. Phan (2003) likewise considers the situation in
which individuals buy or don’t buy something to maximise a surplus via a
maximisation function that multiplies a binary variable to the
difference between costs and benefits of a good.
Marks (1992), in a now fairly dated paper, looked at modelling generalised prisoner’s dilemmas, which involve continuous rather than discrete strategies, and discusses solutions for optimal strategies, including evolutionary stable strategies as pioneered by John Maynard Smith. The general idea of the ideas in Marks (1992) has overlap with G-MSE, in that there are agents (perhaps rational agents) attempting to maximise something through interaction. Marks (1992) first introduces the oligopoly problem, stating, ‘’with a small number of competitive sellers, what is the equilibrium pattern of price and quantity across these sellers, if any?’’ The analagy to managers and stake-holders would seem to be appropriate, perhaps: given a small number of stake-holders what is the equilibrium value of a set of resources (including population size, farm yield, etc.), if any? To do this we need to understand the agency of the stake-holders and the rules of the game as set by managers.
Marks (1992) considers an economic model of a generalised prisoner’s dilemma with three players, considering the genetic algorithm, a machine-learning technique that makes it unnecessary for a human being to consider a strategy (i.e., the strategies are derived from the conditions of the model). This is the kind of avenue that we want to go down. In fact Marks (1992) puts it quite clearly in the block below:
‘’Mathematically, the problem of generating winning strategies is equivalent to solving a multi-dimensional, non-linear optimization with many local optima. In population genetic terms, it is equivalent to selecting for fitness’’
Hence the overlap between evolutionary game theory and adaptive dynamics models with models that produce optimal strategies for maximising utility in economic situations appears to be quite large, as presumed. Therefore, using evolutionary game theory would appear to be a reasonable way of selecting stake-holder strategies in G-MSE. Delving a bit more into this literature might make the jargon clearer, and identify any subtle differences in the maths or algorithms though. And I’m still not sure how this fits in with machine learning (e.g., if machine learning is just adaptive dynamics under the hood – a quick search doesn’t give an answer to this, so I think it will be necessary to do a bit more reading to understand the two; Marks (1992) differentiates, ’‘[…] advent of [Genetic Algorithms] (and machine learning) means […]’‘). Here is an interesting example from a course in machine learning, where the instructor first looks at genetic algorithms – the instructor describes them as the’‘least practical’’ of machine learning algorithms in the course, but the instructor is also an engineer, so perhaps they’ll be more practical (probably more general, if I’m thinking correctly) for solving G-MSE type problems.
Perhaps one c function (e.g., adaptive.c
) could go
through a learning process of maximising utility for each type of agent
(each agent might get intense, depending on how many agents
there are). The rules of the game could be passed from
game.c
to adaptive.c
, where
adaptive.c
also takes in the array of AGENTS
.
From the starting point of each agent’s traits, agents within the
program could reproduce themselves with mutation, the selection could
minimise some cost function until some sort of maxima is acheived that
results in agent trait values that havet he highest return on utility.
The program adaptive.c
could therefore take in
AGENTS
IDs | type1 | type2 | … | see2 | see3 | G1 | G2 | … | Gn |
---|---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | … | 0 | 0 | 0 | 0 | … | 0 |
1 | 1 | 0 | … | 0 | 0 | 0.2 | 1.1 | … | -0.1 |
2 | 1 | 0 | … | 0 | 0 | 1.0 | -0.1 | … | -2.7 |
… | … | … | … | … | … | … | … | … | … |
N-1 | 2 | 0 | … | 0 | 0 | 0.4 | -1.1 | … | 0.9 |
N | 2 | 0 | … | 0 | 0 | 2.1 | 3.0 | … | 0.5 |
Where the table above is the data frame of AGENTS
as it
currently exists with additional columns G1
to
Gn
that could hold real numbers that affect agent
behaviour. A dummy data frame could be created that allows for
evo_time
generations of reproduction with mutation and
selection for minimising a cost function in attempt to find appropriate
values affecting components of an agent’s strategy. I’m not sure how
long such an algorithm would take, but I suspect that it could be
optimised to not be painfully long – different criteria could be set,
e.g., to allow for a maximum number of evolving generations (the aforementioned
instructor suggests 1000) or some convergence criteria.
Essentially, each agent or type of agent would go through a
process of learning an optimal strategy by creating a lineage
of strategies, the descendants of which would be selected by strategy
performance. Note that given a convergence criteria, strategies
might not need to evolve much in each time step of G-MSE – the
best strategy might be stable over time in some situations (and if we
don’t want strategies to change over time steps, the question of optimal
strategy could be solved when initialising agents – still the
idea of allowing dynamic strategies seems interesting, and might be
important if management is also changing).
While for some simulations, we’ll want to take the time to allow evolution of optimal strategies, in others we might even embrace an imperfect strategies evolving as a consequence of short evolution times – this might mimic the limited time that stake-holders have to consider a particular problem.
A general summary of G-MSE as it exists at the moment
A summary of some of the challenges of putting the ‘G’ in G-MSE
A summary of some ideas for moving forward with G-MSE
Short-term plan
I’m going to finish developing thoughts on evolutionary game theory, then move onto looking at game theory from an economic perspective. I think the biggest thing to consider on the immediate horizon is what kind of approach will be used to simulate agents (stake-holders) playing games and making decisions. Once this is clear, the details can follow. Some sort of utility function will be used. Of particular consideration is how much complexity should be incorporated – or, perhaps – how much mechanistic detail.
Leombruni and Richiardi (2005) makes some interesting points regarding use of mathematical versus agent based models, noting the tractability issues with mathematical (and game-theoretic) models as things become more complex due to unique individuals needing to be represented.
Quick efficiency fix
The best way to manage memory in R is going to be by avoiding
Rbind
altogether and working instead with lists, as made
very clear by the following quick experiment in
scratch.r
:
################################################################################
# Testing list versus array efficiency
# ARRAY FIRST:
sam <- sample(x = 1:100, size = 14000, replace = TRUE);
dat <- matrix(data=sam, ncol=14);
obs <- NULL;
proc_start <- proc.time();
time <- 1000;
while(time > 0){
obs <- rbind(obs, dat);
time <- time - 1;
}
proc_end <- proc.time();
time_taken <- proc_end - proc_start;
# TIME TAKEN: 14.09 seconds
# NOW LIST:
sam <- sample(x = 1:100, size = 14000, replace = TRUE);
dat <- matrix(data=sam, ncol=14);
obs <- list();
proc_start <- proc.time();
time <- 1000;
elem <- 1;
i <- 1;
while(time > 0){
obs[[i]] <- dat;
i <- i + 1;
time <- time - 1;
}
proc_end <- proc.time();
time_taken <- proc_end - proc_start;
# TIME TAKEN: 0.005 seconds
################################################################################
The output being deposited into a list is much, much faster. Enough
to make me want to fix this immediately. Doing so was trivial – it was
just a matter of replacing
RESOURCE_REC <- rbind(RESOURCE_REC, RESOURCES)
with
RESOURCE_REC[[time]] <- RESOURCES
, then editing the
plotting functions accordingly given the new data type. The result is
that simulations are now much faster, especially when
time
is high, simulating many time steps. One hundred time
steps used to take 10-12 seconds for some observation times – they now
all take under a second. For more time steps, the efficiency difference
would increase exponentially. The massively increased efficiency occurs
because R now no longer allocates a whole new massive chunk of memory
for each new recorded data frame – it just appends data to a list where
the memory has already been allocated.
CONCLUSION THE TIME IT TAKES TO RUN 100 TIME STEPS HAS DECREASED BY AN ORDER OF MAGNITUDE BY SWITCHING FROM DATA FRAMES TO LISTS IN R (NOW LESS THAN 1 SECOND)
Note that plotting still happens slowly, deliberately, because we’re putting the system to sleep for a tenth of a second in each time step to make the animation smooth. When plotting is turned off, this no longer happens.
Proof of concept: Interactive user input as a stake-holder
The code below runs the gmse
program in a way that is
interactive. I have run time steps, and specified that the hunting
begins in time step 95.
> sim <- gmse( observe_type = 0,
+ agent_view = 10,
+ res_death_K = 400,
+ plotting = TRUE,
+ hunt = TRUE,
+ start_hunting = 95
+ );
This produces the following output. When prompted by the line
’‘Enter the number of animals to shoot
’’, I have typed in a
number and hit enter accordingly.
Year: 95
The manager says the population size is 181
You observe 11 animals on the farm
Enter the number of animals to shoot
10
Year: 96
The manager says the population size is 408
You observe 11 animals on the farm
Enter the number of animals to shoot
10
Year: 97
The manager says the population size is 272
You observe 6 animals on the farm
Enter the number of animals to shoot
10
You can't shoot animals that you can't see
6 animals shot
Year: 98
The manager says the population size is 226
You observe 10 animals on the farm
Enter the number of animals to shoot
0
Year: 99
The manager says the population size is 294
You observe 9 animals on the farm
Enter the number of animals to shoot
5
The output of this also shows the spatial distribution of resources
and a population graph over time. My hope was that allow the
gmse.so
file to be sourced directly from a link so that it
could be run by anyone remotely, but I think that this will take a bit
more work – worth keeping in mind for later.
I am still trying to get a clear picture on how to incorporate management, user, and game-theoretic modelling components. Given uncertainty in all of these components, some unified approach would seem beneficial. Franco et al. (2016) has recently introduced a comprehensive approach to evaluate effects of disurbance on coral reefs using a Bayesian Belief Network (BBN) approach. This approach ‘’offers a methodological framework to address uncertanty.’’ This approach requires some defined outcome state, the probabilities of realisation of which are calculated. Use of BBNs requires an acyclic graph and conditional probability tables. It’s not entirely clear to me how BBNs would be incorporated into the G-MSE simulations, except maybe as a type of observation model? With the simulation, we can look at causality directly and thereby quantify direct and indirect effects, and measurement error. It could, however, be useful to know how well BBNs perform using simulated populations, simulated observational data, and appropriate analysis based on BBNs, as would be used on empirically derived data.
For coauthors, add the G-MSE files onto a public Dropbox so that they can be sourced and run remotely. There are also some useful resources for embedding R in a website. This might be faster than using Shiny, at least at first, so it could be useful for initial demonstrations. It might be useful to show a prototype of G-MSE, or what it might be:
## [1] "Managers estimate the population size is 4230"
## [1] "You encounter 35 animals around your farm"
## [1] "Estimated loss of yield is at 5%"
## [1] "Enter how many animals you intend to hunt"
Demonstrating this (and it would be quick to implement) might be useful for showing how management and games work.
Side note about computation efficiency
Note that it would really be faster to convert to a list type in R if
anything computationally intense needs to be done (e.g., binding rows).
C will not appear to let me read in a list via .Call
, only
a vector, so it’s worth thinking later about whether doing some things
on the R side will be faster:
Updated scratch.R to show how option 2 could work, though the change itself might be more inefficient than binding or other operations.
Issues related to agent-based complex modelling of human decisions
An (2012) reviewes humans as agents in agent-based models of social-ecological systems. An (2012) ties this in with complexity theory, and distinguishes agent-based from individual-based models in a useful way – with agent-based models being defined more by attention to decision making processes (as in models of human behaviour). An (2012) asks,
- What methods, in what manner, have been used to model human decision-making and behavior?
- What are the potential strengths and caveats of these methods?
- What improvements can be made to better model human decisions in coupled human and natural systmes?
An (2012) reviews nine different types of decision models, and notes that different types of decision models can be mixed and matched, as we’ll likely need to do for G-MSE. I’m not sure that we can assume that stake-holders are the same types of decision-makers. For example, I suspect that farmers might be better represented by a microeconomic model of decision making, with a focus on maximising some sort of revenue or yield. An (2012) notes the use of utility functions here (seeming to link with some of my earlier thoughts), including one in which ecological indicators are included in place of just money (Nautiyal and Kaechele 2009). Apparently, econometric work by McFadden (1973) is foundational to looking at decisions based on utility, modelling decisions as a probability of an agent choosing an option. An (2012) notes that decisions are unlikely to be completely rational, and humans will tend to seek ‘’satisfatory rather than optimal utility’’.
A second of the nine types of decision models includes the psychosocial and cognitive models, which attempt to model individual’s thoughts based on beliefs and goals – institutions can also be modelled this way, though we might think of institutions as collections of the same type of individual for the purposes of G-MSE coding.
One type of modelling that could be especially
interesting is what An (2012)
defines as ‘’participatory agent-based modelling’’, wherein real
stake-holders tell the modeller what they would do under some set of
conditions conditions, then the model runs with those decisions. This
has been used, apparently, in an agricultural setting (Naivinit et al. 2010), and would be a very
interesting addition to G-MSE. If we could have an option for letting a
user take over the role of an agent in the model and play
against a computer, it could be interesting – though I’d tend to still
want to develop some game-theoretic algorithm that grounds predictions
of stake-holder behaviour, rather than relying solely on empirically
derived data (i.e., asking people what they would do). This could be
accomplished in a couple ways, in principle – one being throught he use
of a C standalone program (i.e., not linking with R) that prompts the
user for input using the scanf
function and repeatedly
updates the simulation with information in every cycle of the G-MSE
loop. The same effect can be accomplished in R with the following code
as an example of the concept:
act_agent <- function(times){
while(times > 0){
cat("\n\n\n How many geese do you shoot? \n\n");
shot_char <- readLines(con=stdin(),1);
shot_num <- as.numeric(shot_char);
gross_prod <- rpois(n=1, lambda=100);
net_prod <- gross_prod - (2 * shot_num);
cat("\n");
output <- paste("Net production = ", net_prod);
print(output);
times <- times - 1;
}
}
If you read the function into R, then run it (e.g.,
act_agent(times = 2)
, it will ask for input
times
iterations, prompting once per iteration of the
while
loop. An option in G-MSE would be nice to allow:
All of these would be fun, and An
(2012) notes that they are often quite sueful. Ideally it would
be nice to make the program more user-friendly than a command line
interface, but that seems like a concern for a version 2.0, after an
initial version has been released. More helpfully, using some sort of
loop could make for easy input of the R options in the gmse
function – it could ask, in plain language, for users to insert the
numbers that are currently only input within gmse()
itself
(e.g., gmse(time_max = 100)
).
It’s possible that we could develop a type of rudimentary artificial intelligence by collecting data of user decisions (i.e., make a ‘bot’ that mimics human decisions). For example, we could have 100 people act as agents in G-MSE, collect data on the decisions that they make when trying to act like a stake-holder, then construct an algorithm based on real user decision in different situations (alternativley, or in addition, we could also look at actual past decisions from the case studies to make an algorithm). This could be an interesting, approach, albeit a somewhat atheoretical one – it doesn’t excite me quite as much, but it might be worth considering because the end result might predict human behaviour better than theory-driven approaches (as humans don’t always act rationally or think things through carefully – I don’t think a citation is needed for this; it’s 20 JAN 2017, and the current time is 17:00 GMT, or 12:00 EST). It could also be interesting to compare different types of approaches (i.e., have a theory-based approach and a empirically-based approach option). An (2012) warns though that ‘’Even though also based on data, researchers usually have to go through relatively complex data compiling, computation, and/or statistical analysis to obtain such rules’’ An (2012) also notes that this kind of data collection does not necessarily identify why decisions are being made. Hence, I do think game-theory will be absolutely important, with agents using underlying utility functions to maximise their own utilities as a consequence of games.
Some notes on the asymmetric nature of stake-holder games
Games between stake-holders, modelled by agents
in
G-MSE, are typically, if not always, going to be asymmatric.
This means that the stake-holders are distinguished by more than their
strategies – they are likely to have their own unique payoffs defined by
their identities (e.g., as a conservationist, a farmer, etc.). It would
seem as though the only way around this – if it’s even possible – might
be to make identity part of the game itself. In other words,
let agents attempt to maximise some general payoff by deciding to
take on a particular role, and then a strategy given their chosen
role. It’s an interesting thought, but I don’t think it makes much sense
for the practical application of G-MSE. In the context of the games that
we’re interested in, stake-holders effectively are
conservationists, farmers, hunters, etc. (or some mixture of these
roles). Hence, I think we need to work with the idea that the games our
stake-holders play and that G-MSE will model are going to be
asymmetric.
Maynard Smith and Parker (1976) outlined three specfic ways that games might be asymmetric (they were thinking about animal contests, but the general principles apply):
Pay-offs asymmetry: Different players might stand to gain different amounts in the game – e.g., perhaps mutual cooperation returns a higher benefit for one player than another, or defection on the part of one player has a more negative effect on its opponent than vice versa.
Resource asymmetry Intrinsic difference between players might give one player an inherent advantage, allowing them to dominate in an interaction (i.e., there might not be much of a conflict because one side can always win).
Uncorrelated asymmetry Discussed earlier: Maynard Smith and Parker (1976) define this as asymmetries that ‘’do not affect either the payoffs or the’’ resources that might given one player an intrinsic advantage.
The authors offer some general conclusions about asymmetric gains with unequal payoffs, but these are really more about encounters of conflict, and perhaps not so applicabl to G-MSE. They state that, where payoffs are unequal but all parties have access to information, it is best to ‘’play high when you have more to gain and zero when you have less to gain’‘. In other words, if there is a lot to gain by sticking it out and fighting hard in an interaction, do it – if there’s not much to gain, then back off. Such contests are the central focus of Maynard Smith and Parker (1976), but the general conclusion that’‘mixed strategies will be the exception’’ when contests are asymmetrical would seem to apply more broadly. Given the many ways that a game can be asymmetrical – rather, that a symmetrical game could be changed to asymmetrical – it would seem likely that there are more ways that cause a strategy to become pure than not pure because there are more ways of adjusting payoffs to making one strategy the clear winner. This could simplify the game theory in G-MSE, in a sense, if mixed strategies do not require much consideration.
McAvoy and Hauert (2015) recently emphasised the importance of asymmetry in evolutionary games, noting that ‘’cooperation may be tied to individual energy or strength, which is, in turn, determined by a player’s role’’. This would seem to apply to social-ecological conflicts as well – cooperation might reasonably tied to the power (economic, political, etc.) of stake-holders, meaning that it might be important to take this into account in G-MSE modelling. For something like Prisoner’s dilemma, we can represent an asymmetry using subscripts, so the standard game would be represented by a payoff matrix,
\[ \left( \begin{array}{ccc} & C & D \\ C & R, R & S, T \\ D & T, S & P, P \end{array} \right). \]
Where the above satisfies: \(T > R > P > S\). An asymmetric game can be represented by,
\[ \left( \begin{array}{ccc} & C & D \\ C & R_{i}, R_{j} & S_{i}, T_{j} \\ D & T_{i}, S_{j} & P_{i}, P_{j} \end{array} \right). \]
The above is for two different types of players, \(i\) and \(j\). Note that I tried working through the
same basic concept with a bit different notation
earlier on, with each matrix element being defined by a utility function
that is unique to each agent type. In the code, this will all be defined
by agent types and their respective traits (columns in the
agent_array
), but it’s good to link this up with theory and
the general properties of asymmetric games.
McAvoy and Hauert (2015) go into the Prisoner’s Dilemma and Snowdrift gamse given environmental and genotypic asymmetry
Such asymmetries can complicate evolution of strategies, and, perhaps more relevant for G-MSE, can cause different types of agents to experience different types of games as a result of asymmetry:
’‘[…] Thus, based on the social dilemma implied by the ranking of the payoffs, a player who incurs a cost of \(c_{1}\) for cooperating is always playing a Snowdrift Game while a player who incurs a cost of \(c_{2}\) is always playing a Prisoner’s Dilemma. It follows that ecological asymmetry can account for multiple social dilemmas being played within a single population, even if the players all use the same set of strategies’’ (McAvoy and Hauert 2015 p. 9).
The above quote is respect to asymmetry payoffs caused by space, but the point is that the asymmetry of the payoff matrix can lead to different players experiencing different games and therefore having different – potentially conflicting – strategies.
We might also apply the concept of genotypic asymmetry with the process of cultural updating, which occurs when the ‘genotypes’ (perhaps stake-holder types) do not change, but the strategies of players can be updated over time. Note that genetic asymmetry can be reduced to a broader symmetric game given genetic updating (i.e., births and deaths of players of particular types), this is probably not applicable to G-MSE.
Some thoughts on the application of game theory
I’m trying to step back a bit to consider the manager and user models, which will both affect and/or be affected by the game-theoretic component of the model. I’ve considered how the game-theoretic component will fit into G-MSE more generally, and also a bit of how it might be implemented and applied in the context of stake-holder actions. Overall, this will require three c files to be closely integrated, but the application (perhaps even development, if necessary) of game theory requires a lot of thought.
The model will be more general if we allow agents to take any number of actions. but the number of games that are possible increases exponentially with the number of different actions that agents can take (Zeeman 1980). If only two actions are possible (e.g., cooperate and defect), then there are only four types of games that can be played (Prisoner’s dilemma, Snowdrift, Anti-coordination, and Harmony). The number of games increases to 20 for three actions and 228 for four actions (Adami et al. 2016). If we want the software to somehow identify the type of game being played – rather – if game type identification is to be an essential part of the program, then agent actions will probably need to be limited (there is of course, always the option to identify games iff there are sufficiently few actions). If most conflicts can be described by a small number of types of agents with a small number of types of actions (and this seems reasonable, perhaps, especially if we think of actions qualitatively), then constraining the software to such cases might be preferable (at least, as a starting point). The benefit is that we might then make clearer predictions for management, e.g.: Right now, stake-holders are playing a Snowdrift game, but by adopting an alternative management decision, they will transition to playing Harmony.
This is appealing, but I think it also relies on payoff matrices being symmetric, meaning that players are distinguished by their strategies and nothing else (McAvoy and Hauert 2015). In the types of games that interest us, this almost certainly won’t be true. The games we’re interested in at ConfooBio will typically be characterised by uncorrelated asymmetry; that is, situations in which agents know that they are of a certain type and will receive payoffs associated with that type of agent. Hence, the payoff structure might look like a Prisoner’s dilemma to one stake-holder, but Harmony to another (i.e., the optimal strategy is always cooperate for one, but always defect for another because each knows the type of agent that they are and how payoffs differ between types).
I’m starting to work through these ideas with an initial focus on evolutionary games, as this is the application of game theory with which I’m most familiar, and because I think some of the general developments of evolutionary game theory are probably applicable for our purposes. I’ll also need to read more widely into economics and the social sciences, but some recent work by Adami et al. (2016) and McAvoy and Hauert (2015) seems relevant.
Adami et al. (2016) argue that the
optimal strategies predicted by simple mathematical games are unlikely
to be very useful for predicting agent actions given the complexities
associated with decisions of real-world; such complexity notably
includes stochasticity, which applies to games among all kinds of agents
from ‘’microbes to day traders’’ (Adami et al.
2016). Stochasticity can affect the stability of strategies (see also Adami and Hintze 2013). If strategies
are conditional or based on memory of previous encounters, then the
number of traits
(Adami et al. 2016
assume loci, modelling genetics, but the same applies to agents making
decisions) required to model decisions increases rapidly – 21
total traits are needed for conditional expression of strategy when
agents can remember the previous two games. In practice, I suspect that
there is some helpful way to simplify this – perhaps not every detail of
the history of interactions and possible conditions really is needed to
(or even would be expected to) model stake-holder behaviour. Instead, I
suspect that game history could be boiled down into one or two
representative variables that, among other things, are likely to
influence agent behaviour. Agents are perhaps better to be thought of as
modelling stake-holders guided primarily by heuristics rather than
optimally rational behaviour? Hence the agent_array
might
better be thought of as containing variables underlying human values and
traits in the context of games rather than as solutions to
games. A couple recent and potentially relevant papers on decision
rules in complex environments include Fawcett et
al. (2014) and McNamara et al.
(2014). Adami et al. (2016)
conclude that ’‘[w]hile evolutionary games can be described
succinctly in mathematical terms, they can only be solved exactly for
the simplest of cases’’. Adami et al.
(2016) were specifically considering games in an evolutionary
context, but I don’t think that their conclusion is limited to
evolutionary game theory. In the case of decision making stake-holders,
the complexity associated with stochasticity and uncertainty, the
possibility of more than two actions and payoffs, and the asymmetry of
payoff matrices would all seem to conrtribute to the difficulty or
impossibility of solving for exact solutions. Hence, when scenarios are
complex in G-MSE (as we probably need them to be), it is unlikely that
analytic solutions will be of much use. However, stake-holders won’t
evolve in the same sense as biological organisms, so some techniques
used in evolutionary game theory will be unavailable – or have to be
modified. It might be worth thinking more about identifying the
consequences of practical or observed strategies, or types of
strategies, rather than trying to somehow solve for the best
strategies. The Axelrod
experiments kind of did this before a lot of complex techniques
became available to analyse evolutionary games. Users proposed
strategies, which were put into a tournament – the point wasn’t so much
to solve the iterated
Prisoner’s dilemma so much as to explore different strategies for
playing the game.
In this browser app, you can play the iterated prisoner’s dilemma against ‘Lucifer’, an automated agent that response to your decisions.
NEW ISSUE 9: Observation Error It would be useful to
incorporate observation error into the simulations more directly. This
could be affected by one or more variables attached to each agent, which
would potentially cause the mis-identification (e.g., incorrect return
of seeme
) or mis-labelling (incorrect traits read into the
observation array) of resources. This could be done in either of two
ways:
Cause the errors to happen in ‘real time’ – that is, while the observations are happening in the simulation. This would probably be slightly inefficient, but have the benefit of being able to assign errors specifically to agents more directly.
Wait until the resource_array
is marked in the
observation
function, then introduce errors to the array
itself, including errors to whether or not resources are recorded and
what their trait values are. These errors would then be read into the
obs_array
, which is returned by the function.
NEW ISSUE 10: Multiple resource
The resource-wide parameter values (e.g., carrying capacities, movement types) will need to be either:
resource
function as necessary, and/orgmse
function, the length
of which could determine how many times resource
is called
in one time step (one for each type of resource, potentially, if
carrying capacity is type specific – or carrying capacity could be
applied within a type in c – perhaps more efficient, but would require
to read in multiple K
somehow, either through the
paras
vector or in the resources
array – or
something else. How to do this best will need to consider both
computational efficiency and clarity/ease of coding.Note that:
res_remove
can already be called in a type-specific
way by resource
, so it might just be better to call
resource
once and somehow input variable numbers of
K
into c. I’ll need to think more about this, but it could
be something like assigning each individual a competition coefficient
alpha
for how it is affected by each other type of
individual. Intra-type competition could then be modelled generally,
with K
defined by its inverse. Meanwhile, inter-type
competition coefficients could also be useful.
Along these lines, it’s also worth considering an option allowing
only one resource per cell (equating to a local alpha
and
K
of one). This might be worth making its own issue
later.
If we were to call resource
multiple times, we would
also need to paste
arrays together in R
. This
wouldn’t be terrible, but it could lose some efficiency unnecessarily,
and I don’t see the benefit.
MEMORY LEAK CHECK OF R CODE
I have tried running simulations at very high population sizes
(>100000) to see how the simulation would react. Upon seeing quite a
bit of memory being used up, I ran the following valgrind
command:
R -d "valgrind --tool=memcheck --leak-check=yes" --vanilla < gmse.R
The program valgrind
found a lot of large memory
allocations and deallocations (as expected):
Warning: set address range perms: large range
The leak summary was as follows:
==14507== LEAK SUMMARY:
==14507== definitely lost: 133,373,728 bytes in 469 blocks
==14507== indirectly lost: 11,472,512 bytes in 55 blocks
==14507== possibly lost: 120,863,992 bytes in 563 blocks
==14507== still reachable: 2,319,742,586 bytes in 12,127 blocks
==14507== suppressed: 0 bytes in 0 blocks
==14507== Reachable blocks (those to which a pointer was found) are not shown.
==14507== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==14507==
==14507== For counts of detected and suppressed errors, rerun with: -v
If we shift to look only at one run of the resource
model, which is run in the new script scratch.R
, we
get:
==14689== LEAK SUMMARY:
==14689== definitely lost: 3,584 bytes in 4 blocks
==14689== indirectly lost: 0 bytes in 0 blocks
==14689== possibly lost: 0 bytes in 0 blocks
==14689== still reachable: 28,837,506 bytes in 13,346 blocks
==14689== suppressed: 0 bytes in 0 blocks
==14689== Reachable blocks (those to which a pointer was found) are not shown.
==14689== To see them, rerun with: --leak-check=full --show-leak-kinds=all
And if we include one run of the observation model too, we get:
==14721== LEAK SUMMARY:
==14721== definitely lost: 6,296 bytes in 8 blocks
==14721== indirectly lost: 0 bytes in 0 blocks
==14721== possibly lost: 0 bytes in 0 blocks
==14721== still reachable: 28,948,434 bytes in 13,355 blocks
==14721== suppressed: 0 bytes in 0 blocks
==14721== Reachable blocks (those to which a pointer was found) are not shown.
==14721== To see them, rerun with: --leak-check=full --show-leak-kinds=all
A bit more worrisome, if I run an old R script (a simple individual-based model), I get the following
==15050== LEAK SUMMARY:
==15050== definitely lost: 0 bytes in 0 blocks
==15050== indirectly lost: 0 bytes in 0 blocks
==15050== possibly lost: 0 bytes in 0 blocks
==15050== still reachable: 36,846,063 bytes in 15,996 blocks
==15050== suppressed: 0 bytes in 0 blocks
==15050== Reachable blocks (those to which a pointer was found) are not shown.
Originally, I feared that this might suggest a problem with my c
code, or its call to R. All the memory allocated appears to be freed
though. Some searching
online suggests that valgrind
is not always perfect on
this front.
‘’You may be surprised to see that valgrind believes that R has leaked memory - unfortunately, it is not perfect, and in this particular case the memory is not so much ’leaked’ as it is ‘cached for the duration of that R session’, and valgrind fails to detect that ‘ownership’ of a particular block of memory is transfered.’’
This is likely what happened (given the original warning). In fact,
if we run valgrind
and try to track the origin of the leak
with --track-origins=yes
, it complains in exactly the this
way – about memory that is allocated but definitely freed:
R -d "valgrind --tool=memcheck --leak-check=yes --track-origins=yes" --vanilla < scratch.R
Below, for example, valgrind
is complaining about line
468 in the resource.c
file:
==15171== 1,560 bytes in 1 blocks are definitely lost in loss record 165 of 1,867
==15171== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15171== by 0xC2959DE: resource (resource.c:468)
==15171== by 0x4F0A57F: ??? (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F4272E: Rf_eval (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F44E47: ??? (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F42520: Rf_eval (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F43DDC: Rf_applyClosure (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F422FC: Rf_eval (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F45FB5: ??? (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F42520: Rf_eval (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F44E47: ??? (in /usr/lib/R/lib/libR.so)
==15171== by 0x4F42520: Rf_eval (in /usr/lib/R/lib/libR.so)
This line allocates memory for the res_new
array:
res_new = malloc(res_num_total * sizeof(double *));
for(resource = 0; resource < res_num_total; resource++){
res_new[resource] = malloc(trait_number * sizeof(double));
}
This appeared to be have been freed correctly, but on
inspection, each malloc
to an array was missing a
correspondnig free I have fixed this (with thanks to this
StackOverflow thread), and now the entire gmse.R
program
produces the follow valgrind
output:
==15405== LEAK SUMMARY:
==15405== definitely lost: 0 bytes in 0 blocks
==15405== indirectly lost: 0 bytes in 0 blocks
==15405== possibly lost: 0 bytes in 0 blocks
==15405== still reachable: 1,544,824,322 bytes in 12,119 blocks
==15405== suppressed: 0 bytes in 0 blocks
CONCLUSION MEMORY LEAK HAS BEEN IDENTIFIED AND FIXED
While this wasn’t a huge deal for small scale simulations, for
simulations with huge arrays caused by large population sizes, this
would have made a difference. The code has therefore been corrected and
pushed to dev
.
With all of this in mind, it is worth thinking about the R side of
memory management as it becomes more relevant (see R memory management
advice). It might be worth switching to a list structure for input
and output so that entire frames are not copied for each operation
(which I assume R is doing for the rbind()
function). It
might also be worth thinking about running rm()
and
gc()
in tandem to release memory during the major loop – or
also getting rid of some components of the data frame on the fly. The
gmse.R
program could potentially switch from a
list
to an array
after the major simulation
loop finishes and plotting or returning the array is necessary.
It appears that I’m
correct regarding the use of rbind()
(or
c()
or cbind()
) – these are terribly
inefficient with respect to what’s happening under the hood when R calls
C (or C++). I’ve downloaded Svetlana
Eden’s Efficiency
tips for basic R loop, which might be a useful reference when
working on the R side of optimisation. The rbinds
really
show be avoided, if possible. One way to do this, if nothing else, would
be to write to a file instead of cbind
(not sure if this
would be helpful for a shiny app). StackOverflow
suggests using rbindlist
, but this would introduce
dependencies that I’d prefer to avoid. In the end, it might be
worth it to just write a quick add_data.c
script in c for
the sole purpose of joining old and new arrays. Alteratively,
this might not be so important – in the end, it might not even be
necessary to record the entire observation history; at least, not in the
way it’s currently being done. The history might instead only record a
few key things from each time period.
RESOLVED ISSUE 6: Sampling ability with agent number
This issue has been resolved to my satisfaction. I did this using the
second option of addressing it. Now for case 3
in which
blocks of the landscape are iteratively sampled (and resources
potentially move in between iterations), a transect_eff
defining transect efficiency is set as equal to the number of observing
agents (working_agents
). The transect_eff
is a
counter, which, after it has counted down to zero, will permit resource
movement. Hence, if there is only one agent observing,
transect_eff
hits zero and movement happens after every
iteration; if there are two agents observing, then
transect_eff
hits zero after two iterations, then movement
occurs and transect_eff
is reset to
working_agents
.
RESOLVED ISSUE 8: Clear up method sampling type in observation model This issue has been resolved, albeit with cases in a different order than suggested (the original suggestion, it turns out, was not ideal). Cases are now:
0: Sampling with a range of view (i.e., don’t rely on the fix_mark
> 0 for switching methods) 1. Sampling fix_mark
times
randomly on the landscape. 2. Linear transect 3. Square transect
Of course, there is always room for more, but these are now four
clear observation methods. Separating case 0
from
case 1
is especially useful now. Now the variable
fix_mark
is just ignored for all cases except 1. In the
code, both case 0
and case 1
still look
similar, and both dig deep through mark_res
and
field_work
functions to differentiate between observation
methods, but I don’t think that this is necessarily a bad thing – a
different argument to mark_res
differentiates them now, at
least, in the observation
function, so it’s not too
difficult to trace through what is going on. Note that both cases 0 and
1 add a new column for each times_obs
, which isn’t done for
the transect methods.
The specially created branch fix_home_bug
has
been merged. I will keep it alive for a while before removing it
entirely.
Update – 14:33, after rewriting the
gmse.R
code to make an easier catch-all function, with
appropriate analysis, I’ve noticed that the binos
function
in observation.c
is defining distance in a way that is no
longer really compatible with the ponit of case 0
(i.e., sample a small area and extrapolate based on the density). The
binos
function was looking at the Euclidean
distance, making, e.g., 3 cells away diagonally farther than 3 cells
away left or right (or up or down). This might be useful later, so I’m
going to keep it in as an option, but I’m also going to make the default
now as within view
cells in any direction, such
that a block forms around the focal individual, and diagonal distances
are not assumed to be longer than length and width. This is the more
common way of simulating things, and it makes movement and observation
estimates easier – I think the only reason to change it back to
Euclidean distance would be if we had an actual map and really needed to
be precise with the distance of things on it.
I have also simplified the master R file gmse.R
to allow for one function to do all of the work, using several default
options for simulations. Below, the main gmse()
function is shown with its default values.
################################################################################
# PRIMARY FUNCTION (gmse) FOR RUNNING A SIMULATION
# NOTE: RELIES ON SOME OTHER FUNCTIONS BELOW: MIGHT WANT TO READ WHOLE FILE
################################################################################
gmse <- function( time_max = 100, # Max number of time steps in sim
land_dim_1 = 100, # x dimension of the landscape
land_dim_2 = 100, # y dimension of the landscape
res_movement = 1, # How far do resources move
remove_pr = 0.0, # Density independent resource death
lambda = 0.9, # Resource growth rate
agent_view = 10, # Number cells agent view around them
agent_move = 50, # Number cells agent can move
res_birth_K = 10000, # Carrying capacity applied to birth
res_death_K = 400, # Carrying capacity applied to death
edge_effect = 1, # What type of edge on the landscape
res_move_type = 2, # What type of movement for resources
res_birth_type = 2, # What type of birth for resources
res_death_type = 2, # What type of death for resources
observe_type = 0, # Type of observation used
fixed_observe = 1, # How many obs (if type = 1)
times_observe = 1, # How many times obs (if type = 0)
obs_move_type = 1, # Type of movement for agents
res_min_age = 1, # Minimum age recorded and observed
res_move_obs = TRUE, # Move resources while observing
Euclidean_dist = FALSE, # Use Euclidean distance in view
plotting = TRUE # Plot the results
){}
Using the function defined above, with most parameters set to default values, I looked at the four different observation types below given the following parameters.
# A: Sample of a 10 by 10 region to estimate density
# Simulation time: 1.8 seconds
gmse( observe_type = 0,
agent_view = 10,
res_death_K = 800,
plotting = TRUE
);
# B: Mark 30 resources 4 times, recapture 30 4 times
# Simulation time: 2.1 seconds
gmse( observe_type = 1,
fixed_observe = 30,
times_observe = 8,
res_death_K = 800,
plotting = TRUE
);
# C: Sample agent_view rows at a time -- all across
# Simulation time: 2.3 seconds
gmse( observe_type = 2,
agent_view = 10,
res_death_K = 800,
plotting = TRUE
);
# D: Sample agent_view rows at a time -- all across
# Simulation time: 6.5 seconds
gmse( observe_type = 3,
agent_view = 10,
res_death_K = 800,
plotting = TRUE
);
These four simulations A-D, which had identical populations models and similar observation modes, produced the four graphs below.
Overall, these simulations have been stable throughout
testing, and I am (finally) merging the dev
branch to
master
, pushing to GitHub, and declaring this
v0.0.5
.
A couple updates that have been made, or need to be fixed. I’ll do these tomorrow, as they probably won’t require much more than a few hours in the morning
I’ve created a new temporary branch, fix_home_bug
, after
noticing a crash from my home laptop. It seems that I hadn’t initialised
the added
variable at zero in the res_add
function of the resources.c
file. At the office computer,
it seemed to initialise it at zero automatically (or I’d not played with
the right parameters to get it to crash), but at home, it was often
getting initialised to very high values and crashing. I’ve fixed the
issue on the new branch, but it needs to be merged.
NEW UNRESOLVED ISSUE #8: Clear up method sampling type in
observation model The method
sampling for
case 0
is too confusing. Sometimes it means randomly
sampled fix_mark
individuals from the population, and
sometimes it means sample within a particular range of view. Change this
so that the switch
functions have four clear cases:
fix_mark
times randomly on the landscape.This will avoid a lot of hassle, even if the code for cases 0 and 3 end up looking the same, or very similar. It’s just very confusing to manage as it is now.
ISSUE #6 STILL NEEDS RESOLVING I was working on this when I found the bug resolved on the new branch. It didn’t take too long, and it should be an easy fix while I take care of issue 8.
TIME ISSUES: While the simulations run quickly in the office computer, 100 time steps now take about 8 seconds for the loop on my Lenovo Thinkpad X201 – something to be aware of as the coding continues.
FOR TOMORROW: Make a summary that includes an example of all 4 types of observation models and their appropriate analyses (quickly fix the plotting to do the correct analyses automatically):
case 0
View-based sampling in which the density is
sampled and applied to the whole size of the landscape (as in Nuno et al. 2013) case 1
Mark-recapture sampling where there is some fixed number marked at each
time and estimates show Chapman style analysis case 2
Sampling along a linear transect as resources move, and
case 3
Sampling using blocks as resources move.
Some updated code is on the fix_home_bug
branch, which
can be merged into the dev
branch once it’s done and is
stable after some testing in the office (i.e., try to crash it).
Below shows a bit of additional coding, which resulted in two new ways (which is really just one flexible way) that observation can occur. There are couple trivial fixes and additions to make (see new issues 6 and 7), but these should be easy to implement. For now, it’s time to take a step back and plan a bit more generally, especially with respect to implementing the game-theoretic component of the modelling.
RESOLVED ISSUE #5: Sweep observation This issue has
now been resolved. There are now two additional ways to observer
populations, as guided by the method
variable used in the
main switch
of the observational model. In biological
terms, the observational model allows us to sample in the following two
ways:
By sampling view
rows at a time, starting from the
top of the landscape and working down to the bottom. Each time a new row
is sampled, resources on the landscape can move (resource movement can
also be turned off if desired). Hence, it is possible for observers to
miss or double count resources. The bigger view
is, the
fewer iterations of sampling are needed to make it all the way across
the landscape, hence fewer total times resources will move over the
course of sampling.
Identical to 1, but instead of sampling a full row and working
down, observers start in the upper left corner of the landscape and
sample around a view
by view
block, and hence
a total of view^2
cells. Sampling proceeds with blocks
across rows until sampling of the very right side of the landscape has
occurred. After sampling all to the end of the right side, observers
move down, sampling another row of view
by
view
blocks just beneath the first. This continues until
the entire landscape has been sampled, and roughly simulates an observer
working their way through the whole landscape over time (time in which
resources might move).
Note: The first case is redundant, and therefore will probably be removed later, but it helped as a scaffold for the more general procedure and takes up little space; for now, I’ll leave it
Testing on both of the above cases was successful (see the figure
below). In each case, if resources are not allowed to move, then
observers predict resource abundance with 100 percent accuracy (i.e.,
they sweep through the landscape and count all of the stationary
resources). If resource can move, there is a bit of (normally
distributed, it appears, and should be – can look later) error around
the actual abundance. Either of these two methods of observation work
fairly efficiently until view
gets very low (ca 2), in
which case a lot of sampling happens in each generation.
After each sampling, resources moved an average of ca 5 cells away, with a distribution as shown below (Figure below shows the distance that an individual moves in one time step – between successive iterations of observer sampling along a transect.
I did not code sampling using the initially considered method, with
agents physically moved to locations and then looking around. Instead,
resources are just considered counted if they are within the row or
block under consideration. To account for multiple agents sampling,
view
is actually first multiplied by the number of
agents sampling (only 1 for now). This makes sense for case 1,
but for case 2, sampling ability actually increases with the square of
agent number, so this will need to be changed (Adding a new
issue).
INTRODUCE NEW ISSUE #6: Sampling ability with agent number
In case
two of the observational model, the length and
width of a sampling block will both increase linearly with the number of
agents doing the sampling; hence, sampling area increases exponentially
with the number of observers, which is probably unrealistic. There are
two ways to potentially address this:
+= (int) agent_array[agent][8]
. Only allow resources to
move when this countdown hits zero, and reset it it thereafter. Hence,
observers will observe more n more blocks if there are
n more observers.INTRODUCE NEW ISSUE #7: DENSITY TYPE SAMPLING
Of course, it will be easy to make this kind of transect sampling
random instead of comprehensive over the landscape. This can be
done by simply randomly choosing the positions of block on a landscape
some obs_iter
of times. This could allow an estimate of
population size by considering density (i.e., assume that the
number counted in a sampled block reflects the density of the larger
landscape of known size), as was done by Nuno et
al. (2013). This shouldn’t take much time to code and test.
I’m going to start referring to issues that are introduced and resolved in the gmse GitHub repository by number.
RESOLVED ISSUE #4: Repeat calls of resource within
resourc.R Now poorly named given the solution. The result is a
brief update on the addition of a bit of a side function. The function
anecdotal
is now available in the
observation.c
file, and is called from the
anecdotal()
function in the file anecdotal.R
.
All this function does is cause agents of one or all types to count the
number of a particular resource within the agents’ view. It is similar
to the observation
function, but instead of returning an
array of observations of resources (augmented with columns for different
observations periods – see 10 JAN) that is
intended to be used by R separately, the anecdotal
function
adds the number of resources viewed in an agent’s vicinity to a column
in the agent array. The name of the function therefore is meant to add
to an agent’s general mood or impression of the quantity of a resource,
based on anecdotal evidence for what’s going on around their location.
We can imagine such anecdotal evidence as affecting the opinions and
behaviours of stake-holders.
INTRODUCE NEW ISSUE #5: Sweep observation Related to discussions with Jeremy and Tom regarding the Islay geese, need to have a kind of observational model in which agents move to take measurements, but resources move along roughly the same time scale. This can of course be accomplished one way if we:
if(resource_movement == 1)
type criteria at the tail end
before the break
(to avoid unnecessary movement). This will
require also including the resource movement function (currently in
resource.c
) in the observation.c
file. May as
well just dump the whole thing in in the interest of modularity, though
if it stays the same, it will be tempting to create a
utils.c
file of some sort. This resource movement option
can be applied to the existing method
case 0
,
as appears in the switch function of the observation
function.To do a sweep of the landscape while allowing resources to move, I
think we’ll want a completely different method
of
population size estimation (most upstream switch function). What this
method will do is:
x
location of
x = 0
on the landscapex
locations x
to x+view
(i.e., observe
view
rows)x = x+view
x+view
is greater than the y
dimension land_y
x
to
land_y
.The procedure above will simulate observations over a time that is
proportional to their view
(and thus ability to census) –
the more time it takes, the more the resources can move and potentially
lead to measurement error. The observational array returned will still
be output in the same way – resources will be marked as with the
case 0
option and read out as an observational array.
Note: It would be nice to eventually allow for
blocks rather than long linear transects to be sampled,
as square blocks might more realistically correspond to the kind of
sampling that would be done by a real observer. I don’t think that this
would make too much difference in terms of finding sampling error, as
there is no bias to resources movement in one direction; hence, the
turnover of resources for any particular number of cells will be the
same for any N
cells sampled. It also stands to reason that
this error should be normally distributed as the number of sampling
attempts becomes large, and the error should be mean centred around the
actual population size, since the probability of missing and double
counting would seem to cancel out exactly. This might eventually lead to
analytical estimate of observation and error actually being reasonable
under some conditions.
Plan for the near future
I will try to implement this new idea tomorrow, as I don’t think it
will take much more than a day’s work, if that. Then, it’s really time
to take a step back and think – need to read Nuno
et al. (2013) in more detail first, perhaps tonight, and
potentially also add the observation model procedure used therein as
different implementations of case
– this should be very
similar to the solution for *ISSUE 5**, except through the use of random
sampling of area and density measuring of resources. We’ll then be in a
position of having a stable resource and observation model with a few
different options for observation, and I’ll need to think more carefully
about the big picture, and how to proceed with the rest of the
model.
We now have a working G-MSE v0.0.4
, which includes a
stable population model and a stable observation model. The figure below
shows the visual output of the new version, with the landscape in the
top panel (note: different tan colours don’t mean anything yet – the
landscape is effectively uniform); resources (i.e., individuals in
the population) are represented in black. In the bottom panel, the solid
black line shows the actual change in (adult) population size over time,
stabilising around a carrying capacity of 400 (red dotted line). The
dark solid cyan line shows an estimate of the population size from the
observation model, simulated through mark-recapture (other types of
observation are available, see below). The shading around this line
shows \(95%\) confidence interval
estimates. More details about this specific estimate below.
I’ve made a few minor updates to the population model code, and
included one new type of movement that is allowed – borrowed from
individual-based modelling literature on plant-pollinator-exploiter
interactions (Bronstein et al. 2003; Duthie and
Falcy 2013). This type of movement makes use of an individual’s
movement parameter move
by having an individual move
Poisson(move)
times each time step, and with each movement
travelling up to move
cells away (Euclidean distance). This
type of movement is case 0:
in the mover
function in resource.c
.
This update includes the major addition of the
observation.c
file, called by observation.R
to
simulate the sampling of resources (i.e., individuals) from the
population model. The file observation.R
holds the
observation()
function, which returns a data frame
of observed resources. The observation function thereby simulates the
process of acquiring observational data, but not analysing those
data. Analysis of these data is left to R, or to a (yet
written) c function (note, current analyses are fairly simple).
The function observation.R
requires the
following three data frames:
resources
: holds all of the resources simulated.landscape
: holds the landscape on which resources and
agents are located.agent
: holds all of the agents simulated (this also
includes at least one manager of type 0 – even if the manager does not
eventually participate in games).The observation.R
function also
requires the paras
vector, which holds all
parameters that might be important throughout the simulation.
Optional inputs include:
type
, which specifies the type of resource being
observed (default = 1).fix_mark
, which either sets a fixed number of resources
to be sampled during an observation (positive integer value) or sets an
observer to ‘’observe’’ all resources in its view
(0 or
FALSE).times
, which sets how many times an observer will make
observations during a time step (must be >0)samp_age
, which defines the minimum age at which
resources are sampled (the default is set to 1, meaning that resources
just added are not sampled – could conceptualise this as sampling only
adults; for now, it also makes the initial testing easier because
carrying capacity has not yet been applied to juveniles during before
observation – can change this, of course.agent_type
, which identifies which agents are doing the
observing. The default value is 0, which identifies the managers in the
model. For most purposes, we will only need to have managers doing the
observing, but there is definitely some utility in allowing other agents
to do their own observing; more on this below.model
, which currently has to be “IBM”. Eventually it
might be nice to allow observation.R
to shunt observations
to something not individual-based, such as Nilsen’s
model, or another analytical equivalent, but not yet.The file observation.R
calls the function
observation
in the file observation.c
. This c
file follows the following general protocol:
The function observation
is called, which does the
following:
observation.R
function).mark_res
a total of
times
times – each time simulating a unique trip to do
field work. mark_res
is a general function for marking
individuals. Other functions can eventually be called instead of, or in
addition to, mark_res
, but the function is already very
flexible, so it’s hard to imagine what other function might be needed –
mark_res
is currently the default and only function called.
Details on the function are below.obs_array
. This
array includes a row for every resource observed and all of the columns
that also exist in the resource array (e.g., identifying resource
location, identity number, types, life-history parameter values, etc.).
Additionally, the observational array also includes a column for each
times
– the number of times that observations are made.
These columns hold values of 0 or 1, which indicate whether (1) or not
(0) a resource was observed during a particular observation (can think
of times
as outings in the field, each producing a column
of whether a resource was spotted/marked/recaptured or not).obs_array
into a format that can be returned
to RThe function mark_res
is
called by observation
, and does the following:
observation
):field_work
causes the agent to go out and do some
observational field work.a_mover
causes the agent to move according to some
specified rules, as stored in the parameter vector and agent array. The
default is simple uniform movement some Euclidean distance away after
doing field work – setting up for field work in a different location.
The code is almost identical to the code that moves reources in
resource.c
, so I’ll not explain this here.The function field_work
simulates the process of an
agent looking for and tagging resources in some way (this can later be
interpreted as viewing, tagging, marking, recapturing, etc.). There are
currently two different tagging procedures possible (with the option to
build more):
binos
function (simulating, e.g., binoculars).fix_mark
resources on the landscape
(note: which resources is not a function of space)After the observation function is run, we thereby have an
obervational data frame in which rows are individual resources, and
columns include traits of those resources (same as in the resource data
frame) and whether or not the resource was observed during a particular
simulated outing. Through a combination of specifications for
times
and fix_mark
options, observational data
frame can then be interpreted in multiple ways and used in a simulated
analysis:
There are multiple ways to interpret the observation results. Examples of this are as follows (for now, I’m assuming that there is one observer, but we can substitute the below with any number of observers):
Details of the technique used to produce the above figure include the following:
gmse.R
figures out what the estimate of the population size
would be for each time step. The analysis uses a very simple
chapman_est
function that I wrote in R. This function, or
something like it, might be later incorporated as part of the
observation model itself (likely by having observation.R call a
different c file or R function), or in the manager model, or somewhere
inbetween. I haven’t decided.For now, it’s time to take another step back and take stock of what
needs to be done next. A manager model and user model will need to start
looking at multiple resources for making decisions, and somehow both
potentially feed into a game-theoretic model. The complexity involved
with the integration of management, games, and user actions should be a
bit mitigated by all of these eventual functions revolving mostly around
the agent array, with some input from the observation array. Of course,
at least one type of agent will need access to the observational data as
input (perhaps only to ignore it, sometimes), and users will need access
to the resource array for off-take and other things. Some careful
planning is needed for what happens next. I am particularly becoming
aware that the flexibility of this model, while definitely a good thing,
has the potential to tempt me into creating a lot of end user options
that no one will actually want. It might be a good idea to
develop a list at some point separating key options that we definitely
want to be visible to all end users from more obscure options that are
available to us by editing the central gmse.R script. It’s also
likely that a model of this scope will require a well written R function
that translates different combinations of user-friendly inputs into an R
list, which can then be interpreted by the script that calls
resource.R
, observation.R
,
manager.R
, game.R
, and user.R
,
and which places inputs into the vector para
appropriately.
It’s worth noting that the
flexibility of the observation
function might be used to
address social questions that interest us. I’ve been mainly
conceptualising the observation model as something done by a
disinterested third party – a manager rather than a stake-holder per se.
The manager would make some decision that then affected payoffs in a
game among stake-holders. We can do this of course, but we can also
allow the stake-holders themselves to observe, perhaps less thoroughly
and with more potential for bias (as we assume that they have less time
and expertise). For example, we might imagine some stake-holders to
estimate population size or change over time for themselves by observing
all of the resources within a short distance around their location –
perhaps (incorrectly) biased by large population changes (e.g., way more
geese around my location this year than last – estimate a lot of total
geese this year overall). These observations could feed into the game
and user models.
Also – and this might require some tweaking – the flexibility of the
type columns (type1, type2, type3) means that observing can be flexible
too. We could allow each individual to observe, or groups of individuals
of the same type to observe. NEW: We can also
specify the type of individuals doing the observing by any category,
including individual ID. This means that we can tell a specific
agents (assuming they are represented by rows) to observe, or loop
through the function with specific agents. The agent’s type (or ID) is
stored in the observation output, indicating which agent did the
observing if data frames get amalgamated from looping the
observation
function.
As a quick update, I now have a working population model for G-MSE, and have reached the point where it will probably be better for me to take a step back and plan a bit, then work on other aspects of the full model rather than add more bells and whistles to the population sub-component. The development that I have done includes five files (happy to send these for the curious):
gmse.R – A master file that I’m currently using to call everything else
landscape.R – A file that constructs an \(m \times n\) landscape (in the code, this is a simple 2D array, the elements of which can contain any real number). Currently, there is an option to make this landscape any size and randomly place any number of ‘resources’ onto it, if desired. In the past, I have used some code to produce autocorrelation of values on the landscape; if it suits us, I can rewrite this code (to improve the readability) for application to G-MSE. I also think it would be useful to have the option of reading in an image (i.e., a map) and converting it to an array to be used as the landscape (e.g., JPG, BMP, etc.) – I suspect some stakeholders might find this especially useful, as it might help them see the applicability more clearly. Also, I’ve left hooks in the R file to allow eventual development of a non-spatial model.
initialise.R – A file that generates a single ‘RESOURCE’ array, which will hold everything that might be of value to stakeholders; this includes, most obviously, individuals in populations of conservation interest, but can also be used to respresent things like hunting licenses or crop plots. The idea is to have a data structure that provides maximum flexibility – individuals can be represented as rows (or sets of rows) within the array, and their types and attributes can be indexed by column:
## IDs type_1 type_2 x_loc y_loc move time remov_pr growth offspr age
## res_1 1 2 0 1 3 2 0 0.1 1.1 0 0
## res_2 2 2 0 15 12 2 0 0.1 1.1 0 0
## res_3 3 1 0 1 11 2 0 0.1 1.1 0 0
## res_4 4 1 0 16 13 2 0 0.1 1.1 0 0
## res_5 5 2 0 16 20 2 0 0.1 1.1 0 0
## res_6 6 2 0 3 13 2 0 0.1 1.1 0 0
## res_7 7 1 0 1 9 2 0 0.1 1.1 0 0
## res_8 8 1 0 16 11 2 0 0.1 1.1 0 0
## res_9 9 2 0 9 13 2 0 0.1 1.1 0 0
## res_10 10 2 0 19 19 2 0 0.1 1.1 0 0
resource.R – This file has only one real job, and that is to read
in the RESOURCE
array, LANDSCAPE
array,
PARAMETER
vector, and MODEL TYPE
(currently
only individual-based model, “IBM”), and then call the appropriate
resource model. this intermediary R file allows us to be flexible in
re-routing the whole G-MSE to different population models, if need be.
We could even mix and match the extent to which components use simple
equation-based modelling (e.g., as in Nilsen’s
MSE), and which use the more computationally-intense agent-based
simulation (though I really don’t think computation time will be much of
an issue, even with the agent-based model). Currently, all this R file
is doing is calling the C code and the file resource.c – or, more
accurately, it is calling the compiled file resource.so, which allows R
to link to C.
resource.c – This is the file that does all of the heavy lifting
in terms of simulating resources on a landscape; it is written in C to
make the computation run (much) more quickly (probably by two orders of
magntiude). The file includes several C functions, one of which links
them all by running the resource() function, which reads in the
RESOURCE
and LANDSCAPE
arrays, and a
PARAMATER
vector (containing any key parameter values) from
R, and returns a new RESOURCE
array (hence, landscape and
parameter values are unchanged). A rough outline of what this key
function does is as follows: - Reads and edits all of the key input into
a form that C can store and use - Calls function add_time
,
which writes a time step and adds an age to all rows (see table above) -
Calls function mover
to move individuals some Euclidean
distance according to a parameter (see above) and movement rules
(currently: uniform probability of cell distances, Poisson probability
of distances). This program also uses a parameter to determine what
happens at the edge of the landscape – currently, either nothing happens
(i.e., individuals are just ‘out of view’) or the landscape wraps around
as a torus (i.e., if you leave on the left side, you come back on the
right). - Calls the functions res_add
and
res_place
to simulate the addition of new resources (e.g.,
birth of individuals) and place them in a new array, respectively.
Currently, old rows (e.g., individuals) directly create new rows
according to a growth
parameter (see table above),
simulating birth, but this can be changed. A carrying capacity can also
be applied to addition of new rows. New rows are also identical to their
‘parent’ rows in everything except ID and age, but this can also be
changed. - Calls the function remove
to remove some of the
old rows from the input array – currently removal of rows occurs with
some fixed probability (remov_pr
, see table above), or
probabilistically based on a set carrying capacity. - Combines the rows
of the original RESOURCE
array that were not removed with
the newly created resources to make one single array (might want to
make this its own function later, for readability). - Reads and
edits all of the key output back into a form that can be recognised by R
as a data frame. - Note: There is plenty of room for
expanding this population model, and adding components such as
immigration and emigration, interaction of resources, more complex
movement, spatial heterogeneity of birth and death, sexual reproduction,
disturbance, etc. This is just what I consider to be a minimal
individual-based model useful for simulating a population. The code
appears to be stable, though a bit more error checking would be useful,
and some warnings need to be added to the code – also, as of now, it is
possible to have divergent growth of population size, maxing out the
computer’s memory and causing the program to crash. Some safeguard
against this needs to be written in.
A small script can help us see the output of what’s going on in the population, both in terms of individual movement and change in population abundance over time. The run time of the below population is negligible – all of the data underlying the 100 time steps shown in the figure below is produced in a tenth of a second (4 JAN Update: Assuming instead a carrying capacity of 40000, closer to the ball-park of the Islay geese, 100 time steps takes 11 seconds). The upper panel of the figure below shows a landscape (light and dark brown – these colours don’t mean anything at the moment, but could represent different landscape properties) with individuals (black) that move around, reproduce, and die in each time step. The lower panel shows the abundance of these individuals as they increase to carrying capacity (red dashed line), whereafter the population size remains stable (of course, simulating a bigger population takes a bit more time – it takes about nine tenths of a second to simulate 100 time steps at a carrying capacity of 4000).
Towards a Game-theoretic Management Strategy Evaluation (G-MSE)
Game-theory modelling (game.c; green box above)
Some side-notes that might be of use
Potentially relevant conferences and workshops
I would like to develop one general, efficient, open-source, and user- and developer-friendly program for G-MSE that would be a general tool for applying game theory and management strategy evaluation to specific problems of conflict among stake-holders. I’m somewhat flexible on the development, but my preference would be to have software that is:
Open-source, with all version-controlled development history being publicly available on GitHub.
Written primarily or entirely in C (for efficiency and portability)
Easily called from R using an R package (see also) and appropriate R functions (as many scientists would likely want to integrate the program with other R packages and their own code or data). Note that this could be tricky for windows users. See details on the most flexible way to call R from C.
Usable with a browser-based GUI (or perhaps an app, though I’d have to learn how to do this), probably ‘shiny’ on top of R.
Useful for scientists or stake-holders unfamiliar with R, or command line code more generally
Perhaps useful as a teaching tool for students or the general public
Could look similar to this: < https://tomhopper.shinyapps.io/TB_Cases_shiny/ >, the code repository of which is availabile here: < https://github.com/tomhopper/TB_Cases_shiny >. Each tab could have a different set of related inputs and outputs, which together could produce a full report in the browser.
Comparable in scope to something like RangeShifter: http://rsdevs.github.io/RSwebsite/ (Bocedi et al. 2014)
MAJOR POINTS: Some major points fleshed out given the thinking below:
Question: The objects (i.e., populations, resources, commodities) will often be represented as discrete entities (individual animals in populations, but also things like licenses sold and crop patches saved or raided – which could have individual locations). Should the stake-holders also be modelled as (potentially multiple) discrete entities? This is easy to see if, e.g., stake-holders are potential hunters that do or do not buy licenses and engage in hunting, but maybe conservationists could also be considered as discrete – each individually affecting the decision of an organisation in a game.
Given the question above: Stake-holders could then also be represented by a data frame, which could generalise the model to allow many individual stake-holders to play a game (or not, if data frame is single row, or scalar). This could then more naturally incorporate mixed strategies (some will take one strategy, some another) and uncertainty. In the case that it is some sort of organisation making a decision, this would allow the individual stake-holders to collectively affect a single action or policy. This would appear to drift more into the realm of agent-based computational economics, which might be a good thing given the goals of ConFooBio. This could allow for maximum flexibility too, if agents could also be discrete individuals making decisions.
Should the model therefore be focused on at least four data frames modelling individuals? At least two modelling individual species or resources of interest (and at least one being a population of conservation interset), and at least two modelling modelling individuals with interests in the former?
I think that the agent-based model is really going to be the default one to use, with other models being useful only if the end user is really tied to them in some way. In general, to find emergent phenomena and predict dynamics and decisions accurately, I think it will be useful to keep in mind the maxim of keeping situation rules simple while allowing agents to be complex (Volker Grimm said something like this in one of his talks or publications, and given the ConFooBio focus, I think it’s especially applicable).
Before getting into specifics, it will be useful to walk through the G-MSE model conceptually to figure out what kinds of approaches are going to be most useful for the following:
Each of these needs a general framework that will be most usefully applied to real-world problems of conflict. Ideally, these models will be modular – i.e., not depend on the type of modelling being done in other areas of G-MSE. That way, we might, e.g., decide to substitute an entirely different kind of natural resources model (e.g., simple numerical Lotka-Volterra versus spatially explicit individual-based model), but still be able to generate input/output in each component to be used by the next.
Nevertheless, there needs to be some conceptual framework that is consistent, in addition to the five above modules. I’ve written down some of these ideas, deliberately avoiding Nilsen’s MSEtools repository for now. Some potential things that are common to G-MSE:
The model is therefore going to need to generally hold two or more variables or objects that represent populations or resources (including biomass) that can both be affected by any of the sub-components (note: even something like fishing licenses sold can be oberved, perhaps with trivial error – we can therefore apply the same process of MSE to both populations and the things with which they are in conflict).
In any case, there will be a need to model how properties of the population change from one time step to the next. Properties of interest for populations might include:
It would seem as though properties for conflicting resources would be more likely to boil down to one number (e.g., crop yield, licenses sold), but maybe not. We could, for example, assign a location to farms and licenses, or units of biomass in some way.
I think an individual-based model that represents individuals and resources with a table is probably the best way to go in most cases. We can perhaps broaden this out so that the observation model will recognise a table (IBM), a vector (classes), or a number (just size), with some indication of the type of data being returned, but most of the time a full table will be the way to go (in fact, we could probably just make everything a data frame, and have \(1 \times 1\) data frames be interpreted as scalar, and \(1 \times n\) data frames be interpreted as a vector). The information about the population will represent all of the relevant information about the natural population being modelled, so it can pass all of this information onto the observation model, which can then run some function to search through it and extract parameters of interest (with error, potentially). Within this model, we’ll want functions to model birth, death, immigration, and emigration.
For scalar or vector inputs, observation error could be more directly simulated – just with a parameter for bias and error (e.g., around population size, or sizes of each age or stage class).
Alternatively, a different, more general way of doing it might be to instead simulate some length of time \(t_{obs}\) for modelling the process of observation. Then each time step could include a probability of observing an individual. This might be even better because I think it would be more generalisable. In the case of the IBM, individuals could be observed following a Poisson process at each time step that:
The benefit here is that a scalar or vector could be modelled in the same way, just by sampling from a Poisson distribution to find observation number at each time step of some number of individuals (potentially of different ages or classes).
It will then spit out something that will affect both the game that agents play and therefore actions of users.
One job of the management model will be to calculate statistics associated with the uncertainty surrounding these observations (e.g., confidence intervals), which will affect management decisions that are simulated.
TODO: Need to figure out how management decions are going to be implemented. These deisions will feed directly into the game model, and possibly the user model.
This part is especially tricky. Need some common framework to convert the dynamic things (resource, population) into a utility function, then into a payoff matrix (or perhaps something even more general). Questions that need addressing before building the model:
We also want to include uncertainty in the games.
The general structure of the program itself, I think, could fit into Figure 1 of (Bunnefeld et al. 2011) (TREE paper), with a game-theoretic component added into the management model and harvester operating. Would game-theory among agents then be applied to the harvesters who are making decisions? A basic computational model would then proceed something like as follows:
Master file: gmse.R [also create standalone gmse.c with int main(void)]
initialise.R: code within R to organise key data frames
STAKEHOLDER_1
(Stake-holders can be
discrete)STAKEHODLER_2
(rows = individuals;
cols = attributes)RESOURCE_1
(note: resources can be
populations)RESOURCE_2
(rows = individuals; cols
= attribuets)LANDSCAPE
(start with an \(m \times m\) matrix)resources.c: sub-functions affect dynamics of resources
RESOURCE_1
, RESOURCE_2
, and
LANDSCAPE
move(double RESOURCE)
: move individuals or resources on
LANDSCAPEreproduce(double RESOURCE)
: New resources added based
on some rulesdie(double RESURCE)
: Resources removed based on some
rulesimmigrate(double RESOURCE)
: resources added by
different rules (later?)emgirate(double RESOURCE)
: resources removed by
different rules (later?)interact(double RESOURCE_1, double RESOURCE_2)
:
Resources interactRESOURCE_1
, RESOURCE_2
,
and LANDSCAPE
observe.c: sub functions affecting simulated data collection
manager.c: sub functions affecting management decision model
game.c: sub functions affecting game played based on management decisions
user.c: sub functions affecting implementation of users given game.c
summary.R: Summarise information and plot (also create C standalone)
Note: The c standalone will also need the file gmse_util.c, for all of the other components (e.g., random number generation) which would normally be done in R. In R, these components can be incorporated with the appropriate R.h and Rmath.h header files.
Note: The RESOURCE_2
will have to be optional,
because in some scenarios, two stake-holders might simply be in conflict
over the use of one resource.
Note that Erlend Nilsen has constructed the basic MSE framework in R already, and I’ve forked his repository on GitHub as a potential starting point. I’ve also starred a repository for calling C from R, as I think that this will be necessary. I’d like a standalone version of the model in C, but the focus should probably first to be writing the intense code in C while immediately making it called from R – cloning and making a C standalone can come later (maybe avoid using too much of Rmath.h so that a C standalone is easier).
This would allow a harvester operating module or function to fit within the broad simulation or program, G-MSE.
The spatial aspect of some of the key cases studies (e.g., Nellemann et al. 2000), and the importance of space more broadly in ecological processes, suggests to me that the G-MSE program will need to ahve a spatial component – landscapes need to be a part of it, perhaps?
Overall, based on the ERC proposal and Bunnefeld et al. (2011), the model will function something like the below (subject to change):
As long as not too many generations are run (e.g., not too much more than 100), I am cautiously optimistic that this program will be able to include an individual-based model of a focal population, and all of the other game-theoretic components, and not take more than a few minutes to run and produce simulated results (obviously less if it is called directly from C, but I’m shooting for this calling from shiny in a browser). For end users, dynamic graph production can make the wait time a bit more interesting, if it’s possible. For us, the time it will take for me to call in c, especially if using a the cluster, will be trivial.
For the natural resources model, it might be nice to have an option of burning in several time steps before starting the loop (if, e.g., no empirical data are available, and the model instead relies on parameters plugged into a Lotka-Volterra or Ricker model). Or, if data are available, long-term demographic data could be used and assumed to represent the true population dynamics (i.e., just use these data to simulate N individuals) before starting the G-MSE model loop. It is worth thinking about how much population structure we might want to add – my inclination is to make the software as flexible as possible (e.g., allow sex, age, etc., to be attributes of discrete individuals), but this will depend on other aspects of the model.
In the interest of making this model as general as possible, I believe that we’ll eventually want to use an extensive-form game to allow for the sequence of moves to affect stake-holder actions. Nevertheless, just to get the basic framework underway, I think we can start out with a normal-form game, with the intent of generalising the model later (the code will be modular enough to allow this). Generalisation should be easy if we have a separate function to keep track of the game tree, and then allow agents to access the game tree (or parts of it, in the case of incomplete information) to make decisions about how to act. An extensive-form game package exists in R, published by Kenkel and Signorino (2014) with code available on GitHub, but the focus of this package is for ‘estimating recursive, sequential games, and not simultaneous move games or dynamic games with infinite time horizons’. Since the quoted probably describes the kinds of games that ConFooBio is interested in, I think the games package will be a useful reference, but not something to directly apply. It incorporates uncertainty, which could be something useful to return to for further reference.
A couple other (Java based) examples of games are available on GitHub, such as GTE, which has a GUI web application and a corresponding published paper (Savani and Stengel 2014). This model leads me to think that it’s probably best to give each player two matrices:
do.call
function to be used –
probably easier to deal with in R).Another java extensive-form games package exists, though it seems like less useful for ConFooBio purposes.
Some notation to try out: For the purpose of the below, to keep things simple, I’m going to just start with payoff matrices, and assume that history of interactions is not yet used in decisions.
To further simplify, I am going to assume that there are only two players. The general payoff matrices can be represented as below (loosely following the notation of Débarre et al. (2014)):
\[ {\bf A^{1}} = \left( \begin{array}{cc} U^{1}_{a} & U^{1}_{b} \\ U^{1}_{c} & U^{1}_{d} \end{array} \right), {\bf A^{2}} = \left( \begin{array}{cc} U^{2}_{a} & U^{2}_{b} \\ U^{2}_{c} & U^{2}_{d} \end{array} \right). \]
In the above \(a\), \(b\), \(c\), and \(d\) are all different possible outcomes that depend upon the decisions of players 1 and 2. We can think about these in terms of the actions \(X^{1}_{i}\) and \(X^{2}_{i}\), and put these into the familiar payoff table below,
Player 2 | ||
---|---|---|
Player 1 | Strategy 3 | Strategy 4 |
Strategy 1 | \(a \to \{U^{1}, U^{2}\}\) | \(b \to \{U^{1}, U^{2}\}\) |
Strategy 2 | \(c \to \{U^{1}, U^{2}\}\) | \(d \to \{U^{1}, U^{2}\}\) |
For doing the maths though, individual matrices will be used. Note that to keep things general, the above strategies are unique to each player. I think that this will be relevant to ConFooBio because each actor will have a unique role. Hence, a vector \(I\) can represent all possible options for action, with players (normally) only having access to a subset \(i \in I\), though we might conceive of some players being able to do the same thing despite having different roles.
Making payoff matrices a list with \(M\) elements of vectors is probably the
best way to go in R, with \(M=2\)
players for most of what we’ll do. Each player \(m\) will have its own options for acting
within the list M[m]
.
M <- 2; # Number of players in the game
S <- list(); # Strategy vectors (elements all possible strategies)
A <- list(); # Payoff vectors (elements all possible strategy combinations)
For now, let’s just assume that each player has two possible strategies, and we’ll just use the traditional matrix to calculate Nash equilibria; for future reference, Avis et al. (2009) might be useful for quick calculation of Nash equilibria for two player games. Continuing with the above, here’s a basic setup computing the Prisoner’s dilemma:
S[[1]] <- c("C","D"); # Cooperate or defect strategies (change to numeric?);
S[[2]] <- c("C","D");
A[[1]] <- c(3,0,5,1); # Payoffs for player 1
A[[2]] <- c(3,5,0,1); # Payoffs for player 2
A1 <- matrix(data=A[[1]], nrow=length(S[[1]]), byrow=FALSE);
A2 <- matrix(data=A[[2]], nrow=length(S[[2]]), byrow=FALSE);
print(A1); # Note the traditional Prisoner's dilemma payoff structure
## [,1] [,2]
## [1,] 3 5
## [2,] 0 1
print(A2);
## [,1] [,2]
## [1,] 3 0
## [2,] 5 1
Now check to see if the best possible response for each player is the same regardless of its opponent’s strategy.
best1 <- apply(A1,1,which.max); # Best strategies for Player 1
best2 <- apply(A2,2,which.max); # Best strategies for Player 2
tabl1 <- tabulate(best1); # Frequency of bests
tabl2 <- tabulate(best2);
str1 <- tabl1 / sum(tabl1); # Frequency of each strategy
str2 <- tabl2 / sum(tabl2);
summ1 <- matrix(data=str1,nrow=1); # Summary vector of strategies
summ2 <- matrix(data=str2,nrow=1);
colnames(summ1) <- S[[1]];
colnames(summ2) <- S[[2]];
rownames(summ1) <- "Proportion";
rownames(summ2) <- "Proportion";
print(summ1); print(summ2);
## C D
## Proportion 0 1
## C D
## Proportion 0 1
One goal will be to develop a function that can return optimal strategies for each player, including mixed strategies, for any given \(2 \times 2\) payoff matrix. The function below does not do this; it needs to be fixed. A starting point for looking at appropriate algorithms is Avis et al. (2009), who come up with an efficient solution.
Before investing too much time in this, let’s make sure that finding equilibrium solutions make sense in the context of games with uncertainty. We might need a different approach, e.g., if the payoffs themselves are uncertain and the optimal strategies are reflected in this uncertainty
One package in R can solve Nash equilibria, though the documentation for it is not excellent. There’s also a repository that can do it in C, but that might take more time than it is worth – the paper underlying it is Miltersen and Sørensen (2009). A benefit here is that it uses extensive-form games and computes quasi-perfect equilibria, which are specifically equilibria that assumes that a player’s opponent is not perfect, and accounts for past mistakes.
## XXX FIXIT: There is an error in calculating what each should play -- it is tabulating the frequency of best plays, but when mixed strategies occur, it returns a 1/2, 1/2 instead of the proportion based on the value.
solve.nash <- function(){ #Function to be made to solve Nash equilibrium
return(NULL);
}
game <- function(payoff1, payoff2){
if(length(payoff1) != length(payoff2)){
print("WARNING: Payoff vectors must be the same length");
return(NULL);
}
if(min(payoff1) < 0){
payoff1 <- payoff1 + min(payoff1);
}
if(min(payoff2) < 0){
payoff2 <- payoff2 + min(payoff2);
}
if(is.matrix(payoff1)==FALSE){
payoff1 <- matrix(data=payoff1, nrow=2, byrow=TRUE);
}
if(is.matrix(payoff2)==FALSE){
payoff2 <- matrix(data=payoff2, nrow=2, byrow=TRUE);
}
S <- list();
S[[1]] <- c("Strategy_1","Strategy_2");
S[[2]] <- c("Strategy_3","Strategy_4");
best1 <- apply(payoff1,1,which.max); # Best strategies for Player 1
best2 <- apply(payoff2,2,which.max); # Best strategies for Player 2
tabl1 <- tabulate(best1); # Frequency of bests
tabl2 <- tabulate(best2);
expe1 <- apply(payoff1,2,sum) * tabl1;
expe2 <- apply(payoff2,1,sum) * tabl2;
str1 <- expe1 / sum(expe1); # Frequency of each strategy
str2 <- expe2 / sum(expe2);
summ1 <- matrix(data=str1,nrow=1); # Summary vector of strategies
summ2 <- matrix(data=str2,nrow=1);
colnames(summ1) <- S[[1]];
colnames(summ2) <- S[[2]];
rownames(summ1) <- "Proportion";
rownames(summ2) <- "Proportion";
strategy_pr <- list(player1=summ1,player2=summ2);
return(strategy_pr);
}
We can now use the function above to figure out and return strategies for any given payoff vectors from \(a\), \(b\), \(c\), and \(d\) for each player (1 and 2).
u <- shinyUI(pageWithSidebar(
headerPanel(""),
sidebarPanel(
textInput('vec1', 'Player 1: a, b, c, d', "3, 5, 0, 1"),
textInput('vec2', 'Player 2: a, b, c, d', "3, 0, 5, 1")
),
mainPanel(
h4('Proportion strategy is optimally played: (DOES NOT WORK YET)'),
verbatimTextOutput("oid1")
)
))
s <- shinyServer(function(input, output) {
output$oid1<-renderPrint({
p1 <- as.numeric(unlist(strsplit(input$vec1,",")))
p2 <- as.numeric(unlist(strsplit(input$vec2,",")))
pay <- game(payoff1=p1, payoff2=p2)
o1 <- as.numeric(pay$player1)
o2 <- as.numeric(pay$player2)
cat("Player 1 (Strategy 1, 2):\n")
print(o1)
cat("\n\n")
cat("Player 2 (Strategy 3, 4):\n")
print(o2)
}
)
}
)
#shinyApp(ui = u, server = s)
How do we quantify costs and benefits in situations in which there is conflict between conservation and food security? Game theoretic models rely on numeric values being maximised by individual agents, with games promoting cooperation or conflict depending on equilibrium solutions when each agent maximises its value. But for conservation and food security, the values do not seem to be straightforwardly assigned – how do we compare something like extinction risk against food production (or, e.g., tourism income)? It seems that we need to either figure out how to play games in which payoffs are in different, difficult-to-compare currencies, or figure out how to standardise disparate payoff types into a common currency to model games.
Note that there is a whole literature surrounding utility and utility functions, most of which appears to be based in economics. This is probably the best thing to tap into, although the question of what kind of utility functions to use (e.g., ordinal, continuous, etc.) is still something that will need to be worked out.
Could figure out some sort of way to rank order or bin preferences for each agent (Added note: this might link up with Jeremy’s idea of attitude in some way?). This might also help with dealing with uncertainty because the uncertaintly of outcomes could be expressed as the likelihood or probability of hitting a rank or getting into a bin. Successful cooperation could then be defined by increasing, or perhaps maximising, ranks or bins of each agent. I had a actually played around with an idea for using something like this in philosophy (ethics theory), in which ‘maximise well-being’ is sometimes considered a fundamental concept, but one that is hard to pin down (i.e., could have links to environmental ethics). As a bonus, the ranks or bins could be easier for real-world agents to understand.
In any case, a game-theoretic model will need some sort of numbers to work with (even if they are just ordinal preferences), so I think this will be a key question early on.
Should we have tables, such as the hypothetical one below? This is the typical way that games are modelled, but it assumes that different agents are playing the same game. If there are conflicts among more than two types of agents (i.e., agents with three or more unique interests), then fitting games into two-by-two boxes could be difficult (this was mentioned in the project proposal).
Agent 2 | ||
---|---|---|
Agent 1 | Strategy 1 | Strategy 2 |
Strategy 1 | A1 pay, A2 pay | A1 pay, A2 pay |
Strategy 2 | A1 pay, A2 pay | A1 pay, A2 pay |
I also think it is important to recognise early on that these games are unlikely to be symmetrical – the payoffs are unlikely to be the kinds of simple prisoners dilemmas that lead to both agents having the same effective strategy (see also Colyvan et al. 2011).
Note, I don’t think that this means that simple Nash equilibria are impossible to find – the solutions might just look a bit odd, depending on the payoff values in the matrices.
We should not, however, overlook the possibility of solutions that are optimally cooperative when played iteratively but not cooperative when played once. Prisoner’s dilemma is the classic example; see the Axelrod experiments and Wilkinson (1990), Carter and Wilkinson (2013), Carter and Wilkinson (2015), Trivers (1985) work on reciprocal allocation. Dawkins (1976) also had a chapter on this, I think.
Given the above, we should also, perhaps, consider that payoffs might change over time (e.g., one year to the next) with changing environmental conditions (defined very loosely as anything outside of the agent’s control that structures the payoff matrix), and that agents might capitalise on this stochasticity to maximise net gains. Further, they might change in a non-linear way such that one way of maximising payoffs is to let one agent ‘win’ in one year and another agent ‘win’ in the next year. This could benefit all if the payout in a given year has a huge benefit for the ‘winner’, but not an abnormally large loss for the ‘loser’ (probably should use different terminology than ‘winner’ and ‘loser’); in subsequent years then, the other agent might find themselves in a situation where they have an abnormally high amount to gain from ‘winning’ and the other agent does not have an unusually bad year by ‘losing’. Note that, I think, this implies that the changing payoff structure of a game over time might be dynamic in a way that is not purely a zero-sum situation; i.e., gains are non-additive (in the previous example ‘sub-additive’) over time. Non-additivity could work the opposite way too – it might be that when it is unusually good time to ‘win’, it is an even worse time for the other agent to ‘lose’ – I’d need to flesh out this idea more; it has conceptual connections to the community ecology (species interactions) literature.
As a concrete example of the above – maybe, e.g., the conditions are particularly good for hen harrier conservation in the current year (i.e., a population is poised to grow especially well, or rebound in some critical way) – so good that maximising gains now would well compensate for the expected losses if grouse hunters enforced control in the subsequent few years. Perhaps banking these conservations gains would be the best solution, if at a later date the conditions would be such as to cause grouse hunters to benefit disproportionately from target control at a time in which the losses of control to conservation would not be especially severe. The net result of all this could be that each agent benefits by maximising its gains when times are tough at the cost of suffering higher losses when times are good. Again, this depends on variation in the payoff structure over time, and that the payoffs will vary in such a way as to cause sub-additive growth in gains. It also might require more certainty about gains that is reasonable.
Need to think about uncertainty more.
The following recreates Nilsen’s MSE modelling work.
The manager model receives the single estimate of population size (density or abundance), then returns a total allowable catch. A second function models hunter frustration, and is meant to be run after the first function. The second function checks to see if hunter frustration is within a set of bounds; if it is, then the function returns the original total allowable catch. If it is not, then the function adjust the total allowable catch.
The user model (called the implementation model) includes four separate functions, including the very simple, which just samples from a random binomial or poisson function around total allowable catch.
Hence, we can put four of these functions together to simulate a very simple MSE model:
pop_abund <- 100;
harvest <- 20;
growth_rate <- 1;
K <- 200;
pr_harvest <- 0.7;
time <- 1;
time_end <- 30;
track <- matrix(data=0, nrow=time_end, ncol=5);
while(time <= time_end){
pop_vars <- PopMod1(X_t0=pop_abund, sigma2_e=0.2, N_Harv=harvest, K=K,
r_max = growth_rate);
pop_abund <- as.numeric(pop_vars[4]);
obs_vars <- obs_mod1(scale="Abund", value=pop_abund, bias=1, cv=0.4);
if(obs_vars < 0){ # Nilsen's model allows estimate to be negative
obs_vars <- 0; # Make it so that negative equates to est. of extinction
}
har_vars <- HarvDec1(HD_type="A", qu=0.2, PopState_est=obs_vars);
imp_vars <- Impl1(TAC=floor(har_vars), ModType="B", p=pr_harvest);
track[time,] <- c(time, pop_abund, obs_vars, har_vars, imp_vars);
time <- time + 1;
}
colnames(track) <- c("time", "Pop. Size", "Pop. Est.", "Harv. Rate", "Harv.");
We run the above code, and we can look at how key population and management quantities change over time:
The below figure shows all of these quantities over time.
We can re-run the code at any point and essentially recreate a run of Nilsen’s MSE model. The hard work is now to come up with a G-MSE, which will allow for much more individual complexity through an agent-based approach.
The function
do.call
in R apparently
calls a function and passes the arguments for the function from a list
(e.g., if A
is in a list form, or put in a list form with
list(A)
, then do.call("f", list(A))
calls the
function f
for every list element in A
, where
individual list elements can be vectors with function arguments). This
is a base R function.
Scottish Ecology, Environment, and Conservation Conference (‘’The conference aims to bring together researchers in ecology, conservation, and environmental sciences across Scotland’’ – ‘’The conference is primarily for PhD, Masters and advanced undergraduate students’’) University of Aberdeen: 3-4 APR 2017 6 FEB abstract submission deadline
Modelling Biological Evolution 2017: Developing Novel Approaches (topics include: Evolutionary Game Theory and Solving Social Dilemmas) http://www.math.le.ac.uk/people/ag153/homepage/MBE_2017/MBE_2017_1.htm University of Leicester: 4-5 APR 2017 1 FEB 2017 register and abstract submission deadline.
Workshop on behavioural game theory (topic is Pyschological Game Theory) https://www.uea.ac.uk/economics/news-and-events/workshop-on-behavioural-game-theory-2017 University of East Anglia (Norwich): 5-6 JUL 2017 28 FEB 2017 submission deadline (no workshop fee)
Game theory and management (topics include: Game theory and management applications, cooperative games and applications, dynamic games and applications, stochastic games and appications) http://gsom.spbu.ru/en/gsom/research/conferences/gtm/ Saint Petersburg University: 28-30 JUN 2017
6th workshop on stochastic methods in game theory ( ‘’Many decision problems involve elements of uncertainty and of strategy. Most often the two elements cannot be easily disentangled. The aim of this workshop is to examine several aspects of the interaction between strategy and stochastics. Various game theoretic models will be presented, where stochastic elements are particularly relevant either in the formulation of the model itself or in the computation of its solutions.’’ Example topics include: Large games and stochastic and dynamic games) https://sites.google.com/site/ericegametheory2017/home Sicily, Italy: 5-13 MAY 2017
13 European Meeting on Game Theory (SING13) (topics include: cooperative games and their applications, dynamic games, stochastic games, learning and experimentation in games, computational game theory, game theory applications in fields such as management). http://www.lamsade.dauphine.fr/sing13/ Paris, France: 5-7 JUL 2017 28 FEB abstract submission deadline
Adami, C., Schossau, J., & Hintze, A. (2016). Evolutionary game theory using agent-based methods. Physics of Life Reviews, 19, 1–26. https://doi.org/10.1016/j.plrev.2016.08.015
An, L. (2012). Modeling human decisions in coupled human and natural systems: Review of agent-based models. Ecological Modelling, 229, 25–36. https://doi.org/10.1016/j.ecolmodel.2011.07.010
Ascough, J. C., Maier, H. R., Ravalico, J. K., & Strudley, M. W. (2008). Future research challenges for incorporation of uncertainty in environmental and ecological decision-making. Ecological Modelling, 219(3–4), 383–399. https://doi.org/10.1016/j.ecolmodel.2008.07.015
Bautista, C., Naves, J., Revilla, E., Fernández, N., Albrecht, J., Scharf, A. K., … Selva, N. (2016). Patterns and correlates of claims for brown bear damage on a continental scale. Journal of Applied Ecology. http://doi.org/10.1111/1365-2664.12708
Bennett, E. M. (2017). Changing the agriculture and environment conversation. Nature Ecology and Evolution, 1(January), 1–2. https://doi.org/10.1038/s41559-016-0018
Bischof, R., Nilsen, E. B., Brøseth, H., Männil, P., Ozoliņš, J., & Linnell, J. D. C. (2012). Implementation uncertainty when using recreational hunting to manage carnivores. Journal of Applied Ecology, 49(4), 824–832. https://doi.org/10.1111/j.1365-2664.2012.02167.x
Bjerketvedt, D. K., Reimers, E., Parker, H., & Borgstrøm, R. (2014). The Hardangervidda wild reindeer herd: a problematic management history. Rangifer, 34(1), 57–72.
Bonabeau, E. (2002). Agent-based modeling: methods and techniques for simulating human systems. Proceedings of the National Academy of Sciences, 99, 7280–7287. https://doi.org/10.1073/pnas.082080899
Bunnefeld, N., & Keane, A. (2014). Managing wildlife for ecological, socioeconomic, and evolutionary sustainability. Proceedings of the National Academy of Sciences, 111(36), 12964–12965. http://doi.org/10.1073/pnas.1413571111
Bunnefeld, N., Hoshino, E., & Milner-Gulland, E. J. (2011). Management strategy evaluation: A powerful tool for conservation? Trends in Ecology and Evolution, 26(9), 441–447. http://doi.org/10.1016/j.tree.2011.05.003
Chollett, I., Garavelli, L., O’Farrell, S., Cherubin, L., Matthews, T. R., Mumby, P. J., & Box, S. J. (2016). A Genuine Win-Win: Resolving the ``Conserve or Catch’’ Conflict in Marine Reserve Network Design. Conservation Letters, 0(0), 1–9. https://doi.org/10.1111/conl.12318
Cobano, J. A., Conde, R., Alejo, D., & Ollero, A. (2011). Path planning method based on Genetic Algorithms and the Monte-Carlo method to avoid aerial vehicle collisions under uncertainties. In Proceedings of the IEEE International Conference on Robotics and Automation (pp. 4429–4434). https://doi.org/10.1109/ICRA.2011.5980246
Colyvan, M., Justus, J., & Regan, H. M. (2011). The conservation game. Biological Conservation, 144(4), 1246–1253. http://doi.org/10.1016/j.biocon.2010.10.028
Duffy, R., St John, F. A. V, Büscher, B., & Brockington, D. (2016). Toward a new understanding of the links between poverty and illegal wildlife hunting. Conservation Biology, 30(1), 14–22. https://doi.org/10.1111/cobi.12622
Elston, D. A., Spezia, L., Baines, D., & Redpath, S. M. (2014). Working with stakeholders to reduce conflict-modelling the impact of varying hen harrier Circus cyaneus densities on red grouse Lagopus lagopus populations. Journal of Applied Ecology, 51(5), 1236–1245. http://doi.org/10.1111/1365-2664.12315
Eythórsson, E., Tombre, I. M., & Madsen, J. (2017). Goose management schemes to resolve conflicts with agriculture: Theory, practice and effects. Ambio, 46(S2), 231–240. https://doi.org/10.1007/s13280-016-0884-4
Farmer, J. D., & Foley, D. (2009). The economy needs agent-based modelling. Nature, 460(August), 685–686. https://doi.org/10.1038/460685a
Franco, C., Hepburn, L. A., Smith, D. J., Nimrod, S., & Tucker, A. (2016). A Bayesian Belief Network to assess rate of changes in coral reef ecosystems. Environmental Modelling and Software, 80, 132–142. https://doi.org/10.1016/j.envsoft.2016.02.029
Hake, M., Mansson, J., & Wiberg, A. (2010). A working model for preventing crop damage caused by increasing goose populations in Sweden. Ornis Svecica, 20(3-4), 225–233.
Hamblin, S. (2013). On the practical usage of genetic algorithms in ecology and evolution. Methods in Ecology and Evolution, 4(2), 184–194. https://doi.org/10.1111/2041-210X.12000
Heinonen, J. P. M., Palmer, S. C. F., Redpath, S. M., & Travis, J. M. J. (2014). Modelling hen harrier dynamics to inform human-wildlife conflict resolution: A spatially-realistic, individual-based approach. PLoS ONE, 9(11). http://doi.org/10.1371/journal.pone.0112492
Hindar, K., Fleming, I. A., McGinnity, P., & Diserud, O. (2006). Genetic and ecological effects of salmon farming on wild salmon: modelling from experimental results. ICES Journal of Marine Science, 63(7), 1234–1247. https://doi.org/10.1016/j.icesjms.2006.04.025
Janssen, M. A., Holahan, R., Lee, A., & Ostrom, E. (2010). Lab experiments for the study of socio-ecological systems. Science, 328, 613–618. http://doi.org/10.1126/science.1229223
Karlsson, S., Diserud, O. H., Fiske, P., & Hindar, K. (2016). Widespread genetic introgression of escaped farmed Atlantic salmon in wild salmon populations. ICES Journal of Marine Science, 0, fsw121. https://doi.org/10.1093/icesjms/fsw121
Liu, Y., Diserud, O. H., Hindar, K., & Skonhoft, A. (2013). An ecological-economic model on the effects of interactions between escaped farmed and wild salmon (Salmo salar). Fish and Fisheries, 14(2), 158–173. http://doi.org/10.1111/j.1467-2979.2012.00457.x
Luo, X., Yang, W., Kwong, C., Tang, J., & Tang, J. (2014). Linear programming embedded genetic algorithm for product family design optimization with maximizing imprecise part-worth utility function. Concurrent Engineering, 22(4), 309–319. https://doi.org/10.1177/1063293X14553068
Man, M., Zhang, Y., Ma, G., Friston, K., & Liu, S. (2016). Quantification of degeneracy in Hodgkin-Huxley neurons on Newman-Watts small world network. Journal of Theoretical Biology, 402, 62–74. http://doi.org/10.1016/j.jtbi.2016.05.004
Manfredo, M. J., Bruskotter, J. T., Teel, T. L., Fulton, D., Schwartz, S. H., Arlinghaus, R., … Sullivan, L. (2016). Why social values cannot be changed for the sake of conservation. Conservation Biology. Accepted. https://doi.org/10.1111/cobi.12855.This
Mansson, J., Nilsson, L., & Hake, M. (2013). Territory size and habitat selection of breeding Common Cranes (Grus grus) in a boreal landscape. Ornis Fennica, 90(2), 65–72.
Marks, R. E. (1992). Breeding hybrid strategies: optimal behaviour for oligopolists. Journal of Evolutionary Economics, 2(1), 17–38. https://doi.org/10.1007/BF01196459
McAvoy, A., & Hauert, C. (2015). Asymmetric evolutionary games. PLoS Computational Biology, 11(8), e1004349. https://doi.org/10.1371/journal.pcbi.1004349
Mccann, R. K., Marcot, B. G., & Ellis, R. (2006). Bayesian belief networks: applications in ecology and natural resource. Canadian Journal of Forest Research, 36, 3053–3062.
Miyasaka, T., Le, Q. B., Okuro, T., Zhao, X., & Takeuchi, K. (2017). Agent-based modeling of complex social–ecological feedback loops to assess multi-dimensional trade-offs in dryland ecosystem services. Landscape Ecology. https://doi.org/10.1007/s10980-017-0495-x
Nellemann, C., Jordhoy, P., Stoen, O. G., & Strand, O. (2000). Cumulative impacts of tourist resorts on wild reindeer (Rangifer tarandus tarandus) during winter. Arctic, 53(1), 9–17. https://doi.org/10.14430/arctic829
Nellemann, C., Vistnes, I., Jordhoy, P., Strand, O., & Newton, A. (2003). Progressive impact of piecemeal infrastructure development on wild reindeer. Biological Conservation, 113(2), 307–317. https://doi.org/10.1016/S0006-3207(03)00048-X
Olaussen, J. O., & Skonhoft, A. (2008). On the economics of biological invasion: An application to recreational fishing. Natural Resource Modeling, 21(4), 625–653. https://doi.org/10.1111/j.1939-7445.2008.00026.x
Rumpff, L., Duncan, D. H., Vesk, P. A., Keith, D. A., & Wintle, B. A. (2011). State-and-transition modelling for Adaptive Management of native woodlands. Biological Conservation, 144(4), 1244–1235. http://doi.org/10.1016/j.biocon.2010.10.026
Strand, O., Nilsen, E. B., Solberg, E. J., & Linnell, J. C. D. (2012). Can management regulate the population size of wild reindeer (Rangifer tarandus) through harvest? Canadian Journal of Zoology, 90, 163–171. http://doi.org/Doi 10.1139/Z11-123
Tilman, A. R., Watson, J. R., & Levin, S. (2016). Maintaining cooperation in social-ecological systems: Theoretical Ecology. https://doi.org/10.1007/s12080-016-0318-8
Tu, M. T., Wolff, E., & Lamersdorf, W. (2000). Genetic algorithms for automated negotiations: a FSM-based application approach. Proceedings 11th International Workshop on Database and Expert Systems Applications, 1029–1033. https://doi.org/10.1109/DEXA.2000.875153
Wam, H. K., Bunnefeld, N., Clarke, N., & Hofstad, O. (2016). Conflicting interests of ecosystem services: Multi-criteria modelling and indirect evaluation to trade off monetary and non-monetary measures. Ecosystem Services.
Wang, P., Poe, G. L., & Wolf, S. A. (2017). Payments for ecosystem services and wealth distribution. Ecological Economics, 132, 63–68. https://doi.org/10.1016/j.ecolecon.2016.10.009
Wright, G. D., Andersson, K. P., Gibson, C. C., & Evans, T. P. (2016). Decentralization can help reduce deforestation when user groups engage with local government. Proceedings of the National Academy of Sciences, 201610650. https://doi.org/10.1073/pnas.1610650114