Title: | Formal Psychological Models of Categorization and Learning |
---|---|
Description: | Formal psychological models of categorization and learning, independently-replicated data sets against which to test them, and simulation archives. |
Authors: | Andy Wills, Lenard Dome, Charlotte Edmunds, Garrett Honke, Angus Inkster, René Schlegelmilch, Stuart Spicer |
Maintainer: | Andy Wills <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.1 |
Built: | 2024-10-29 04:31:36 UTC |
Source: | https://github.com/ajwills72/catlearn |
Formal psychological models, independently-replicated data sets against which to test them, and simulation archives.
For a complete list of functions, use library(help =
"catlearn")
.
For a complete table of simulations, use data(thegrid)
.
All functions are concisely documented, use the help function e.g
?shin92
.
For more detailed documentation, see the references listed by the help documentation.
For a tutorial introduction, see Wills et al. (2016a).
For a guide to contributing to this package, Catlearn Research Group (2016).
Andy Wills
Maintainer: Andy Wills [email protected]
Catlearn Research Group (2016). Contributing to catlearn. http://catlearn.r-forge.r-project.org/intro-catlearn.pdf
Wills, A.J., O'Connell, G., Edmunds, C.E.R. & Inkster, A.B. (2016). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. The Psychology of Learning and Motivation.
Logistic function to convert output activations to rating of outcome probability (see e.g. Gluck & Bower, 1988).
act2probrat(act, theta, beta)
act2probrat(act, theta, beta)
act |
Vector of output activations |
theta |
Scaling constant |
beta |
Bias constant |
The contents of this help file are relatively brief; a more extensive tutorial on using act2probrat can be found in Spicer et al. (n.d.).
The function takes the output activation of a learning model
(e.g. slpRW), and converts it into a rating of the subjective
probability that the outcome will occur. It does this separately for
each activation in the vector act
. It uses a logistic function
to do this conversion (see e.g. Gluck & Bower, 1988, Equation 7). This
function can produce a variety of monotonic mappings from activation
to probability rating, determined by the value set for the two
constants:
theta
is a scaling constant; as its value rises, the function
relating activation to rating becomes less linear and at high values
approximates a step function.
beta
is a bias parameter; it is the value of the output
activation that results in an output rating of P = 0.5. For example,
if you wish an output activation of 0.4 to produce a rated probability
of 0.5, set beta to 0.4.
Returns a vector of probability ratings.
As this function returns probabilities, the numbers returned are always in the range 0-1. If the data you are fitting use a different range, convert them. For example, if your data are ratings on a 0-10 scale, divide them by 10. If your data are something other than probability estimates (e.g. you asked participants to use negative ratings to indicate preventative relationships), don't use this function unless you are sure it is doing what you intend.
Andy Wills
Gluck, M.A. & Bower, G.H. (1988). From conditioning to category learning: An adaptive network model. Journal of Experimental Psychology: General, 117, 227-247.
Spicer, S., Jones, P.M., Inkster, A.B., Edmunds, C.E.R. & Wills, A.J. (n.d.). Progress in learning theory through distributed collaboration: Concepts, tools, and examples. Manuscript in preparation.
Changes a nominal-dimension input representation (e.g. 3 1 2) into a
padded representation (e.g. 001 100 010). This form out input
representation is required by e.g. slpSUSTAIN
.
convertSUSTAIN(input, dims)
convertSUSTAIN(input, dims)
input |
A matrix containing the nominal-dimension input representation. Each row is a trial and each column is a stimulus dimension. |
dims |
A vector of the number of nominal values for each
dimension. For example, if there are three dimensions with three,
one and two possible values, then |
Returns a matrix containing the padded input representation.
Lenard Dome, Andy Wills
## Create a dummy training matrix with two dimensions. The first ## two dimensions have two possible nominal values, while the ## third and fourth have three possible nominal values. dummy <- cbind(matrix(sample(1:2, 20, replace=TRUE), ncol = 2), matrix(sample(1:3, 20, replace=TRUE), ncol = 2)) ## Specify the number of nominal spaces for each dimension dims <- c(2, 2, 3, 3) ## Convert the input representation into a binary padded representation convertSUSTAIN(dummy, dims)
## Create a dummy training matrix with two dimensions. The first ## two dimensions have two possible nominal values, while the ## third and fourth have three possible nominal values. dummy <- cbind(matrix(sample(1:2, 20, replace=TRUE), ncol = 2), matrix(sample(1:3, 20, replace=TRUE), ncol = 2)) ## Specify the number of nominal spaces for each dimension dims <- c(2, 2, 3, 3) ## Convert the input representation into a binary padded representation convertSUSTAIN(dummy, dims)
In some category-learning experiments, category members are distortions of an underlying base pattern. Where this is the case, 'category breadth' refers to the magnitude of such distortions. Broad categories take longer to learn than narrow categories. Once trained to an errorless criterion, the effect of category breadth on performance on novel items depends on category size. For small categories, narrow categories are better than broad ones. For larger categries, the reverse is true. Homa & Vosburgh (1976) provide the data for this CIRP.
data(homa76)
data(homa76)
A data frame with the following columns:
Experimental phase (within-subjects). Takes values : 'train','imm'. The training phase is 'train', 'imm' is the immediate test phase.
Category breadth (between-subjects). Takes values : 'mixed', 'uni-low'
Stimulus type (within-subjects). Takes values : 'proto', 'low', 'med', 'high', 'old-low', 'old-med', 'old-high', 'rand'. All refer to novel stimuli in the test phase, except those beginning 'old-', which are stimuli from the training phase presented during the test phase. 'low', 'med', 'high' refer to distortion level. 'proto' are prototypes. 'rand' are a set of 10 random stimuli, generated from prototypes unrelated to those used in training. These random stimuli are not mentioned in the Method of the paper, but are mentioned in the Results section - they are presented at the end of the test session. Empty cell for training phase.
Category size (within-subjects). Takes values : 3, 6, 9. NA for training phase, where category size is not a meaningful variable given that the DV is blocks to criterion. Also NA for old stimuli; Homa & Vosburgh's (1976) Results section collapses across category size for old stimuli
For test phases: probability of a correct response, except for random stimuli, where 'val' is the probability with which the random stimuli were placed into the specified category. For training phase: number of blocks to criterion
Wills et al. (n.d.) discuss the derivation of this CIRP. In brief, the effects have been independently replicated. Homa & Vosburgh (1976) was selected as the only experiment to contain all three independently replicated effects.
Homa & Vosburgh's experiment involved the supervised classification of nine-dot random dot patterns. Stimuli had three different levels of distortion from the prototype - low (3.5 bit), medium (5.6 bit), and high (7.7 bit). There were three categories in training, one with 3 members, one with 6 members, and one with 9 members. Participants were either trained on stimuli that were all low distortion (narrow categories), or on an equal mix of low, medium, and high distortion stimuli (broad categories). Training was to an errorless criterion. The test phase involved the presentation of the prototypes, old stimuli, and novel stimuli of low, medium, and high distortion.
The data for the prototype, and other novel test stimuli, were estimated
from Figure 1 of Homa & Vosburgh (1976), using plot
digitizer
(Huwaldt, 2015). The data for old stimuli were estimated from
Figure 3, using the same procedure. The data for the training phase,
and for random stimuli, were reported in the text of Homa & Vosburgh
(1976) and are reproduced here. All data are averages across participants.
Homa & Vosburgh's (1976) experiment also includes results for further test phases, delayed by either 1 week, or 10 weeks, from the day of training. These data are not the focus of this category breadth CIRP and have not been included.
Homa, D. & Vosburgh, R. (1976). Category breadth and the abstraction of prototypical information. Journal of Experimental Psychology: Human Learning and Memory, 2, 322-330.
Huwaldt, J.A. (2015). Plot Digitizer [software]. https://plotdigitizer.sourceforge.net/
Wills et al. (n.d.). Benchmarks for category learning. Manuscript in preparation.
In the inverse base-rate effect, participants are trained that a compound of two cues (I + PC) leads to a frequently-occurring outcome (C), while another two-cue compound (I + PR) leads to a rarely-occuring outcome (R). The key results are that, at test, participants tend to respond 'C' to cue I on its own, but 'R' to the cue compound (PC + PR). This latter response is striking because PC and PR had been perfectly predictive of diseases C and R respectively, and disease C is more common, so the optimal response to PC + PR is 'C'. Participants respond in opposition to the underlying disease base rates.
data(krus96)
data(krus96)
A data frame with the following columns:
Symptom presented. Take values: I, PC, PR, PC+PR, I+PC+PR, I+PCo, I+PRo, PC+PRo, I+PC+PRo, as defined by Kruschke (1996).
Response made. Takes values: C, R, Co, Ro, as defined by Kruschke (1996).
Mean probability of response, averaged across participants.
Wills et al. (n.d.) discuss the classification of these data as a Auxilliary Phenomenon, rather than a CIRP (Canonical Independently Replicated Phenomenon). In brief, these particular results have been independently replicated, but are arguably not the best exemplar of the known phenomena in this area (in particular, they lack a demonstration of the shared-cue effect in IBRE). Auxilliary Phenomena may be included in catlearn if are the subject of a simulation archived in catlearn.
The data are from Experiment 1 of Kruschke (1996), which involved the diagnosis of hyopthetical diseases (F, G, H, J) on the basis of symptoms presented as text (e.g. "ear aches, skin rash"). Participants were trained with feedback across 15 blocks of 8 trials each. They were then tested without feedback on 18 test stimuli, each presented twice.
The data are as shown in Table 2 of Kruschke (1996). The data are mean response probabilities for each stimulus in the test phase, averaged across the two presentations of the stimulus, the two copies of the abstract design, and across participants.
Andy J. Wills, René Schlegelmilch
Kruschke, J.K. (1996). Base rates in category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 3-26.
Wills et al. (n.d.). Benchmarks for category learning. Manuscript in preparation.
Runs a simulation of the krus96
AP using the
slpEXIT
model implementation and
krus96train
as the input representation.
krus96exit (params = c(2.87, 2.48, 4.42, 4.42, .222, 1.13, .401))
krus96exit (params = c(2.87, 2.48, 4.42, 4.42, .222, 1.13, .401))
params |
A vector containing values for c, P, phi, l_gain,
l_weight, l_ex, and sigma_bias (i.e. the sigma for the bias unit), in
that order. See |
A simulation using slpEXIT
and
krus96train
. The stored exemplars are the four stimuli
present during the training phase, using the same representation as in
krus96train
.
Other parameters of slpEXIT are set as follows: iterations
=
10, sigma for the non-bias units = 1. These values are conventions of
modeling with EXIT, and should not be considered as free
parameters. They are set within the krus96exit
function,
and hence can't be changed without re-writing the function.
This simulation is discussed in Spicer et al. (n.d.). It produces the same response probabilities (within rounding error) as the simulation reported in Kruschke (2001), with the same parameters.
56 simulated participants are used in this simulation, the same number as used by Kruschke (2001). Kruschke reports using the same trial randomizations as used for his 56 real participants. These randomizations were not published, so it we couldn't reproduce that part of his simulation. It turns out that the choice of set of 56 randomizations matters, it affects some of the predicted response probabilities. We chose a random seed that reproduced Kruschke's response probabilities to within rounding error. As luck would have it, Kruschke's reported response probabilities (and hence this simulation) are the same (within rounding error) as the results of large sample (N = 500) simulations we have run.
A matrix of predicted response probabilities, in the same order and
format as the observed data contained in krus96
.
René Schlegelmilch, Andy Wills
Kruschke, J. K. (2001). The inverse base rate effect is not explained by eliminative inference. Journal of Experimental Psychology: Learning, Memory & Cognition, 27, 1385-1400.
Spicer, S.G., Schlegelmilch, R., Jones, P.M., Inkster, A.B., Edmunds, C.E.R. & Wills, A.J. (n.d.). Progress in learning theory through distributed collaboration: Concepts, tools, and examples. Manuscript in preparation.
Create randomized training blocks for AP krus96
, in a format
suitable for the slpEXIT
model, and other models that use the
same input representation format.
krus96train(blocks = 15, subjs = 56, ctxt = TRUE, seed = 1)
krus96train(blocks = 15, subjs = 56, ctxt = TRUE, seed = 1)
blocks |
Number of training blocks to generate. Omit this
argument to get the same number of blocks (15) as used in
|
subjs |
Number of simulated subjects to be run. |
ctxt |
If |
seed |
Sets the random seed. |
A data frame is produced, with one row for each trial, and with the following columns:
ctrl
- Set to 1 (reset model) for trial 1 of each simulated
subject, set to zero (normal trial) for all other training trials, and
set to 2 for test trials (i.e. those with no feedback).
block
- training block
stim
- Stimulus code, as described in Kruschke (1996).
x1, x2, ...
- symptom representation. Each column represents
one symptom, in the order I1, PC1, PR1, I2, PC2, PR2, context. 1 =
symptom present, 0 = symptom absent
t1, t2, ...
- Disease representation. Each column represents
one disease, in the order C1, R1, C2, R2. 1 = disease present. 0 =
disease absent.
Although the trial ordering is random, a random seed is used, so multiple calls of this function with the same parameters should produce the same output. This is usually desirable for reproducibility and stability of non-linear optimization. To get a different order, use the seed argument to set a different seed.
This routine was originally developed to support Wills et al. (n.d.).
A data frame, where each row is one trial, and the columns contain model input.
René Schlegelmilch, Andy Wills
Kruschke, J.K. (1996). Base rates in category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 3-26.
Wills et al. (n.d.). Benchmarks for category learning. Manuscript in preparation.
Creates randomized training blocks for Experiment 1 in Medin et
al. (1987), in a format that is suitable for slpALCOVE
,
slpSUSTAIN
, and other models that use either of those
input-representation formats.
medin87train(blocks = 2, subjs = 2, seed = 7649, missing = 'pad')
medin87train(blocks = 2, subjs = 2, seed = 7649, missing = 'pad')
subjs |
Number of simulated participants to run. |
blocks |
Number of blocks to generate. The ten trial types are randomized within each block. |
seed |
Set random seed. |
missing |
If set to 'geo', output missing dimension flags (see below). If set to 'pad', use the padded stimulus representation format of slpSUSTAIN. |
A matrix is produced, with one row for each trial, and with the following columns:
ctrl
- Set to 4 on the first trial for each participant - 4 resets
the model to the initial state and does unsupervised learning afterwards.
Set to 3 for unsupervised trials - normal unsupervised learning
trial.
blk
- Training block.
stim
- Stimulus number, ranging from 1 to 10. The numbering scheme
is the same as in Medin et al. (1987, Fig. 1).
x1, x2, ...
- input representation. Where
missing='geo'
, x1, x2, and x3 are returned, each set at 1 or
0. This is the binary dimensional representation required by models
such as slpALCOVE, where e.g. x2 is the value on the second
dimension. Where missing='pad'
, w1, w2, x1, x2, y1, y2, z1,
z2, are returned. This is the padded represenation required by
models such as slpSUSTAIN; e.g. y1 and y2 represent the two possible
values on dimension 3, so if y1 is black, y2 is white, and the
stimulus is white, then [y1, y2] = [0, 1].
Although the trial ordering is random, a random seed is used, so multiple calls of this function with the same parameters should produce the same output. This is usually desirable for reproducibility and stability of non-linear optimization. To get a different order, use the seed argument to set a different seed.
R by C matrix, where each row is one trial, and the columns contain model input.
Lenard Dome, Andy Wills
Medin, D. L., Wattenmaker, W. D., & Hampson, S. E. (1987). Family resemblance, conceptual cohesiveness, and category construction. Cognitive Psychology, 19(2), 242–279.
Instantiation frequency is the number of times a stimulus has been observed as a member of a specific category (Barsalou, 1985). Increasing instantiation frequency of a stimulus increases categorization accuracy for that stimulus ('direct' effect), and for other similar stimuli ('indirect' effect). Experiment 1 of Nosofsky (1988) provides the data for this CIRP.
data(nosof88)
data(nosof88)
A data frame with the following columns:
Experimental condition, see 'details'. 1 = 'B', 2 = 'E2', 3 = 'E7'
Stimulus number, see Nosofsky (1988), Figure 1. Takes values: 1-12
Mean probability, across participants, of responding that the item belongs to category 2.
Wills et al. (n.d.) discuss the derivation of this CIRP. In brief,
both the direct and indirect effects have been independently
replicated. Experiment 1 of Nosofsky (1988) was selected due to the
availability of a multidimensional scaling solution for the stimuli,
see nosof88train
.
Experiment 1 of Nosofsky(1988) involved the classification of Munsell chips of fixed hue (5R) varying in brightness (value) and saturation (chroma). Instantiation frequency was manipulated between subjects. In condition B, all stimuli were equally frequent. In condition E2 (E7), stimulus 2 (7) was approximately five times as frequent as each of the other stimuli. In condition E2 (E7), stimulus 4 (9) indexes the indirect effect. There were three blocks of training. Block length was 48 trials for condition B and 63 trials for conditions E2 and E7. The training phase was followed by a transfer phase, which is not included in this CIRP (see Nosofsky, 1988, for details).
The data are as shown in Table 1 of Nosofsky (1988). The data are mean response probabilities for each stimulus in the training phase, averaged across blocks and participants.
Andy J. Wills [email protected]
Nosofsky, R.M. (1988). Similarity, frequency, and category representations, Journal of Experimental Psychology: Learning, Memory and Cognition, 14, 54-65.
Barsalou, L.W. (1985). Ideals, central tendency, and frequency of instantiation as determinants of graded structure in categories. Journal of Experimental Psychology: Learning, Memory & Cognition, 11, 629-654.
Wills et al. (n.d.). Benchmarks for category learning. Manuscript in preparation.
Runs a simulation of the nosof88
CIRP using the
slpALCOVE
model implementation as an exemplar model and
nosof88train
as the input representation.
nosof88exalcove(params = NULL)
nosof88exalcove(params = NULL)
params |
A vector containing values for c, phi, la, and lw, in
that order, e.g. params = c(2.1, 0.6, 0.09, 0.9). See
|
An exemplar-based simulation using slpALCOVE
and
nosof88train
. The co-ordinates for the radial-basis units
are taken from the multdimensional scaling solution for these stimuli
reported by Nosofsky (1987).
Other parameters of slpALCOVE are set as follows: r
= 2,
q
= 1, initial alpha
= 1 / (number of input dimensions),
initial w
= 0. These values are conventions of modeling with
ALCOVE, and should not be considered as free parameters. They are set
within the nosof88exalcove
function, and hence can't be changed
without re-writing the function.
This simulation is reported in Wills & O'Connell (n.d.).
A matrix of predicted response probabilities, in the same order and
format as the observed data contained in nosof88
.
Andy Wills & Garret O'Connell
Nosofsky, R.M. (1987). Attention and learning processes in the identification and categorization of integral stimuli, Journal of Experimental Psychology: Learning, Memory and Cognition, 13, 87-108.
Wills, A.J. & O'Connell (n.d.). Averaging abstractions. Manuscript in preparation.
nosof88
, nosof88oat
,
nosof88train
, slpALCOVE
Uses nosof88exalcove
to find best-fitting parameters for
the ex-ALCOVE model for the nosof88
CIRP.
nosof88exalcove_opt(recompute = FALSE)
nosof88exalcove_opt(recompute = FALSE)
recompute |
When set to TRUE, the function re-runs the optimization. When set to FALSE, the function returns a stored copy of the results of the optimization. |
This function is an archive of the optimization procedure used to
derive the best-fitting parameters for the nosof88exalcove
simulation; see Spicer et al. (2017) for a tutorial introduction to
the concept of simulation archives.
Optimization used the L-BFGS-B method from the optim
function of the standard R stats
package. The objective
function was sum of squared errors. Please inspect the source code for
further details (e.g. type nosof88exalcove_opt
). The
optimization was repeated for 16 different sets of starting values.
Where recompute = TRUE
, the function can take many hours to
run, depending on your system, and there is no progress bar. You can
use Task Manager (Windows) or equivalent if you want some kind of
visual feedback that the code is working hard. The code uses all the
processor cores on the local machine, so speed of execution is a
simple function of clock speed times processor cores. So, for example,
a 4 GHz i7 processor (8 virutal cores) will take a quarter of the time
to run this compared to a 2 GHz i5 processor (4 virtual cores).
A vector containing the best-fitting values for c, phi, la, and lw, in
that order. See slpALCOVE
for an explanation of these
parameters.
Andy Wills
Spicer, S., Jones, P.M., Inkster, A.B., Edmunds, C.E.R. & Wills, A.J. (2017). Progress in learning theory through distributed collaboration: Concepts, tools, and examples. Manuscript in preparation.
Tests whether a model output passes the ordinal adequacy criteria for
the nosof88
CIRP.
nosof88oat(dta, xtdo=FALSE)
nosof88oat(dta, xtdo=FALSE)
dta |
Matrix containing model output. The matrix must have the
same format, row order, and column names, as |
xtdo |
eXTenDed Output: Either |
This function implements the Wills & O'Connell (n.d.) ordinal adequacy
tests for the nosof88
CIRP. Specifically, a model passes
this test if it passes all four component tests: 1. E2(2) > B(2), 2.
E7(7) > B(7), 3. E2(4) > B(4), 4. E7(9) > B(9). These tests refer to
classification accuracy for particular stimuli in particular
experimental conditions. For example, E7(9) indicates stimulus 9 in
experimental condition E7.
Alternatively, by setting xtdo
to TRUE
, this function
returns the summary model predictions reported by Wills & O'Connell
(2016).
Where xtdo=FALSE
, this function returns TRUE if the ordinal
adequacy tests are passed, and FALSE otherwise.
Where xtdo=TRUE
, this function returns a summary matrix. The
columns are stimulus numbers. The rows ('B','E') indicate the baseline
(equal frequency) condition ('B') and the experimental conditions ('E2'
or 'E7', depending on the column).
Andy Wills and Garret O'Connell
Wills, A.J. & O'Connell (n.d.). Averaging abstractions. Manuscript in preparation.
Runs a simulation of the nosof88
CIRP using the
slpALCOVE
model implementation as a prototype model and
nosof88train
as the input representation.
nosof88protoalcove(params = NULL)
nosof88protoalcove(params = NULL)
params |
A vector containing values for c, phi, la, and lw, in
that order, e.g. params = c(2.1, 0.6, 0.09, 0.9). See
|
An prototype-based simulation using slpALCOVE
and
nosof88train
. There is one radial-basis unit for each
category, representing the prototype. These prototypes are calculated
by taking the mean of the co-ordinates of the stimuli in a category,
with the stimulus co-ordinates coming from the multdimensional
scaling solution reported by Nosofsky (1987). The calculations of the
means are weighted by the instantiation frequency of the
stimuli. Hence, the prototypes for each condition of the experiment
are different.
Other parameters of slpALCOVE are set as follows: r
= 2,
q
= 1, initial alpha
= 1 / (number of input dimensions),
initial w
= 0. These values are conventions of modeling with
ALCOVE, and should not be considered as free parameters. They are set
within the nosof88protoalcove
function, and hence can not be
changed without re-writing the function.
This simulation is reported in Wills & O'Connell (n.d.).
A matrix of predicted response probabilities, in the same order and
format as the observed data contained in nosof88
.
Andy Wills & Garret O'Connell
Nosofsky, R.M. (1987). Attention and learning processes in the identification and categorization of integral stimuli, Journal of Experimental Psychology: Learning, Memory and Cognition, 13, 87-108.
Wills, A.J. & O'Connell (n.d.). Averaging abstractions. Manuscript in preparation.
nosof88
, nosof88oat
,
nosof88train
, slpALCOVE
Uses nosof88protoalcove
to find best-fitting parameters
for the ex-ALCOVE model for the nosof88
CIRP.
nosof88protoalcove_opt(recompute = FALSE)
nosof88protoalcove_opt(recompute = FALSE)
recompute |
When set to TRUE, the function re-runs the optimization. When set to FALSE, the function returns a stored copy of the results of the optimization. |
This function is an archive of the optimization procedure used to
derive the best-fitting parameters for the
nosof88protoalcove
simulation; see Spicer et al. (2017)
for a tutorial introduction to the concept of simulation archives.
Optimization used the L-BFGS-B method from the optim
function of the standard R stats
package. The objective
function was sum of squared errors. Please inspect the source code for
further details (e.g. type nosof88protoalcove_opt
). The
optimization was repeated for 16 different sets of starting values.
Where recompute = TRUE
, the function can take many hours to
run, depending on your system, and there is no progress bar. You can
use Task Manager (Windows) or equivalent if you want some kind of
visual feedback that the code is working hard. The code uses all the
processor cores on the local machine, so speed of execution is a
simple function of clock speed times processor cores. So, for example,
a 4 GHz i7 processor (8 virutal cores) will take a quarter of the time
to run this compared to a 2 GHz i5 processor (4 virtual cores).
A vector containing the best-fitting values for c, phi, la, and lw, in
that order. See slpALCOVE
for an explanation of these
parameters.
Andy Wills
Spicer, S., Jones, P.M., Inkster, A.B., Edmunds, C.E.R. & Wills, A.J. (2017). Progress in learning theory through distributed collaboration: Concepts, tools, and examples. Manuscript in preparation.
Create randomized training blocks for CIRP nosof88
, in a
format suitable for the slpALCOVE
model, and any other
model that uses the same input representation format. The stimulus
co-ordinates come from a MDS solution reported by Nosofsky (1987) for
the same stimuli.
nosof88train(condition = 'B', blocks = 3, absval = -1, subjs = 1, seed = 4182, missing = 'geo')
nosof88train(condition = 'B', blocks = 3, absval = -1, subjs = 1, seed = 4182, missing = 'geo')
condition |
Experimental condition 'B', 'E2', or 'E7', as defined by Nosofsky (1988). |
blocks |
Number of blocks to generate. Omit this argument to get the same number of blocks as the published study (3). |
absval |
Teaching value to be used where category is absent. |
subjs |
Number of simulated subjects to be run. |
seed |
Sets the random seed |
missing |
If set to 'geo', output missing dimension flags (see below) |
A matrix is produced, with one row for each trial, and with the following columns:
ctrl
- Set to 1 (reset model) for trial 1, set to zero (normal
trial) for all other trials.
cond
- 1 = condition B, 2 = condition E2, 3 = condition E7
blk
- training block
stim
- stimulus number (as defined by Nosofsky, 1988)
x1, x2
- input representation. These are the co-ordinates of an
MDS solution for these stimuli (see Nosofsky, 1987).
t1, t2
- teaching signal (1 = category present, absval = category
absent)
m1, m2
- Missing dimension flags (always set to zero in this
experiment, indicating all input dimensions are present on all
trials). Only produced if missing = 'geo'
.
Although the trial ordering is random, a random seed is used, so multiple calls of this function with the same parameters should produce the same output. This is usually desirable for reproducibility and stability of non-linear optimization. To get a different order, use the seed argument to set a different seed.
This implementation assumes a block length of 64 trials for conditions E2 and E7, rather than the 63 trials reported by Nosofsky (1988).
This routine was originally developed to support simulations reported in Wills & O'Connell (n.d.).
R by C matrix, where each row is one trial, and the columns contain model input.
Andy Wills & Garret O'Connell
Nosofsky, R.M. (1987). Attention and learning processes in the identification and categorization of integral stimuli, Journal of Experimental Psychology: Learning, Memory and Cognition, 13, 87-108.
Nosofsky, R.M. (1988). Similarity, frequency, and category representations, Journal of Experimental Psychology: Learning, Memory and Cognition, 14, 54-65.
Wills, A.J. & O'Connell (n.d.). Averaging abstractions. Manuscript in preparation.
nosof88
, nosof88oat
, slpALCOVE
Shepard et al. (1961) stated that where there are two, equal-sized categories constructed from the eight stimuli it is possible to produce from varying three binary stimulus dimensions, there are only six logically distinct category structures. Shepard et al. (1961) labeled these structures as Types I through VI (see e.g. Nosofsky et al., 1994, Figure 1, for details). The CIRP concerns the relative difficulty of learning these category structures, as indexed by classification accuracy. The result, expressed in terms of accuracy, is:
I > II > [III, IV, V] > VI
The experiment reported by Nosofsky et al. (1994) provides the data for this CIRP.
data(nosof94)
data(nosof94)
A data frame with the following columns:
Type of category structure, as defined by Shepard et al. (1961). Takes values : 1-6
Training block. Takes values: 1-16
Mean error probability, averaged across participants
Wills et al. (n.d.) discuss the derivation of this CIRP. In
brief, the effect has been independently replicagted. Nosofsky et
al. (1994) was selected as the CIRP because it had acceptable sample
size (N=40 per Type), and included simulations of the results with a
number of different formal models. Inclusion of this dataset in
catlearn
thus permits a validation of catlearn
model
implementations against published simulations.
In Nosofsky et al. (1994) the stimuli varied in shape (squares or triangles), type of interior line (solid or dotted), and size (large or small). Each participant learned two problems. Each problem was trained with feedback, to a criterion of four consecutive sub-blocks of eight trials with no errors, or for a maximum of 400 trials.
The data are as shown in the first 16 rows of Table 1 of Nosofsky et al. (1994). Only the first 16 blocks are reported, for comparability with the model fitting reported in that paper. Where a participant reached criterion before 16 blocks, Nosofsky et al. assumed they would have made no further errors if they had continued.
Andy J. Wills [email protected]
Nosofsky, R.M., Gluck, M.A., Plameri, T.J., McKinley, S.C. and Glauthier, P. (1994). Comparing models of rule-based classification learning: A replication and extension of Shepaard, Hovland, and Jenkins (1961). Memory and Cognition, 22, 352-369.
Shepard, R.N., Hovland, C.I., & Jenkins, H.M. (1961). learning and memorization of classifications. Psychological Monographs, 75, Whole No. 517.
Wills et al. (n.d.). Benchmarks for category learning. Manuscript in preparation.
Runs a simulation of the nosof94
CIRP using the
slpALCOVE
model implementation as an exemplar model and
nosof94train
as the input representation. This
simulation replicates the one reported by Nosofsky et al. (1994).
nosof94bnalcove(params = c(6.33,0.011,0.409,0.179))
nosof94bnalcove(params = c(6.33,0.011,0.409,0.179))
params |
A vector containing values for c, phi, la, and lw, in
that order. See |
An exemplar-based simulation using slpALCOVE
and
nosof94train
. The co-ordinates for the radial-basis units
are assumed, and use the same binary representation as the abstract
category structure.
The defaults for params
are the best fit of the model to the
nosof94
CIRP. The derivation of this fit is described by
Nosofsky et al. (1994).
The other parameters of slpALCOVE are set as follows: r
= 1,
q
= 1, initial alpha
= 1 / number of dimensions, initial
w
= 0. These values are conventions of modeling with ALCOVE, and
should not be considered as free parameters. They are set within the
nosof88bnalcove
function, and hence can't be changed without
re-writing the function.
This is a replication of the simulation reported by Nosofsky et al. (1994). Compared to other published simulations with the ALCOVE model, their simulation is non-standard in a number of respects:
1. A background noise ('BN') decision rule is used (other simulations use an exponential ratio rule).
2. As a consequence of #1, absence of a category label is represented by a zero (other simulations use -1).
3. The sum of the attentional weights is constrained to be 1 on every trial (other simulations do not apply this constraint).
The current simulation replicates these non-standard aspects of the Nosofsky et al. (1994) simulation.
A matrix of predicted response probabilities, in the same order and
format as the observed data contained in nosof94
.
Andy Wills
Nosofsky, R.M., Gluck, M.A., Plameri, T.J., McKinley, S.C. and Glauthier, P. (1994). Comparing models of rule-based classification learning: A replication and extension of Shepaard, Hovland, and Jenkins (1961). Memory and Cognition, 22, 352–369
nosof94
, nosof94oat
,
nosof94train
, slpALCOVE
,
nosof94bnalcove
Runs a simulation of the nosof94
CIRP using the
slpALCOVE
model implementation as an exemplar model and
nosof94train
as the input representation.
nosof94exalcove(params = NULL)
nosof94exalcove(params = NULL)
params |
A vector containing values for c, phi, la, and lw, in
that order, e.g. params = c(2.1, 0.6, 0.09, 0.9). See
|
N.B.: This simulation uses a standard version of ALCOVE. For a
replication of the ALCOVE simulation of these data reported by
Nosofsky et al. (1994), which is non-standard in a number of respects,
see nosof94bnalcove
.
An exemplar-based simulation using slpALCOVE
and
nosof94train
. The co-ordinates for the radial-basis units
are assumed, and use the same binary representation as the abstract
category structure.
Other parameters of slpALCOVE are set as follows: r
= 1,
q
= 1, initial alpha
= 1/3, initial w
= 0. These
values are conventions of modeling with ALCOVE, and should not be
considered as free parameters. They are set within the
nosof88exalcove
function, and hence can't be changed without
re-writing the function.
This simulation is reported in Wills & O'Connell (n.d.).
A matrix of predicted response probabilities, in the same order and
format as the observed data contained in nosof94
.
Andy Wills
Nosofsky, R.M., Gluck, M.A., Plameri, T.J., McKinley, S.C. and Glauthier, P. (1994). Comparing models of rule-based classification learning: A replication and extension of Shepaard, Hovland, and Jenkins (1961). Memory and Cognition, 22, 352–369
Wills, A.J. & O'Connell (n.d.). Averaging abstractions. Manuscript in preparation.
nosof94
, nosof94oat
,
nosof94train
, slpALCOVE
,
nosof94bnalcove
Uses nosof94exalcove
to find best-fitting parameters for
the ex-ALCOVE model for the nosof94
CIRP.
nosof94exalcove_opt(recompute = FALSE, xtdo = FALSE)
nosof94exalcove_opt(recompute = FALSE, xtdo = FALSE)
recompute |
When set to TRUE, the function re-runs the optimization. When set to FALSE, the function returns a stored copy of the results of the optimization. |
xtdo |
eXTenDed Output; where set to TRUE, some further details of the optimization procedure are printed to the console. |
This function is an archive of the optimization procedure used to
derive the best-fitting parameters for the nosof94exalcove
simulation; see Spicer et al. (2017) for a tutorial introduction to
the concept of simulation archives.
Optimization used the L-BFGS-B method from the optim
function of the standard R stats
package. The objective
function was sum of squared errors. Please inspect the source code for
further details (e.g. type nosof94exalcove_opt
). The
optimization was repeated for 15 different sets of starting values.
Where recompute = TRUE
, the function can take many hours to
run, depending on your system, and there is no progress bar. You can
use Task Manager (Windows) or equivalent if you want some kind of
visual feedback that the code is working hard. The code uses all the
processor cores on the local machine, so speed of execution is a
simple function of clock speed times processor cores. So, for example,
a 4 GHz i7 processor (8 virutal cores) will take a quarter of the time
to run this compared to a 2 GHz i5 processor (4 virtual cores).
A vector containing the best-fitting values for c, phi, la, and lw, in
that order. See slpALCOVE
for an explanation of these
parameters.
Andy Wills
Spicer, S., Jones, P.M., Inkster, A.B., Edmunds, C.E.R. & Wills, A.J. (2017). Progress in learning theory through distributed collaboration: Concepts, tools, and examples. Manuscript in preparation.
Tests whether a model output passes the ordinal adequacy criteria for
the nosof94
CIRP.
nosof94oat(dta, xtdo=FALSE)
nosof94oat(dta, xtdo=FALSE)
dta |
Matrix containing model output. The matrix must have the
same format, row order, and column names, as |
xtdo |
eXTenDed Output: Either |
This function implements a standard ordinal adequacy test for the
nosof94
CIRP. Specifically, a model passes this test if
the mean errors (averaged across blocks), obey the following:
I < II < [III, IV, V] < VI
Note that '[III, IV, V]' indicates that the these three problems can be in any order of difficulty (or all be of equal difficulty), as long as all three are harder than Problem 2 and all three are easier than Problem 6.
Alternatively, by setting xtdo
to TRUE
, this function
returns the mean classification error by Problem type.
Where xtdo=FALSE
, this function returns TRUE if the ordinal
adequacy tests are passed, and FALSE otherwise.
Where xtdo=TRUE
, this function returns a summary matrix,
containing mean errors (across blocks) for each of the six problem
types.
Andy Wills
Produce a line graph similar to that shown in Nosofsky et al. (1994, Figures 1, 6-9).
nosof94plot(results,title = 'Nosofsky et al. (1994)')
nosof94plot(results,title = 'Nosofsky et al. (1994)')
results |
Mean error probability by block and problem, in the same format
as data set |
title |
Title to appear at top of plot |
Andy Wills
Nosofsky, R.M., Gluck, M.A., Plameri, T.J., McKinley, S.C. and Glauthier, P. (1994). Comparing models of rule-based classification learning: A replication and extension of Shepaard, Hovland, and Jenkins (1961). Memory and Cognition, 22, 352–369.
data(nosof94) nosof94plot(nosof94)
data(nosof94) nosof94plot(nosof94)
Runs a simulation of the nosof94
CIRP using the
slpSUSTAIN
model implementation and
nosof94train
as the input representation.
nosof94sustain(params = c(9.01245, 1.252233, 16.924073, 0.092327))
nosof94sustain(params = c(9.01245, 1.252233, 16.924073, 0.092327))
params |
A vector containing values for r, beta, d, and eta, in
that order, e.g. params = c(8.1, 1.5, 9.71, 0.8). See
|
NOTE: The underlying slpSUSTAIN function is currently written in R, and hence this simulation will take several minutes to run. slpSUSTAIN may be converted to C++ in a future release, which will reduce the run time of this simulation to a few seconds.
A simulation using slpSUSTAIN
and
nosof94train
, i.e. a simulation of Nosofsky et al. (1994)
with the Love et al. (2004) SUTAIN model.
Other parameters of slpSUSTAIN are set as follows: tau
= 0,
initial lambda
= 1, initial w
= 0, inital cluster
centered on the first stimulus presented to the siumulated
subject. These values are conventions of modelling with SUSTAIN, and
should not be considered as free parameters. They are set within the
nosof94sustain
function, and hence can't be changed without
re-writing the function.
The simulation uses 100 simulated subjects. Like the simulations
nosof94exalcove
and nosof94protoalcove
, all simulated
participants complete 16 blocks of training. This differs from the
Nosofsky et al. (1994) experiment, in which participants are trained to
a criterion of four consecutive errorless 8-trial subblocks.
The simulation by Gureckis (2014) builds this criterion-based training
into their simulation by using a random number generator to turn the
response probability on each trial into a correct or incorrect
response. This feature of the Gureckis (2014) simulation is not
incorporated here, because the instability in ouput this generates
makes parameter optimization (e.g. via optim
) less reliable.
A comparison of 10,000 simulated participants in the Gureckis (2014) simulation with 1,000 simulated participants in the current simulation reveals a mean difference in the 96 reported response probabilities of less than 0.01.
A matrix of predicted response probabilities, in the same order and
format as the observed data contained in nosof94
.
Lenard Dome, Andy Wills
Love, B. C., Medin, D. L., & Gureckis, T. M. (2004). SUSTAIN: a network model of category learning. Psychological Review, 111, 309-332.
Gureckis, T. M. (2014). sustain_python. https://github.com/NYUCCL/sustain_python
Nosofsky, R.M., Gluck, M.A., Plameri, T.J., McKinley, S.C. and Glauthier, P. (1994). Comparing models of rule-based classification learning: A replication and extension of Shepaard, Hovland, and Jenkins (1961). Memory and Cognition, 22, 352–369.
nosof94
, nosof94oat
,
nosof94train
, slpALCOVE
,
nosof94bnalcove
Create randomized training blocks for CIRP nosof94
, in a format
suitable for the slpALCOVE
or slpSUSTAIN
models, and
other models that use the same input representation formats.
nosof94train(cond = 1, blocks = 16, absval = -1, subjs = 1, seed = 7624, missing = 'geo', blkstyle = 'accurate')
nosof94train(cond = 1, blocks = 16, absval = -1, subjs = 1, seed = 7624, missing = 'geo', blkstyle = 'accurate')
cond |
Category structure type (1-6), as defined by Shepard et al. (1961). |
blocks |
Number of blocks to generate. Omit this argument to get the same number of blocks (16) as used in the simulations reported by Nosofsky et al. (1994). |
absval |
Teaching value to be used where category is absent. |
subjs |
Number of simulated subjects to be run. |
seed |
Sets the random seed. |
missing |
If set to 'geo', output missing dimension flags (see
below). If set to 'pad', use the padded stimulus representation format
of slpSUSTAIN. If set to 'pad', set |
blkstyle |
If set to 'accurate', reproduce the randomization of this experiment, as described in Nosofsky et al. (1994). If set to 'eights', use instead the randomization used in the Gureckis (2016) simulation of this experiment. |
A matrix is produced, with one row for each trial, and with the following columns:
ctrl
- Set to 1 (reset model) for trial 1 of each simulated
subject, set to zero (normal trial) for all other trials.
blk
- training block
stim
- Stimulus number, ranging from 1 to 8. The numbering scheme
is the same as in Nosofsky et al. (1994, Figure 1), under the mapping
of dim_1_left = 0, dim_1_right = 1, dim_2_front = 0, dim_2_back = 1,
dim_3_bottom = 0, dim_3_top = 1.
x1, x2, ...
- input representation. Where missing='geo'
,
x1, x2, and x3 are returned, each set at 1 or 0. This is the binary
dimensional representation required by models such as slpALCOVE, where
e.g. x2 is the value on the second dimension. Where
missing='pad'
, x1, x2, y1, y2, z1, z2, are returned. This is the
padded represenation required by models such as slpSUSTAIN; e.g. y1 and
y2 represent the two possible values on dimension 2, so if y1 is black,
y2 is white, and the stimulus is white, then [y1, y2] = [0, 1].
t1, t2
- Category label (1 = category present, absval = category
absent)
m1, m2, m3
- Missing dimension flags (always set to zero in this
experiment, indicating all input dimensions are present on all
trials). Only produced if missing = 'geo'
.
Although the trial ordering is random, a random seed is used, so multiple calls of this function with the same parameters should produce the same output. This is usually desirable for reproducibility and stability of non-linear optimization. To get a different order, use the seed argument to set a different seed.
This routine was originally developed to support Wills et al. (n.d.).
R by C matrix, where each row is one trial, and the columns contain model input.
Andy Wills, Lenard Dome
Nosofsky, R.M., Gluck, M.A., Plameri, T.J., McKinley, S.C. and Glauthier, P. (1994). Comparing models of rule-based classification learning: A replication and extension of Shepaard, Hovland, and Jenkins (1961). Memory and Cognition, 22, 352–369
Gureckis, T. (2016). https://github.com/NYUCCL/sustain_python
Wills et al. (n.d.). Benchmarks for category learning. Manuscript in preparation.
Category size is the number of examples of a category that have been presented to the participant. The category-size effect (e.g. Homa et al., 1973) is the phenomenon that, as category size increases, the accuracy of generalization to new members of that category also increases. The equal-frequency conditions of Experiment 3 of Shin & Nosofsky (1992) provides the data for this CIRP.
data(shin92)
data(shin92)
A data frame with the following columns:
Experimental condition (category size). Takes values : 3, 10
Category membership of stimulus. Takes values: 1, 2
Stimulus code, as defined by Shin & Nosofsky (1992). Stimuli beginning 'RN' or 'URN' are the 'novel' stimuli. Stimuli beginning 'P' are prototypes. The remaining stimuli are the 'old' (training) stimuli.
Mean probability, across participants, of responding that the item belongs to category 2.
Wills et al. (2017) discuss the derivation of this CIRP, with Wills et
al. (n.d.) providing further details. In brief, the effect has been
independently replicated. Experiment 3 of Shin & Nosofsky (1992) was
selected due to the availability of a multi-dimensional scaling
solution for the stimuli, see shin92train
.
Experiment 3 of Shin & Nosofsky (1992) involved the classification of nine-vertex polygon stimuli drawn from two categories. Category size was manipulated between subjects (3 vs. 10 stimuli per category). Participants received eight blocks of training, and three test blocks.
The data are as shown in Table 10 of Shin & Nosofsky (1992). The data are mean response probabilities for each stimulus in the test phase, averaged across test blocks and participants.
Andy J. Wills [email protected]
Shin, H.J. & Nosofsky, R.M. (1992). Similarity-scaling studies of dot-pattern classification and recognition. Journal of Experimental Psychology: General, 121, 278-304.
Wills et al. (n.d.). Benchmarks for category learning. Manuscript in preparation.
Wills, A.J., O'Connell, G., Edmunds, C.E.R. & Inkster, A.B. (2017). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. The Psychology of Learning and Motivation, 66, 79-115.
Runs a simulation of the shin92
CIRP using the
slpALCOVE
model implementation as an exemplar model and
shin92train
as the input representation.
shin92exalcove(params = NULL)
shin92exalcove(params = NULL)
params |
A vector containing values for c, phi, la, and lw, in
that order, e.g. params = c(2.1, 0.6, 0.09, 0.9). See
|
An exemplar-based simulation using slpALCOVE
and
shin92train
. The co-ordinates for the radial-basis units
are derived from the test stimuli in shin92train
. The
output is the average of 100 simulated subjects.
The defaults for params
are the best fit of the model to the
shin92
CIRP. They were derived through minimization of
SSE using non-linear optimization from 16 different initial
states (using code not included in this archive).
The other parameters of slpALCOVE are set as follows: r
= 2,
q
= 1, initial alpha
= 1 / (number of input dimensions),
inital w
= 0. These values are conventions of modeling with
ALCOVE, and should not be considered as free parameters. They are set
within the shin92exaclove
function, and hence can't be changed
without re-writing the function.
This simulation was reported in Wills et al. (2017).
A matrix of predicted response probabilities, in the same order and
format as the observed data contained in shin92
.
Andy Wills & Garret O'Connell
Shin, H.J. & Nosofsky, R.M. (1992). Similarity-scaling studies of dot-pattern classification and recognition. Journal of Experimental Psychology: General, 121, 278–304.
Wills, A.J., O'Connell, G., Edmunds, C.E.R. & Inkster, A.B. (2017). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. The Psychology of Learning and Motivation, 66, 79-115.
Uses shin92exalcove
to find best-fitting parameters for
the ex-ALCOVE model for the shin92
CIRP.
shin92exalcove_opt(params = c(2, 1, 0.25, 0.75), recompute = FALSE, trace = 0)
shin92exalcove_opt(params = c(2, 1, 0.25, 0.75), recompute = FALSE, trace = 0)
params |
A vector containing the initial values for c, phi, la,
and lw, in that order. See |
recompute |
When set to TRUE, the function re-runs the optimization (which takes about 25 minutes on a 2.4 GHz processor). When set to FALSE, the function returns a stored copy of the results of the optimization (which is instantaneous). |
trace |
Sets the level of tracing information (i.e. information
about the progress of the optimization), as defined by the
|
This function is an archive of the optimization procedure used to
derive the best-fitting parameters for the shin92exalcove
simulation; see Spicer et al. (2017) for a tutorial introduction to
the concept of simulation archives.
Optimization used the L-BFGS-B method from the optim
function of the standard R stats
package. The objective
function was sum of squared errors. Please inspect the source code for
further details (e.g. type shin92exalcove_opt
).
This function was run in 16 times from different starting points, using 8 threads on a Core i7 3.6 GHz processor. The default parameters of this function are those for the best fit from those 16 starting points. The 16 starting points were
pset <- rbind(
c(2,1,.25,.25),c(2,1,.25,.75),c(2,1,.75,.25),c(2,1,.75,.75),
c(2,3,.25,.25),c(2,3,.25,.05),c(2,3,.75,.25),c(2,3,.75,.75),
c(8,1,.25,.25),c(8,1,.25,.75),c(8,1,.75,.25),c(8,1,.75,.75),
c(8,3,.25,.25),c(8,3,.25,.75),c(8,3,.75,.25),c(8,3,.75,.75)
)
not all of which converged successfully.
A vector containing the best-fitting values for c, phi, la,
and lw, in that order. See slpALCOVE
for an explanation
of these parameters.
Andy Wills
Spicer, S., Jones, P.M., Inkster, A.B., Edmunds, C.E.R. & Wills, A.J. (2017). Progress in learning theory through distributed collaboration: Concepts, tools, and examples. Manuscript in preparation.
Tests whether a model output passes the ordinal adequacy criterion for
the shin92
CIRP.
shin92oat(dta, xtdo=FALSE)
shin92oat(dta, xtdo=FALSE)
dta |
Matrix containing model output. The matrix must have the
same format, row order, and column names, as that returned by
|
xtdo |
eXTenDed Output: Either |
This function implements the Wills et al. (2017) ordinal adequacy
test for the shin92
CIRP. Specifically, a model passes
this test if response accuracy is higher for novel items from the
size-10 condition than novel items from the size-3 condition.
Alternatively, by setting xtdo
to TRUE
, this function
returns the summary model predictions reported by Wills et al. (2017).
Where xtdo=FALSE
, this function returns TRUE if the ordinal
adequacy test is passed, and FALSE otherwise.
Where xtdo=TRUE
, this function returns a summary matrix. The rows
are the two category sizes, the columns are the three principal stimulus
types (old, prototype, new), and the values are predicted accuracy
scores.
Andy Wills and Garret O'Connell
Shin, H.J. & Nosofsky, R.M. (1992). Similarity-scaling studies of dot-pattern classification and recognition. Journal of Experimental Psychology: General, 121, 278–304.
Wills, A.J., O'Connell, G., Edmunds, C.E.R. & Inkster, A.B. (2017). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. The Psychology of Learning and Motivation, 66, 79-115.
Runs a simulation of the shin92
CIRP using the
slpALCOVE
model implementation as a prototype model and
shin92train
as the input representation.
shin92protoalcove(params = NULL)
shin92protoalcove(params = NULL)
params |
A vector containing values for c, phi, la, and lw, in
that orderr, e.g. params = c(2.1, 0.6, 0.09, 0.9). See
|
An exemplar-based simulation using slpALCOVE
and
shin92train
. The co-ordinates for the radial-basis units
for the two prototypes are derived from the arithmetic means of the
test stimuli in shin92train
. The output is the average of
100 simulated subjects.
The defaults for params
are the best fit of the model to the
shin92
CIRP. They were derived through minimization of
SSE using non-linear optimization from 16 different initial
states (using code not included in this archive).
The other parameters of slpALCOVE are set as follows: r
= 2,
q
= 1, initial alpha
= 1 / (number of input dimensions),
inital w
= 0. These values are conventions of modeling with
ALCOVE, and should not be considered as free parameters. They are set
within the shin92exaclove
function, and hence can't be changed
without re-writing the function.
This simulation was reported in Wills et al. (2017).
A matrix of predicted response probabilities, in the same order and
format as the observed data contained in shin92
.
Andy Wills & Garret O'Connell
Shin, H.J. & Nosofsky, R.M. (1992). Similarity-scaling studies of dot-pattern classification and recognition. Journal of Experimental Psychology: General, 121, 278–304.
Wills, A.J., O'Connell, G., Edmunds, C.E.R. & Inkster, A.B. (2017). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. The Psychology of Learning and Motivation, 66, 79-115.
Uses shin92protoalcove
to find best-fitting parameters for
the proto-ALCOVE model for the shin92
CIRP.
shin92protoalcove_opt(params = c(2,1,.25,.75), recompute = FALSE, trace = 0)
shin92protoalcove_opt(params = c(2,1,.25,.75), recompute = FALSE, trace = 0)
params |
A vector containing the initial values for c, phi, la,
and lw, in that order. See |
recompute |
When set to TRUE, the function re-runs the optimization (which takes about 10 minutes on a 2.4 GHz processor). When set to FALSE, the function returns a stored copy of the results of the optimization (which is instantaneous). |
trace |
Sets the level of tracing information (i.e. information
about the progress of the optimization), as defined by the
|
This function is an archive of the optimization procedure used to
derive the best-fitting parameters for the shin92protoalcove
simulation; see Spicer et al. (2017) for a tutorial introduction to
the concept of simulation archives.
Optimization used the L-BFGS-B method from the optim
function of the standard R stats
package. The objective
function was sum of squared errors. Please inspect the source code for
further details (e.g. type shin92protoalcove_opt
).
This function was run in 16 times from different starting points, using 8 threads on a Core i7 3.6 GHz processor. The default parameters of this function are those for the best fit from those 16 starting points. The 16 starting points were
pset <- rbind(
c(2,1,.25,.25),c(2,1,.25,.75),c(2,1,.75,.25),c(2,1,.75,.75),
c(2,3,.25,.25),c(2,3,.25,.05),c(2,3,.75,.25),c(2,3,.75,.75),
c(8,1,.25,.25),c(8,1,.25,.75),c(8,1,.75,.25),c(8,1,.75,.75),
c(8,3,.25,.25),c(8,3,.25,.75),c(8,3,.75,.25),c(8,3,.75,.75)
)
not all of which converged successfully.
A vector containing the best-fitting values for c, phi, la,
and lw, in that order. See slpALCOVE
for an explanation
of these parameters.
Andy Wills
Spicer, S., Jones, P.M., Inkster, A.B., Edmunds, C.E.R. & Wills, A.J. (2017). Progress in learning theory through distributed collaboration: Concepts, tools, and examples. Manuscript in preparation.
Creates randomized training and transfer blocks for CIRP shin92
, in a format suitable for the slpALCOVE
model, and any other
model that uses the same input representation format. The stimulus
co-ordinates come from a MDS solution reported by Shin & Nosofsky
(1992).
shin92train(condition = 'equal3', learn.blocks = 8, trans.blocks = 3, absval = -1, format = 'mds', subjs = 1, seed = 8416, missing = 'geo')
shin92train(condition = 'equal3', learn.blocks = 8, trans.blocks = 3, absval = -1, format = 'mds', subjs = 1, seed = 8416, missing = 'geo')
condition |
Experimental condition 'equal3', 'equal10', 'unequal3', or 'unequal10', as defined by Shin & Nosofsky (1992). |
learn.blocks |
Number of training blocks to generate. Omit this argument to get the same number of training blocks as the published study (8). |
trans.blocks |
Number of transfer blocks to generate. Omit this argument to get the same number of transfer blocks as the published study (3). |
absval |
Teaching value to be used where category is absent. |
format |
Specifies format used for input representation. Only one format is currently suported, so this option is provided solely to support future development. |
subjs |
Number of simulated subjects to be run. |
seed |
Sets the random seed |
missing |
If set to 'geo', output missing dimension flags (see below) |
A matrix is produced, with one row for each trial, and with the following columns:
ctrl
- Set to 1 (reset model) for trial 1, set to zero (normal
trial) for all other training trials, and set to 2 (freeze learning) for
all transfer trials.
cond
- 1 = equal3, 2 = equal10, 3 = unequal3, 4 = unequal10
phase
- 1 = training, 2 = transfer
blk
- block of trials
stim
- stimulus number; these correspond to the rows in Tables A3
and A4 of Shin & Nosofsky (1992)
x1 ... x6
- input representation. These are the co-ordinates of
an MDS solution for these stimuli (see Shin & Nosofsky, 1992, Tables A3
and A4). Note: Size 3 conditions have a four-dimensional MDS solution,
so the output is x1 ... x4
t1, t2
- teaching signal (1 = category present, absval = category
absent)
m1 ... m6
- Missing dimension flags (always set to zero in this
experiment, indicating all input dimensions are present on all
trials). Note: ranges from m1 to m4 for Size 3 conditions. Only produced
if missing = 'geo'
.
Although the trial ordering is random, a random seed is used, so multiple calls of this function with the same parameters should produce the same output. This is usually desirable for reproducibility and stability of non-linear optimization. To get a different order, use the seed argument to set a different seed.
This function was originally developed to support simulations reported in Wills et al. (2017).
R by C matrix, where each row is one trial, and the columns contain model input.
Andy Wills
Shin, H.J. & Nosofsky, R.M. (1992). Similarity-scaling studies of dot-pattern classification and recognition. Journal of Experimental Psychology: General, 121, 278-304.
Wills, A.J., O'Connell, G., Edmunds, C.E.R. & Inkster, A.B. (2017). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. The Psychology of Learning and Motivation, 66.
Kruschke's (1992) category learning model.
slpALCOVE(st, tr, dec = 'ER', humble = TRUE, attcon = FALSE, absval = -1, xtdo = FALSE)
slpALCOVE(st, tr, dec = 'ER', humble = TRUE, attcon = FALSE, absval = -1, xtdo = FALSE)
st |
List of model parameters |
tr |
R-by-C matrix of training items |
dec |
String defining decision rule to be used |
humble |
Boolean specifying whether a humble or strict teacher is to be used |
attcon |
Boolean specifying whether attention is constrained |
absval |
Real number specifying teaching value for category absence |
xtdo |
Boolean specifying whether to write extended information to the console (see below). |
The coverage in this help file is relatively brief; Catlearn Research Group (2016) provides an introduction to the mathematics of the ALCOVE model, whilst a more extensive tutorial on using slpALCOVE can be found in Wills et al. (2016).
The functions works as a stateful list processor. Specifically, it takes a matrix as an argument, where each row is one trial for the network, and the columns specify the input representation, teaching signals, and other control signals. It returns a matrix where each row is a trial, and the columns are the response probabilities at the output units. It also returns the final state of the network (attention and connection weights), hence its description as a 'stateful' list processor.
Argument st
must be a list containing the following items:
colskip
- skip the first N columns of the tr array, where N =
colskip. colskip should be set to the number of optional columns you
have added to matrix tr, PLUS ONE. So, if you have added no optional
columns, colskip = 1. This is because the first (non-optional) column
contains the control values, below.
c
- specificity constant (Kruschke, 1992, Eq. 1). Positive real
number. Scales psychological space.
r
- distance metric (Kruschke, 1992, Eq. 1). Set to 1
(city-block) or 2 (Euclidean).
q
- similarity gradient (Kruschke, 1992, Eq. 1). Set to 1
(exponential) or 2 (Gaussian).
phi
- decision constant. For decision rule ER
, it is
referred to as mapping constant phi, see Kruschke (1992, Eq. 3). For
decision rule BN
, it is referred to as the background noise
constant b, see Nosofsky et al. (1994, Eq. 3).
lw
- associative learning rate (Kruschke, 1992, Eq. 5) . Real
number between 0 and 1.
la
- attentional learning rate (Kruschke, 1992, Eq. 6). Real
number between 0 and 1.
h
- R by C matrix of hidden node locations in psychological
space, where R = number of input dimensions and C = number of hidden
nodes.
alpha
- vector of length N giving initial attention weights for
each input dimension, where N = number of input dimensions. If you are
not sure what to use here, set all values to 1.
w
- R by C matrix of initial associative strengths, where R =
number of output units and C = number of hidden units. If you are not
sure what to use here, set all values to zero.
Argument tr
must be a matrix, where each row is one trial
presented to the network. Trials are always presented in the order
specified. The columns must be as described below, in the order
described below:
ctrl
- vector of control codes. Available codes are: 0 = normal
trial, 1 = reset network (i.e. set attention weights and associative
strengths back to their initial values as specified in h and w (see
below)), 2 = Freeze learning. Control codes are actioned before the
trial is processed.
opt1, opt2, ...
- optional columns, which may have any names
you wish, and you may have as many as you like, but they must be
placed after the ctrl column, and before the remaining columns (see
below). These optional columns are ignored by this function, but you
may wish to use them for readability. For example, you might include
columns for block number, trial number, and stimulus ID number. The
argument colskip (see above) must be set to the number of optional
columns plus 1.
x1, x2, ...
- input to the model, there must be one column for
each input unit. Each row is one trial.
t1, t2, ...
- teaching signal to model, there must be one
column for each output unit. Each row is one trial. If the stimulus is a
member of category X, then the teaching signal for output unit X must be
set to +1, and the teaching signal for all other output units must be
set to absval
.
m1, m2, ...
- missing dimension flags, there must be one column
for each input unit. Each row is one trial. Where m = 1, that input unit
does not contribute to the activation of the hidden units on that
trial. This permits modelling of stimuli where some dimensions are
missing on some trials (e.g. where modelling base-rate negelct,
Kruschke, 1992, p. 29–32). Where m = 0, that input unit contributes as
normal. If you are not sure what to use here, set to zero.
Argument dec
, if specified, must take one of the following
values:
ER
specifies an exponential ratio rule (Kruschke, 1992, Eq. 3).
BN
specifies a background noise ratio rule (Nosofsky et al.,
1994, Eq. 3). Any output activation lower than zero is set to zero
before entering into this rule.
Argument humble
specifies whether a humble or strict teacher is
to be used. The function of a humble teacher is specified in Kruschke
(1992, Eq. 4b). In this implementation, the value -1 in Equation 4b is
replaced by absval
.
Argument attcon
specifies whether attention should be constrained
or not. If you are not sure what to use here, set to FALSE. Some
implementations of ALCOVE (e.g. Nosofsky et al., 1994) constrain the sum
of the attentional weights to always be 1 (personal communication,
R. Nosofsky, June 2015). The implementation of attentional constraint in
alcovelp is the same as that used by Nosofsky et al. (1994), and
present as an option in the source code available from Kruschke's
website (Kruschke, 1991).
Argument xtdo
(eXTenDed Output), if set to TRUE, will output to
the console the following information on every trial: (1) trial number,
(2) attention weights at the end of that trial, (3) connection weights
at the end of that trial, one row for each output unit. This output can
be quite lengthy, so diverting the output to a file with the sink
command prior to running alcovelp
with extended output is
advised.
Returns a list containing three components: (1) matrix of response probabilities for each output unit on each trial, (2) attentional weights after final trial, (3) connection weights after final trial.
Andy Wills
Catlearn Research Group (2016). Description of ALCOVE. http://catlearn.r-forge.r-project.org/desc-alcove.pdf
Kruschke, J. (1991). ALCOVE.c. Retrieved 2015-07-20, page since removed, but archival copy here: https://web.archive.org/web/20150605210526/http://www.indiana.edu/~kruschke/articles/ALCOVE.c
Kruschke, J. (1992). ALCOVE: an exemplar-based connectionist model of category learning. Psychological Review, 99, 22-44
Nosofsky, R.M., Gluck, M.A., Plameri, T.J., McKinley, S.C. and Glauthier, P. (1994). Comparing models of rule-based classification learning: A replication and extension of Shepaard, Hovland, and Jenkins (1961). Memory and Cognition, 22, 352-369.
Wills, A.J., O'Connell, G., Edmunds, C.E.R., & Inkster, A.B.(2017). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. Psychology of Learning and Motivation, 66, 79-115.
A model often attributed to Bush & Mosteller (1951), more precisely this is the separable error term learning equation discussed by authors such as Mackintosh (1975) and Le Pelley (2004); see Note 1.
slpBM(st, tr, xtdo = FALSE)
slpBM(st, tr, xtdo = FALSE)
st |
List of model parameters |
tr |
R matrix of training items |
xtdo |
Boolean specifying whether to include extended information in the output (see below) |
The function operates as a stateful list processor (slp; see Wills et al., 2017). Specifically, it takes a matrix (tr) as an argument, where each row represents a single training trial, while each column represents the different types of information required by the model, such as the elemental representation of the training stimuli, and the presence or absence of an outcome. It returns the output activation on each trial (a.k.a sum of associative strengths of cues present on that trial), as a vector. The slpBM function also returns the final state of the model - a vector of associative strengths between each stimulus and the outcome representation.
Argument st
must be a list containing the following items:
lr
- the learning rate (fixed for a given simulation), as denoted
by, for example, theta
in Equation 1 of Mackintosh (1975). If you
want different elements to differ in salience (different alpha values)
use the input activations (x1, x2, ..., see below) to represent
element-specific salience.
w
- a vector of initial associative strengths. If you are not
sure what to use here, set all values to zero.
colskip
- the number of optional columns to be skipped in the tr
matrix. colskip should be set to the number of optional columns you have
added to the tr matrix, PLUS ONE. So, if you have added no optional
columns, colskip=1. This is because the first (non-optional) column
contains the control values (details below).
Argument tr
must be a matrix, where each row is one trial
presented to the model. Trials are always presented in the order
specified. The columns must be as described below, in the order
described below:
ctrl
- a vector of control codes. Available codes are: 0 = normal
trial; 1 = reset model (i.e. set associative strengths (weights) back to
their initial values as specified in w (see above)); 2 = Freeze
learning. Control codes are actioned before the trial is processed.
opt1, opt2, ...
- any number of preferred optional columns, the
names of which can be chosen by the user. It is important that these
columns are placed after the control column, and before the remaining
columns (see below). These optional columns are ignored by the
function, but you may wish to use them for readability. For example, you
might choose to include columns such as block number, trial number and
condition. The argument colskip (see above) must be set to the number of
optional columns plus one.
x1, x2, ...
- activation of any number of input elements. There
must be one column for each input element. Each row is one trial. In
simple applications, one element is used for each stimulus (e.g. a
simulation of blocking (Kamin, 1969), A+, AX+, would have two inputs,
one for A and one for X). In simple applications, all present elements
have an activation of 1 and all absence elements have an activation of
0. However, slpBM supports any real number for activations, e.g. one
might use values between 0 and 1 to represent differing cue saliences.
t
- Teaching signal (a.k.a. lambda). Traditionally, 1 is used to
represent the presence of the outcome, and 0 is used to represent the
absence of the outcome, altough slpBM supports any real values for lambda..
Argument xtdo
(eXTenDed Output) - if set to TRUE, function will
return the associative strengths for the end of each trial (see Value).
Returns a list containing two components (if xtdo = FALSE) or three components (if xtdo = TRUE, xout is also returned):
st |
Vector of final associative strengths |
suma |
Vector of output activations for each trial |
xout |
Matrix of associative strengths at the end of each trial |
1. Bush & Mosteller's (1951) Equations 2 outputs response probability,
not associative strength. Also, it has two learning rate paramters,
a
and b
. At least to a first approximation, b
serves a similar function to beta-outcome-absent
in Rescorla &
Wagner (1972), and a-b
is similar to
beta-outcome-present
in that same model.
Lenard Dome, Stuart Spicer, Andy Wills
Bush, R. R., & Mosteller, F. (1951). A mathematical model for simple learning. Psychological Review, 58(5), 313-323.
Kamin, L.J. (1969). Predictability, surprise, attention and conditioning. In Campbell, B.A. & Church, R.M. (eds.), Punishment and Aversive Behaviour. New York: Appleton-Century-Crofts, 1969, pp.279-296.
Le Pelley, M.E. (2004). The role of associative history in models of associative learning: A selective review and a hybrid model, Quarterly Journal of Experimental Psychology, 57B, 193-243.
Mackintosh, N.J. (1975). A theory of attention: Variations in the associability of stimuli with reinforcement, Psychological Review, 82, 276-298.
Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black & W. F. Prokasy (Eds.), Classical conditioning II: Current research and theory (pp. 64-99). New York: Appleton-Century-Crofts.
Spicer, S., Jones, P.M., Inkster, A.B., Edmunds, C.E.R. & Wills, A.J. (n.d.). Progress in learning theory through distributed collaboration: Concepts, tools, and examples. Manuscript in preparation.
Wills, A.J., O'Connell, G., Edmunds, C.E.R., & Inkster, A.B.(2017). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. Psychology of Learning and Motivation, 66, 79-115.
COmpetition between Verbal and Implicit Systems model of category learning (Ashby et al. 1998), as described in Ashby et al. (2011). The current implementation supports two-category experiments, and uses only single-dimension, not-below-chance, rules in the Explicit system.
slpCOVIS(st, tr, crx = TRUE, respt = FALSE, rgive = TRUE, xtdo = FALSE)
slpCOVIS(st, tr, crx = TRUE, respt = FALSE, rgive = TRUE, xtdo = FALSE)
st |
List of model parameters |
tr |
R-by-C matrix of training items |
crx |
Boolean. Explicit System. If set to TRUE, the current rule is included in the random selection of a rule to receive a weight increase from the Possion distribution. If set to FALSE, the current rule is not included in this random selection. |
respt |
Set to FALSE for the behaviour described in Note 5; behaviour when TRUE is undocumented |
rgive |
Set to TRUE; FALSE is undocumented |
xtdo |
Set to FALSE; TRUE is undocumented |
The coverage in this help file is relatively brief; for a more extensive tutorial, see Inkster et al. (n.d.).
The function works as a stateful list processor (slp; see Wills et al., 2017). Specifically, it takes a matrix (tr) as an argument, where each row is one trial for the network, and the columns specify the input representation. It returns a List containing the predictions made by the model and the final state of the model, hence its description as a 'stateful' list processor.
Argument st
must be a list containing the following
information. Parameter names given in brackets in the descriptions
below follow the naming conventions of Ashby et al. (2011), and
Edmunds & Wills (2016). Equation numbers are from Ashby et al. (2011);
where there is no equation, the page number is given instead.
Explicit system variables:
envar
- (sigma^2_E) - p. 68 - Variance of the noise distribution
used to determine which response the explicit system makes on the
current trial. See Note 4, below.
decbound
- (C) - Eq. 1 - location of the decision bound on a
single dimension. In the current implementation of slpCOVIS, this
location is the same for all dimensions.
corcon
- (delta_c) - Eq. 2 - constant by which to increase
current rule saliency in the case of a correct response.
errcon
- (delta_e) - Eq. 3 - constant by which to decrease
current rule saliency in the case of an incorrect response.
perscon
- (gamma) - Eq. 4 - perseveration constant, i.e. value
to add to the salience of the current rule to obtain its rule weight.
lambda
- (lambda) - Eq. 5 - Mean of the Poission
distribution. A value randomly sampled from the Poisson distribution
is added to a randomly-selected rule when calculating the weights for
new rule selection.
decsto
- (a) - Eq. 7 - decision stochasticity when using rule
weights to select the rule for the next trial. For Ashby et
al. (2011)'s implementation, a = 1. For other uses, see Edmunds &
Wills (2016).
Procedural system variables:
sconst
- (alpha) - Eq. 8 - scaling constant for cortical unit
activation. See Note 3, below.
invar
- (sigma^2_p) - Eq. 9 - Variance of the
normally-distributed noise used to calculate striatal unit activation.
dbase
- (D_base) - Eq. 10 - baseline dopamine level.
alphaw
- (alpha_w) - Eq. 10 - Learning rate parameter in force
when striatal activation is above the NMDA threshold, and dopamine is
above baseline.
betaw
- (beta_w) - Eq. 10 - Learning rate parameter in force
when striatal activation is above the NMDA threshold, and dopamine is
below baseline.
gammaw
- (gamma_w) - Eq. 10 - Learning rate parameter in force
when striatal activation is between the AMPA and NMDA thresholds.
nmda
- (theta_NMDA) - Eq. 10 - Activation threshold for
post-synaptic NMDA.
ampa
- (theta_AMPA) - Eq. 10 - Activation threshold for
post-synaptic AMPA. See Note 1, below.
wmax
- (w_max) - Eq. 10 - Intended upper weight limit for a
cortico-striatal link. See Note 2, below.
prep
- ( P_(n-1) ) - Eq. 12 - predicted reward value
immediately prior to first trial. If unsure, set to zero.
prer
- ( R_(n-1) ) - Eq. 12 - obtained reward value immediately
prior to first trial. If unsure, set to zero.
Competition / decision system variables:
emaxval
- p.77 - The maximum possible value of the the Explicit
system's discriminant variable. For example, if the stimulus value
varies from zero to one, and C (see above) is 0.5, then the maximum
value is 1-0.5 = 0.5
etrust
- (theta_E) - Eq. 15 - trust in the explicit system
immediately prior to first trial. If unsure, set to .99.
itrust
- (theta_P) - p. 77 - trust in the procedural system
immediately prior to first trial. If unsure, set to .01. See also Note
7, below.
ocp
- (delta_OC) - Eq. 15 - constant used to increase trust in
the Explicit system after it suggests a response that turns out to be
correct.
oep
- (delta_OE) - Eq. 16 - constant used to decrease trust in
the Explicit system after it suggests a response that turns out to be
incorrect.
Initial state of model:
initrules
- vector of length stimdim
, representing the
initial salience of each single-dimensional rule in the Explicit
system.
crule
- a number indicating which rule is in use immediately
prior to the first trial (1 = dimension 1, 2 = dimension 2, etc). If
this is not meaningful in the context of your simulation, set it to
zero, and ensure ctrl = 1 in the first row of your training matrix
(see below). This will then randomly pick an initial rule.
initsy
- matrix of stimdim
rows and two columns -
contains the initial values for the cortico-striatal connection
strengths.
scups
- matrix of stimdim
columns and as many rows as
you wish to have cortical input units. Each row represents the
position of a cortical unit in N-dimensional stimulus space.
And finally, a couple of things slpCOVIS needs to interpret your tr matrix (see below):
stimdim
- number of stimulus dimensions in the input
representation.
colskip
- skip the first N columns of the tr array, where N =
colskip. colskip should be set to the number of optional columns you
have added to matrix tr, PLUS ONE. So, if you have added no optional
columns, colskip = 1. This is because the first (non-optional) column
contains the control values, see below.
Argument tr
must be a matrix, where each row is one trial
presented to the network. Trials are always presented to the model in
the order specified. The columns must be as described below, in the
order described below:
ctrl
- vector of control codes. Available codes are: 0 = normal
trial, 1 = reset network (i.e. set back to the state defined in list
st
and randomly select an initial rule for the Explicit System
using Eq. 7) , 2 = Freeze learning. Control codes are actioned before the
trial is processed.
opt1, opt2, ...
- optional columns, which may have any names
you wish, and you may have as many as you like, but they must be
placed after the ctrl column, and before the remaining columns (see
below). These optional columns are ignored by this function, but you
may wish to use them for readability. For example, you might include
columns for block number, trial number, and stimulus ID number. The
argument colskip (see above) must be set to the number of optional
columns plus 1.
x1, x2, ...
- stimulus input to the model; there must be one
column for each stimulus dimension.
t1
- teaching signal to model. If the correct response is
Category 1, t = 1. If the correct response is Category 2, t =
-1. Experiments with something other than two categories are not
supported in the current implementation.
optend1, optend2, ...
- optional columns, which may have any
names you wish, and you may have as many as you like, but they must be
placed after the t1 column. These optional columns are ignored by this
function, but may help with cross-compatibility with other model
implementations. For example, the additional 't' and 'm' columns of
input representations generated for slpALCOVE will be safely ignored
by slpCOVIS.
Returns a List containing eight components:
foutmat |
A two-column matrix, representing the model's response on each trial. For any given trial, [1,0] indicates a Category 1 response; [0,1] indicates a Category 2 response. Responses are reported in this manner to facilitate cross-compatibility with models that produce response probabilities on each trial. |
frules |
Explicit system - rule saliences after final trial |
fsystr |
Procedural system - cortico-striatal synaptic strengths after final trial) |
fetrust |
Decision system - trust in explicit system after final trial |
fitrust |
Decision system - trust in procedural system after final trial |
frule |
Explicit system - rule used by explicit system on final trial |
fprep |
Implicit system - predicted reward value on final trial |
fprer |
Implicit system - obtained reward value on final trial |
1. Ashby et al. (2011) state (p. 74) that the intended operation of COVVIS is theta_NMDA > theta_AMPA, but the values they report are theta_NMDA = .0022, theta_AMPA = .01.
2. Ashby et al. (2011) did not specify a value for w_max; Edmunds & Wills (2016) assumed the intended value was 1.
3. Ashby et al. (2011) do not use Eq. 8 in their simulation, they manually set sensory cortex activation to 1 for the presented stimulus and 0 for all the others (p. 78). They thus do not have a value for alpha. Edmunds & Wills (2016) set alpha to 0.14, which produces similar behaviour for 0,1 coded stimulus dimensions, without having to manually set the activations.
4. In Ashby et al. (2011) and Edmunds & Wills (2016), sigma^2_E is set to zero. In this implementation of slpRW, positive values should also work but have not been extensively tested.
5. In the descriptions provided by Ashby et al. (2011, p. 69 & p. 75), there is some ambiguity about the meaning of the term 'response' - does this mean the response of a system (e.g. the Explicit system), or the overall response (i.e. the output of the decision system). In the current implementation, the response of the Explicit System is compared to the feedback to determine whether the Explicit System was correct or incorrect, and the response of the Procedural System is compared to the feedback to determine whether the Procedural System was correct or incorrect.
6. It seems that in Ashby et al.'s (2011) simulations, each dimension generates only one single-dimension rule for a two-category problem, rather than two as one might expect (e.g. small = A, large = B, but also large = A, small = B). Rules that would produce below-chance responding are excluded from the rule set.
7. Ashby et al. (2011) state that theta_E + theta_P = 1. However, slpCOVIS does not perform this check on the initial state, so it is important to check this manually.
Angus Inkster, Andy Wills, Charlotte Edmunds
Ashby, F.G., Alfonso-Reese, L.A., Turken, A.U. & Waldron, E.M. (1998). A neuropsychological theory of multiple systems in category learning. Psychological Review, 105, 442-481.
Ashby, F.G., Paul, E.J., & Maddox, W.T. (2011). COVIS. In Pothos, E.M. & Wills, A.J. (2011). Formal approaches in categorization. Cambridge, UK: Cambridge University Press.
Edmunds, C.E.R., & Wills, A.J. (2016). Modeling category learning using a dual-system approach: A simulation of Shepard, Hovland and Jenkins (1961) by COVIS. In A. Papfragou, D. Grodner, D. Mirman, & J.C. Trueswell (Eds.). Proceedings of the 38th Annual Conference of the Cognitive Science Society (pp. 69-74). Austin, TX: Cognitive Science Society.
Inkster, A.B., Edmunds, C.E.R., & Wills, A.J. (n.d.). A distributed-collaboration resource for dual-process modeling in category learning. Manuscript in preparation.
Pothos, E.M. & Wills, A.J.(2011). Formal approaches in Categorisation.Cambridge: University Press.
Wills, A.J., O'Connell, G., Edmunds, C.E.R., & Inkster, A.B.(2017). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. Psychology of Learning and Motivation, 66, 79-115.
Stewart and Morin (2007)'s extension to Nosofsky's (1984, 2011) Exemplar-based Generalized Context Model. The implementation also contains O'Bryan et al. (2018)'s version of the Similarity-Dissimilarity Generalized Context Model, see Note 1.
slpDGCM(st, test, dec = "BIAS", exemplar_mute = FALSE, exemplar_decay = TRUE)
slpDGCM(st, test, dec = "BIAS", exemplar_mute = FALSE, exemplar_decay = TRUE)
st |
List of model parameters |
test |
Test matrix. |
dec |
Decision mechanism. If |
exemplar_mute |
If |
exemplar_decay |
If |
This implementation houses the two version of DGCM. In order to use the
instantiation of DGCM described in O'Bryan et al. (2018), set
exemplar_decay = FALSE
and exemplar_mute = TRUE
. The
default settings of the function will run the model that corresponds to
Stewart and Morin (2007).
The functions works as a stateful list processor. Specifically, it takes a data frame as an argument, where each row is one trial for the model, and the columns specify the input representation, teaching signals, and other control signals. It returns two matrices containing, for each trial, response probabilities and the accumulated evidence for each category. It also returns the final state of the network (e.g. memory decay), hence its description as a 'stateful' list processor, see Note 1.
This implementation took the assumption that when exemplar_decay =
TRUE
, memory strengths for exemplar are equal to each other at the
beginning of the test phase. In future releases, we plan to implement a
feature that allows initial memory strengths to be treated as freely
varying parameters.
st
must be a list containing the following items:
attentional_weights
- vector of attentional weights, where sum of
all elements equal to 1.
c
- generalization constant.
r
- The Minkowski metric parameter r gives a city block
metric when r = 1 (used for separable-dimension stimuli) and
a Euclidean metric when r = 2 (used for integral-dimension
stimuli).
s
- similarity and dissimilarity weighting. If 0, evidence for a
category will be purely based on the dissimilarity between
current input vector and all exemplars from the other
categories. If it is 1, evidence for a given category will be
solely based on similarity to its own exemplars.
t
- exemplar weighting. If memory_decay = FALSE
, it is a
vector of exemplar-specific memory strength. If
memory_decay = TRUE
(default), it is a vector of
exemplar-specific memory strengths that will update according
to the function as specified in Equation 4 in Stewart and
Morin (2007).
beta
- category bias vector. Only used when dec
set to
BIAS, otherwise ignored. Currently, there is no restriction
in place on what values are allowed in this implementation,
but Stewart and Morin (2007) specifies that elements of
beta
should sum to 1.
base
- a vector of baseline level of similarity. This parameter
will control how much noise will spread over all categories
in the background-noise decision rule. It is only used if
dec
is set to NOISE.
gamma
- decision scaling constant. Only used when dec
is
set to BIAS.
theta
- decay rate. If exemplar_decay = FALSE
, theta is
ignored.
colskip
- the number of optional columns to skip in test plus
one. If you have no optional columns, set it to one.
outcomes
- the number of categories.
exemplars
- a matrix of exemplars and their corresponding
category indicated by a single integer.
test
must be a data.matix
with the following columns:
opt1, opt2, ...
- any number of optional columns, the names of
which can be chosen by the user. These optional columns are ignored
by the slpDGCM function, but you may wish to use them for
readability.
x1, x2, x3, ...
- input to the model, there must be one column
for each input unit. Each row is one trial. DGCM uses a nominal
stimulus representation, which means that features are coded as
either 0 (absent) or 1 (present).
If exemplar_decay = FALSE
, returns a list of the following
matrices:
v
A matrix of evidence accumulated for each category (columns)
on each trial (rows) as output by Equation 3 in Stewart and Morin
(2007).
p
A matrix of response probabilities. Category responses
(columns) for each trial (rows).
If exemplar_decay = TURE
, the function also returns memory
decay for each trial, decay
.
1. O'Bryan et al. (2018)'s version of the DGCM is not a stateful list processor, but we decided to include it in the same implementation. In fact, Stewart and Morin (2007)'s version only classifies as a stateful list processor, because of the memory decay function.
Lenard Dome, Andy Wills
Nosofsky, R. M. (1984). Choice, similarity, and the context theory of classification. Journal of Experimental Psychology: Learning, memory, and cognition, 10, 104.
O'Bryan, Sean R., et al. (2018). Model-based fMRI reveals dissimilarity processes underlying base rate neglect. ELife 7: e36395.
Stewart, N., & Morin, C. (2007). Dissimilarity is used as evidence of category membership in multidimensional perceptual categorization: A test of the similarity–dissimilarity generalized context model. Quarterly Journal of Experimental Psychology, 60, 1337-1346.
## Replicate O'Bryan et al. (2018) # Exemplars stim = matrix(c( 1,1,0,0,0,0, 1, 1,0,1,0,0,0, 2, 0,0,0,1,1,0, 3, 0,0,0,1,0,1, 4), ncol = 7, byrow = TRUE) # Transfer/test stimuli # This is a row for each unique transfer stimulus tr = matrix(c( 1, 1, 0, 0, 0, 0, #0,1,2 1, 0, 1, 0, 0, 0, #3 0, 0, 0, 1, 1, 0, #4,5,6 0, 0, 0, 1, 0, 1, #7 1, 0, 0, 0, 0, 0, #8 0, 0, 0, 1, 0, 0, #9 0, 1, 0, 0, 0, 0, #10 0, 0, 1, 0, 0, 0, #11 0, 0, 0, 0, 1, 0, #12 0, 0, 0, 0, 0, 1, #13 0, 1, 1, 0, 0, 0, #14, 15 0, 0, 0, 0, 1, 1, #16, 17 1, 0, 0, 0, 1, 0, #18 1, 0, 0, 0, 0, 1, #19 0, 1, 0, 1, 0, 0, #20 0, 0, 1, 1, 0, 0, #21 0, 0, 1, 0, 1, 0, #22, 23 0, 1, 0, 0, 0, 1 #24, 25 ), ncol = 6, byrow = TRUE) # parameters from paper aweights = c(0.27692188, 0.66524089, 0.88723335, 0.16967400, 0.71206208, 0.87939732) st <- list(attentional_weights = aweights/sum(abs(aweights)), c = 9.04906080, s = 0.94614863, b = 0.02250668, t = c(3, 1, 3, 1), beta = c(1, 1, 1, 1)/4, gamma = 1, theta = 0.4, r = 1, colskip = 1, outcomes = 4, exemplars = stim) slpDGCM(st, tr, exemplar_decay = FALSE, exemplar_mute = TRUE, dec = "NOISE")
## Replicate O'Bryan et al. (2018) # Exemplars stim = matrix(c( 1,1,0,0,0,0, 1, 1,0,1,0,0,0, 2, 0,0,0,1,1,0, 3, 0,0,0,1,0,1, 4), ncol = 7, byrow = TRUE) # Transfer/test stimuli # This is a row for each unique transfer stimulus tr = matrix(c( 1, 1, 0, 0, 0, 0, #0,1,2 1, 0, 1, 0, 0, 0, #3 0, 0, 0, 1, 1, 0, #4,5,6 0, 0, 0, 1, 0, 1, #7 1, 0, 0, 0, 0, 0, #8 0, 0, 0, 1, 0, 0, #9 0, 1, 0, 0, 0, 0, #10 0, 0, 1, 0, 0, 0, #11 0, 0, 0, 0, 1, 0, #12 0, 0, 0, 0, 0, 1, #13 0, 1, 1, 0, 0, 0, #14, 15 0, 0, 0, 0, 1, 1, #16, 17 1, 0, 0, 0, 1, 0, #18 1, 0, 0, 0, 0, 1, #19 0, 1, 0, 1, 0, 0, #20 0, 0, 1, 1, 0, 0, #21 0, 0, 1, 0, 1, 0, #22, 23 0, 1, 0, 0, 0, 1 #24, 25 ), ncol = 6, byrow = TRUE) # parameters from paper aweights = c(0.27692188, 0.66524089, 0.88723335, 0.16967400, 0.71206208, 0.87939732) st <- list(attentional_weights = aweights/sum(abs(aweights)), c = 9.04906080, s = 0.94614863, b = 0.02250668, t = c(3, 1, 3, 1), beta = c(1, 1, 1, 1)/4, gamma = 1, theta = 0.4, r = 1, colskip = 1, outcomes = 4, exemplars = stim) slpDGCM(st, tr, exemplar_decay = FALSE, exemplar_mute = TRUE, dec = "NOISE")
DIVergent Autoencoder (Kurtz, 2007; 2015) artificial neural network category learning model
slpDIVA(st, tr, xtdo = FALSE)
slpDIVA(st, tr, xtdo = FALSE)
st |
List of model parameters |
tr |
R-by-C matrix of training items |
xtdo |
When set to TRUE, produce extended output |
This function works as a stateful list processor (Wills et al., 2017). Specifically, it takes a matrix as an argument, where each row is one trial for the network, and the columns specify the input representation, teaching signals, and other control signals. It returns a matrix where each row is a trial, and the columns are the response probabilities for each category. It also returns the final state of the network (connection weights and other parameters), hence its description as a 'stateful' list processor.
Argument st
must be a list containing the following items:
st
must contain the following principal model parameters:
learning_rate
- Learning rate for weight updates through
backpropagation. The suggested learning rate default is
learning_rate = 0.15
beta_val
- Scalar value for the Beta parameter. beta_val
controls the degree of feature focusing (not unlike attention) that
the model uses to make classification decisions (see: Conaway & Kurtz,
2014; Kurtz, 2015). beta_val = 0
turns feature focusing off.
phi
- Scalar value for the phi parameter. phi
is a
real-valued mapping constant, see Kruschke (1992, Eq. 3).
st
must also contain the following information about network
architecture:
num_feats
- Number of input features.
num_hids
- Number of hidden units. A rough rule of thumb for
this hyperparameter is to start with num_feats = 2
and add
additional units if the model fails to converge.
num_cats
- Number of categories.
continuous
- A Boolean value to indicate if the model should
work in continuous input or binary input mode. Set continuous =
TRUE
when the inputs are continuous.
st
must also contain the following information about the
initial state of the network:
in_wts
- A matrix of initial input-to-hidden weights with
num_feats + 1
rows and num_hids
columns. Can be set to
NULL
when the first line of the tr
matrix includes
control code 1, ctrl = 1
.
out_wts
- A matrix of initial hidden-to-output weights with
num_feats + 1
rows, num_hids
columns and with the third
dimension being num_cats
in extent. Can be set to NULL
when the first line of the tr
matrix includes control code 1,
ctrl = 1
.
st
must also contain the following information so that it can
reset these weights to random values when ctrl = 1 (see below):
wts_range
- A scalar value for the range of the
randomly-generated weights. The suggested weight range deafult is
wts_range = 1
wts_center
- A scalar value for the center of the
randomly-generated weights. This is commonly set to wts_center =
0
st
must also contain the following parameters that describe
your tr
array:
colskip
- Skip the first N columns of the tr array, where
N = colskip
. colskip
should be set to the number of
optional columns you have added to matrix tr
, PLUS ONE. So, if
you have added no optional columns, colskip = 1
. This is
because the first (non-optional) column contains the control values,
below.
Argument tr
must be a matrix, where each row is one trial
presented to the network. Trials are always presented in the order
specified. The columns must be as described below, in the order
described below:
ctrl
- column of control codes. Available codes are: 0 = normal
learning trial, 1 = reset network (i.e. initialize a new set of
weights following the st
parameters), 2 = Freeze
learning. Control codes are actioned before the trial is processed.
opt1, opt2, ...
- optional columns, which may have any names
you wish, and you may have as many as you like, but they must be
placed after the ctrl
column, and before the remaining columns
(see below). These optional columns are ignored by this function, but
you may wish to use them for readability. For example, you might
include columns for block number, trial number, and stimulus ID
number. The argument colskip
(see above) must be set to the
number of optional columns plus 1.
x1, x2, ...
- input to the model, there must be one column for
each input unit. Each row is one trial. Dichotomous inputs should be
in the format -1, 1
. Continuous inputs should be scaled to the
range of -1, 1
. As the model's learning objective is to
accurately reconstruct the inputs, the input to the model is also the
teaching signal. For testing under conditions of missing information,
input features can be set to 0 to negate the contribution of the
feature(s) for the classification decision of that trial.
t1, t2, ...
- Category membership of the current
stimulus. There must be one column for each category. Each row is one
trial. If the stimulus is a member of category X, then the value in
the category X column must be set to +1
, and the values for all
other category columns must be set to -1
.
Returns a list containing two components: (1) matrix of response
probabilities for each category on each trial, (2) an st
list
object that contains the model's final state. A weight initialization
history is also available when the extended output parameter is set
xtdo = TRUE
in the slpDIVA
call.
A faster (Rcpp) implementation of slpDIVA is planned for a future release of catlearn.
Garrett Honke, Nolan B. Conaway, Andy Wills
Conaway, N. B., & Kurtz, K. J. (2014). Now you know it, now you don't: Asking the right question about category knowledge. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the Thirty-Sixth Annual Conference of the Cognitive Science Society (pp. 2062-2067). Austin, TX: Cognitive Science Society.
Kruschke, J. (1992). ALCOVE: an exemplar-based connectionist model of category learning. Psychological Review, 99, 22-44
Kurtz, K.J. (2007). The divergent autoencoder (DIVA) model of category learning. Psychonomic Bulletin & Review, 14, 560-576.
Kurtz, K. J. (2015). Human Category Learning: Toward a Broader Explanatory Account. Psychology of Learning and Motivation, 63.
Wills, A.J., O'Connell, G., Edmunds, C.E.R., & Inkster, A.B.(2017). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. The Psychology of Learning and Motivation, 66, 79-115.
EXemplar-based attention to distinctive InpuT model (Kruschke, 2001)
slpEXIT(st, tr, xtdo = FALSE)
slpEXIT(st, tr, xtdo = FALSE)
st |
List of model parameters |
tr |
R-by-C matrix of training items |
xtdo |
if |
The contents of this help file are relatively brief; a more extensive tutorial on using slpEXIT can be found in Spicer et al. (n.d.).
The functions works as a stateful list processor. Specifically, it takes a data frame as an argument, where each row is one trial for the network, and the columns specify the input representation, teaching signals, and other control signals. It returns a matrix where each row is a trial, and the columns are the response probabilities at the output units. It also returns the final state of the network (cue -> exemplar, and cue -> outcome weights), hence its description as a 'stateful' list processor.
References to Equations refer to the equation numbers used in the Appendix of Kruschke (2001).
Argument tr
must be a data frame, where each row is one trial
presented to the network, in the order of their occurence.
tr
requires the following columns:
x1, x2, ...
- columns for each cue (1
= cue present,
0
= cue absent). These columns have to start with x1
ascending with features ..., x2, x3, ...
at adjacent
columns. See Notes 1, 2.
t1, t2, ...
- columns for the teaching values indicating the
category feedback on the current trial. Each category needs a single
teaching signal in a dummy coded fashion, e.g., if the first category
is the correct category for that trial, then t1
is set to
1
, else it is set to 0
. These columns have to start with
t1
ascending with categories ..., t2, t3, ...
at
adjacent columns.
ctrl
- vector of control codes. Available codes are: 0 = normal
trial, 1 = reset network (i.e. reset connection weights to the values
specified in st
). 2 = freeze learning. Control codes are
actioned before the trial processed.
opt1, opt2, ...
- optional columns, which may have any name you
wish. These optional columns are ignored by this function, but you may
wish to use them for readability. For example, you might include
columns for block number, trial number, and stimulus ID..
Argument st
must be a list containing the following items:
nFeat
- integer indicating the total number of possible
stimulus features, i.e. the number of x1, x2, ...
columns in
tr
.
nCat
- integer indicating the total number of possible
categories, i.e. the number of t1, t2, ...
columns in
tr
.
phi
- response scaling constant - Equation (2)
c
- specificity parameter. Defines the narrowness of
receptive field in exemplar node activation - Equation (3).
P
- Attentional normalization power (attentional capacity) -
Equation (5). If P
equals 1
then the attention weights
will satisfy the constraint that attention strength for currently
present features will sum to one. The sum of attention strengths for
present features grows as a function of P
.
l_gain
- attentional shift rate - Equation (7)
l_weight
- learning rate for feature to category associations.
- Equation (8)
l_ex
- learning rate for exemplar_node to gain_node associations
- Equation (9)
iterations
- number of iterations of shifting attention on each
trial (see Kruschke, 2001, p. 1400). If you're not sure what to use
here, set it to 10.
sigma
- Vector of cue saliences, one for each cue. If you're
not sure what to put here, use 1 for all cues except the bias cue. For
the bias cue, use some value between 0 and 1.
w_in_out
- matrix with nFeat
columns and nCat
rows,
defining the input-to-category association weights, i.e. how much each
feature is associated to a category (see Equation 1). The nFeat
columns follow the same order as x1, x2, ...
in tr
,
and likewise, the nCat
rows follow the order of
t1, t2, ...
.
exemplars
- matrix with nFeat
columns and n rows, where
n is the number of exemplars, such that each row represents a single
exemplar in memory, and their corresponding feature values.
The nFeat
columns follow the same order as x1, x2, ...
in tr
. The n-rows follow the same order as in the
w_exemplars
matrix defined below. See Note 3.
w_exemplars
- matrix which is structurally equivalent to
exemplars
. However, the matrix represents the associative weight
from the exemplar nodes to the gain nodes, as given in Equation 4.
The nFeat
columns follow the same order as
x1, x2, ...
in tr
. The n-rows follow the same order
as in the exemplars
matrix.
Returns a list containing three components (if xtdo = FALSE) or four
components (if xtdo = TRUE, g
is also returned):
p |
Matrix of response probabilities for each outcome on each trial |
w_in_out |
Matrix of final cue -> outcome associative strengths |
w_exemplars |
Matrix of final cue -> exemplar associative strengths |
g |
Vector of gains at the end of the final trial |
1. Code optimization in slpEXIT means it's essential that every cue is
either set to 1 or to 0. If you use other values, it won't work
properly. If you wish to represent cues of unequal salience, use
sigma
.
2. EXIT simulations normally include a 'bias' cue, i.e. a cue that is
present on all trials. You will need to explicitly include this in
your input representation in tr
. For an example, see the output
of krus96train
.
3. The bias cue should be included in these exemplar representations,
i.e. they should be the same as the representation of the stimuli in
tr
. For an example, see the output of krus96train
.
René Schlegelmilch, Andy Wills, Angus Inkster
Kruschke, J. K. (1996). Base rates in category learning. Journal of Experimental Psychology-Learning Memory and Cognition, 22(1), 3-26.
Kruschke, J. K. (2001). The inverse base rate effect is not explained by eliminative inference. Journal of Experimental Psychology: Learning, Memory & Cognition, 27, 1385-1400.
Spicer, S.G., Schlegelmilch, R., Jones, P.M., Inkster, A.B., Edmunds, C.E.R. & Wills, A.J. (n.d.). Progress in learning theory through distributed collaboration: Concepts, tools, and examples. Manuscript in preparation.
Gluck and Bower (1988) adaptive least-mean-square (LMS) network
slpLMSnet(st, tr, xtdo = FALSE, dec = "logistic")
slpLMSnet(st, tr, xtdo = FALSE, dec = "logistic")
st |
List of model parameters |
tr |
Numerical matrix of training items, use |
xtdo |
Boolean specifying whether to include extended information in the output (see below) |
dec |
Specify what response rule to use. |
The function operates as a stateful list processor (slp; see Wills et al., 2017). Specifically, it takes a matrix as an argument. Each row represents a single trial. Each column represents different types of information required by the implementation of the model, such as the elemental representation of stimuli, teaching signals, and other variables specifying the model's behaviour (e.g. freezing learning).
Argument st
must be a list containing the following items:
beta
- the learning rate (fixed for a given simulation) for the
LMS learning rule. The upper bound of this parameter is not
specified, but we suggest .
theta
- is a positive scaling constant. When theta rises, the
logistic choice function will become less linear. When theta is
high, the logistic function will approximate the behaviour of a step
function.
bias
- is a bias parameter. It is the value of the output
activation that results in an output probability rating of P =
0.5. For example, if you wish an output activation of 0.4 to produce a
rated probability of 0.5, set beta to 0.4. If you are not sure what to
use here, set it to 0. The bias parameter is not part of the original
Gluck and Bower (1988) LMS network, see Note 1.
w
- is a matrix of initial connection weights, where each row is
an outcome, and each column is a feature or cue. If you are not sure
what to use here, set all values to 0.
outcomes
- is the number of possible categories or outcomes.
colskip
- the number of optional columns to be skipped in the tr
matrix. colskip should be set to the number of optional columns
PLUS ONE. So, if you have added no extra columns, colskip = 1.
Argument tr
must be a matrix, where each row is one trial
presented to the model. Trials are always presented in the order
specified. The columns must be as described below, in the order
described below:
ctrl
- a vector of control codes. Available codes are: 0 = normal
trial; 1 = reset model (i.e. set associative strengths (weights)
back to their initial values as specified in w (see above)); 2 =
Freeze learning. Control codes are actioned before the trial is
processed.
opt1, opt2, ...
- any number of preferred optional columns, the
names of which can be chosen by the user. It is important that these
columns are placed after the control column, and before the
remaining columns (see below). These optional columns are ignored by
the slpLMSnet function, but you may wish to use them for
readability. For example, you might choose to include columns such
as block number, trial number and condition. The argument colskip
(see above) must be set to the number of optional columns plus one.
x1, x2, ...
- activation of input nodes of corresponding features.
Feature patterns usually represented as a bit array. Each element in the
bit array encodes the activations of the input nodes given the presence or
absence of the corresponding features. These activations can take on either
1 or 0, present and absent features respectively. For example, Medin and
Edelson's (1988) inverse base-rate effect with stimuli AB and AC can be
represented as [1 1 0] and [1 0 1] respectively. In a more unconventional
scenario, you can set activation to vary between present 1 and absent -1,
see Note 2. slpLMSnet can also support any positive or negative real number
for activations, e.g. one might use values between 0 and 1 to represent the
salience of the features.
d1, d2, ...
- teaching input signals indicating the category feedback
on the current trial. It is a bit array, similar to the activations of
input nodes. If there are two categories and the stimuli on the current
trial belongs to the first, then this would be represented in tr
as
[1 0], on edge cases see Note 3. The length of this array must be provided
via outcomes
in st
.
Returns a list with the following items if xtdo = FALSE
:
p |
A matrix with either the probability rating for each
outcome on each trial if |
nodeActivation |
Output node activations on each trial, as output by Equation 3 in Gluck and Bower (1988). |
connectionWeightMatrix |
A connection weight matrix, W, where
each row represents the corresponding element in the teaching
signals array in |
If xtdo = TRUE
, the following item also returned:
squaredDifferences |
The least mean squeared differences between desired and actual activations of output nodes on each trial (Eq. 4 in Gluck and Bower, 1988). This metric is an indicator of the network's performance, which is measured by its accuracy. |
1. The bias
parameter is not part of the original Gluck and
Bower (1988) model. bias
in the current implementation helps
comparisons between simulations using the act2probrat
logistic choice function. Set bias to 0 for operation as specified
in Gluck & Bower (1988). Also note that, where there is more than
one output node, the same bias value is subtracted from the output
of each node. This form of decision mechanism is not present in the
literature as far as we are aware, although using a negative bias
value would, in multi-outcome cases, approximate a 'background
noise' decision rule, as used in, for example, Nosofsky et
al. (1994).
2. slpLMSnet can support both positive and negative real numbers as input node activations. For example, one might wish to follow Markman's (1989) suggestion that the absence of a feature element is encoded as -1 instead of 0.
3. slpLMSnet can process a bit array of teaching signals, where the model is told that the stimulus belongs to more than one category. slpLMSnet uses matrix operations to update weights, so it can encode and update multiple teaching signals on the same trial.
Lenard Dome, Andy Wills
Gluck, M. A., & Bower, G. H. (1988). From conditioning to category learning: An adaptive network model. Journal of Experimental Psychology: General, 117, 227-247.
Markman, A. B. (1989). LMS rules and the inverse base-rate effect: Comment on Gluck and Bower (1988). Journal of Experimental Psychology: General, 118, 417-421.
Medin, D. L., & Edelson, S. M. (1988). Problem structure and the use of base-rate information from experience. Journal of Experimental Psychology: General, 117, 68-85.
Nosofsky, R.M., Gluck, M.A., Plameri, T.J., McKinley, S.C. and Glauthier, P. (1994). Comparing models of rule-based classification learning: A replication and extension of Shepard, Hovland, and Jenkins (1961). Memory and Cognition, 22, 352-369.
Wills, A.J., O'Connell, G., Edmunds, C.E.R., & Inkster, A.B.(2017). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. Psychology of Learning and Motivation, 66, 79-115.
## load catlearn library(catlearn) ## create st with initial state st <- list(beta = 0.025, # learning rate theta = 1, # decision scaling parameter bias = 0, # decision bias parameter # initial weight matrix, # row = number of categories, # col = number of cues w = matrix(rep(0, 6*4), nrow = 4, ncol = 6), outcomes = 4, # number of possible outcomes colskip = 3) ## create inverse base-rate effect tr for 1 subject and without bias cue tr <- krus96train(subjs = 1, ctxt = FALSE) # run simulation and store output out <- slpLMSnet(st, data.matrix(tr)) out$connectionWeightMatrix
## load catlearn library(catlearn) ## create st with initial state st <- list(beta = 0.025, # learning rate theta = 1, # decision scaling parameter bias = 0, # decision bias parameter # initial weight matrix, # row = number of categories, # col = number of cues w = matrix(rep(0, 6*4), nrow = 4, ncol = 6), outcomes = 4, # number of possible outcomes colskip = 3) ## create inverse base-rate effect tr for 1 subject and without bias cue tr <- krus96train(subjs = 1, ctxt = FALSE) # run simulation and store output out <- slpLMSnet(st, data.matrix(tr)) out$connectionWeightMatrix
Mackintosh's (1975) attentional learning model, as implemented by Le Pelley et al. (2016).
slpMack75(st, tr, xtdo = FALSE)
slpMack75(st, tr, xtdo = FALSE)
st |
List of model parameters |
tr |
Matrix of training items |
xtdo |
Boolean specifying whether to include extended information in the output (see below) |
The function operates as a stateful list processor (slp; see Wills et al., 2017). Specifically, it takes a matrix (tr) as an argument, where each row represents a single training trial, while each column represents the different types of information required by the model, such as the elemental representation of the training stimuli, and the presence or absence of an outcome. It returns the output activation on each trial (a.k.a. sum of associative strengths of cues present on that trial), as a vector. The slpMack75 function also returns the final state of the model - a vector of associative and attentional strengths between each stimulus and the outcome representation.
Argument st
must be a list containing the following items:
lr
- the associative learning rate (fixed for a given
simulation), as denoted by theta
in Equation 1 of Mackintosh
(1975).
alr
- the attentional learning rate parameter. It can be set without
limit (see alpha below), but we recommend setting this parameter to somewhere
between 0.1 and 1.
w
- a vector of initial associative strengths. If you are not
sure what to use here, set all values to zero.
alpha
- a vector of initial attentional strengths. If the
updated value is above 1 or below 0.1, it is capped to 1 and 0.1
respectively.
colskip
- the number of optional columns to be skipped in the tr
matrix. colskip should be set to the number of optional columns you have
added to the tr matrix, PLUS ONE. So, if you have added no optional
columns, colskip=1. This is because the first (non-optional) column
contains the control values (details below).
Argument tr
must be a matrix, where each row is one trial
presented to the model. Trials are always presented in the order
specified. The columns must be as described below, in the order
described below:
ctrl
- a vector of control codes. Available codes are:
0 = normal trial 1 = reset model (i.e. set associative strengths back to their initial values as specified in w) 2 = Freeze learning 3 = Reset associative weights to initial state, but keep attentional strengths in alpha 4 = Reset attentional strengths to initial state, but keep association weights.
Control codes are actioned before the trial is processed.
opt1, opt2, ...
- any number of preferred optional columns, the
names of which can be chosen by the user. It is important that these
columns are placed after the control column, and before the remaining
columns (see below). These optional columns are ignored by the
function, but you may wish to use them for readability. For example, you
might choose to include columns such as block number, trial number and
condition. The argument colskip (see above) must be set to the number of
optional columns plus one.
x1, x2, ...
- activation of any number of input elements. There
must be one column for each input element. Each row is one trial. In
simple applications, one element is used for each stimulus (e.g. a
simulation of blocking (Kamin, 1969), A+, AX+, would have two inputs,
one for A and one for X). In simple applications, all present elements
have an activation of 1 and all absence elements have an activation of
0. However, slpMack75 supports any real number for activations, e.g. one
might use values between 0 and 1 to represent differing cue saliences.
t
- Teaching signal (a.k.a. lambda). Traditionally, 1 is used to
represent the presence of the outcome, and 0 is used to represent the
absence of the outcome, although slpMack75 supports any real values for lambda.
If you are planning to use multiple outcomes, see Note 2.
Argument xtdo
(eXTenDed Output) - if set to TRUE, function will
additionally return trial-level data including attentional strengths and
the updated associative strengths after each trial (see Value).
Returns a list containing three components (if xtdo = FALSE) or five components (if xtdo = TRUE, xoutw and xouta is also returned):
suma |
Vector of summed associative strength for each trial. |
w |
Vector of final associative strengths. |
alpha |
Vector of final attentional weights. |
xoutw |
Matrix of trial-level data of the associative strengths at the end of the trial, after each has been updated. |
xouta |
Matrix of trial-level data of the attentional strengths at the end of the trial, after each has been updated. |
1. Mackintosh (1975) did not formalise how to update the cues' associability, but described when associability increases or decreases in Equation 4 and 5. He assumed that the change in alpha would reflect the difference between the prediction error generated by the current cue and the combined influence (a sum) of all other cues. Le Pelley et al. (2016) provided a linear function in Equation 2 that adheres to this description. This expression is probably the simplest way to express Mackintosh's somewhat vague description in mathematical terms. A linear function is also easier to computationally implement. So we decided to use Equation 2 from Le Pelley et al. (2016) for updating attentional strengths.
2. At present, only single-outcome experiments are officially supported. If you want to simulate a two-outcome study, consider using +1 for one outcome, and -1 for the other outcome. Alternatively, run a separate simulation for each outcome.
Lenard Dome, Andy Wills, Tom Beesley
Kamin, L.J. (1969). Predictability, surprise, attention and conditioning. In Campbell, B.A. & Church, R.M. (eds.), Punishment and Aversive Behaviour. New York: Appleton-Century-Crofts, 1969, pp.279-296.
Le Pelley, M. E., Mitchell, C. J., Beesley, T., George, D. N., & Wills, A. J. (2016). Attention and associative learning in humans: An integrative review. Psychological Bulletin, 142(10), 1111–1140. https://doi.org/10.1037/bul0000064
Mackintosh, N.J. (1975). A theory of attention: Variations in the associability of stimuli with reinforcement, Psychological Review, 82, 276-298.
Wills, A.J., O'Connell, G., Edmunds, C.E.R., & Inkster, A.B.(2017). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. Psychology of Learning and Motivation, 66, 79-115.
Gillan et al.'s (2015) model-free / model-based hybrid Reinforcement Learning model (see Note 1).
slpMBMF(st, tr, xtdo = FALSE)
slpMBMF(st, tr, xtdo = FALSE)
st |
List of model parameters |
tr |
Matrix of training items |
xtdo |
Boolean. When TRUE, extended output is provided, see below |
The contents of this help file are relatively brief; a more extensive discussion of this model can be found in the supplementary materials of Gillan et al. (2015).
The function operates as a stateful list processor (slp; see Wills et al., 2017). Specifically, it takes a matrix (tr) as an argument, where each row represents a single training trial, while each column represents the different types of information required by the model. It returns a matrix of predicted response probabilities for each stage 1 action on each trial. The slpMBMF function also returns the final Q values for the model.
The current implementation of slpMBMF deals only with relatively
simple Reinforcement Learning experiments, of which Gillan et
al. (2015, Exp. 2) is one example. Specifically, each trial has two
stages. In the first stage of the trial, there is a single state, and
the participant can emit one of x
actions. In the second stage,
there are y
states. A reward follows (or doesn't) without a
further action from the participant.
A hybrid MB/MF model thus has 2x
Q-values at stage 1 (x
for the model-based system, and x
for the model-free system),
and y
Q-values at stage 2 (one for each state; there are no
actions at stage 2, and the MB and MF systems evaluate stage 2
Q-values the same way in this model). See Note 3.
Argument st
must be a list containing the following items:
alpha
- the model-free learning rate (range: 0-1)
lambda
- the eligibility trace parameter (range: 0-1)
w
- A number between 0 and 1, representing the relative
contribution of the model-based and model-free parts of the model to
the response (0 = pure model-free, 1 = pure model-based).
beta
- Decision stochasticity parameter
p
- Decision perseveration (p > 0) or switching (p < 0)
parameter
tprob
- A 2 x 2 matrix of transition probabilities, used by the
model-based system. The rows are the actions at stage 1. The columns
are the states at stage 2. The cells are transition probabilities
(e.g. tprob[2,1] is the probability of arriving at stage 2 state #1
given action #2 at stage 1).
q1.mf
- A vector of initial model-free Q values for the actions
at stage 1.
q1.mb
- A vector of initial model-based Q values for the
actions at stage 1.
q2
- A vector of initial Q values for the states at stage 2
(the MB and MF systems share common Q values at stage 2).
If you are unsure what initial Q values to use, set all to 0.5.
Argument tr
must be a matrix, where each row is one
trial. Trials are always presented to the model in the order
specified. The matrix must contain the following named columns (other
columns will be ignored):
s1.act
- The action made by the participant at stage 1, for
each trial; must be an integer in the range 1-x
.
s2.state
- State of environment at stage 2, for each trial;
must be an integer in the range 1-y
.
t
- Reward signal for trial; must be a real number. If you're
unsure what to use here, use 1 = rewarded, 0 = not rewarded.
When xtdo = FALSE, returns a list containing these components:
out
- Matrix of response probabilities, for each stage 1 action
on each trial.
q1.mf
- A vector of final model-free Q values for the actions at
stage 1.
q1.mb
- A vector of final model-based Q values for the
actions at stage 1
q2
- A vector of final Q values for the states at stage 2
(the MB and MF systems share common Q values at stage 2).
When xtdo = TRUE, the list also contains the following model-state information :
xout
- A matrix containing the state of the model at the end of
each trial. Each row is one trial. It has the following columns:
q1.mb.1, q1.mb.2, ...
- One column for each model-based Q
value at stage 1.
q1.mf.1, q1.mf.2, ...
- One column for each model-free Q
value at stage 1.
q2.1, q2.2, ...
- One column for each Q value at stage 2.
q1.h.1, q1.h.2, ...
- One column for each hybrid Q value at
stage 1.
s1.d.mf
- Model-free delta at stage 2, wrt. stage 1 action.
s2.d.mf
- Model-free delta at outcome.
In addition, when xtdo = TRUE, the list also contains the following information that is not used by the model (but which might be handy as potential neural regressors).
s1.d.mb
- Model-based delta at stage 2, wrt. stage 1
action.
s1.d.h
- Hybrid delta (based on stage 1 hybrid Q values) at
stage 2, wrt. stage 1 action.
s1.d.diff
- s1.d.mf - s1.d.mb
1. Gillan et al.'s (2015) choice rule, at least as stated in their supplementary materials, would lead to the response probabilities being infinite on switch trials, which is presumably an error. The current implementation uses Daw et al. (2011, suppl. mat., Eq. 2).
2. Gillan et al. (2015) decay Q values for unselected actions by (1-alpha). This is not part of the current implementation.
3. In the current implementation of the model, x
must be 2 and
y
must be two, otherwise the model will fail or behave
unpredictably. If you'd like to develop a more general version of this
implementation, contact the author.
Andy Wills ( [email protected] ), Tom Sambrook
Daw, N.D., Gershman, S.J., Seymour, B., Dayan, P., & Dolan, R.J. (2011). Model-based influences on humans' choices and striatal prediction errors. Neuron, 69, 1204-1215.
Gillan, C.M., Otto, A.R., Phelps, E.A. & Daw, N.D. (2015). Model-based learning protects against forming habits. Cogn. Affect. Behav. Neurosci., 15, 523-536.
Wills, A.J., O'Connell, G., Edmunds, C.E.R., & Inkster, A.B.(2017). Progress in modeling thrXSough distributed collaboration: Concepts, tools, and category-learning examples. Psychology of Learning and Motivation, 66, 79-115.
This is Model 4 from Paskewitz and Jones (2020). Model 4 is a Neural Network with Competitive Attentional Gating - a fragmented version of EXIT (Kruschke, 2001) lacking exemplar-based rapid attentional shifts.
slpNNCAG(st, tr, xtdo = FALSE)
slpNNCAG(st, tr, xtdo = FALSE)
st |
List of model parameters |
tr |
R matrix of training items |
xtdo |
Boolean specifying whether to include extended information in the output (see below). |
The function operates as a stateful list processor (slp; see Wills et al., 2017). Specifically, it takes a matrix (tr) as an argument, where each row represents a single training trial, while each column represents the different types of information required by the model, such as the elemental representation of the training stimuli, and the presence or absence of an outcome.
Argument st
must be a list containing the following items:
P
- attention normalization constant, .
phi
- decision-making constant, , also referred to as
specificity constant.
lambda
- learning rate, .
mu
- attentional learning rate, .
outcomes
- The number of categories.
w
- a matrix of initial weights, where
equals to the number of categories and
equals to the
number of stimuli.
eta
- , a vector with
elements, where
is the salience of the
cue. In edge cases,
is capped at lower bound of 0.1, see Note
1.
colskip
- The number of optional columns to be skipped in the tr
matrix. colskip should be set to the number of optional columns you have
added to the tr matrix, PLUS ONE. So, if you have added no optional
columns, colskip=1. This is because the first (non-optional) column
contains the control values (details below).
Argument tr
must be a matrix, where each row is one trial
presented to the model. Trials are always presented in the order
specified. The columns must be as described below, in the order
described below:
ctrl
- a vector of control codes. Available codes are: 0 = normal
trial; 1 = reset model (i.e. set matrix of initial weights and vector of
salience back to their initial values as specified in st
); 2 =
Freeze learning. Control codes are actioned before the trial is
processed.
opt1, opt2, ...
- any number of preferred optional columns, the
names of which can be chosen by the user. It is important that these
columns are placed after the control column, and before the remaining
columns (see below). These optional columns are ignored by the function,
but you may wish to use them for readability. For example, you might
choose to include columns such as block number, trial number and
condition. The argument colskip (see above) must be set to the number of
optional columns plus one.
x1, x2, ...
- columns for each cue (1
= cue present,
0
= cue absent). There must be one column for each input
element. Each row is one trial. In simple applications, one element is
used for each stimulus (e.g. a simulation of blocking (Kamin, 1969), A+,
AX+, would have two inputs, one for A and one for X). In simple
applications, all present elements have an activation of 1 and all
absent elements have an activation of 0. However, slpNNCAG supports any
real number for activations.
t1, t2, ...
- columns for the teaching values indicating the
category feedback on the current trial. Each category needs a single
teaching signal in a dummy coded fashion, e.g., if there are four
categories and the current stimulus belongs to the second category, then
we would have [0, 1, 0, 0]
.
Returns a list containing three components (if xtdo = FALSE) or four components (if xtdo = TRUE).
if xtdo = FALSE
:
p |
Response probabilities for each trial (rows) and each category (columns). |
final_eta |
Salience at the end of training. |
final_weights |
An |
if xtdo = TRUE
, the following values are also returned:
model_predictions |
The matrix for trial-level predictions of the model as specified by Equation 5 in Paskewitz and Jones (2021). |
eta |
The updated salience at the end of each trial. |
1. If there is only one stimulus present on a given trial with
=
0
or with =
0
, Equation 12 of
Paskewitz & Jones (2020) breaks down. In order to avoid this,
and
are capped at the lower limit of
0.01
.
2. This model is implemented in C++ for speed.
Lenard Dome, Andy Wills
Kamin, L.J. (1969). Predictability, surprise, attention and conditioning. In Campbell, B.A. & Church, R.M. (eds.), Punishment and Aversive Behaviour. New York: Appleton-Century-Crofts, 1969, pp.279-296.
Kruschke, J. K. (2001). Toward a unified model of attention in associative learning. Journal of Mathematical Psychology, 45(6), 812-863.
Paskewitz, S., & Jones, M. (2020). Dissecting EXIT. Journal of Mathematical Psychology, 97, 102371.
Wills, A.J., O'Connell, G., Edmunds, C.E.R., & Inkster, A.B.(2017). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. Psychology of Learning and Motivation, 66, 79-115.
This is Model 5 from Paskewitz and Jones (2020). Model 5 is a Neural Network with Rapid Attentional Shifts the also contains an competitive attentional gating mechanism. It is a fragmented version of EXIT (Kruschke, 2001) lacking exemplar-mediated attention.
slpNNRAS(st, tr, xtdo = FALSE)
slpNNRAS(st, tr, xtdo = FALSE)
st |
List of model parameters |
tr |
R matrix of training items |
xtdo |
Boolean specifying whether to include extended information in the output (see below). |
The function operates as a stateful list processor (slp; see Wills et al., 2017). Specifically, it takes a matrix (tr) as an argument, where each row represents a single training trial, while each column represents the different types of information required by the model, such as the elemental representation of the training stimuli, and the presence or absence of an outcome.
Argument st
must be a list containing the following items:
P
- attention normalization constant, .
phi
- decision-making constant, , also referred to as
specificity constant.
lambda
- learning rate, .
mu
- attentional learning rate, .
rho
- attentional shift rate, . Attention shifts ten
times per trial.
outcomes
- The number of categories.
w
- a matrix of initial weights, where
equals to the number of categories and
equals to the
number of stimuli.
eta
- , a vector with
elements, where
is the salience of the
cue. In edge cases,
is capped at lower bound of 0.1, see Note
1.
colskip
- The number of optional columns to be skipped in the tr
matrix. colskip should be set to the number of optional columns you have
added to the tr matrix, PLUS ONE. So, if you have added no optional
columns, colskip=1. This is because the first (non-optional) column
contains the control values (details below).
Argument tr
must be a matrix, where each row is one trial
presented to the model. Trials are always presented in the order
specified. The columns must be as described below, in the order
described below:
ctrl
- a vector of control codes. Available codes are: 0 = normal
trial; 1 = reset model (i.e. set matrix of initial weights and vector of
salience back to their initial values as specified in st
); 2 =
Freeze learning. Control codes are actioned before the trial is
processed.
opt1, opt2, ...
- any number of preferred optional columns, the
names of which can be chosen by the user. It is important that these
columns are placed after the control column, and before the remaining
columns (see below). These optional columns are ignored by the function,
but you may wish to use them for readability. For example, you might
choose to include columns such as block number, trial number and
condition. The argument colskip (see above) must be set to the number of
optional columns plus one.
x1, x2, ...
- columns for each cue (1
= cue present,
0
= cue absent). There must be one column for each input
element. Each row is one trial. In simple applications, one element is
used for each stimulus (e.g. a simulation of blocking (Kamin, 1969), A+,
AX+, would have two inputs, one for A and one for X). In simple
applications, all present elements have an activation of 1 and all
absence elements have an activation of 0. However, slpNNRAS supports any
real number for activations.
t1, t2, ...
- columns for the teaching values indicating the
category feedback on the current trial. Each category needs a single
teaching signal in a dummy coded fashion, e.g., if there are four
categories and the current stimulus belongs to the second category, then
we would have [0, 1, 0, 0]
.
Returns a list containing three components (if xtdo = FALSE) or four components (if xtdo = TRUE).
if xtdo = FALSE
:
p |
Response probabilities for each trial (rows) and each category (columns). |
final_eta |
Salience at the end of training. |
final_weights |
An |
if xtdo = TRUE
, the following values are also returned:
model_predictions |
The matrix for trial-leve predictions of the model as specified by Equation 5 in Paskewitz and Jones (2020). |
eta |
The updated salience at the end of each trial. |
1. If there is only one stimulus present on a given trial with
=
0
or with =
0
, Equation 12 breaks
down. In order to avoid this, and
is capped at the
lower limit of
0.01
.
2. This model is implemented in C++ for speed.
Lenard Dome, Andy Wills
Kamin, L.J. (1969). Predictability, surprise, attention and conditioning. In Campbell, B.A. & Church, R.M. (eds.), Punishment and Aversive Behaviour. New York: Appleton-Century-Crofts, 1969, pp.279-296.
Kruschke, J. K. (2001). Toward a unified model of attention in associative learning. Journal of Mathematical Psychology, 45(6), 812-863.
Paskewitz, S., & Jones, M. (2020). Dissecting EXIT. Journal of Mathematical Psychology, 97, 102371.
Wills, A.J., O'Connell, G., Edmunds, C.E.R., & Inkster, A.B.(2017). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. Psychology of Learning and Motivation, 66, 79-115.
Rescorla & Wagner's (1972) theory of Pavlovian conditioning.
slpRW(st, tr, xtdo = FALSE)
slpRW(st, tr, xtdo = FALSE)
st |
List of model parameters |
tr |
Matrix of training items |
xtdo |
Boolean specifying whether to include extended information in the output (see below) |
The contents of this help file are relatively brief; a more extensive tutorial on using slpRW can be found in Spicer et al. (n.d.).
The function operates as a stateful list processor (slp; see Wills et al., 2017). Specifically, it takes a matrix (tr) as an argument, where each row represents a single training trial, while each column represents the different types of information required by the model, such as the elemental representation of the training stimuli, and the presence/absence of an outcome. It returns the output activation on each trial (a.k.a. sum of associative strengths of cues present on that trial), as a vector. The slpRW function also returns the final state of the model - a vector of associative strengths between each stimulus and the outcome representation.
Argument st
must be a list containing the following items:
lr
- the learning rate (fixed for a given simulation). In order to
calculate lr, calculate the product of Rescorla-Wagner parameters alpha
and beta. For example, if you want alpha = 0.1 and beta = 0.2, set lr =
0.02. If you want different elements to differ in salience (different
alpha values) use the input activations (x1, x2, ..., see below) to
represent element-specific salience. For example, if alpha_A = 0.4,
alpha_X = 0.2, and beta = 0.1, then set lr = 0.1, and the activations of
A and B to 0.4 and 0.2, respectively.
w
- a vector of initial associative strengths. If you are not
sure what to use here, set all values to zero.
colskip
- the number of optional columns to be skipped in the tr
matrix. colskip should be set to the number of optional columns you have
added to the tr matrix, PLUS ONE. So, if you have added no optional
columns, colskip=1. This is because the first (non-optional) column
contains the control values (details below).
Argument tr
must be a matrix, where each row is one trial
presented to the model. Trials are always presented in the order
specified. The columns must be as described below, in the order
described below:
ctrl
- a vector of control codes. Available codes are: 0 = normal
trial; 1 = reset model (i.e. set associative strengths (weights) back to
their initial values as specified in w (see above)); 2 = Freeze
learning. Control codes are actioned before the trial is processed.
opt1, opt2, ...
- any number of preferred optional columns, the
names of which can be chosen by the user. It is important that these
columns are placed after the control column, and before the remaining
columns (see below). These optional columns are ignored by the slpRW
function, but you may wish to use them for readability. For example, you
might choose to include columns such as block number, trial number and
condition. The argument colskip (see above) must be set to the number of
optional columns plus one.
x1, x2, ...
- activation of any number of input elements. There
must be one column for each input element. Each row is one trial. In
simple applications, one element is used for each stimulus (e.g. a
simulation of blocking (Kamin, 1969), A+, AX+, would have two inputs, one for A and
one for X). In simple applications, all present elements have an
activation of 1 and all absence elements have an activation of
0. However, slpRW supports any real number for activations, e.g. one
might use values between 0 and 1 to represent differing cue saliences.
t
- Teaching signal (a.k.a. lambda). Traditionally, 1 is used to
represent the presence of the outcome, and 0 is used to represent the
absence of the outcome, although slpRW suports any real values for lambda.
Argument xtdo
(eXTenDed Output) - if set to TRUE, function will
return associative strength for the end of each trial (see Value).
Returns a list containing two components (if xtdo = FALSE) or three components (if xtdo = TRUE, xout is also returned):
suma |
Vector of output activations for each trial |
st |
Vector of final associative strengths |
xout |
Matrix of associative strengths at the end of each trial |
Stuart Spicer, Lenard Dome, Andy Wills
Kamin, L.J., (1969) Predictability, surprise, attention and conditioning. In Campbell, B.A. & Church, R.M. (eds.), Punishment and Aversive Behaviour. New York: Appleton-Century-Crofts, 1969, pp.279-296.
Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black & W. F. Prokasy (Eds.), Classical conditioning II: Current research and theory (pp. 64-99). New York: Appleton-Century-Crofts.
Spicer, S.G., Jones, P.M., Inkster, A.B., Edmunds, C.E.R. & Wills, A.J. (n.d.). Progress in learning theory through distributed collaboration: Concepts, tools, and examples. Manuscript in preparation.
Wills, A.J., O'Connell, G., Edmunds, C.E.R., & Inkster, A.B.(2017). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. Psychology of Learning and Motivation, 66, 79-115.
Supervised and Unsupervised STratified Adaptive Incremental Network (Love, Medin & Gureckis, 2004)
slpSUSTAIN(st, tr, xtdo = FALSE, ties = "random")
slpSUSTAIN(st, tr, xtdo = FALSE, ties = "random")
st |
List of model parameters |
tr |
Matrix of training items |
xtdo |
Boolean specifying whether to include extended information in the output (see below) |
ties |
Model behaviour where multiple clusters have the same
highest activations. Options are: |
This function works as a stateful list processor (slp; see Wills et al., 2017). It takes a matrix (tr) as an argument, where each row represents a single training trial, while each column represents some information required by the model, such as the stimulus representation, indications of supervised/unsupervised learning, etc. (details below).
Argument st
must be a list containing the following items:
r
- Attentional focus parameter, always non-negative.
beta
- Cluster competition parameter, always non-negative.
d
- Decision consistency parameter, always non-negative.
eta
- Learning rate parameter, see Note 1.
tau
- Threshold parameter for cluster recruitment under
unsupervised learning conditions (Love et al., 2004, Eq. 11). If every
trial is a supervised learning trial, set tau to 0. slpSUSTAIN can
accomodate both supervised and unsupervised learning within the same
simulation, using the ctrl
column in tr
(see below).
lambda
- Vector containing the initial receptive field tuning
value for each stimulus dimension; the order corresponds to the order
of dimensions in tr
, below. For a stimulus with three
dimensions, where all receptive fields are equally tuned, lambda = [1,
1, 1].
cluster
- A matrix of the initial positions of each recruited
cluster. If set to NA, cluster = NA
, then each time the network is
reset, a single cluster will be created, centered on the stimulus
presented on the current trial.
w
- A matrix of initial connection weights. If set to NA as
w = NA
then, each time the network is reset,
zero-strength weights to a single cluster will be created.
dims
- Vector containing the length of each dimension
(excluding category dimension, see tr
, below), i.e. the number
of nominal spaces in the representation of each dimension. For Figure
1 of Love et al. (2004), dims = [2, 2, 2].
maxcat
- optional. If set, maxcat is an integer specifying the
maximum number of clusters to be recruited during unsupervised
learning. A similar restriction has been used by Love et al. (2004) to
simulate an unsupervised free-sorting task from Experiment 1 in Medin,
Wattenmaker, & Hampson (1987). In this experiment, participants needed
to sort items into two predefined categories. This parameter will only
be used during unsupervised learning. If it is not set, or if it is
set to 0, there is no maximum to the number of clusters that can be
created.
colskip
- Number of optional columns skipped in tr
,
PLUS ONE. So, if there are no optional columns, set colskip to 1.
Argument tr
must be a matrix, where each row is one trial
presented to the model. Columns are always presented in the order
specified below:
ctrl
- A vector of control codes. The control codes are
processed prior to the trial and prior to updating cluster's position,
lambdas and weights (Love et al., 2004, Eq. 12, 13 and 14,
respectively). The available values are:
0 = do supervised learning.
1 = reset network and then do supervised learning.
2 = freeze supervised learning.
3 = do unsupervised learning.
4 = reset network and then do unsupervised learning.
5 = freeze unsupervised learning
'Reset network' means revert w
, cluster
,and
lambda
back to the values passed in st
.
Unsupervised learning in slpSUSTAIN
is at an early stage of
testing, as we have not yet established any CIRP for unsupervised
learning.
opt1, opt2, ...
- optional columns, which may have any names
you wish, and you may have as many as you like, but they must be
placed after the ctrl column, and before the remaining columns (see
below). These optional columns are ignored by this function, but you
may wish to use them for readability. For example, you might include
columns for block number, trial number, and stimulus ID number.
x1, x2, y1, y2, y3, ...
- Stimulus representation. The
columns represent the kth nominal value for ith dimension. It's a
'padded' way to represent stimulus dimensions and category membership
(as category membership in supervised learning is treated as an
additional dimension) with varying nominal length, see McDonnell &
Gureckis (2011), Fig. 10.2A. All dimensions for the trial are
represented in this single row. For example, if for the presented
stimulus, dimension 1 is [0 1] and dimension 2 is [0 1 0] with
category membership [0 1], then the input representation is [0 1 0 1 0
0 1].
Argument ties
can be either random
or first
. It specifies
how the model behaves in the event, when there are multiple winning clusters
with the same activations (see Note):
random
- The model randomly selects one cluster from the ones
that have the same activations. To increase the reproducibility of
your simulation, set a specific random seed seed before calling
slpSUSTAIN
(use e.g.set.seed
).
first
- The model selects the cluster that was first recruited
from the clusters that have the same activations. Up to and including
version 0.7.1 of catlearn
, this was the default behaviour of
slpSUSTAIN
.
Returns a list with the following items if xtdo = FALSE
:
probs |
Matrix of probabilities of making each response within
the queried dimension (e.g. column 1 = category A; column 2 = category
B), see Love et al. (2004, Eq. 8). Each row is a single trial and
columns are in the order presented in |
lambda |
Vector of the receptive field tunings for each stimulus
dimension, after the final trial. The order of dimensions corresponds
to the order they are presented in |
w |
Matrix of connection weights, after the final
trial. Each row is a separate cluster, reported in order of
recruitment (first row is the first cluster to be recruited). The
columns correspond to the columns on the input representation
presented (see |
cluster |
Matrix of recruited clusters, with their positions in
stimulus space. Each row is a separate cluster, reported in order of
recruitment. The columns correspond to the columns on the input
representation presented (see |
If xtdo = TRUE
, xtdo
is returned instead of
probs
:
xtdo |
A matrix that includes |
1. Love et al. (2004) do not explicitly set a range for the learning rate; we recommend a range of 0-1.
2. The specification of SUSTAIN states that under supervised learning, a new cluster is recruited each time the model predicts category membership incorrectly. This new cluster is centered on the current stimulus. The implementation in slpSUSTAIN adds the stipulation that a new cluster is NOT recruited if it already exists, i.e. if its location in stimulus space is identical to the location on an existing cluster. Instead, it selects the existing cluster and updates as normal. Love et al. (2004) do not specify model behaviour under such conditions, so this is an assumption of our implementation. We'd argue that this is a reasonable implementation - without it SUSTAIN would add clusters indefinitely under conditions where the stimulus -> category associations are proabilistic rather than deterministic.
3. In some cases, two or more clusters can have identical activations because the presented stimulus is equally similar to multiple clusters. Love et al. (2004) does not specify how the model will behave in these cases. In our implementation, we make the assumption that the model picks randomly between the highest activated clusters (given that they have the same activations). This, we felt, was in line with the approximation of lateral inhibition in the SUTAIN specification (Love et al. 2004, Eq. 6).
Lenard Dome, Andy Wills
Love, B. C., & Gureckis, T.M. (2007). Models in Search of a Brain. Cognitive, Affective, & Behavioral Neuroscience, 7, 90-108.
Love, B. C., Medin, D. L., & Gureckis, T. M. (2004). SUSTAIN: a network model of category learning. Psychological Review, 111, 309-332.
McDonnell, J. V., & Gureckis, T. M. (2011). Adaptive clustering models of categorization. In E. M. Pothos & A. J. Wills (Eds.), Formal Approaches in Categorization, pp. 220-252.
Medin, D. L., Wattenmaker, W. D., & Hampson, S. E. (1987). Family resemblance, conceptual cohesiveness, and category construction. Cognitive Psychology, 19(2), 242-279.
Wills, A.J., O'Connell, G., Edmunds, C.E.R., & Inkster, A.B.(2017). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. Psychology of Learning and Motivation, 66, 79-115.
Calculate sum of squared errors
ssecl(obs,exp)
ssecl(obs,exp)
obs |
Vector of observed values |
exp |
Vector of expected values |
Returns sum of the squared differences.
Andy Wills
Nosofsky's (1984, 2011) Generalized Context Model; an exemplar-based model of categorization.
stsimGCM(st)
stsimGCM(st)
st |
List of model parameters |
Argument st
must be a list containing the following required
items: training_items
, tr
, nCats
, nFeat
,
sensitivity
, weights
, choice_bias
, p
,
r_metric
, mp
, and gamma
nCats
- integer indicating the number of categories
nFeat
- integer indicating the number of stimulus dimensions
tr
- the stimuli presented to the model, for which the choice
probabilities will be predicted. tr
has to be a matrix or
dataframe with one row for each stimulus. tr
requires the
following columns.
x1, x2, ...
- columns for each dimension carrying the
corresponding values (have to be coded as numeric values) for each
exemplar (trial) given in the row. Columns have to start with
x1
ascending with dimensions ..., x2, x3, ...
at
adjacent columns.
tr
may have any number of additional columns with any desired
name and position, e.g. for readability. As long as the feature columns
x1, x2, ...
are given as defined (i.e. not scattered, across
the range of matrix columns), the output is not affected by optional
columns.
training_items
- all unique exemplars assumed to be stored in
memory; has to be a matrix or dataframe with one row for each exemplar.
The rownames have to start with 1 in ascending order.
training_items
requires the following columns:
x1, x2, ...
- columns for each feature dimension carrying
the corresponding values (have to be coded as numeric values) for each
exemplar (row). Columns have to start with x1
ascending with
dimensions ..., x2, x3, ...
at adjacent columns.
cat1, cat2, ...
- columns that indicate the category
assignment of each exemplar (row). For example, if the exemplar in row
2 belongs to category 1 the corresponding cell of cat1
has to
be set to 1
, else 0
. Columns have to start with
cat1
ascending with categories ..., cat2, cat3, ...
at adjacent columns.
mem
- (optional) one column that indicates whether an exemplar
receives an extra memory weight, yes = 1
, no = 0
.
For each exemplar (row) in the training_items
with mem
set to 0
the corresponding memory strength parameter is set to 1.
When mem
for an exemplar is set to 1
the memory strength
parameter is set as defined in mp
, see below.
training_items
may have any number of additional columns with any
desired name and position, e.g. for readability. As long as the feature
columns x1, x2, ...
and cat1, cat2, ...
are given as
defined (i.e. not scattered, across the range of matrix columns), the
output is not affected by optional columns.
NOTE: The current model can be implemented as a prototype model if
the training_items
only carry one row for each category representing
the values of the corresponding prototypes (e.g. see Minda & Smith, (2011).
mp
- memory strength parameter (optional). Can take any numeric
value between -Inf and +Inf. The default is 1, i.e. all exemplars have
the same memory strength. There are two ways of specifying mp
,
i.e. either globally or exemplar specific:
When globally setting mp
to a single integer,
e.g. to 5, then all exemplars in training_items
with mem
= 1 will receive a memory strength 5 times higher than the memory strengths
for the remaining exemplars.
For setting exemplar specific memory strengths mp
has to be
a vector of length n, where n is the overall number of of exemplars with
mem
= 1 in the training_items
. The order of memory strengths
defined in this vector exactly follows their row-wise ascending order of
appearence in the training_items
. E.g. if there are two exemplars
with mem
= 1 in the training_items
, the first one in row 2
and the second one in row 10, then setting mp
to c(3,2) will result
in assigning a memory strength of 3 to the first exemplar (in row 2) and
a memory strength of 2 to the second exemplar (in row 10). The memory
strengths for all other exemplars will be set to 1. See Note 1.
sensitivity
- sensitivity parameter c; can take any value
between 0 (all exemplars are equally similar) and +infinity
(towards being insensitive to large differences). There are two ways
of specifying sensitivity
, i.e. either globally or
exemplar specific: When globally setting sensitivity
to a single value, e.g. sensitivity
=3, then the same parameter is
applied to all exemplars. On the other hand, exemplar specific
sensitivity parameters can be used by defining sensitivity
as
a vector of length n, where n is the number of rows in
training_items
. The sensitivity
vector values then represent
the sensitivity parameters for all exemplars in training_items
at
the corresponding row positions. E.g. if there are 3 exemplars (rows) in
training_items
, then setting sensitivity
to c(1,1,3)
assigns sensitivity
= 1 to the first two exemplars, and
sensitivity
= 3 for the third exemplar. See Note 2.
weights
- dimensional attention weights. Order corresponds
to the definitions of x1, x2, ...
in tr
and
training_items
. Has to be a vector with length n-1 , where n
equals to nFeat
dimension weights, e.g. of length 2 when
there are three features, leaving out the last dimension. A
constraint in the GCM is that all attentional weights sum to 1. Thus,
the sum of n-1 weights should be equal to or smaller than 1, too. The
last n-th weight then is computed within the model with: 1 - (sum of
n-1 feature weights). When setting the weights to 1/nFeat
=
equal weights. See Note 3.
choice_bias
- Category choice biases. Has to be a vector with
length n-1, where n equals to nCats
category biases, leaving out
the last category bias, under the constraint that all biases sum to 1.
Order corresponds to the definitions of cat1, cat2
in the
training_items
. The sum of n-1 choice biases has to be equal
to or smaller than 1. Setting the weights to 1/nCats
= no
choice bias. The bias for the last category then is computed in the
model with: 1 - (sum of nCats
-1 choice biases). See Note 3.
gamma
- decision constant/ response scaling. Can take any
value between 0 (towards more probabilistic) and +infinity (towards
deterministic choices). Nosofsky (2011) suggests setting gamma higher
than 1 when individual participants' data are considered. See Note 2.
r_metric
- distance metric. Set to 1 (city-block) or 2
(Euclidean). See Nosofsky (2011), and Note 4, for more details.
p
- similarity gradient. Set to 1 (exponential) or 2 (Gaussian).
See Nosofsky (2011), for more details.
A matrix of probabilities for category responses (columns) for each
stimulus (rows) presented to the model (e.g. test trials). Stimuli
and categories are in the same order as presented to the model in
st
, see below.
1. Please note that setting mp = 1 or e.g. mp = 5 globally, will yield identical response probabilities. Crucially, memory strength is indifferent from the category choice bias parameter, if (and only if) mp's vary between categories, without varying within categories. Thus, the memory strength parameter can therefore be interpreted in terms of an exemplar choice bias (potentially related to categorization accuracy). In addition, if exemplar specific mp's are assigned during parameter fitting, one might want to calculate the natural log of the corresponding estimates, enabling direct comparisons between mp's indicating different directions, e.g. -log(.5) = log(2), for loss and gain, respectively, which are equal regarding their extent into different directions.
2. Theoretically, increasing global sensitivity indicates that categorization mainly relies on the most similar exemplars, usually making choices less probabilistic. Thus sensitivity c is likely to be correlated with gamma. See Navarro (2007) for a detailed discussion. However, it is possible to assume exemplar specific sensitivities, or specificity. Then, exemplars with lower sensitivity parameters will have a stronger impact on stimulus similarity and thus categorization behavior for stimuli. See Rodrigues & Murre (2007) for a related study.
3. Setting only the n-1 instead of all n feature weights (or bias parameters) eases model fitting procedures, in which the last weight always is a linear combination of the n-1 weights.
4. See Tversky & Gati (1982) for further info on r. In brief summary, r=2 (usually termed Euclidean), then a large difference on only one feature outweighs small differences on all features. In contrast, if r=1 (usually termed City-Block or Manhattan distance) both aspects contribute to an equal extent to the distance. Thus, r = 2 comes with the assumption that small differences in all features may be less recognized, than a large noticable differences on one feature, which may be depend on confusability of the stimuli or on the nature of the given task domain (perceptual or abstract).
Rene Schlegelmilch, Andy Wills
Minda, J. P., & Smith, J. D. (2011). Prototype models of categorization: Basic formulation, predictions, and limitations. Formal approaches in categorization, 40-64.
Navarro, D. J. (2007). On the interaction between exemplar-based concepts and a response scaling process. Journal of Mathematical Psychology, 51(2), 85-98.
Nosofsky, R. M. (1984). Choice, similarity, and the context theory of classification. Journal of Experimental Psychology: Learning, memory, and cognition, 10(1), 104.
Nosofsky, R. M. (2011). The generalized context model: An exemplar model of classification. In Pothos, E.M. & Wills, A.J. Formal approaches in categorization. Cambridge University Press.
Rodrigues, P. M., & Murre, J. M. (2007). Rules-plus-exception tasks: A problem for exemplar models?. Psychonomic Bulletin & Review, 14(4), 640-646.
Tversky, A., & Gati, I. (1982). Similarity, separability, and the triangle inequality. Psychological review, 89(2), 123.
## Three Categories with 2 Training Items each, and repeatedly presented ## transfer/test items (from nosof94train()). Each item has three ## features with two (binary) values: memory strength (st$mp and ## 'mem' column in st$training_items are optional) is ## equal for all exemplars st<-list( sensitivity = 3, weights = c(.2,.3), choice_bias = c(1/3), gamma = 1, mp = 1, r_metric = 1, p = 1, nCats = 2, nFeat=3 ) ## training item definitions st$training_items <- as.data.frame( t(matrix(cbind(c(1,0,1,1,1,0,0),c(1,1,0,2,1,0,0), c(0,1,0,5,0,1,0),c(0,0,1,1,0,1,0)), ncol=4, nrow=7, dimnames=list(c("stim","x1", "x2", "x3", "cat1", "cat2", "mem"), c(1:4))))) st$tr <- nosof94train() ## get the resulting predictions for the test items ## columns of the output correspond to category numbers as defined ## above rows correspond to the column indices of the test_items stsimGCM(st) ## columns of the output correspond to category numbers as defined ## above rows correspond to the column indices of the test_items ## Example 2 ## Same (settings) as above, except: memory strength is 5 times higher ## for for some exemplars st$mp<-5 ## which exemplars? ## training item definitions st$training_items <- as.data.frame( t(matrix(cbind(c(1,0,1,1,1,0,1),c(1,1,0,2,1,0,0), c(0,1,0,5,0,1,0),c(0,0,1,1,0,1,1)), ncol=4, nrow=7, dimnames=list(c("stim","x1", "x2", "x3", "cat1", "cat2", "mem"), c(1:4))))) ## exemplars in row 1 and 4 will receive a memory strength of 5 ## get predictions stsimGCM(st) ## Example 3 ## Same (settings) as above, except: memory strength is item specific ## for the two exemplars i.e. memory strength boost is not the same ## for both exemplars (3 for the first in row 1, and 5 for the ## second exemplar in row 4) st$mp<-c(3,5) ## get predictions stsimGCM(st)
## Three Categories with 2 Training Items each, and repeatedly presented ## transfer/test items (from nosof94train()). Each item has three ## features with two (binary) values: memory strength (st$mp and ## 'mem' column in st$training_items are optional) is ## equal for all exemplars st<-list( sensitivity = 3, weights = c(.2,.3), choice_bias = c(1/3), gamma = 1, mp = 1, r_metric = 1, p = 1, nCats = 2, nFeat=3 ) ## training item definitions st$training_items <- as.data.frame( t(matrix(cbind(c(1,0,1,1,1,0,0),c(1,1,0,2,1,0,0), c(0,1,0,5,0,1,0),c(0,0,1,1,0,1,0)), ncol=4, nrow=7, dimnames=list(c("stim","x1", "x2", "x3", "cat1", "cat2", "mem"), c(1:4))))) st$tr <- nosof94train() ## get the resulting predictions for the test items ## columns of the output correspond to category numbers as defined ## above rows correspond to the column indices of the test_items stsimGCM(st) ## columns of the output correspond to category numbers as defined ## above rows correspond to the column indices of the test_items ## Example 2 ## Same (settings) as above, except: memory strength is 5 times higher ## for for some exemplars st$mp<-5 ## which exemplars? ## training item definitions st$training_items <- as.data.frame( t(matrix(cbind(c(1,0,1,1,1,0,1),c(1,1,0,2,1,0,0), c(0,1,0,5,0,1,0),c(0,0,1,1,0,1,1)), ncol=4, nrow=7, dimnames=list(c("stim","x1", "x2", "x3", "cat1", "cat2", "mem"), c(1:4))))) ## exemplars in row 1 and 4 will receive a memory strength of 5 ## get predictions stsimGCM(st) ## Example 3 ## Same (settings) as above, except: memory strength is item specific ## for the two exemplars i.e. memory strength boost is not the same ## for both exemplars (3 for the first in row 1, and 5 for the ## second exemplar in row 4) st$mp<-c(3,5) ## get predictions stsimGCM(st)
Records results of all ordinal adequacy tests registered in the catlearn package.
data(thegrid)
data(thegrid)
A data frame with the following columns:
Unique identifier number for each entry into the grid. When making a new entry, use the next available integer.
The CIRP (Canonical Independently Replicated Phenomenon) against which a model was tested. This must correspond precisely to the name of a data set in the catlearn package.
A one-word description of the model being tested. Simulations in the same row of The Grid must have precisely the same one-word description. Note, this is not the name of the function used to run the simulation, nor the name of the model implementation function. It is a descriptive term, defined by the modeler.
Indicates the result of the simulation. 1 = passes ordinal adequacy test, 0 = fails ordinal adequacy test, OES = outside explanatory scope (in other words, this is not a result the model was designed to accommodate), 'pending' = the function listed in 'sim' is currently being written or tested.
The name of the catlearn function used to run the simulation.
The name of the catlearn function used to perform the Ordinal Adequacy Test.
The Grid is a means of centrally recording the results of model simulations centrally, within the catlearn package. For further discussion, see Wills et al. (2016).
Andy J. Wills [email protected]
citation('catlearn')
Wills, A.J., O'Connell, G., Edmunds, C.E.R. & Inkster, A.B. (2016). Progress in modeling through distributed collaboration: Concepts, tools, and category-learning examples. The Psychology of Learning and Motivation.