Core –- Parts and Usage Overview

The code is divided up into files to contain pieces with similar purposes or concepts in the algorithm. Each file has its own single module for defining a namespace used when importing its names into other files. Each module exports members intended for public access, but the code in this project explicitly names its imports to maintain clarity in what is used and where it comes from.

Missions.jl

InformativeSampling.MissionsModule

This module contains functions for initializing mission data and the function for running the entire search mission. The entry-point to the actual informative sampling. This contains the main loop and most of the usage of Samples and belief models.

Main public types and functions:

source
InformativeSampling.Missions.MissionType

The main function that runs the informative sampling routine. For each iteration, a sample location is selected, a sample is collected, the belief model is updated, and visuals are possibly shown. The run finishes when the designated number of samples is collected.

Inputs:

  • func: any function to be run at the end of the update loop, useful for visualization or saving data (default does nothing)
  • samples: a vector of samples, this can be used to jump-start a mission or resume a previous mission (default empty)
  • beliefs: a vector of beliefs, this pairs with the previous argument (default empty)
  • seed_val: the seed for the random number generator, an integer (default 0)
  • sleep_time: the amount of time to wait after each iteration, useful for visualizations (default 0)

Outputs:

  • samples: a vector of new samples collected
  • beliefs: a vector of probabilistic representations of the quantities being searched for, one for each sample collection

Examples

using Missions: synMission

mission = synMission(num_samples=10) # create the specific mission
samples, beliefs = mission(visuals=true, sleep_time=0.5) # run the mission
source
InformativeSampling.Missions.MissionType

Fields:

  • occupancy::Any: an occupancy map, true in cells that are occupied

  • sampler::Any: a function that returns a measurement value for any input

  • num_samples::Any: the number of samples to collect in one run

  • sampleCostType::Any: a constructor for the function that returns the (negated) value of taking a sample (default DistScaledEIGF)

  • weights::Any: weights for picking the next sample location

  • start_locs::Any: the locations that should be sampled first (default [])

  • prior_samples::Any: any samples taken previously (default empty)

  • kernel::Any: the kernel to be used in the belief model (default multiKernel)

  • means::Any: whether or not to use a non-zero mean for each quantity and to learn means (default (true, false))

  • noise::Any: a named tuple of noise value(s) and if learned further (default (0.0, false))

  • use_cond_pdf::Any: whether or not to use the conditional distribution of the data to train the belief model (default false)

  • hyp_drop::Any: whether or not to drop hypotheses and settings for it (default (false, 10, 5, 0.4))

Defined as a keyword struct, so all arguments are passed in as keywords:

mission = Mission(; occupancy,
                  sampler,
                  num_samples,
                  sampleCostType,
                  weights,
                  start_locs,
                  prior_samples,
                  noise,
                  kernel)
source
InformativeSampling.Missions.replayMethod
replay(func, M::Mission, full_samples, beliefs; sleep_time)

Replays a mission that has already taken place. Mainly for visualization purposes.

Inputs:

  • func: any function to be run at the end of the update loop, useful for

visualization or saving data (default does nothing)

  • full_samples: a vector of samples
  • beliefs: a vector of beliefs
  • sleep_time: the amount of time to wait after each iteration, useful for

visualizations (default 0)

source

Samples.jl

InformativeSampling.Samples.GridMapsSamplerType

Handles samples of the form (location, quantity) to give the value from the right map. Internally a tuple of GridMaps.

Constructor can take in a tuple or vector of GridMaps or each GridMap as a separate argument.

Examples

ss = GridMapsSampler(GridMap(zeros(5, 5)), GridMap(ones(5, 5)))

loc = [.2, .75]
ss(loc) # result: [0, 1]
ss((loc, 2)) # result: 1
source
InformativeSampling.Samples.SampleType

Struct to hold the input and output of a sample.

Fields:

  • x::Tuple{Vector{Float64}, Int64}: the sample input, usually a location and sensor id

  • y::Any: the sample output or observation, a scalar

source
InformativeSampling.Samples.selectSampleLocationMethod
selectSampleLocation(sampleCost, bounds) -> AbstractArray

The optimization of choosing a best single sample location.

Inputs:

  • sampleCost: a function from sample location to cost (x->cost(x))
  • bounds: map lower and upper bounds

Returns the sample location, a vector

source
InformativeSampling.Samples.takeSamplesMethod
takeSamples(loc, sampler) -> Any

Pulls a ground truth value from a given location and constructs a Sample object to hold them both.

Inputs:

  • loc: the location to sample
  • sampler: a function that returns ground truth values
  • quantities: (optional) a vector of integers which represent which quantities to sample, defaults to all of them

Outputs a vector of Samples containing input x and measurement y

source

SampleCosts.jl

InformativeSampling.SampleCostsModule

This module holds a variety of SampleCost functions used by Samples.jl in selecting a new sample location. The purpose is to pick the location that will minimize the given function.

Each sample cost function in this module is a subtype of the abstract SampleCost type. Their common interface consists of two functions:

  • values(sampleCost, location): returns the values of the terms (μ, σ, τ, P);
    this is typically what each subtype will override;
    in all of these, μ = belief model mean, σ = belief model std, τ = travel distance, P = proximity value;
    all of these are for the specific location
  • sampleCost(location): the actual sample cost at the location; it has a default method explained in SampleCost;
    for the equations the location is denoted $x$

Many of these were experiment cost functions and aren't recommended. The main ones recommended for use, in order, are

Others can be useful if one wants to do some more experimentation. Note that all these functions are currently hardcoded to use the first quantity as the objective quantity unless otherwise stated. Unless explicitly used, the distance value is generally used to make sure unreachable locations are forbidden (their value will be Inf).

Main public types and functions:

source
InformativeSampling.SampleCosts.DerivVarType

Uses the norm of the derivative of the belief model mean and the belief model variance:

\[C(x) = - w_1 \, {\left\lVert \frac{\partial μ}{\partial x}(x) \right\rVert}^2 - w_2 \, σ^2(x)\]

source
InformativeSampling.SampleCosts.DistLogEIGFType

A variation on EIGF that takes the logarithm of the variance and adds a distance cost term that is normalized by the average of the region dimensions:

\[C(x) = - w_1 \, (μ(x) - y(x_c))^2 - w_2 \, \log(σ^2(x)) + w_3 \, β \, \frac{τ(x)}{\left\lVert \boldsymbol{\ell}_d \right\rVert_1}\]

where $β$ is a parameter to delay the distance effect until a few samples have been taken.

source
InformativeSampling.SampleCosts.DistProxType

Combines the average mean value, average standard deviation, travel distance, and proximity as terms:

\[C(x) = - w_1 \, μ_{\mathrm{ave}}(x) - w_2 \, σ_{\mathrm{ave}}(x) + w_3 \, τ(x) + w_4 \, P(x)\]

where $P(x) = \sum_i(\frac{\min(\boldsymbol{\ell}_d)}{4 \, \mathrm{dist}_i})^3$. Averages are performed over all quantities.

source
InformativeSampling.SampleCosts.DistScaledDerivVarType

Uses the norm of the derivative of the belief model mean and the belief model variance, then scales it all by a normalized travel distance:

\[C(x) = \frac{- w_1 \, {\left\lVert \frac{\partial μ}{\partial x}(x) \right\rVert}^2 - w_2 \, σ^2(x)} {1 + β \, \frac{τ(x)}{\left\lVert \boldsymbol{\ell}_d \right\rVert_1}}\]

where $β$ is a parameter to delay the distance effect until a few samples have been taken.

source
InformativeSampling.SampleCosts.DistScaledEIGFType

Augments EIGF with a factor to scale by a normalized travel distance:

\[C(x) = \frac{- w_1 \, (μ(x) - y(x_c))^2 - w_2 \, σ^2(x)} {1 + β \, \frac{τ(x)}{\left\lVert \boldsymbol{\ell}_d \right\rVert_1}}\]

where $β$ is a parameter to delay the distance effect until a few samples have been taken.

source
InformativeSampling.SampleCosts.EIGFType

Expected Informativeness for Global Fit (EIGF). This function is adapted from [Lam] through added weights to choose the balance between exploration and exploitation. It has the form:

\[C(x) = - w_1 \, (μ(x) - y(x_c))^2 - w_2 \, σ^2(x)\]

where $x_c$ is the nearest collected sample location.

source
InformativeSampling.SampleCosts.InfoGainType

Derived from the idea of information gain across the region. Returns the entropy of a set of points (10x10 grid) given the new sample location. Minimizing this entropy is equivalent to maximizing information gain since the entropy before the sample is always the same.

This function is very computationally expensive, which is why the test grid is set at 10x10.

It has the form:

\[C(x) = \log |Σ|\]

source
InformativeSampling.SampleCosts.LogLikelihoodType

Idea derived from the log likelihood of query location. Similar to EIGF, the measured value from the nearest sample is used:

\[C(x) = - w_1 \, \left( \frac{μ(x) - y(x_c)}{σ_n} \right)^2 - w_2 \, \log (σ^2(x))\]

where $x_c$ is the nearest collected sample location, and $σ_n$ is the noise.

This function was seen to work well but hasn't undergone extensive tests. There is still some question about the theory and the use of noise parameter vs signal amplitude parameter.

source
InformativeSampling.SampleCosts.LogLikelihoodFullType

A test of the log likelihood idea but using a weighted sum of all measured sample values, not just the nearest one:

\[C(x) = - w_1 \, \frac{1}{\sum_i k(x, x_i)} \sum_i k(x, x_i) * \left( \frac{μ(x) - y(x_i)}{σ_n} \right)^2 - w_2 \, \log (σ^2(x))\]

where $x_i$ is each collected sample location, and $σ_n$ is the noise.

This function's performance wasn't satisfactory.

source
InformativeSampling.SampleCosts.LogNormedType

Combines the average belief value and the log of the average uncertainty value of all quantities. All belief and uncertainty values are first normalized by the maximum belief value of that quantity. It has the form:

\[C(x) = - w_1 \, μ_{\textrm{norm-ave}}(x) - w_2 \, \log (σ_{\textrm{norm-ave}}(x))\]

source
InformativeSampling.SampleCosts.MIPTType

A simple cost function that doesn't use a belief model but works purely on distances. It returns the negated distance to the nearest sample:

\[C(x) = - \min_i \left\lVert x - x_i \right\rVert\]

Useful when maximizing distance between samples.

source
InformativeSampling.SampleCosts.SampleCostType

Typically construct a SampleCost through SampleCostType(occupancy, samples, beliefModel, quantities, weights)

A pathCost is constructed automatically from the other arguments.

This object can then be called to get the cost of sampling at a location: sampleCost(x)

source
InformativeSampling.SampleCosts.SampleCostMethod

Cost to take a new sample at a location $x$. This is a fallback method that calculates a simple linear combination of all the values of a SampleCost.

Has the form:

\[C(x) = - w_1 \, μ(x) - w_2 \, σ(x) + w_3 \, τ(x) + w_4 \, P(x)\]

source
InformativeSampling.SampleCosts.VarTraceType

Similar to InfoGain but uses only variances rather than the full covariance matrix, which is the same as the trace instead of the log determinant. This reduces computation but is still more costly than a single point estimate. This uses a 20x20 grid of test points.

It has the form:

\[C(x) = \textrm{tr}(Σ)\]

source
InformativeSampling.SampleCosts.valuesMethod
values(sc::InformativeSampling.SampleCosts.SampleCost, loc)

Returns the values to be used to calculate the sample cost (belief mean, standard deviation, travel distance, sample proximity).

Each concrete subtype of SampleCost needs to implement.

This can be a useful function to inspect values during optimization.

source

ROSInterface.jl

InformativeSampling.ROSInterfaceModule

This module contains the interface for passing data to and from other ROS nodes. It sets up an informative_sampling node and provides methods to handle the data. This is designed specifically for communication with Swagbot.

Main public types and functions:

source
InformativeSampling.ROSInterface.ROSSamplerType

A struct that stores information for communicating with Swagbot.

Objects of this type can be used as samplers in missions, meaning they can be called with a SampleInput to return its value. This object also has a length, which is the length of the number of its subscriptions and can be iterated over to get the name of each one.

Fields:

  • data_topics::Vector{T} where T<:Union{String, Tuple{String, String}}: vector of topic names that will be subscribed to to receive measurements

  • done_topic::Any: topic name that publishes a message to signify the traveling is done

  • pub_topic::Any: the publisher topic name

  • publisher::Any: the publisher topic object, created automatically from a given name

source
InformativeSampling.ROSInterface.ROSSamplerMethod
ROSSampler(
    data_topics,
    done_topic,
    pub_topic
) -> InformativeSampling.ROSInterface.ROSSampler

Creating a ROSSampler object requires a vector of topics to subscribe to for measurement data, essentially the list of sensors onboard the robot to listen to. Each element of this list should be a 2-tuple of topics that will transmit the value and error for each sensor. This constructor initializes a ROS node and sets up a publisher to pub_topic.

source
InformativeSampling.ROSInterface.ROSSamplerMethod
function (R::ROSSampler{String})(new_index::SampleInput)

Returns a single value from the sample location of the chosen quantity. It does this by first publishing the next location to sample. Once the location is sampled, it calls out to each topic in sequence and waits for its message.

Currently will be unused.

function (R::ROSSampler{NTuple{2, String}})(new_index::SampleInput)

Returns a single value and its error from the sample location of the chosen quantity. It does this by first publishing the next location to sample. Once the location is sampled, it calls out to each topic in sequence and waits for its message.

Currently will be unused.

source
InformativeSampling.ROSInterface.ROSSamplerMethod
function (R::ROSSampler{String})(new_loc::Location)

Returns a vector of values from the sample location, one for each sensor measurement available. It does this by first publishing the next location to sample. Once the location is sampled, it calls out to each topic in sequence and waits for its message.

Examples

data_topics = [
    "/value1",
    "/value2"
]

done_topic = "sortie_finished"
pub_topic = "latest_sample"

sampler = ROSSampler(data_topics, done_topic, pub_topic)

location = [.1, .3]
[value1, value2] = sampler(location)
function (R::ROSSampler{NTuple{2, String}})(new_loc::Location)

Returns a vector of (value, error) pairs from the sample location, one for each sensor measurement available. It does this by first publishing the next location to sample. Once the location is sampled, it calls out to each topic in sequence and waits for its message.

Examples

data_topics = [
    ("/value1", "/error1"),
    ("/value2", "/error2")
]

done_topic = "sortie_finished"
pub_topic = "latest_sample"

sampler = ROSSampler(data_topics, done_topic, pub_topic)

location = [.1, .3]
[(value1, error1), (value2, error2)] = sampler(location)
source
  • LamLam, C Q (2008) Sequential adaptive designs in computer experiments for response surface model fit (Doctoral dissertation). The Ohio State University.
  • LiuLiu H, Cai J, Ong Y (2017) An adaptive sampling approach for kriging metamodeling by maximizing expected prediction error. Comput Chem Eng 106:171–182