cgm.core

The core module contains the basic building blocks of a Causal Graphical Model.

Module Contents

Classes

HasComparison

HasParents

ComparableHasParents

HasVariable

Variable

A variable has a name and can taken on a finite number of states.

DAG_Node

A DAG (Directed Acyclic Graph) node is a variable in a DAG. A node can have multiple parents and multiple children, but no cycles can be created.

CPDSpec

Helper class to hold specification of a CPD created using | operator

CG_Node

A Causal Graph Node

Factor

A factor is a function that has a list of variables in its scope, and maps every combination of variable values to a real number. In this implementation the mapping is stored as a np.ndarray. For example, if this factor’s scope is the variables {A, B, C}, and each of these is a binary variable, then to access the value of the factor for [A=1, B=0, C=1], the entry can be accessed at self.values[1, 0, 1]. If the ndarray isn’t specified, a random one will be created.

CPD

Conditional Probability Distribution

DAG

Mutable Directed Acyclic Graph.

CG

Causal Graph Contains a list of CG_Nodes. The information about connectivity is stored in the DAG.

Data

DCovariant

D

V

API

cgm.core.DCovariant

‘TypeVar(…)’

cgm.core.D

‘TypeVar(…)’

class cgm.core.HasComparison

Bases: typing.Protocol

__lt__(other: cgm.core.HasComparison) bool
class cgm.core.HasParents

Bases: typing.Protocol[cgm.core.DCovariant]

property parents: FrozenSet[cgm.core.DCovariant]
property ancestors: FrozenSet[cgm.core.DCovariant]
class cgm.core.ComparableHasParents

Bases: cgm.core.HasParents[cgm.core.DCovariant], cgm.core.HasComparison, typing.Protocol[cgm.core.DCovariant]

class cgm.core.HasVariable

Bases: typing.Protocol

property name: str
property num_states: int
__lt__(other: cgm.core.HasVariable) bool
cgm.core.V

‘TypeVar(…)’

class cgm.core.Variable(name: str, num_states: int)

Bases: cgm.core.HasVariable

A variable has a name and can taken on a finite number of states.

Initialization

_name: str

None

_num_states: int

None

property name: str
property num_states: int
__repr__() str
__lt__(other) bool
__eq__(other) bool
__hash__() int
class cgm.core.DAG_Node

Bases: cgm.core.HasParents[cgm.core.D], cgm.core.HasVariable, typing.Generic[cgm.core.D]

A DAG (Directed Acyclic Graph) node is a variable in a DAG. A node can have multiple parents and multiple children, but no cycles can be created.

variable: cgm.core.Variable

None

dag: DAG[D]

None

__post_init__()
property name: str

Return the name of the variable.

property parents: FrozenSet
property ancestors: FrozenSet
property num_states: int

Return the number of states the variable can take on.

__repr__() str
__lt__(other) bool
__eq__(other) bool
__hash__() int
class cgm.core.CPDSpec

Helper class to hold specification of a CPD created using | operator

child: CG_Node

None

parents: list[CG_Node]

None

class cgm.core.CG_Node

Bases: cgm.core.HasParents, cgm.core.HasVariable

A Causal Graph Node

A CG_Node is a variable in a Bayesian Network. A node is associated with a single conditional probability distribution (CPD), which is a distribution over the variable given its parents. If the node has no parents, this CPD is a distribution over all the states of the variable.

Example: g = cgm.CG() A = g.node(‘A’, 2) B = g.node(‘B’, 2) C = g.node(‘C’, 2) phi1 = g.P(A | B) phi2 = g.P(B | C) phi3 = g.P©

dag_node: cgm.core.DAG_Node[cgm.core.CG_Node]

None

cg: CG

None

classmethod from_params(name: str, num_states: int, cg: cgm.core.CG) cgm.core.CG_Node

Create a new CG_Node with default CPD.

property parents: frozenset[cgm.core.CG_Node]
property ancestors: frozenset[cgm.core.CG_Node]
property variable: cgm.core.Variable
property name: str
property num_states: int
property dag: DAG[CG_Node]
property cpd: Optional[cgm.core.CPD]

Get the CPD associated with this node.

__repr__() str
__lt__(other) bool
__eq__(other) bool
__hash__() int
__or__(parents) cgm.core.CPDSpec

Enable syntax like A | [B, C] for CPD creation

exception cgm.core.ScopeShapeMismatchError(expected_shape, actual_shape)

Bases: Exception

Exception raised when the shape of a factor’s scope does not match the shape of its stored values array.

Initialization

Initialize self. See help(type(self)) for accurate signature.

exception cgm.core.NonUniqueVariableNamesError(non_unique_names)

Bases: Exception

Exception raised when the variables in a factor’s scope do not have unique names.

Initialization

Initialize self. See help(type(self)) for accurate signature.

class cgm.core.Factor(scope: Sequence[cgm.core.V], values: numpy.ndarray | int | float | None = None, rng: numpy.random.Generator | None = None)

Bases: typing.Generic[cgm.core.V]

A factor is a function that has a list of variables in its scope, and maps every combination of variable values to a real number. In this implementation the mapping is stored as a np.ndarray. For example, if this factor’s scope is the variables {A, B, C}, and each of these is a binary variable, then to access the value of the factor for [A=1, B=0, C=1], the entry can be accessed at self.values[1, 0, 1]. If the ndarray isn’t specified, a random one will be created.

All variables in the scope must have unique names.

Factors ϕ1 and ϕ2 can be multiplied and divided by ϕ1 * ϕ2 and ϕ1 / ϕ2. A factor can be marginalized over a subset of its scope. For example, to marginalize out variables A and B, call ϕ.marginalize([A, B]).

Example:

A = cgm.Variable('A', 2)
B = cgm.Variable('B', 2)
C = cgm.Variable('C', 2)
phi1 = cgm.Factor([A, B, C])
phi2 = cgm.Factor([B, C])
phi3 = cgm.Factor([B, C])
phi1 * phi2
phi1 / phi2
phi1.marginalize([A, B])

Args: scope: A list of variables that are in the scope of the factor. values: The values of the factor. If None, random values will be generated. If a scalar, all values will be set to that scalar. rng: A numpy random number generator. Only used if values is None.

Initialization

classmethod get_null()

Return a factor with no scope and a single value of 1.0.

property values: numpy.ndarray

Return the values of the factor.

property shape: tuple[int, ...]

Return the shape of the factor’s values array.

property scope: tuple[cgm.core.V, ...]

Return the scope of the factor.

permute_scope(new_scope: Sequence[cgm.core.V]) cgm.core.Factor

Set the scope of the factor according to the specified permutation.

Must be a permutation of the original scope.

set_scope(new_scope: Sequence[cgm.core.V]) cgm.core.Factor

Set the scope of the factor to the specified scope.

_check_input()
__repr__()
property table: cgm._format.FactorTableView

Access the factor’s table representation.

Returns: FactorTableView object that can be used either as a property (for default view) or as a method (for custom views)

_get_random_values(rng: numpy.random.Generator)
_normalize_dimensions(other: cgm.core.Factor) tuple[numpy.ndarray, numpy.ndarray, tuple[cgm.core.V, ...]]

Expand and permute the dimensions of the two factors to match.

This is required for factor multiplication, division, addition, and subtraction.

__mul__(other: Factor | int | float) cgm.core.Factor

Factor product as defined in PGM Definition 4.2 (Koller 2009).

__rmul__(other: int | float) cgm.core.Factor
__add__(other: Factor | int | float) cgm.core.Factor
__radd__(other: int | float) cgm.core.Factor
__sub__(other: Factor | int | float) cgm.core.Factor
__truediv__(other: cgm.core.Factor) cgm.core.Factor
marginalize(variables: List[cgm.core.V]) Factor[V]

Sum over all possible states of a list of variables example: phi3.marginalize([A, B])

marginalize_cpd(cpd: cgm.core.CPD) cgm.core.Factor

Marginalize out a conditional probability distribution.

Sum over all possible states of a set of the cpd variables, weighted by how probable the c is.

Example:

X = cgm.cgm.CG_Node.from_params('X', 2)
Y = cgm.cgm.CG_Node.from_params('Y', 2)
phi1 = cgm.Factor([X, Y])
cpd = cgm.CPD(Y, [X])
phi2 = phi1.marginalize_cpd(cpd)
print(phi2)
# ϕ(X)
max(variable: cgm.core.V) cgm.core.Factor

Returns the maximum along the the state of the variables that maximizes the factor. example: phi3.max(A)

argmax(variable: cgm.core.V) cgm.core.Factor

Find the state of the variables that maximizes the factor example: phi3.argmax(A)

abs() cgm.core.Factor

Returns the absolute value of the factor.

normalize() cgm.core.Factor

Returns a new factor with the same distribution whose sum is 1.

increment_at_index(index: tuple[int, ...], amount) None

Increment the value of the factor at a particular index by amount.

condition(condition_dict: dict[cgm.core.V, int]) cgm.core.Factor

Condition on a set of variables.

Condition on a set of variables at particular values of those variables. condition_dict is a dictionary where each key is a variable to condition on and the value is an integer representing the index to condition on.

The scope of the returned factor will exclude all the variables conditioned on.

class cgm.core.CPD(scope: Sequence[cgm.core.CG_Node], values: numpy.ndarray | None = None, child: cgm.core.CG_Node | None = None, rng: numpy.random.Generator | None = None, virtual: bool = False)

Bases: cgm.core.Factor[cgm.core.CG_Node]

Conditional Probability Distribution

This is a type of factor with additional constraints. One variable in its scope is the child node, the others are the parents. The CPD must sum to 1 for every particular value of the child node. Additionally, the CPD cannot introduce cycles in the DAG.

Example:

  g = cgm.CG()
  A = g.node('A', 2)
  B = g.node('B', 2)
  C = g.node('C', 2)
  phi1 = g.P(A | B)
  phi2 = g.P(B | C)
  phi3 = g.P(C)
  print(g)

Initialization

Create a conditional probability distribution.

Args: scope: The scope of the CPD. The scope sets the order of the dimensions in the underlying array. values: The values of the CPD. If None, random values will be generated. child: The child node of the CPD. If child is None, the first variable in the scope is assumed to be the child. rng: A numpy random number generator used to set the values. Only used if values is None. virtual: If True, the CPD is not added to the DAG. This is useful for creating derived CPDs

property child: cgm.core.CG_Node

Return the child node of the CPD.

property parents: frozenset[cgm.core.CG_Node]

Return the parents of the CPD.

property dag: DAG[CG_Node]

Return the DAG that the CPD is associated with.

_assert_nocycles()
_normalize()
normalize()
sample(num_samples: int, rng: numpy.random.Generator) tuple[numpy.ndarray, numpy.random.Generator]

Sample from the distribution

condition(condition_dict: dict[cgm.core.CG_Node, int]) cgm.core.CPD

Condition on a set of variables.

Condition on a set of variables at particular values of those variables. condition_dict is a dictionary where each key is a variable to condition on and the value is an integer representing the index to condition on.

The scope of the returned factor will exclude all the variables conditioned on.

marginalize_cpd(cpd: cgm.core.CPD) cgm.core.CPD

Marginalize out a distribution over a parent variable.

Sum over all possible states of a set of parent variables, weighted by how probable the parent is.

set_scope(new_scope: Sequence[cgm.core.CG_Node]) cgm.core.CPD

Set the scope of the factor to the specified scope.

permute_scope(new_scope: Sequence[cgm.core.CG_Node]) cgm.core.CPD

Set the scope of the factor according to the specified permutation.

Must be a permutation of the original scope.

__repr__()
property table: cgm._format.FactorTableView

Access the CPD’s table representation.

class cgm.core.DAG(nodes: Sequence[cgm.core.D | None] | None = None)

Bases: typing.Generic[cgm.core.D]

Mutable Directed Acyclic Graph.

Initialization

property nodes: list[cgm.core.DAG_Node[cgm.core.D]]
get_parents(node: cgm.core.DAG_Node[cgm.core.D]) frozenset[cgm.core.DAG_Node[cgm.core.D]]

Return the parents of a node.

add_node(node: cgm.core.DAG_Node[cgm.core.D] | cgm.core.D, parents: FrozenSet[cgm.core.DAG_Node[cgm.core.D] | cgm.core.D] | set[cgm.core.DAG_Node[cgm.core.D] | cgm.core.D], replace: bool = False) None

Add a node to the graph.

_ancestor_dict() dict[cgm.core.DAG_Node[cgm.core.D], frozenset[cgm.core.DAG_Node[cgm.core.D]]]

Return a dictionary of ancestors for each node.

get_ancestors(node: cgm.core.DAG_Node[cgm.core.D]) frozenset[cgm.core.DAG_Node[cgm.core.D]]

Return the ancestors of a node.

__repr__()
class cgm.core.CG

Causal Graph Contains a list of CG_Nodes. The information about connectivity is stored in the DAG.

dag: cgm.core.DAG[cgm.core.CG_Node]

‘field(…)’

_cpd_dict: dict[cgm.core.CG_Node, cgm.core.CPD]

‘field(…)’

get_cpd(node: cgm.core.CG_Node) cgm.core.CPD | None

Get the CPD associated with a node.

set_cpd(node: cgm.core.CG_Node, cpd: cgm.core.CPD) None

Associate a CPD with a node.

node(name: str, num_states: int) cgm.core.CG_Node

Create a new node and return it.

property nodes: list[cgm.core.CG_Node]

Returns the list of CG_Nodes in the graph.

While the underlying DAG stores DAG_Node objects, this property reconstructs and returns the original CG_Node objects.

__repr__()
P(spec_or_node: cgm.core.CPDSpec | cgm.core.CG_Node, values: numpy.ndarray | None = None, **kwargs) cgm.core.CPD

Create a CPD using probability notation.

Args: spec_or_node: Either a CPDSpec from the | operator or a single node for priors values: Optional values for the CPD **kwargs: Additional arguments passed to CPD constructor