cgm.core
¶
The core module contains the basic building blocks of a Causal Graphical Model.
Module Contents¶
Classes¶
A variable has a name and can taken on a finite number of states. |
|
A DAG (Directed Acyclic Graph) node is a variable in a DAG. A node can have multiple parents and multiple children, but no cycles can be created. |
|
Helper class to hold specification of a CPD created using | operator |
|
A Causal Graph Node |
|
A factor is a function that has a list of variables in its scope, and maps every combination of variable values to a real number. In this implementation the mapping is stored as a np.ndarray. For example, if this factor’s scope is the variables {A, B, C}, and each of these is a binary variable, then to access the value of the factor for [A=1, B=0, C=1], the entry can be accessed at self.values[1, 0, 1]. If the ndarray isn’t specified, a random one will be created. |
|
Conditional Probability Distribution |
|
Mutable Directed Acyclic Graph. |
|
Causal Graph Contains a list of CG_Nodes. The information about connectivity is stored in the DAG. |
Data¶
API¶
- cgm.core.DCovariant¶
‘TypeVar(…)’
- cgm.core.D¶
‘TypeVar(…)’
- class cgm.core.HasComparison¶
Bases:
typing.Protocol
- __lt__(other: cgm.core.HasComparison) bool ¶
- class cgm.core.HasParents¶
Bases:
typing.Protocol
[cgm.core.DCovariant
]- property parents: FrozenSet[cgm.core.DCovariant]¶
- property ancestors: FrozenSet[cgm.core.DCovariant]¶
- class cgm.core.ComparableHasParents¶
Bases:
cgm.core.HasParents
[cgm.core.DCovariant
],cgm.core.HasComparison
,typing.Protocol
[cgm.core.DCovariant
]
- class cgm.core.HasVariable¶
Bases:
typing.Protocol
- property name: str¶
- property num_states: int¶
- __lt__(other: cgm.core.HasVariable) bool ¶
- cgm.core.V¶
‘TypeVar(…)’
- class cgm.core.Variable(name: str, num_states: int)¶
Bases:
cgm.core.HasVariable
A variable has a name and can taken on a finite number of states.
Initialization
- _name: str¶
None
- _num_states: int¶
None
- property name: str¶
- property num_states: int¶
- __repr__() str ¶
- __lt__(other) bool ¶
- __eq__(other) bool ¶
- __hash__() int ¶
- class cgm.core.DAG_Node¶
Bases:
cgm.core.HasParents
[cgm.core.D
],cgm.core.HasVariable
,typing.Generic
[cgm.core.D
]A DAG (Directed Acyclic Graph) node is a variable in a DAG. A node can have multiple parents and multiple children, but no cycles can be created.
- variable: cgm.core.Variable¶
None
- __post_init__()¶
- property name: str¶
Return the name of the variable.
- property parents: FrozenSet¶
- property ancestors: FrozenSet¶
- property num_states: int¶
Return the number of states the variable can take on.
- __repr__() str ¶
- __lt__(other) bool ¶
- __eq__(other) bool ¶
- __hash__() int ¶
- class cgm.core.CPDSpec¶
Helper class to hold specification of a CPD created using | operator
- class cgm.core.CG_Node¶
Bases:
cgm.core.HasParents
,cgm.core.HasVariable
A Causal Graph Node
A CG_Node is a variable in a Bayesian Network. A node is associated with a single conditional probability distribution (CPD), which is a distribution over the variable given its parents. If the node has no parents, this CPD is a distribution over all the states of the variable.
Example: g = cgm.CG() A = g.node(‘A’, 2) B = g.node(‘B’, 2) C = g.node(‘C’, 2) phi1 = g.P(A | B) phi2 = g.P(B | C) phi3 = g.P©
- dag_node: cgm.core.DAG_Node[cgm.core.CG_Node]¶
None
- classmethod from_params(name: str, num_states: int, cg: cgm.core.CG) cgm.core.CG_Node ¶
Create a new CG_Node with default CPD.
- property parents: frozenset[cgm.core.CG_Node]¶
- property ancestors: frozenset[cgm.core.CG_Node]¶
- property variable: cgm.core.Variable¶
- property name: str¶
- property num_states: int¶
- property cpd: Optional[cgm.core.CPD]¶
Get the CPD associated with this node.
- __repr__() str ¶
- __lt__(other) bool ¶
- __eq__(other) bool ¶
- __hash__() int ¶
- __or__(parents) cgm.core.CPDSpec ¶
Enable syntax like A | [B, C] for CPD creation
- exception cgm.core.ScopeShapeMismatchError(expected_shape, actual_shape)¶
Bases:
Exception
Exception raised when the shape of a factor’s scope does not match the shape of its stored values array.
Initialization
Initialize self. See help(type(self)) for accurate signature.
- exception cgm.core.NonUniqueVariableNamesError(non_unique_names)¶
Bases:
Exception
Exception raised when the variables in a factor’s scope do not have unique names.
Initialization
Initialize self. See help(type(self)) for accurate signature.
- class cgm.core.Factor(scope: Sequence[cgm.core.V], values: numpy.ndarray | int | float | None = None, rng: numpy.random.Generator | None = None)¶
Bases:
typing.Generic
[cgm.core.V
]A factor is a function that has a list of variables in its scope, and maps every combination of variable values to a real number. In this implementation the mapping is stored as a np.ndarray. For example, if this factor’s scope is the variables {A, B, C}, and each of these is a binary variable, then to access the value of the factor for [A=1, B=0, C=1], the entry can be accessed at self.values[1, 0, 1]. If the ndarray isn’t specified, a random one will be created.
All variables in the scope must have unique names.
Factors ϕ1 and ϕ2 can be multiplied and divided by ϕ1 * ϕ2 and ϕ1 / ϕ2. A factor can be marginalized over a subset of its scope. For example, to marginalize out variables A and B, call ϕ.marginalize([A, B]).
Example:
A = cgm.Variable('A', 2) B = cgm.Variable('B', 2) C = cgm.Variable('C', 2) phi1 = cgm.Factor([A, B, C]) phi2 = cgm.Factor([B, C]) phi3 = cgm.Factor([B, C]) phi1 * phi2 phi1 / phi2 phi1.marginalize([A, B])
Args: scope: A list of variables that are in the scope of the factor. values: The values of the factor. If None, random values will be generated. If a scalar, all values will be set to that scalar. rng: A numpy random number generator. Only used if values is None.
Initialization
- classmethod get_null()¶
Return a factor with no scope and a single value of 1.0.
- property values: numpy.ndarray¶
Return the values of the factor.
- property shape: tuple[int, ...]¶
Return the shape of the factor’s values array.
- property scope: tuple[cgm.core.V, ...]¶
Return the scope of the factor.
- permute_scope(new_scope: Sequence[cgm.core.V]) cgm.core.Factor ¶
Set the scope of the factor according to the specified permutation.
Must be a permutation of the original scope.
- set_scope(new_scope: Sequence[cgm.core.V]) cgm.core.Factor ¶
Set the scope of the factor to the specified scope.
- _check_input()¶
- __repr__()¶
- property table: cgm._format.FactorTableView¶
Access the factor’s table representation.
Returns: FactorTableView object that can be used either as a property (for default view) or as a method (for custom views)
- _get_random_values(rng: numpy.random.Generator)¶
- _normalize_dimensions(other: cgm.core.Factor) tuple[numpy.ndarray, numpy.ndarray, tuple[cgm.core.V, ...]] ¶
Expand and permute the dimensions of the two factors to match.
This is required for factor multiplication, division, addition, and subtraction.
- __mul__(other: Factor | int | float) cgm.core.Factor ¶
Factor product as defined in PGM Definition 4.2 (Koller 2009).
- __rmul__(other: int | float) cgm.core.Factor ¶
- __add__(other: Factor | int | float) cgm.core.Factor ¶
- __radd__(other: int | float) cgm.core.Factor ¶
- __sub__(other: Factor | int | float) cgm.core.Factor ¶
- __truediv__(other: cgm.core.Factor) cgm.core.Factor ¶
- marginalize(variables: List[cgm.core.V]) Factor[V] ¶
Sum over all possible states of a list of variables example: phi3.marginalize([A, B])
- marginalize_cpd(cpd: cgm.core.CPD) cgm.core.Factor ¶
Marginalize out a conditional probability distribution.
Sum over all possible states of a set of the cpd variables, weighted by how probable the c is.
Example:
X = cgm.cgm.CG_Node.from_params('X', 2) Y = cgm.cgm.CG_Node.from_params('Y', 2) phi1 = cgm.Factor([X, Y]) cpd = cgm.CPD(Y, [X]) phi2 = phi1.marginalize_cpd(cpd) print(phi2) # ϕ(X)
- max(variable: cgm.core.V) cgm.core.Factor ¶
Returns the maximum along the the state of the variables that maximizes the factor. example: phi3.max(A)
- argmax(variable: cgm.core.V) cgm.core.Factor ¶
Find the state of the variables that maximizes the factor example: phi3.argmax(A)
- abs() cgm.core.Factor ¶
Returns the absolute value of the factor.
- normalize() cgm.core.Factor ¶
Returns a new factor with the same distribution whose sum is 1.
- increment_at_index(index: tuple[int, ...], amount) None ¶
Increment the value of the factor at a particular index by amount.
- condition(condition_dict: dict[cgm.core.V, int]) cgm.core.Factor ¶
Condition on a set of variables.
Condition on a set of variables at particular values of those variables. condition_dict is a dictionary where each key is a variable to condition on and the value is an integer representing the index to condition on.
The scope of the returned factor will exclude all the variables conditioned on.
- class cgm.core.CPD(scope: Sequence[cgm.core.CG_Node], values: numpy.ndarray | None = None, child: cgm.core.CG_Node | None = None, rng: numpy.random.Generator | None = None, virtual: bool = False)¶
Bases:
cgm.core.Factor
[cgm.core.CG_Node
]Conditional Probability Distribution
This is a type of factor with additional constraints. One variable in its scope is the child node, the others are the parents. The CPD must sum to 1 for every particular value of the child node. Additionally, the CPD cannot introduce cycles in the DAG.
Example:
g = cgm.CG() A = g.node('A', 2) B = g.node('B', 2) C = g.node('C', 2) phi1 = g.P(A | B) phi2 = g.P(B | C) phi3 = g.P(C) print(g)
Initialization
Create a conditional probability distribution.
Args: scope: The scope of the CPD. The scope sets the order of the dimensions in the underlying array. values: The values of the CPD. If None, random values will be generated. child: The child node of the CPD. If child is None, the first variable in the scope is assumed to be the child. rng: A numpy random number generator used to set the values. Only used if values is None. virtual: If True, the CPD is not added to the DAG. This is useful for creating derived CPDs
- property child: cgm.core.CG_Node¶
Return the child node of the CPD.
- property parents: frozenset[cgm.core.CG_Node]¶
Return the parents of the CPD.
- _assert_nocycles()¶
- _normalize()¶
- normalize()¶
- sample(num_samples: int, rng: numpy.random.Generator) tuple[numpy.ndarray, numpy.random.Generator] ¶
Sample from the distribution
- condition(condition_dict: dict[cgm.core.CG_Node, int]) cgm.core.CPD ¶
Condition on a set of variables.
Condition on a set of variables at particular values of those variables. condition_dict is a dictionary where each key is a variable to condition on and the value is an integer representing the index to condition on.
The scope of the returned factor will exclude all the variables conditioned on.
- marginalize_cpd(cpd: cgm.core.CPD) cgm.core.CPD ¶
Marginalize out a distribution over a parent variable.
Sum over all possible states of a set of parent variables, weighted by how probable the parent is.
- set_scope(new_scope: Sequence[cgm.core.CG_Node]) cgm.core.CPD ¶
Set the scope of the factor to the specified scope.
- permute_scope(new_scope: Sequence[cgm.core.CG_Node]) cgm.core.CPD ¶
Set the scope of the factor according to the specified permutation.
Must be a permutation of the original scope.
- __repr__()¶
- property table: cgm._format.FactorTableView¶
Access the CPD’s table representation.
- class cgm.core.DAG(nodes: Sequence[cgm.core.D | None] | None = None)¶
Bases:
typing.Generic
[cgm.core.D
]Mutable Directed Acyclic Graph.
Initialization
- property nodes: list[cgm.core.DAG_Node[cgm.core.D]]¶
- get_parents(node: cgm.core.DAG_Node[cgm.core.D]) frozenset[cgm.core.DAG_Node[cgm.core.D]] ¶
Return the parents of a node.
- add_node(node: cgm.core.DAG_Node[cgm.core.D] | cgm.core.D, parents: FrozenSet[cgm.core.DAG_Node[cgm.core.D] | cgm.core.D] | set[cgm.core.DAG_Node[cgm.core.D] | cgm.core.D], replace: bool = False) None ¶
Add a node to the graph.
- _ancestor_dict() dict[cgm.core.DAG_Node[cgm.core.D], frozenset[cgm.core.DAG_Node[cgm.core.D]]] ¶
Return a dictionary of ancestors for each node.
- get_ancestors(node: cgm.core.DAG_Node[cgm.core.D]) frozenset[cgm.core.DAG_Node[cgm.core.D]] ¶
Return the ancestors of a node.
- __repr__()¶
- class cgm.core.CG¶
Causal Graph Contains a list of CG_Nodes. The information about connectivity is stored in the DAG.
- dag: cgm.core.DAG[cgm.core.CG_Node]¶
‘field(…)’
- _cpd_dict: dict[cgm.core.CG_Node, cgm.core.CPD]¶
‘field(…)’
- get_cpd(node: cgm.core.CG_Node) cgm.core.CPD | None ¶
Get the CPD associated with a node.
- set_cpd(node: cgm.core.CG_Node, cpd: cgm.core.CPD) None ¶
Associate a CPD with a node.
- node(name: str, num_states: int) cgm.core.CG_Node ¶
Create a new node and return it.
- property nodes: list[cgm.core.CG_Node]¶
Returns the list of CG_Nodes in the graph.
While the underlying DAG stores DAG_Node objects, this property reconstructs and returns the original CG_Node objects.
- __repr__()¶
- P(spec_or_node: cgm.core.CPDSpec | cgm.core.CG_Node, values: numpy.ndarray | None = None, **kwargs) cgm.core.CPD ¶
Create a CPD using probability notation.
Args: spec_or_node: Either a CPDSpec from the | operator or a single node for priors values: Optional values for the CPD **kwargs: Additional arguments passed to CPD constructor