GTN¶
Documentation for GTN, a library for automatic differentiation with weighted finite-state tranducers (WFSTs).
Building and Installing¶
Build Requirements¶
A C++ compiler with good C++14 support (e.g. g++ >= 5.0)
cmake – version 3.5.1 or later, and
make
Building and Installing with CMake¶
Building¶
Currently, GTN must be built and installed from source.
First, clone gtn from its repository on Github:
git clone https://github.com/facebookresearch/gtn.git && cd gtn
Create a build directory and run CMake and make:
mkdir -p build && cd build
cmake ..
make -j $(nproc)
Run tests with:
make test
Install with:
make install
Build Options¶
Options |
Configurations |
Default Value |
---|---|---|
GTN_BUILD_TESTS |
ON, OFF |
ON |
GTN_BUILD_BENCHMARKS |
ON, OFF |
ON |
GTN_BUILD_EXAMPLES |
ON, OFF |
ON |
GTN_BUILD_PYTHON_BINDINGS |
ON, OFF |
OFF |
CMAKE_BUILD_TYPE |
Debug |
Linking your Project with CMake¶
Once flashlight is built and installed, including it in another project is simple with a CMake imported target. In your CMake list, add:
find_package(gtn REQUIRED)
# Create myCompiledTarget, etc.
# ...
target_link_libraries(myCompiledTarget PUBLIC gtn::gtn)
Your target’s files will be linked with the library and can include headers (e.g. #include <gtn/gtn.h>
) directly.
Python Bindings¶
Basic Usage¶
Construction¶
Start by constructing a gtn::Graph
and adding nodes and arcs.
Graph graph;
// Add a start node
graph.addNode(true);
// Add an accept node
graph.addNode(false, true);
// Add an internal node
graph.addNode();
// Add an arc from node 0 to 2 with label 0
graph.addArc(0, 2, 0);
// Add an arc from node 0 to 2 with input label 1 and output label 1
graph.addArc(0, 2, 1, 1);
// Add an arc from node 2 to 1 with input label 0, output label 0 and weight 2
graph.addArc(2, 1, 0, 0, 2);
We can visualize the graph
with draw()
.
By default a gtn::Graph
requires a gradient. To disable gradient
computation use the constructer parameter (e.g. Graph(false)
) or the method
Graph::setCalcGrad()
.
Nodes and arcs have unique integer ids in the graph
. When you add a node or
an arc you can capture the id as the return value. This can make adding arcs
from node ids simpler in some cases.
Graph g;
auto n1 = g.addNode();
auto n2 = g.addNode();
auto a1 = g.addArc(n1, n2, 0, 1);
Input and Output¶
The GTN framework provides some convenience functions to visualize graphs. We
can print them to stdout
or write them to Graphviz’s .dot
format with draw()
.
// print graph to stdout
std::cout << graph << std::endl;
// Draw the graph in dot format to file
draw(graph, "simple_fsa.dot", {{0, "a"}, {1, "b"}, {2, "c"}});
Graph’s in .dot
format can be compiled to different images formats. For
example, to make a PDF image use
dot -Tpdf graph.dot -o graph.pdf
We can save graphs in binary with save()
or as a text file with
saveTxt()
. Similarly we can load graphs in binary format with
load()
or from a text file with loadTxt()
.
// save in binary format
save("myGraph.bin", graph);
// save in text format
saveTxt("myGraph.txt", graph);
// load in binary format
graph = load("myGraph.bin");
// load in text format
graph = loadTxt("myGraph.txt");
We can save and load graphs from input streams as well as files. The text format of the graph is described below.
std::stringstream in(
// First line is space separated start states
"0\n"
// Second line is space separated accept states
"1\n"
// The remaining lines are a list of arcs:
// <source node> <dest node> <ilabel> [olabel] [weight]
// where the olabel defaults to the ilabel if it is not specified
// and the weight defaults to 0.0 if it is not specified.
"0 2 0\n" // olabel = 0, weight = 0.0
"0 2 1 1\n" // olabel = 1, weight = 0.0
"2 1 0 0 2\n"); // olabel = 0, weight = 2.0
Graph other_graph = load(in);
Comparisons¶
There are several ways to compare two graphs in GTN. Use equal()
to
check for an exact match between two graphs. In this case nodes with the same
id should have the same arcs. The function isomorphic()
checks that
the graph structures are the same regardless of the actual node ids. This can
be expensive for large graphs.
WFSAs and WFSTs of different structure can be equivalent in the sense that they
accept or transduce the same paths with the same scores. To check this
condition, use randEquivalent()
. The function
randEquivalent()
uses a Monte Carlo sampling strategy to assess the
equivalence between two graphs and may incorrectly assume the graphs are
equivalent if not enough samples are used.
Example Graphs¶
Graphs in GTN can have multiple start and accept nodes. Start nodes have bold circles and accepting nodes are denoted by concentric circles in the figure below.
Graphs can also have cycles, though not every function supports cyclic graphs.
GTN also allows \(\epsilon\) transitions in graphs. Use
epsilon
to add \(\epsilon\) labels for arcs.
Common Functions¶
GTN supports a range of functions on WFSAs and WFSTs from the simple rational functions like concatenation to the powerful composition operation. Unless explicitly noted, every function in GTN is differentiable with respect to its inputs.
Union¶
Use union_()
to compute the union of a std::vector
of graphs.
The union of two graphs \(\mathcal{A}_1 + \mathcal{A}_2\) accepts any sequence accepted by either input graph.
Concatenation¶
Use concat()
to compute the concatenation of two graphs or a
std::vector
of graphs.
The concatenation of two graphs, \(\mathcal{A}_1\mathcal{A}_2\) accepts any path \({\bf p}{\bf r}\) such that \({\bf p}\) is accepted by \(\mathcal{A}_1\) and \({\bf r}\) is accepted by \(\mathcal{A}_2\).
Closure¶
Use closure()
to compute the Kleene closure of a graph.
The Kleene closure, \(\mathcal{A}^*\), accepts any sequence accepted by the original graph repeated 0 or more times (0 repeats is the empty sequence, \(\epsilon\).).
Intersection¶
Use intersect()
to compute the intersection of two acceptors.
The intersection, \(\mathcal{A}_1 \circ \mathcal{A}_2\) accepts any path which is accepted by both input graphs. The score for a path in the intersected graph is the sum of the scores of the path from each input graph.
Compose¶
Use compose()
to compute the composition of two transducers.
The composition, \(\mathcal{T}_1 \circ \mathcal{T}_2\) transduces \({\bf p} \rightarrow {\bf u}\) if the first input transduces \({\bf p} \rightarrow {\bf r}\) and the second graph transduces \({\bf r} \rightarrow {\bf u}\). As in intersection, the score of the transduction in the composed graph is the sum of the scores of the transduction from each input graph.
Forward Score¶
Use forwardScore()
to compute the forward score of a graph.
The forward algorithm computes the log-sum-exp of the scores of all accepting
paths in a graph. The graph must not have any cycles. Use
Graph::item()
on the output of forwardScore()
to access
the score.
In the above example the graph has three paths:
\(0 \rightarrow 1 \rightarrow 2 \rightarrow 3\) with a score of 4.6
\(0 \rightarrow 2 \rightarrow 3\) with a score of 5.3
\(1 \rightarrow 2 \rightarrow 3\) with a score of 3.5
The resulting forward score is \(\log(e^{4.6} + e^{5.3} + e^{3.5}) = 5.81\).
Viterbi Score¶
Use viterbiScore()
to compute the Viterbi score of a graph.
The Viterbi algorithm gives the max of the scores of all accepting paths in
a graph. The graph must not have any cycles. Use Graph::item()
on the
output of viterbiScore()
to access the score.
The Viterbi score of the figure in the section Forward Score is 5.3, the highest score over all paths.
Viterbi Path¶
Use viterbiPath()
to compute the Viterbi path of a graph.
The Viterbi algorithm gives the highest scoring path of all accepting paths in
a graph. The graph must not have any cycles. The output of
viterbiPath()
is a chain graph representing the highest scoring path.
Autograd Basics¶
Whenever we apply a function on gtn::Graph
(s), if at least one of
its inputs requires a gradient, the output gtn::Graph
records its
inputs, and a method by which to compute its gradient. The gradients are
computed with a GradFunc
, a function that takes the gradient with respect
to the gtn::Graph
and computes the gradient with respect to its
inputs.
To recursively compute the gradient of a gtn::Graph
with respect
to all of its inputs in the computation graph, call gtn::backward()
on the gtn::Graph
. This iteratively computes the gradients by
repeatedly applying the chain rule in a reverse topological ordering of the
computation graph.
Here is a simple example
auto g1 = Graph(true);
g1.addNode(true);
g1.addNode(false, true);
g1.addArc(0, 1, 0, 0, 1.0);
auto g2 = Graph(false);
g2.addNode(true);
g2.addNode(false, true);
g2.addArc(0, 1, 0, 0, 2.0);
auto g3 = Graph(true);
g3.addNode(true);
g3.addNode(false, true);
g3.addArc(0, 1, 0, 0, 3.0);
auto g4 = negate(subtract(add(g1, g2), g3));
g4.backward();
auto g1Grad = g1.grad(); // g1Grad.item() = -1
auto g3Grad = g3.grad(); // g3Grad.item() = 1
Warning
Calling g2.grad()
will throw an exception since calcGrad
is set to false
for the graph
Retaining the Computation Graph
The gtn::backward()
function takes an optional boolean parameter,
retainGraph
, which is false
by default. When the argument is false,
each gtn::Graph
is cleared from the computation graph during the
backward pass as soon as it is no longer needed. This reduces peak memory
usage while computing gradients. Setting retainGraph
to true
is not
recommended unless there is a need to call backward multiple times without
redoing the forward computation.
Let’s consider the computation graph that is built with the above example
Assuming we need to call gtn::backward()
only once on this graph, we
can see that the intermediate Graph gSub
can be deleted as soon as as the
gradients of g3
and gSum
are computed. This will happen when
retainGraph
is set to false
.
Interfacing with PyTorch¶
Adding a GTN function or layer to PyTorch is just like adding any custom extension to PyTorch. For the details, take a look at an example which constructs a custom loss function in PyTorch with GTN. We’ll go over the example here at a high-level with attention to the bits specific to GTN.
First declare the class which should inherit from torch.autograd.Function
.
class GTNLossFunction(torch.autograd.Function):
"""
A minimal example of adding a custom loss function built with GTN graphs to
PyTorch.
The example is a sequence criterion which computes a loss between a
frame-level input and a token-level target. The tokens in the target can
align to one or more frames in the input.
"""
The GTNLossFunction
requires static forward
and a backward
methods.
The forward method, with some additional comments, is copied below:
@staticmethod
def forward(ctx, inputs, targets):
B, T, C = inputs.shape
losses = [None] * B
emissions_graphs = [None] * B
# Move data to the host as GTN operations run on the CPU:
device = inputs.device
inputs = inputs.cpu()
targets = targets.cpu()
# Compute the loss for the b-th example:
def forward_single(b):
emissions = gtn.linear_graph(T, C, inputs.requires_grad)
# *NB* A reference to the `data` should be held explicitly when
# using `data_ptr()` otherwise the memory may be claimed before the
# weights are set. For example, the following is undefined and will
# likely cause serious issues:
# `emissions.set_weights(inputs[b].contiguous().data_ptr())`
data = inputs[b].contiguous()
emissions.set_weights(data.data_ptr())
target = GTNLossFunction.make_target_graph(targets[b])
# Score the target:
target_score = gtn.forward_score(gtn.intersect(target, emissions))
# Normalization term:
norm = gtn.forward_score(emissions)
# Compute the loss:
loss = gtn.subtract(norm, target_score)
# We need the save the `loss` graph to call `gtn.backward` and we
# need the `emissions` graph to access the gradients:
losses[b] = loss
emissions_graphs[b] = emissions
# Compute the loss in parallel over the batch:
gtn.parallel_for(forward_single, range(B))
# Save some graphs and other data for backward:
ctx.auxiliary_data = (losses, emissions_graphs, inputs.shape)
# Put losses back in a torch tensor and move them back to the device:
return torch.tensor([l.item() for l in losses]).to(device)
To perform the backward computation, we save losses
, a list which holds the
individual graphs storing the loss for each example. We also save the
emissions_graphs
so that we can access the gradients in order to construct
the full gradient with respect to the input
tensor.
GTN provides parallel_for()
to run computations in parallel. We us it
here to parallelize the forward computation and backward computations over
examples in the batch. Using parallel_for()
requires some care. For
example notice the lines:
losses = [None] * B
emissions_graphs = [None] * B
These lists must be preconstructed so that threads can insert into the list
rather than constructively appending to it, which could cause a race condition
during the execution of forward_single
.
The backward
method is very simple. It just calls backward()
on
each loss
graph and accumulating the gradients into a
torch.Tensor
.
@staticmethod
def backward(ctx, grad_output):
losses, emissions_graphs, in_shape = ctx.auxiliary_data
B, T, C = in_shape
input_grad = torch.empty((B, T, C))
# Compute the gradients for each example:
def backward_single(b):
gtn.backward(losses[b])
emissions = emissions_graphs[b]
grad = emissions.grad().weights_to_numpy()
input_grad[b] = torch.from_numpy(grad).view(1, T, C)
# Compute gradients in parallel over the batch:
gtn.parallel_for(backward_single, range(B))
return grad_output.unsqueeze(1).unsqueeze(1) * input_grad.to(grad_output.device), None
Graph¶
-
class Graph¶
A
Graph
class to perform automatic differentiation with weighted finite-state acceptors (WFSAs) and transducers (WFSTs).Example:
Graph graph; graph.addNode(true); // Add a start node graph.addNode(); // Add an internal node graph.addNode(false, true); // Add an accept node // Add an arc from node 0 to 1 with ilabel 0, olabel 1 and weight 2.0 graph.addArc(0, 1, 0, 1, 2.0); // Add an arc from node 1 to 2 with ilabel 1, olabel 2 and weight 1.0 graph.addArc(1, 2, 1, 2, 1.0); // Compute the Viterbi score of the graph auto score = viterbiScore(graph); print(score); // Print the score graph to std out backward(score); // Compute the gradient graph.grad(); // Access the gradient graph.zeroGrad(); // Clear the gradient
All operations are in the log or tropical semirings. The default score for an arc is
0
(e.g. the multiplicative identity) and the additive identity is-infinity
. Path scores are accumulated with log-sum-exp or max operations and the score for a path is accumulated with addition.
Graph-level Methods¶
-
Graph(bool calcGrad = true)¶
Construct a
Graph
.- Parameters
calcGrad – Whether or not to compute gradients with respect to this graph when calling
gtn::backward
.
-
int addNode(bool start = false, bool accept = false)¶
Adds a node to the graph.
- Parameters
start – Indicates if the node is a starting node.
accept – Indicates if the node is an accepting node.
- Returns
The id of the node (used for e.g. adding arcs).
-
size_t addArc(size_t srcNode, size_t dstNode, int label)¶
Add a arc between two nodes.
This assumes the graph is an acceptor, the input label on the arc is the same as the output label.
- Parameters
srcNode – The id of the source node.
dstNode – The id of the destination node.
label – The arc label.
- Returns
The id of the added arc.
-
size_t addArc(size_t srcNode, size_t dstNode, int ilabel, int olabel, float weight = 0.0)¶
Add a arc between two nodes.
- Parameters
srcNode – The id of the source node.
dstNode – The id of the destination node.
ilabel – The arc input label.
olabel – The arc output label.
weight – The arc weight.
- Returns
The id of the added arc.
-
inline size_t numArcs() const¶
The number of arcs in the graph.
-
inline size_t numNodes() const¶
The number of nodes in the graph.
-
inline size_t numStart() const¶
The number of starting nodes in the graph.
-
inline size_t numAccept() const¶
The number of accepting nodes in the graph.
-
float item() const¶
Get the weight on a single arc graph.
-
static Graph deepCopy(const Graph &src)¶
A deep copy of a graph
src
which is not recorded in the autograd tape.For a version which is recorded in the autograd tape see
gtn::clone
.
-
void arcSort(bool olabel = false)¶
Sort the arcs entering and exiting a node in increasing order by arc in label or out label if
olabel == true
.This function is intended to be used prior to calls to
intersect
andcompose
to improve the efficiency of the algorithm.
-
inline void markArcSorted(bool olabel = false)¶
Mark a graph’s arcs as sorted.
If
olabel == false
then the graph will be marked as sorted by arc input labels, otherwise it will be marked as sorted by the arc output labels.
-
inline bool ilabelSorted() const¶
Check if the arcs entering and exiting every node are sorted by input label.
-
inline bool olabelSorted() const¶
Check if the arcs entering and exiting every node are sorted by output label.
-
inline float *weights()¶
Returns an array of weights from a graph.
The array will contain
Graph::numArcs()
elements.
-
inline const float *weights() const¶
A
const
version ofGraph::weights
.
-
void setWeights(const float *weights)¶
Set the arc weights on a graph.
The
weights
array must haveGraph::numArcs()
elements.
-
void labelsToArray(int *out, bool ilabel = true)¶
Extract an array of labels from a graph.
The array should have space for
Graph::numArcs()
elements.- Parameters
out – [out] A pointer to the buffer to populate with labels.
ilabel – [in] Retreive ilabels if true, otherwise gets olabels.
-
std::vector<int> labelsToVector(bool ilabel = true)¶
Extract a
std::vector
of labels from the graph.See
Graph::labelsToArray
.
Autograd Methods¶
-
void addGrad(std::vector<float> &&other)¶
Add a
std::vector
of gradients to the gradient graph weights without making a copy ofother
.The
Graph::addGrad
methods are intended for use by the autograd. This overload is used with anrvalue
orstd::move
to avoid an extra copy:graph.addGrad(std::move(graphGrad));
-
void addGrad(const std::vector<float> &other)¶
Add a
std::vector
of gradients to the gradient graph weights.The
Graph::addGrad
methods are intended for use by the autograd.
-
void addGrad(const Graph &other)¶
Add a
Graph
of gradients to the gradient graph.The
Graph::addGrad
methods are intended for use by the autograd.
-
inline bool calcGrad() const¶
Check if a graph requires a gradient.
-
inline bool isGradAvailable() const¶
Check if a graph’s gradient is computed.
-
const Graph &grad() const¶
A
const
version ofGraph::grad
.
-
void setCalcGrad(bool calcGrad)¶
Specify if the gradient for this graph should be computed.
-
void zeroGrad()¶
Clear the graph’s gradients.
-
std::uintptr_t id()¶
A unique identifier for a graph.
Intended for use by the autograd.
-
inline GradFunc gradFunc()¶
Get the gradient function of a graph.
Intended for use by the autograd.
-
inline void setGradFunc(GradFunc gradFunc)¶
Set the gradient function of a graph.
Intended for use by the autograd.
-
inline std::vector<Graph> &inputs() const¶
Get the vector of inputs used in the autograd computation graph.
Intended for use by the autograd.
Node Accessors¶
-
inline const std::vector<int> &start() const¶
Get the indices of the start nodes of the graph.
-
inline const std::vector<int> &accept() const¶
Get the indices of the accepting nodes of the graph.
-
inline bool isStart(size_t i) const¶
Check if the
i
-th node is a start node.
-
inline bool isAccept(size_t i) const¶
Check if the
i
-th node is an accepting node.
-
inline void makeAccept(size_t i)¶
Make the the
i
-th node an accepting node.
-
inline size_t numOut(size_t i) const¶
The number of outgoing arcs from the
i
-th node.
-
inline const std::vector<int> &out(size_t i) const¶
Get the indices of outgoing arcs from the
i
-th node.
-
inline int out(size_t i, size_t j) const¶
Get the index of the
j
-th outgoing arc from thei
-th node.
-
inline size_t numIn(size_t i) const¶
The number of incoming arcs to the
i
-th node.
-
inline const std::vector<int> &in(size_t i) const¶
Get the indices of incoming arcs to the
i
-th node.
-
inline size_t in(size_t i, size_t j) const¶
Get the index of the
j
-th incoming arc to thei
-th node.
Arc Accessors¶
-
inline int srcNode(size_t i) const¶
The destination node of the
i
-th arc.
-
inline int dstNode(size_t i) const¶
The source node of the
i
-th arc.
-
inline int label(size_t i) const¶
The label of the
i
-th arc (use this for acceptors).
-
inline int ilabel(size_t i) const¶
The input label of the
i
-th arc.
-
inline int olabel(size_t i) const¶
The output label of the
i
-th arc.
-
inline float weight(size_t i) const¶
The weight of the
i
-th arc.
-
inline void setWeight(size_t i, float weight)¶
Set the weight of the
i
-th arc.
Autograd¶
-
namespace gtn¶
Functions
-
void backward(Graph g, const Graph &grad, bool retainGraph = false)¶
Compute the gradients of any inputs w.r.t
g
using the chain rule to incorporategrad
, a gradient of another function with respect tograph
.If
retainGraph = false
(default) then the autograd graph will not be saved and un-referenced nodes in the autograd graph may be destroyed.- Parameters
g – The graph to compute gradients with respect to.
grad – A seed gradient, typically set to be a gradient of another function with respect to graph.
retainGraph – Whether or not to save the autograd graph. Setting this to False is more memory efficient as temporary Graphs created during the forward computation may be destroyed.
-
void backward(Graph g, bool retainGraph = false)¶
Compute the gradients of any inputs w.r.t
g
.If
retainGraph = false
(default) then the autograd graph will not be saved and un-referenced nodes in the autograd graph may be destroyed.See the overload
gtn::backward()
.
-
void backward(Graph g, const Graph &grad, bool retainGraph = false)¶
Functions¶
-
enum Projection¶
Projection type used with
gtn::clone
.Values:
-
enumerator NONE¶
Keep both input and output labels.
-
enumerator INPUT¶
Keep only the input lables.
-
enumerator OUTPUT¶
Keep only the output lables.
-
enumerator NONE¶
-
Graph clone(const Graph &g, Projection projection = Projection::NONE)¶
Performs a deep clone of a graph with an option to project to either the input or output labels.
The operation is recorded in the autograd tape. For a version which is not recorded on the autograd tape, see
Graph::deepCopy
.
-
Graph projectInput(const Graph &g)¶
Removes the output labels from the graph and records the operation in the autograd tape.
This function makes a copy of the input graph.
-
Graph projectOutput(const Graph &g)¶
Removes the input labels from the graph and records the operation in the autograd tape.
This function makes a copy of the input graph.
-
Graph concat(const Graph &g1, const Graph &g2)¶
Create the concatenation of two graphs.
This operation is recorded in the autograd tape.
Equivalent to
concat({g1, g2})
, seegtn::concat(const std::vector<Graph>&)
.
-
Graph concat(const std::vector<Graph> &graphs)¶
Concatenate a vector of graphs.
This operation is recorded in the autograd tape.
If
x_i
is a sequence accepted (orx_i:y_i
is transduced) bygraphs[i]
then the concatenated graph accepts the sequencex_1x_2...x_n
ifgraphs
containsn
graphs. The score of the pathx_1...x_n
is the sum of the scores of the individualx_i
ingraphs[i]
. The concatenated graph is constructuted by connecting every accepting state ofgraphs[i-1]
to every starting state ofgraphs[i]
with an epsilon transition. The starting state of the concatenated graphs are starting states ofgraphs[0]
and the accepting states are accepting states ofgraphs.back()
.Note the concatenation of 0 graphs
gtn::concat({})
is the graph which accepts the empty string (epsilon). The concatentation of a single graph is equivalent to a clone.
-
Graph remove(const Graph &g, int label = epsilon)¶
Create the equivalent graph without epsilon transitions.
If label is specified then instead of removing epsilon transitions, arcs with the matching label are removed. The removed arc labels are treated as if they were epsilon transitions. Note this is different than simply pruning the arc.
-
Graph remove(const Graph &g, int ilabel, int olabel)¶
Create the equivalent graph without
ilabel:olabel
transitions.The removed arc labels are treated as if they were epsilon transitions. Note this is different than simply pruning the arc.
-
Graph compose(const Graph &g1, const Graph &g2)¶
Compose two transducers.
This operation is recorded in the autograd tape. If x:y is transduced by
g1
andy:z
is transduced byg2
then the composition will transducex:z
. The arc scores are added in the composed graph.
-
Graph intersect(const Graph &g1, const Graph &g2)¶
Intersect two acceptors.
This operation is recorded in the autograd tape. This function only works on acceptors, calling it on a
graph
wheregraph.ilabel(a) != graph.olabel(a)
for somea
is undefined and may yield incorrect results. The intersected graph accepts any pathx
which is accepted by bothg1
andg2
. The arc scores are added in the intersected graph.The result of
gtn::compose
will yield an equivalent result, however; this function should be preferred since the implementation may be faster.
-
Graph forwardScore(const Graph &g)¶
Compute the forward score of a graph.
Returns the score in a scalar graph which can be accessed with
Graph::item()
. This is equivalent to the shortest distance from the start nodes to the accept nodes in the log semiring. NB: This assumes the input graph is acyclic.
-
Graph viterbiScore(const Graph &g)¶
Compute the Viterbi score of a graph.
Returns the score in a scalar graph which can be accessed with
Graph::item()
. This is equivalent to the shortest distance from the start nodes to the accepting nodes in the tropical semiring. NB: This assumes the input graph is acyclic.
-
Graph viterbiPath(const Graph &g)¶
Compue the Viterbi shortest path of a graph and return it in a single chain graph with the labels and weights of the shortest path.
This is equivalent to the shortest path from the start nodes to the accepting nodes in the tropical semiring. NB: This assumes the input graph is acyclic.
-
Graph sample(const Graph &g, size_t maxLength = 1000)¶
Attempt to sample an accepting path from the graph.
If the graph has dead-ends or does accept any paths, then an empty path may be returned. Note that the empty path “{}” is different from the path which is the empty string “{ε}”.
- Parameters
g – The graph to sample from.
maxLength – The maximum length of a sampled path.
Parallel¶
-
template<typename FuncType, typename ...Args>
auto parallelMap(FuncType &&function, Args&&... inputs)¶ Executes a function in parallel.
- Parameters
function – [in] A function pointer to execute in parallel
...inputs – [in] variadic arguments of iterable/indexable containers, such as
std::vector
s, i.e.vector<T1>, vector<T2>,...
. Types must match the input types of function exactly, i.e.function
must take argumentsT1, T2,...
.
- Returns
a vector of type
T
whereT
is the type of the output type offunction
. If the given function returnsvoid
, the return type isvoid
.
Creations¶
Creation functions facilitate making graphs with common predetermined structure.
Utils¶
Comparisons¶
-
bool randEquivalent(const Graph &g1, const Graph &g2, size_t numSamples = 100, double tol = 1e-4, size_t maxLength = 1000)¶
Compare two graphs by sampling paths and checking they have the same scores in both graphs.
- Parameters
g1 – A graph to be compared.
g2 – A graph to be compared.
numSamples – The number of samples to use. The more samples the more likely the result is accurate.
tol – The largest allowed absolute difference between the path score from each graph.
maxLength – The maximum length of sampled paths.
Input and Output¶
-
using SymbolMap = std::unordered_map<int, std::string>¶
User defined map of integer to arc label strings for use with e.g.
-
void saveTxt(const std::string &fileName, const Graph &g)¶
Save a graph to a file in txt format.
The first line will be a space separated list of the start nodes (
Graph::start
) and the second line will be a space separated list of accepting nodes (Graph::accept
).The following lines contain the arcs in the format:srcNode dstNode ilabel olabel weight
-
Graph loadTxt(const std::string &fileName)¶
Load a graph from a file in txt format.
The expected format is the first two lines contain a list of space separated start and accept nodes respectively. The following lines should contain the arcs in the format:
wheresrcNode dstNode ilabel [olabel=ilabel] [weight=0.0]
[x=y]
indicate optional values forx
with a default value ofy
.For example:
is a two node graph with an arc from start node 0 to accept node 1 with input and output label of 1,0 1 0 1 1
is a two node graph with an arc from node 0 to node 1 with input label 1 and output label 2, and0 1 0 1 1 2
is a two node graph with an arc from node 0 to node 1 with input label 1, output label 2, and a weight of 3.0.0 1 0 1 1 2 3.0
The returned Graph will contain “maxNodeIdx” nodes where “maxNodeIdx” is the largest index of a node in the file regardless of if there are arcs for every node.
-
std::ostream &operator<<(std::ostream &out, const Graph &g)¶
overload
<<
operator for working withstd::cout
onGraph
-
void draw(const Graph &g, const std::string &filename, const SymbolMap &isymbols = SymbolMap(), const SymbolMap &osymbols = SymbolMap())¶
Write a graph in the Graphviz DOT format to a file.
Arc labels are of the format
ilabel/olabel:weight
. If the output symbols are not specified then theolabel
is omitted and arc labels are of the formatilabel:weight
. If the input symbols are not specified then integer ids are used as the label.Compile to pdf with:
dot -Tpdf graph.dot -o graph.pdf
- Parameters
g – The graph to draw
filename – The name of the file to write to (e.g. graph.dot).
isymbols – A map of integer ids to strings used for arc input labels
osymbols – A map of integer ids to strings used for arc output labels
Installing Python Bindings¶
The python package may be installed with pip
:
pip install gtn
To build and install from source, first clone the repo:
git clone https://github.com/facebookresearch/gtn.git
Setup your environment:
conda create -n gtn_env
conda activate gtn_env
Install dependencies:
cd bindings/python
conda install setuptools
Use one of the following commands for installation:
python setup.py install
or, to install in editable mode (for dev):
python setup.py develop
Running Python Tests¶
Python binding tests can be run with make test
, or with
python -m unittest discover bindings/python/test
Run a simple example:
python bindings/python/examples/simple_graph.py
gtn¶
Autograd¶
- backward(g, grad, retain_graph=False)¶
Compute the gradients of any inputs with respect to
graph
.- Parameters
graph (Graph) – The graph to compute gradients with respect to.
grad (Graph) – A seed gradient, typically set to be a gradient of another function with respect to
graph
.retain_graph (bool) – Whether or not to save the autograd graph. Setting this to False is more memory efficient as temporary graphs created during the forward computation may be destroyed.
- backward(g, retain_graph=False)
Same as
backward()
but with the initialgrad
set to be ones.
Functions¶
- add(g1, g2)¶
Add two scalar graphs.
- closure(g)¶
Compute the (Kleene) closure of the graph. This operation is recorded in the autograd tape.
- compose(g1, g2)¶
Compose two transducers. This operation is recorded in the autograd tape. If
x:y
is transduced byg1
andy:z
is transduced byg2
then the composed graph will transducex:z
. The arc scores are added in the composed graph.Both
compose()
andintersect()
can be much faster when operating on graphs with sorted arcs. SeeGraph.arc_sort()
.
- concat(graphs)¶
Concatenate a list of graphs. This operation is recorded in the autograd tape.
If
x_i
is a sequence accepted (orx_i:y_i
is transduced) bygraphs[i]
then the concatenated graph accepts the sequencex_1x_2...x_n
ifgraphs
containsn
graphs. The score of the pathx_1...x_n
is the sum of the scores of the individualx_i
ingraphs[i]
. The concatenated graph is constructed by connecting every accepting state ofgraphs[i-1]
to every starting state ofgraphs[i]
with an \(\epsilon\) transition. The starting state of the concatenated graphs are starting states ofgraphs[0]
and the accepting states are accepting states ofgraphs[-1]
.Note the concatenation of 0 graphs
gtn::concat([])
is the graph which accepts the empty string (\(\epsilon\)). The concatenation of a single graph is equivalent to a clone.
- concat(g1, g2)
Equivalent to
concat([g1, g2])
, seeconcat()
.
- forward_score(g)¶
Compute the forward score of a graph. Returns the score in a scalar graph which can be accessed with
Graph.item()
. This operation is recorded in the autograd tape.The forward score is equivalent to the shortest distance from the start nodes to the accept nodes in the log semiring.
NB:
graph
must be acyclic.
- intersect(g1, g2)¶
Intersect two acceptors. This operation is recorded in the autograd tape. This function only works on acceptors, calling it on a
graph
wheregraph.ilabel(a) != graph.olabel(a)
for somea
is undefined and may yield incorrect results. The intersected graph accepts any pathx
which is accepted by bothg1
andg2
. The arc scores are added in the intersected graph.The result of
compose()
will yield an equivalent result, however; this function should be preferred since the implementation may be faster.Both
compose()
andintersect()
can be much faster when operating on graphs with sorted arcs. SeeGraph.arc_sort()
.
- negate(g)¶
Negate a scalar graph.
- project_input(other)¶
Removes the output labels from the graph and records the operation in the autograd tape. This function makes a copy of the input graph.
- project_output(other)¶
Removes the input labels from the graph and records the operation in the autograd tape. This function makes a copy of the input graph.
- remove(other, label=gtn.epsilon)¶
Construct the equivalent graph without \(\epsilon\) transitions. The \(\epsilon\) closure of each node in the graph is computed and the required transitions are added to yield the \(\epsilon\)-free equivalent graph. If
label
is specified then instead of removing epsilon transitions, arcs with the matching label are removed. The removed arc labels are treated as if they were \(\epsilon\) transitions.
- subtract(g1, g2)¶
Subtract one scalar graph from another.
- union(graphs)¶
Construct the union of a list of graphs.
- viterbi_score(g)¶
Compute the Viterbi score of a graph. Returns the score in a scalar graph which can be accessed with
Graph.item()
. This operation is recorded in the autograd tape.This is equivalent to the shortest distance from the start nodes to the accepting nodes in the tropical semiring.
NB:
graph
must be acyclic.
- viterbi_path(g)¶
Compue the Viterbi shortest path of a graph and return it in a single chain graph with the labels and weights of the shortest path. This operation is recorded in the autograd tape.
The Viterbi shorted path is equivalent to the shortest path from the start nodes to the accepting nodes in the tropical semiring.
NB:
graph
must be acyclic.
Creations¶
- scalar_graph(weight, calc_grad=True)¶
Creates a scalar graph - a graph with a single arc between two nodes with a given weight value and an \(\epsilon\) label (
epsilon
).
- linear_graph(M, N, calc_grad=True)¶
Create a linear chain graph with
M + 1
nodes andN
edges between each node. The labels of the edges between each node are the integers[0, ..., N - 1]
.
Comparisons¶
- equal(g1, g2)¶
Checks if two graphs are exactly equal (not isomorphic).
- isomorphic(g1, g2)¶
Checks if two graphs are isomorphic. This function will be extremely slow for large graphs.
Parallel¶
- parallel_for(function, int_list)¶
Computes the result of a given function that takes an int argument in parallel given some list of ints over which to process.
Returns nothing, even if the passed function has a return value.
Input and Output¶
- draw(g1, file_name, isymbols={}, osymbols={})¶
Draw a graph to an image. This function requires a working installation of Graphviz. Arc labels are of the format
ilabel/olabel:weight
. If the output symbols are not specified then theolabel
is omitted and arc labels are of the formatilabel:weight
. If the input symbols are not specified then integer ids are used as the label.The format of the image is determined by the file_name extension and can be any dot supported extension (check with
dot -T?
).
- load(file_name)¶
Load a graph from a file. The first two lines contain the list of space separated start and accept nodes respectively. The following lines contain the arcs in the format:
srcNode dstNode ilabel [olabel=ilabel] [weight=0.0]
where
[x=y]
indicate optional values forx
with a default value ofy
.For example:
0 1 0 1 1
is a two node graph with an arc from start node 0 to accept node 1 with input and output label of 1,
0 1 0 1 1 2
is a two node graph with an arc from node 0 to node 1 with input label 1 and output label 2, and
0 1 0 1 1 2 3.0
is a two node graph with an arc from node 0 to node 1 with input label 1, output label 2, and a weight of 3.0.
- write_dot(g, file_name, isymbols={}, osymbols={})¶
gtn.Graph¶
- class Graph(calc_grad=True)¶
A weighted finite-state acceptor or transducer.
- Parameters
calc_grad (bool) – Specify if a gradient is required for the graph.
- add_node(start=False, accept=False)¶
Add a node to the graph.
- add_arc(src_node, dst_node, label)
Add an accepting arc to the graph.
- add_arc(src_node, dst_node, ilabel, olabel, weight=0.0)¶
Add a transducing arc to the graph.
- arc_sort(olabel=False)¶
Sort the arcs entering and exiting a node by label.
This function is intended to be used prior to calls to
intersect()
andcompose()
to improve the efficiency of the algorithms.- Parameters
olabel (bool) – Sort by increasing order on the arc input label (default) or output label if
olabel == True
.
- mark_arc_sorted(olabel=False)¶
Mark the arcs entering and exiting the nodes of a graph as sorted. This method is intended to be used when a graph is constructed in sorted order to avoid paying for a call to
arc_sort()
.- Parameters
olabel (bool) – Mark as sorted by input label (default) or output label if
olabel == True
.
- item()¶
Get the weight on a single arc graph.
- num_arcs()¶
Get the number of arcs in the graph.
- num_nodes()¶
Get the number of nodes in the graph.
- num_start()¶
Get the number of starting nodes in the graph.
- num_accept()¶
Get the number of accepting nodes in the graph.
- weights()¶
Get a pointer to an array of the graph’s arc weights.
- weights_to_numpy()¶
Get a 1D
numpy.ndarray
with the graph’s arc weights.
- set_weights(weights)¶
Set all of the arc weights of the graph.
- Parameters
weights (int or list or numpy.ndarray) – The weights of the arcs to set. An
int
type is treated as the pointer to the first entry of an array of weights.
- labels_to_list(ilabel=True)¶
Get the graph’s arc labels as a list.
- Parameters
ilabel (bool) – If True return the input labels, otherwise return the output labels.
- zero_grad()¶
Clear the graph’s gradient.