fastISM package¶
fastISM takes a Keras model as input. The main steps of fastISM are as follows:
- One-time Initialization (
fastISM.fast_ism_utils.generate_models()
):
- Obtain the computational graph from the model. This is done in
fastISM.flatten_model.get_flattened_graph()
.- Chunk the computational graph into segments that can be run as a unit. This is done in
fastISM.fast_ism_utils.segment_model()
.- Augment the model to create an “intermediate output model” (referred to as
intout_model
in the code) that returns intermediate outputs at the end of each segment for reference input sequences. This is done infastISM.fast_ism_utils.generate_intermediate_output_model()
.- Create a second “mutation propagation model” (referred to as
fast_ism_model
in the code) that largely resembles the original model, but incorporates as additional inputs the necessary flanking regions from outputs of the IntOut model on reference input sequences between segments. This is done infastISM.fast_ism_utils.generate_fast_ism_model()
.
- For each batch of input sequences:
- Run the
intout_model
on the sequences (unperturbed) and cache the intermediate outputs at the end of each segment. This is done infastISM.fast_ism.FastISM.pre_change_range_loop_prep()
.- For each positional mutation:
- Introduce the mutation in the input sequences
- Run the
fast_ism_model
feeding as input appropriate slices of theintout_model
outputs. This is done infastISM.fast_ism.FastISM.get_ith_output()
.
See How fastISM Works for a more intuitive understanding of the algorithm.
ism_base module¶
This module contains a base ISM
class, from which the NaiveISM
and FastISM
classes inherit. It also includes implementation of NaiveISM
.
fast_ism module¶
This module contains the FastISM
class.
-
class
fastISM.fast_ism.
FastISM
(model, seq_input_idx=0, change_ranges=None, early_stop_layers=None, test_correctness=True)¶ Bases:
fastISM.ism_base.ISMBase
-
cleanup
()¶
-
get_ith_output
(inp_batch, i, idxs_to_mutate)¶
-
pre_change_range_loop_prep
(inp_batch, num_seqs)¶
-
prepare_intout_output
(intout_output, num_seqs)¶
-
prepare_ith_input
(padded_inputs, i, idxs_to_mutate)¶
-
run_model
(inputs)¶
-
test_correctness
(batch_size=10, replace_with=0, atol=1e-06)¶ Verify that outputs are correct by matching with Naive ISM. Running on small examples so as to not take too long.
Hence not comparing runtime against Naive ISM implementation, which requires bigger inputs to offset overheads.
TODO: ensure generated data is on GPU already before calling either method (for speedup)
-
time_batch
(seq_batch)¶
-
fast_ism_utils module¶
-
class
fastISM.fast_ism_utils.
GraphSegment
(start_node, input_seqlen, input_perturbed_ranges)¶ Bases:
object
-
input_unperturbed_width
()¶
-
output_perturbed_width
()¶
-
update_forward_output
(input_unperturbed_slices, input_unperturbed_padding, output_seqlen, output_perturbed_ranges)¶
-
update_num_filters
(num_out_filters)¶
-
-
class
fastISM.fast_ism_utils.
SliceAssign
(a_dim, b_dim)¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
-
call
(inputs)¶ GOAL: a[:,i:min(i+b.shape[1], a.shape[1])] = b clip b if i+b.shape[1] exceeds width of a, guarantee width of output is same as a. This could happen when a layer’s output (b) feeds into multiple layers, but some layers don’t need all positions of b (can happen near the edges). See test_skip_then_mxp of test/test_simple_skip_conn_architectures.py
For Cropping1D layers, i can also be negative, which needs to be handled separately.
Parameters: inputs ([type]) – [description] Returns: [description] Return type: [type]
-
-
fastISM.fast_ism_utils.
compute_segment_change_ranges
(model, nodes, edges, inbound_edges, node_to_segment, stop_segment_idxs, input_seqlen, input_filters, input_change_ranges, seq_input_idx)¶ for each segment, given input change range compute (ChangeRangesBase.forward_compose):
- input range of intermediate output required
- offsets for input tensor wrt intermediate output
- output seqlen
- output change range
- number of filters in output.
Starts only from sequence input that is changed. Does not deal with alternate inputs.
Forward propagation through network one segment at a time till a segment in stop_segments_idxs is hit. Computes the change ranges for each segment and propagates to the next segment.
-
fastISM.fast_ism_utils.
generate_fast_ism_model
(model, nodes, edges, inbound_edges, outputs, node_to_segment, stop_segment_idxs, alternate_input_segment_idxs, segments)¶
-
fastISM.fast_ism_utils.
generate_fast_ism_subgraph
(current_node, node_edge_to_tensor, input_tensors, input_specs, nodes, edges, inbound_edges, node_to_segment, stop_segment_idxs, alternate_input_segment_idxs, segments)¶
-
fastISM.fast_ism_utils.
generate_intermediate_output_model
(model, nodes, edges, inbound_edges, outputs, node_to_segment, stop_segment_idxs)¶
-
fastISM.fast_ism_utils.
generate_intermediate_output_subgraph
(current_node, node_to_tensor, output_tensor_names, nodes, edges, inbound_edges, node_to_segment, stop_segment_idxs)¶
-
fastISM.fast_ism_utils.
generate_models
(model, seqlen, num_chars, seq_input_idx, change_ranges, early_stop_layers=None)¶
-
fastISM.fast_ism_utils.
label_alternate_input_segment_idxs
(current_node, nodes, edges, node_to_segment, stop_segment_idxs, alternate_input_segment_idxs, segment_idx)¶
-
fastISM.fast_ism_utils.
label_stop_descendants
(current_node, nodes, edges, node_to_segment, segment_idx)¶
-
fastISM.fast_ism_utils.
process_alternate_input_node
(current_node, node_edge_to_tensor, input_tensors, input_specs, nodes, edges, inbound_edges, node_to_segment, alternate_input_segment_idxs)¶
-
fastISM.fast_ism_utils.
resolve_multi_input_change_ranges
(input_change_ranges_list)¶ For AGGREGATE_LAYERS such as Add, the different inputs have different change ranges. For the change ranges, take the largest range over all input ranges:
- e.g. [ [(1,3), (4,6)], [(2,4), (4,5)] ] -> [(1,4), (3,6)]
- input1 -^ input2 -^
Parameters: input_change_ranges_list – list of list of tuples. Inner lists must have same length, where each ith tuple corresponds to ith mutation in the input (ith input change range). :type input_change_ranges_list: list[list[tuple]] :return: Resolved input change ranges. All ranges must have the same width. :rtype: list[tuple]
-
fastISM.fast_ism_utils.
segment_model
(model, nodes, edges, inbound_edges, seq_input_idx, early_stop_layers)¶
-
fastISM.fast_ism_utils.
segment_subgraph
(current_node, nodes, edges, inbound_edges, node_to_segment, stop_segment_idxs, segment_idx, num_convs_in_cur_segment)¶
-
fastISM.fast_ism_utils.
update_stop_segments
(current_node, nodes, edges, node_to_segment, stop_segment_idxs)¶
change_range module¶
-
class
fastISM.change_range.
ChangeRangesBase
(config)¶ Bases:
object
Base class for layer-specific computations of which indices of the output are changed when list of input changed indices are specified. Conversely, given output ranges of indices that need to be produced by the layer, compute the input ranges that will be required for the same.
In addition, given an input….
TODO: document better and with examples!
-
backward
(output_select_ranges)¶
-
forward
(input_seqlen, input_change_ranges)¶ list of tuples. e.g. [(0,1), (1,2), (2,3)…] if single bp ISM
-
static
forward_compose
(change_ranges_objects_list, input_seqlen, input_change_ranges)¶
-
validate_config
()¶
-
-
class
fastISM.change_range.
Conv1DChangeRanges
(config)¶ Bases:
fastISM.change_range.ChangeRangesBase
-
backward
(output_select_ranges)¶
-
forward
(input_seqlen, input_change_ranges)¶ list of tuples. e.g. [(0,1), (1,2), (2,3)…] if single bp ISM
-
validate_config
()¶
-
-
class
fastISM.change_range.
Cropping1DChangeRanges
(config)¶ Bases:
fastISM.change_range.ChangeRangesBase
-
backward
(output_select_ranges)¶
-
forward
(input_seqlen, input_change_ranges)¶ list of tuples. e.g. [(0,1), (1,2), (2,3)…] if single bp ISM
-
validate_config
()¶
-
-
class
fastISM.change_range.
Pooling1DChangeRanges
(config)¶ Bases:
fastISM.change_range.ChangeRangesBase
-
backward
(output_select_ranges)¶
-
forward
(input_seqlen, input_change_ranges)¶ list of tuples. e.g. [(0,1), (1,2), (2,3)…] if single bp ISM
-
validate_config
()¶
-
-
fastISM.change_range.
get_int_if_tuple
(param, idx=0)¶
-
fastISM.change_range.
not_supported_error
(message)¶
flatten_model module¶
This module implements functions required to take an arbitrary Keras model and reduce them to a graph representation that is then manipulated by fast_ism_utils
.
-
fastISM.flatten_model.
get_flattened_graph
(model, is_subgraph=False)¶ [summary]
Parameters: - model ([type]) – [description]
- is_subgraph (bool, optional) – [description], defaults to False
Returns: [description]
Return type: [type]
-
fastISM.flatten_model.
is_bipartite
(edges)¶
-
fastISM.flatten_model.
is_consistent
(edges, inbound_edges)¶
-
fastISM.flatten_model.
is_input_layer
(layer)¶ Checks if layer is an input layer
Parameters: layer (tf.keras.layers) – A Keras layer Returns: True if layer is input layer, else False Return type: bool
-
fastISM.flatten_model.
list_replace
(l, old, new)¶
-
fastISM.flatten_model.
node_is_layer
(node_name)¶
-
fastISM.flatten_model.
strip_subgraph_names
(name, subgraph_names)¶ subgraph_name1/subgraph_name2/layer/name -> layer/name
-
fastISM.flatten_model.
viz_graph
(nodes, edges, outpath)¶