hong2p.thor module¶
Functions for working with ThorImage / ThorSync outputs, including for dealing with naming conventions we use for outputs of these programs.
-
hong2p.thor.assign_frame_times_to_blocks(frame_times, rtol=1.5)[source]¶ Takes array of frame times to (start, stop) indices for each block.
- Parameters
frame_times (np.array) – as output by get_frame_times. should have a shape of (movie.shape[0],).
rtol (float) – (optional, default=1.5) time differences between frames must be at least this multiplied by the median time difference in order for a block to be called there.
Notes: This function defines blocks (periods of continuous acquisition) by regions of frame_times where the time difference between frames remains essentially constant. Large jumps in the time between two frames defines the start of a new block. Indices returned would be suitable to index the first dimension of the movie, the output of get_frame_times, etc. stop indices are included as part of the block, so you should add one when using them as the end of a slice.
-
hong2p.thor.assign_frames_to_blocks(df, thorimage_dir, **kwargs)[source]¶ Takes ThorSync+Image data to (start, stop) indices for each block.
Args: df: as output by load_thorsync_hdf5
thorimage_dir: path to a directory created by ThorImage
**kwargs: passed through to get_frame_times
See documentation of assign_frame_times_to_blocks for more details on the definition of blocks and the properties of the output.
-
hong2p.thor.assign_frames_to_odor_presentations(thorsync_input, thorimage_dir, odor_onset_to_frame_rel=None, odor_onset_to_frame_const=None, odor_timing_names=None, check_all_frames_assigned=True, check_no_discontinuity=True, **kwargs)[source]¶ Returns list of (start, first_odor_frame, end) frame indices
One 3-tuple per odor presentation.
Frames are indexed as they are along the first dimension of the movie, including for volumetric data (where a scalar index of this dimension will produce a volume) and/or data collected via frame averaging.
End frames are included in range, and thus getting a presentation must be done like movie[start_i:(end_i + 1)] rather than movie[start_i:end_i].
Not all frames necessarily included. No overlap.
- Parameters
thorsync_input (str | pd.DataFrame) – path to directory created by ThorSync or a dataframe as would be created by passing such a directory to load_thorsync_hdf5.
thorimage_dir (str) – path to directory created by ThorImage that corresponds to the same experiment as the ThorSync data in thorsync_input.
odor_onset_to_frame_rel (float) –
(NOT IMPLEMENTED) factor of averaged volumes/frames per second used to determine how long after odor onset to call first odor frame.
No first odor frames will be before: (odor onset time + odor_onset_to_frame_rel * averaged volumes/frames per second + odor_onset_to_frame_const)
odor_onset_to_frame_const (float) – (NOT IMPLEMENTED) seconds after odor onset to call first odor frame. mainly to compensate for known lag between valve opening and odor arriving at the animal.
**kwargs – passed through to get_frame_times.
-
hong2p.thor.cnmf_metadata_from_thor(filename)[source]¶ Takes TIF filename to key settings from XML needed for CNMF.
-
hong2p.thor.fps_from_thor(df)[source]¶ Takes a DataFrame and returns fps from ThorImage XML.
df must have a ‘thorimage_dir’ column (that can be either a relative or absolute path, as long as it’s under raw_data_root), which is expected to only contain one unique value.
Only the path in the first row is used.
-
hong2p.thor.get_col_onset_indices(df, possible_col_names, threshold=None)[source]¶ Returns arrays onsets, offsets with appropriate indices in df.
- Parameters
possible_col_names (str or tuple) – can be either exact column name in df or an iterable of column names, where the first matching a column in df will be used.
**kwargs – passed through to threshold_crossings.
-
hong2p.thor.get_col_onset_offset_indices(df, possible_col_names, checks=True, threshold=None)[source]¶ Returns arrays onsets, offsets with appropriate indices in df.
- Parameters
possible_col_names (str or tuple) – can be either exact column name in df or an iterable of column names, where the first matching a column in df will be used.
threshold (float) – passed to threshold_crossings under the same name.
Raises OnsetOffsetNumMismatch if checks=True and the number of onsets and offsets differ.
-
hong2p.thor.get_col_onset_offset_times(df, possible_col_names, **kwargs)[source]¶ Returns arrays onsets, offsets with appropriate values from df.time_s.
- Parameters
df (DataFrame) – must have a column ‘time_s’, as generated by load_thorsync_hdf5.
possible_col_names (str or tuple) – can be either exact column name in df or an iterable of column names, where the first matching a column in df will be used.
**kwargs – passed to get_col_onset_offset_indices
-
hong2p.thor.get_flyback_indices(n_frames, z, n_flyback, series=None)[source]¶ Returns indices of XY frames during piezo flyback, or empty array if none
-
hong2p.thor.get_frame_times(df, thorimage_dir, time_ref='mid', min_block_duration_s=3.0, acquisition_trigger_names=None, warn=True, _debug=False, _wont_use_df_after=False)[source]¶ Returns seconds from start of ThorSync recording for each frame.
- Parameters
df (
DataFrame) – as returned by load_thorsync_hdf5thorimage_dir – path to ThorImage directory to load metadata from
time_ref ('mid' | 'end') –
min_block_duration_s (float) – (default=1.0) minimum time (in seconds) between onset and offset of acquisition trigger. Shorter blocks that precede all acceptable-length blocks will simply be disregarding, with a warning. Shorter blocks following any acceptable-length blocks will currently trigger an error.
Returns a np.array that should be of length equal to the number of frames actually saved by ThorImage (i.e. <output>.shape should be equal to (movie.shape[0],)).
-
hong2p.thor.get_thorimage_dims(xml)[source]¶ Takes etree XML root object to (xy, z, c) dimensions of movie.
XML object should be as returned by get_thorimage_xmlroot.
-
hong2p.thor.get_thorimage_fps(thorimage_directory, **kwargs)[source]¶ Takes ThorImage dir to fps of recording.
- before_averaging (bool): (default=False) pass True to return the fps before
any averaging.
All kwargs are passed through to get_thorimage_fps_xml.
-
hong2p.thor.get_thorimage_fps_xml(xml, before_averaging=False)[source]¶ Takes XML root object to fps of recording.
xml: etree XML root object as returned by get_thorimage_xmlroot.
- before_averaging (bool): (default=False) pass True to return the fps before
any averaging.
-
hong2p.thor.get_thorimage_n_averaged_frames_xml(xml)[source]¶ Returns how many frames ThorImage averaged for a single output frame.
-
hong2p.thor.get_thorimage_n_frames(xml, without_flyback=False, num_volumes=False)[source]¶ Returns the number of XY planes (# of timepoints) in the recording.
This is the number of frames after any averaging configured in ThorImage.
Any flyback frames are included.
If additional color channels are enabled but other parameters remain the same, this number will not change.
- Parameters
without_flyback – if True, subtract the number of flyback frames (if any)
num_volumes – if True, return number of volumes instead of number of XY frames. since there are a fixed number of flyback frames per volume, this option will return the same number regardless of without_flyback.
-
hong2p.thor.get_thorimage_pixelsize_um(xml)[source]¶ Takes etree XML root object to XY pixel size in um.
Pixel size in X is the same as pixel size in Y.
XML object should be as returned by get_thorimage_xmlroot.
-
hong2p.thor.get_thorimage_power_regtype_and_level(xml)[source]¶ Returns regtype, power_level where regtype is either ‘pockel’|’non_pockel’
- Return type
Tuple[str,float]
-
hong2p.thor.get_thorimage_time(xml)[source]¶ Takes etree XML root object to recording start time.
XML object should be as returned by get_thorimage_xmlroot.
- Return type
datetime
-
hong2p.thor.get_thorimage_xml_path(thorimage_dir)[source]¶ Takes ThorImage output dir to (expected) path to its XML output.
Raises IOError if either thorimage_dir or Experiment.xml contained within it do not exist.
- Return type
str
-
hong2p.thor.get_thorimage_xmlroot(thorimage_dir_or_xmlroot)[source]¶ Takes ThorImage output dir to object w/ XML data.
Returns the input without doing anything if it is already the same type of XML object that would be returned, to allow writing functions that can either be given paths to ThorImage directories or re-use an already loaded representation of its XML.
- Return type
Element
-
hong2p.thor.get_thorimage_z_stream_frames(xml)[source]¶ Returns number of different Z depths measured in ThorImage recording.
Does NOT include any flyback frames there may be.
- Return type
int
-
hong2p.thor.get_thorimage_z_xml(xml)[source]¶ Returns number of different Z depths measured in ThorImage recording.
Does NOT include any flyback frames there may be.
- Return type
int
-
hong2p.thor.get_thorsync_h5(thorsync_dir)[source]¶ Returns path to ThorSync .h5 output given a directory created by ThorSync
-
hong2p.thor.get_thorsync_samplerate_hz(thorsync_dir)[source]¶ Returns int sample rate (Hz) of ThorSync HDF5 data in thorsync_dir.
-
hong2p.thor.get_thorsync_time(thorsync_dir)[source]¶ Returns modification time of ThorSync XML.
Not perfect, but it doesn’t seem any ThorSync outputs have timestamps.
-
hong2p.thor.get_thorsync_xml_path(thorsync_dir)[source]¶ Takes ThorSync output dir to (expected) path to its XML output.
-
hong2p.thor.is_thorimage_dir(d, verbose=False)[source]¶ True if dir has expected ThorImage outputs, False otherwise.
Looks for .raw not any TIFFs now.
- Return type
bool
-
hong2p.thor.is_thorimage_raw(f)[source]¶ True if filename indicates file is ThorImage raw output.
- Return type
bool
-
hong2p.thor.is_thorsync_dir(d, verbose=False)[source]¶ True if dir has expected ThorSync outputs, False otherwise.
- Return type
bool
-
hong2p.thor.load_thorimage_metadata(thorimage_dir, return_xml=False)[source]¶ Returns (fps, xy, z, c, n_flyback, raw_output_path) for ThorImage dir.
Returns xml as an additional final return value if return_xml is True.
-
hong2p.thor.load_thorsync_hdf5(thorsync_dir, datasets=None, exclude_datasets=None, drop_gctr=True, return_dataset_names_only=False, skip_dict_rename=False, skip_normalization=False, rename_dict=None, use_tqdm=False, verbose=False, _debug=False)[source]¶ Loads ThorSync .h5 output within thorsync_dir into a pd.DataFrame
A column ‘time_s’ will be added, which is derived from ‘GCtr’, and represents the time (in seconds) from the start of the ThorSync recording.
- Parameters
datasets (iterable of str | None) – Load only datasets with these names. Do not include the group names preceding the dataset name. Pass only one of either this or exclude_datasets. Names are checked after any renaming via rename_dict or normalization.
exclude_datasets (iterable of str | False | None) – Load only datasets except those with these names. Do not include ‘gctr’ here. Defaults to hdf5_default_exclude_datasets if neither this nor datasets is passed. If False, all datasets are loaded.
drop_gctr (bool) – (default=True) Drop ‘/Global/GCtr’ data (would be returned as column ‘gctr’) after using it to calculate ‘time_s’ column.
rename_dict (None or dict) – (default=None) a dict of original->new name. If not passed, hdf5_dataset_rename_dict is used. Applied before any further operations on the column (dataset) names.
These HDF5 files have the following hierarchical structure, where leaves of this tree are “Datasets” and their parents are “Groups” (via inspection of a ThorSync 3.0 output): - Global:
GCtr (from ThorSync 3.0 manual) “ThorSync records data into a table with clock cycles beginning with 0. The time of acquisition can be determined by dividing the clock cycle by the frequency of the data collection set at 20 MHz. Thus, each sequential clock cycle represents an increment of 0.05 μs.”
Note that this 20 MHz is not the same as the sampling rate specified in the ThorSync XML output. See commented example at end of this function.
DI: - Frame In
completely zero in the file I was exploring
Frame Out - may have one high pulse (==2 for some reason; low==0) per frame - seems to only be low briefly before returning high again. perhaps just
for one / a few samples?
it may be possible there are cases where there are more high pulses here than there are frames in the movie, perhaps in cases with averaging or multiple separate acquisition periods.
CI: - Frame Counter
AI: - <one entry for each user-configured analog input>
Three changes will be made in translating HDF5 dataset names to DataFrame column names: 1. If any dataset name is in the keys of rename_dict, it will be replaced
with the corresponding value, unless skip_dict_rename is passed.
Names except those under the group ‘AI’ (mostly user configurable inputs) will be lowercased, unless skip_normalization is passed.
All names will have any spaces converted to underscores, unless skip_normalization is passed.
- Return type
DataFrame
-
hong2p.thor.pair_thor_dirs(thorimage_dirs, thorsync_dirs, use_mtime=False, use_ranking=True, check_against_naming_conv=False, check_unique_thorimage_nums=None, verbose=False, ignore_prepairing=None, ignore=None)[source]¶ Takes lists (not necessarily same len) of dirs, and returns a list of lists of matching (ThorImage, ThorSync) dirs (sorted by experiment time).
- Parameters
check_against_naming_conv (bool) – (default=False) If True, check ordering from pairing is consistent with ordering derived from our naming conventions for Thor software output.
check_unique_thorimage_nums (bool) – If True, check numbers parsed from ThorImage directory names, as-per convention, are unique. Requires check_against_naming_conv to be True. Defaults to True if check_against_naming_conv is True, else defaults to False.
ignore_prepairing (None | iterable of str) – An optional iterable of substrings. If any are present in the name of a Thor directory, that directory will be excluded from consideration in pairing. This is mainly to keep the (fragile) implementation that requires equal numbers of ThorImage and ThorSync directories for pairing working if some particular experiments named a certain way only have data from one. Will also be try appending these to ignore if uneven numbers of directories and use_ranking=True.
ignore (None | iterable of str) – As ignore_prepairing, but ignore will happen after pairing. Both the ThorImage and ThorSync directories of a pair will be checked for these substrings and if any match the pair is not returned. This is mainly intended to ignore known-bad data.
Raises ValueError if two dirs of one type match to the same one of the other, but just returns shorter list of pairs if some matches can not be made. These errors currently just cause skipping of pairing for the particular (date, fly) pair above (though maybe this should change?).
Raises AssertionError when assumptions are violated in a way that should trigger re-evaluating the code.
- Return type
List[Tuple[Path,Path]]
-
hong2p.thor.pair_thor_subdirs(parent_dir, verbose=False, **kwargs)[source]¶ Raises ValueError/AssertionError when pair_thor_dirs does.
Above, the former causes skipping of automatic pairing, whereas the latter is not handled and will intentionally cause failure, to prevent incorrect assumptions from leading to incorrect results.
- Return type
List[Tuple[Path,Path]]
-
hong2p.thor.parse_thorimage_notes(xml, *, debug=False)[source]¶ Returns dict of metadata, with <key>: <val> lines and rest parsed separately.
- Parameters
thorimage_dir_or_xml – path to ThorImage output directory or XML Element containing parsed contents of the corresponding Experiment.xml file.
Lines not matching the <key>: <val> format will be appended together under the ‘prose’ key in the returned dict.
It is assumed there will be a single line with the YAML path from olf, and this line is not included in output (should be handled separately, via util.stimulus_yaml_from_thorimage, and would only add noise in dealing with what remains here).
- Return type
dict
-
hong2p.thor.read_movie(thorimage_dir, discard_flyback=True, discard_channel_b=False, checks=True, _debug=False)[source]¶ Returns (t,[z,]y,x) indexed timeseries as a numpy array.
-
hong2p.thor.thor_subdirs(parent_dir, absolute_paths=True)[source]¶ Returns a length-2 tuple, where the first element is all ThorImage children and the second element is all ThorSync children (of parent_dir).
- Return type
Tuple[List[Path],List[Path]]
-
hong2p.thor.thorimage_subdirs(parent_dir)[source]¶ Returns a list of any immediate child directories of parent_dir that have all expected ThorImage outputs.
- Return type
List[Path]
-
hong2p.thor.thorimage_xml(fn_taking_xml)[source]¶ Converts an attribute lookup fn taking XML to allow ThorImage directory input.
-
hong2p.thor.thorsync_num(thorsync_dir)[source]¶ Returns number in suffix of ThorSync output directory name as an int.
- Return type
int
-
hong2p.thor.thorsync_subdirs(parent_dir)[source]¶ Returns a list of any immediate child directories of parent_dir that have all expected ThorSync outputs.
- Return type
List[Path]
-
hong2p.thor.threshold_crossings(signal, threshold=None, onsets=True, offsets=True)[source]¶ Returns indices where signal goes from < threshold to > threshold as onsets, and where signal goes from > threshold to < threshold as offsets.
Cases where it at one index equals the threshold are ignored. Shouldn’t happen and may indicate electrical problems for our application.
-
hong2p.thor.tif2xml_root(filename)[source]¶ Returns etree root of ThorImage XML settings from TIFF filename, assuming TIFF was named and placed according to a certain convention.
Path can be to analysis output directory, as long as raw data directory exists.
-
hong2p.thor.xmlroot(xml_path)[source]¶ Loads contents of xml_path into xml.etree.ElementTree and returns root.
Use calls to <node>.find(<child name>) to traverse down tree and at leaves, use <leaf>.attrib[<attribute name>] to get values. There are other functions too, but see xml documentation for more information.
- Return type
Element