hong2p.olf module¶
Functions for loading YAML metadata created by my tom-f-oconnell/olfactometer repo, and dealing with the resulting representations of odors delivered during an experiment.
Keeping these functions here rather than in the olfactometer repo because it has other somewhat heavy dependencies that the analysis side of things will generally not need.
-
hong2p.olf.abbrev(odor_str, abbrevs=None, *, component_delim=' + ', conc_delim='@')[source]¶ Abbreviates odor name in input, when an abbreviation is available.
- Parameters
odor_str (
str) – can optionally contain concentration information (followed byolf.conc_delimiter –
so) (if) –
abbrevs (
Optional[Dict[str,str]]) – dict mapping from input names to the names (abbreviations) you want. if not passed, the dict olf.odor2abbrev is used
- Return type
str
-
hong2p.olf.add_abbrevs_from_odor_lists(odor_lists, name2abbrev=None, yaml_path=None, *, if_abbrev_mismatch='warn', verbose=False)[source]¶ Adds name->abbreviation mappings in odor_lists to odor2abbrev input.
- Parameters
yaml_path (
Union[str,Path,None]) – this is used included in some print/warning messages, but is not loaded.- Return type
None
-
hong2p.olf.format_odor(odor_dict, conc=True, name_conc_delim=None, conc_key='log10_conc', cast_int_concs=False)[source]¶ Takes a dict representation of an odor to a pretty str.
Expected to have at least ‘name’ key, but will also use ‘log10_conc’ (or conc_key) if available, unless conc=False.
- Parameters
cast_int_concs (
bool) – if True, will convert (log10) concentrations to integer if they are np.isclose to their nearest integer.
>>> odor = {'name': 'ethyl acetate', 'log10_conc': -2} >>> format_odor(odor) 'ethyl acetate @ -2'
-
hong2p.olf.format_odor_list(odor_list, *, delim=' + ', **kwargs)[source]¶ Takes list of dicts representing odors for one trial to pretty str.
- Return type
str
-
hong2p.olf.is_odor_component_level(level_name)[source]¶ Returns True if column/level name or Series-key is named to store odor metadata
Values for matching keys should store strings representing one, of potentially multiple, component odors presented (simultaneously) on a given trial. My convention for representing multiple components presented together one one trial is to make multiple variables (e.g. columns), named such as [‘odor1’, ‘odor2’, …], with a different sufffix number for each component.
- Return type
bool
-
hong2p.olf.is_odor_var(var_name)[source]¶ Returns True if column/level name or Series-key is named to store odor metadata
Values for matching keys should store strings representing one, of potentially multiple, component odors presented (simultaneously) on a given trial. My convention for representing multiple components presented together one one trial is to make multiple variables (e.g. columns), named such as [‘odor1’, ‘odor2’, …], with a different sufffix number for each component.
- Return type
bool
-
hong2p.olf.odor_index_sort_key(level, sort_names=True, names_first=True, name_order=None, require_in_name_order=False, warn=True, _debug=False)[source]¶ - Parameters
level (
Index) – one level from a pd.MultiIndex with odor metadata. elements should be odor strings (asparse_odor_name()andparse_log10_conc()).sort_names (
bool) – whether to use odor names as part of sort key. If False, only sorts on concentrations.names_first (
bool) – if True, sorts on names primarily, otherwise sorts on concentrations primarily. Ignored if sort_names is False.name_order (
Optional[List[str]]) – list of odor names to use as a fixed order for the names. Concentrations will be sorted within each name.require_in_name_order (
bool) – if True, raises ValueError if odors with not in name_order are present. Otherwise sorts such odors alphabetically after those in name_order.warn (
bool) – if True and require_in_name_order=False, warns about which odors were not in name_order
- Return type
Index
-
hong2p.olf.odor_lists_to_multiindex(odor_lists, *, sort_components=True, pad_to_n_odors=None, **format_odor_kwargs)[source]¶ - Parameters
pad_to_n_odors (
Optional[int]) – if int, returned MultiIndex will have at least this many levels dedicated to odor components (+ the 1 ‘repeat’ level always included).- Return type
MultiIndex
-
hong2p.olf.odordict_sort_key(odor_dict)[source]¶ Returns a hashable key for sorting odors by name, then concentration.
- Return type
Tuple[str,float]
-
hong2p.olf.pad_odor_index_to_n_components(df, n)[source]¶ Pads dataframe odor index, so that it has n ‘odor<n>’ component levels.
- Parameters
n (
int) – target number of odor levels
Odors presented together (e.g. in one trial, mixed in air), should each have their own level in the odor MultiIndex, with olf.solvent_str used to fill when a given trial had less components presented at once.
- Return type
DataFrame
-
hong2p.olf.pad_odor_indices_to_max_components(dfs)[source]¶ Pads odor index each each dataframe to max number of input component levels.
- Return type
Sequence[DataFrame]
-
hong2p.olf.panel_odor_orders(df, panel2name_order=None, **kwargs)[source]¶ Returns dict of panel names to ordered unique odor strs (with concentration).
- Parameters
df (
DataFrame) – DataFrame with columns ‘panel’ and >=1 matching is_odor_varpanel2name_order (
Optional[Dict[str,List[str]]]) – dict mapping panels to lists of odor names, each in the desired order**kwargs – passed through to sort_odors
-
hong2p.olf.parse_log10_conc(odor_str, *, require=False)[source]¶ Takes formatted odor string to float log10 vol/vol concentration.
Returns None if input does not contain olf.conc_delimiter.
- Parameters
odor_str (
str) – contains odor name, and generally also concentrationrequire (
bool) – if True, raises ValueError if olf.conc_delimiter is not in input
>>> parse_log10_conc('ethyl acetate @ -2') -2
- Return type
Optional[float]
-
hong2p.olf.parse_odor_list(trial_odors_str, *, delim=' + ', **parse_odor_kwargs)[source]¶ - Return type
Sequence[NewType()(OdorDict,dict)]
-
hong2p.olf.parse_odor_name(odor_str, *, require_conc=True)[source]¶ Takes formatted odor string to just the name of the odor.
Returns None if input matches olf.solvent_str, but otherwise raises ValueError if odor_str does not contain olf.conc_delimiter.
- Parameters
odor_str (
str) – contains odor name and concentration. name and concentration must be separated by olf.conc_delimiter (‘@’), with whitespace on either side of it.require_conc (
bool) – if False, will return odor_str if it contains no olf.conc_delimiter
>>> parse_odor_name('ethyl acetate @ -2') 'ethyl acetate'
>>> parse_odor_name(solvent_str) is None True
- Return type
Optional[str]
-
hong2p.olf.remove_consecutive_repeats(odor_lists)[source]¶ Returns a list without any consecutive repeats and int # of consecutive repeats.
Raises ValueError if there is a variable number of consecutive repeats.
Assumed that all elements of odor_lists are repeated the same number of times, for each consecutive group of repeats. As long as any repeats are to full n_repeats and consecutive, it is ok for a particular odor (e.g. solvent control) to be repeated n_repeats times in each of several different positions.
>>> without_repeats, n = remove_consecutive_repeats(['a','a','a','b','b','b']) >>> without_repeats ['a', 'b'] >>> n 3
>>> without_repeats, n = remove_consecutive_repeats(['a','a','b','b','a','a']) >>> without_repeats ['a', 'b', 'a'] >>> n 2
>>> without_repeats, n = remove_consecutive_repeats(['a','a','a','b','b']) Traceback (most recent call last): ValueError: variable number of consecutive repeats
Wanted to also take a list-of-lists-of-dicts, where each dict represents one odor and each internal list represents all of the odors on one trial, but the internal lists (nor the dicts they contain) would not be hashable, and thus cannot work with Counter as-is.
- Return type
Tuple[List[Hashable],int]
-
hong2p.olf.sort_odor_list(odor_list)[source]¶ Returns a sorted list of dicts representing odors for one trial
Name takes priority over concentration, so with the same set of odor names in each trial’s odor_list, this should produce a consistent ordering (and same indexes can be used assuming equal length of all)
-
hong2p.olf.sort_odors(df, *, panel_order=None, panel2name_order=None, panel=None, if_panel_missing='warn', axis=None, _debug=False, **kwargs)[source]¶ Sorts DataFrame by odor index/columns.
- Parameters
df (
DataFrame) – should have columns/index-level names where olf.is_odor_var(<col name>) returns Truepanel_order (
Optional[List[str]]) – list of str panel names. If passed, must also provide panel2name_order. Will sort panels first, then odors within each panel.panel2name_order (
Optional[Dict[str,List[str]]]) – maps str panel names to lists of odor name orders, for each. If passed, must also pass panel_order.panel (
Optional[str]) – to specify panel for input data, if it does not have separate index level(s) / column indicating which panel each odor belongs to. must have a matching key in panel2name_order. all data will be assumed to belong to this panel.if_panel_missing – ‘warn’|’err’|None
axis (
Optional[str]) – if None, detect which axes to sort (and may sort both). otherwise, expecting ‘columns’|’index’**kwargs – passed through to
odor_index_sort_key().
Notes: Index will be checked first, and if it contains odor information, will sort on that. Otherwise, will check and sort on matching columns.
Sorts by concentration, then name. solvent_str is treated as less than all odors.
>>> df = pd.DataFrame({ ... 'odor1': ['B @ -2', 'A @ -2', 'A @ -3'], ... 'odor2': ['solvent'] * 3, ... 'delta_f': [1.1, 1.2, 0.9] ... }).set_index(['odor1', 'odor2'])
Names are sorted alphabetically by default, then within each name they are sorted by concentration. Pass names_only=False to only sort on concentration, or names_first=False to sort on concentrations first. >>> sort_odors(df)
delta_f
odor1 odor2 A @ -3 solvent 0.9 A @ -2 solvent 1.2 B @ -2 solvent 1.1
>>> sort_odors(df, name_order=['B','A']) delta_f odor1 odor2 B @ -2 solvent 1.1 A @ -3 solvent 0.9 A @ -2 solvent 1.2
- Return type
DataFrame
-
hong2p.olf.strip_concs_from_odor_str(odor_str, **kwargs)[source]¶ Works with input representing either single components or air mixtures of multiple.
- Parameters
**kwargs – passed thru to format_odor
- Return type
str
-
hong2p.olf.yaml_data2odor_lists(yaml_data, *, sort=True)[source]¶ Returns a list-of-lists of dictionary representation of odors.
Each dictionary will have at least the key ‘name’ and generally also ‘log10_conc’.
The i-th list contains all of the odors presented simultaneously on the i-th odor presentation.
- Parameters
yaml_data (dict) – parsed contents of stimulus YAML file
sort (bool) – (default=True) whether to, within each trial, sort odors. Irrelevant if there are is only ever a single odor presented on each trial.