The mwtab API Reference¶
Routines for working with mwTab
format files used by the
Metabolomics Workbench.
This package includes the following modules:
mwtab
- This module provides the
MWTabFile
class which is a python dictionary representation of a Metabolomics Workbench mwtab file. Data can be accessed directly from theMWTabFile
instance using bracket accessors. cli
- This module provides command-line interface for the
mwtab
package. tokenizer
- This module provides the
tokenizer()
generator that generates tuples of key-value pairs from mwtab files. fileio
- This module provides the
read_files()
generator to open files from different sources (single file/multiple files on a local machine, directory/archive of files, URL address of a file). converter
- This module provides the
Converter
class that is responsible for the conversion ofmwTab
formated files into their JSON representation and vice versa. mwschema
- This module provides JSON schema definitions for the
mwTab
formatted files, i.e. specifies required and optional keys as well as data types. validator
- This module provides routines to validate
mwTab
formatted files based on schema definitions as well as checks for file self-consistency. mwrest
- This module provides the
GenericMWURL
class which is a python dictionary representation of a Metabolomics Workbench REST URL. The class is used to validate query parameters and to generate a URL path which can be used to request data from Metabolomics Workbench through their REST API.
mwtab.mwtab¶
This module provides the MWTabFile
class
that stores the data from a single mwTab
formatted file in the
form of an OrderedDict
. Data can be accessed
directly from the MWTabFile
instance using
bracket accessors.
The data is divided into a series of “sections” which each contain a
number of “key-value”-like pairs. Also, the file contains a specially
formatted SUBJECT_SAMPLE_FACTOR
block and blocks of data between
*_START
and *_END
.
-
class
mwtab.mwtab.
MWTabFile
(source, *args, **kwds)[source]¶ MWTabFile class that stores data from a single
mwTab
formatted file in the form ofcollections.OrderedDict
.-
read
(filehandle)[source]¶ Read data into a
MWTabFile
instance.Parameters: filehandle ( io.TextIOWrapper
,gzip.GzipFile
,bz2.BZ2File
,zipfile.ZipFile
) – file-like object.Returns: None Return type: None
-
write
(filehandle, file_format)[source]¶ Write
MWTabFile
data into file.Parameters: - filehandle (
io.TextIOWrapper
) – file-like object. - file_format (str) – Format to use to write data: mwtab or json.
Returns: None
Return type: - filehandle (
-
writestr
(file_format)[source]¶ Write
MWTabFile
data into string.Parameters: file_format (str) – Format to use to write data: mwtab or json. Returns: String representing the MWTabFile
instance.Return type: str
-
print_file
(f=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, file_format='mwtab')[source]¶ Print
MWTabFile
into a file or stdout.Parameters: - f (
io.StringIO
) – writable file-like stream. - file_format (str) – Format to use: mwtab or json.
Returns: None
Return type: - f (
-
print_subject_sample_factors
(section_key, f=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, file_format='mwtab')[source]¶ Print mwtab SUBJECT_SAMPLE_FACTORS section into a file or stdout.
Parameters: - section_key (str) – Section name.
- f (
io.StringIO
) – writable file-like stream. - file_format (str) – Format to use: mwtab or json.
Returns: None
Return type:
-
print_block
(section_key, f=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, file_format='mwtab')[source]¶ Print mwtab section into a file or stdout.
Parameters: - section_key (str) – Section name.
- f (
io.StringIO
) – writable file-like stream. - file_format (str) – Format to use: mwtab or json.
Returns: None
Return type:
-
The mwtab command-line interface¶
- Usage:
- mwtab -h | –help mwtab –version mwtab convert (<from-path> <to-path>) [–from-format=<format>] [–to-format=<format>] [–validate] [–mw-rest=<url>] [–verbose] mwtab validate <from-path> [–mw-rest=<url>] [–verbose] mwtab download url <url> [–to-path=<path>] [–verbose] mwtab download study all [–to-path=<path>] [–input-item=<item>] [–output-format=<format>] [–mw-rest=<url>] [–validate] [–verbose] mwtab download study <input-value> [–to-path=<path>] [–input-item=<item>] [–output-item=<item>] [–output-format=<format>] [–mw-rest=<url>] [–validate] [–verbose] mwtab download (study | compound | refmet | gene | protein) <input-item> <input-value> <output-item> [–output-format=<format>] [–to-path=<path>] [–mw-rest=<url>] [–verbose] mwtab download moverz <input-item> <m/z-value> <ion-type-value> <m/z-tolerance-value> [–to-path=<path>] [–mw-rest=<url>] [–verbose] mwtab download exactmass <LIPID-abbreviation> <ion-type-value> [–to-path=<path>] [–mw-rest=<url>] [–verbose] mwtab extract metadata <from-path> <to-path> <key> … [–to-format=<format>] [–no-header] mwtab extract metabolites <from-path> <to-path> (<key> <value>) … [–to-format=<format>] [–no-header]
- Options:
-h, --help Show this screen. --version Show version. --verbose Print what files are processing. --validate Validate the mwTab file. --from-format=<format> Input file format, available formats: mwtab, json [default: mwtab]. --to-format=<format> Output file format [default: json]. Available formats for convert:
mwtab, json.- Available formats for extract:
- json, csv.
--mw-rest=<url> URL to MW REST interface [default: https://www.metabolomicsworkbench.org/rest/]. --context=<context> Type of resource to access from MW REST interface, available contexts: study, compound, refmet, gene, protein, moverz, exactmass [default: study]. --input-item=<item> Item to search Metabolomics Workbench with. --output-item=<item> Item to be retrieved from Metabolomics Workbench. --output-format=<format> Format for item to be retrieved in, available formats: mwtab, json. --no-header Include header at the top of csv formatted files. For extraction <to-path> can take a “-” which will use stdout.
-
mwtab.cli.
cli
(cmdargs)[source]¶ Implements the command line interface.
param dict cmdargs: dictionary of command line arguments.
mwtab.tokenizer¶
This module provides the tokenizer()
lexical analyzer for
mwTab format syntax. It is implemented as Python generator-based state
machine which generates (yields) tokens one at a time when next()
is invoked on tokenizer()
instance.
Each token is a tuple of “key-value”-like pairs, tuple of
SUBJECT_SAMPLE_FACTORS
or tuple of data deposited between
*_START
and *_END
blocks.
-
mwtab.tokenizer.
tokenizer
(text)[source]¶ A lexical analyzer for the mwtab formatted files.
Parameters: text (py:class:str) – mwTab formatted text. Returns: Tuples of data. Return type: py:class:~collections.namedtuple
mwtab.fileio¶
This module provides routines for reading mwTab
formatted files
from difference kinds of sources:
- Single
mwTab
formatted file on a local machine.- Directory containing multiple
mwTab
formatted files.- Compressed zip/tar archive of
mwTab
formatted files.- URL address of
mwTab
formatted file.ANALYSIS_ID
ofmwTab
formatted file.
-
mwtab.fileio.
read_files
(*sources, **kwds)[source]¶ Construct a generator that yields file instances.
Parameters: sources – One or more strings representing path to file(s).
mwtab.converter¶
This module provides functionality for converting between the
Metabolomics Workbench mwTab
formatted file and its equivalent
JSONized representation.
The following conversions are possible:
- Local files:
- One-to-one file conversions:
- textfile - to - textfile
- textfile - to - textfile.gz
- textfile - to - textfile.bz2
- textfile.gz - to - textfile
- textfile.gz - to - textfile.gz
- textfile.gz - to - textfile.bz2
- textfile.bz2 - to - textfile
- textfile.bz2 - to - textfile.gz
- textfile.bz2 - to - textfile.bz2
- textfile / textfile.gz / textfile.bz2 - to - textfile.zip / textfile.tar / textfile.tar.gz / textfile.tar.bz2 (TypeError: One-to-many conversion)
- Many-to-many files conversions:
- Directories:
- directory - to - directory
- directory - to - directory.zip
- directory - to - directory.tar
- directory - to - directory.tar.bz2
- directory - to - directory.tar.gz
- directory - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
- Zipfiles:
- zipfile.zip - to - directory
- zipfile.zip - to - zipfile.zip
- zipfile.zip - to - tarfile.tar
- zipfile.zip - to - tarfile.tar.gz
- zipfile.zip - to - tarfile.tar.bz2
- zipfile.zip - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
- Tarfiles:
- tarfile.tar - to - directory
- tarfile.tar - to - zipfile.zip
- tarfile.tar - to - tarfile.tar
- tarfile.tar - to - tarfile.tar.gz
- tarfile.tar - to - tarfile.tar.bz2
- tarfile.tar - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
- tarfile.tar.gz - to - directory
- tarfile.tar.gz - to - zipfile.zip
- tarfile.tar.gz - to - tarfile.tar
- tarfile.tar.gz - to - tarfile.tar.gz
- tarfile.tar.gz - to - tarfile.tar.bz2
- tarfile.tar.gz - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
- tarfile.tar.bz2 - to - directory
- tarfile.tar.bz2 - to - zipfile.zip
- tarfile.tar.bz2 - to - tarfile.tar
- tarfile.tar.bz2 - to - tarfile.tar.gz
- tarfile.tar.bz2 - to - tarfile.tar.bz2
- tarfile.tar.bz2 - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
- URL files:
- One-to-one file conversions:
- analysis_id - to - textfile
- analysis_id - to - textfile.gz
- analysis_id - to - textfile.bz2
- analysis_id - to - textfile.zip / textfile.tar / textfile.tar.gz / textfile.tar.bz2 (TypeError: One-to-many conversion)
- textfileurl - to - textfile
- textfileurl - to - textfile.gz
- textfileurl - to - textfile.bz2
- textfileurl.gz - to - textfile
- textfileurl.gz - to - textfile.gz
- textfileurl.gz - to - textfile.bz2
- textfileurl.bz2 - to - textfile
- textfileurl.bz2 - to - textfile.gz
- textfileurl.bz2 - to - textfile.bz2
- textfileurl / textfileurl.gz / textfileurl.bz2 - to - textfile.zip / textfile.tar / textfile.tar.gz / textfile.tar.bz2 (TypeError: One-to-many conversion)
- Many-to-many files conversions:
- Zipfiles:
- zipfileurl.zip - to - directory
- zipfileurl.zip - to - zipfile.zip
- zipfileurl.zip - to - tarfile.tar
- zipfileurl.zip - to - tarfile.tar.gz
- zipfileurl.zip - to - tarfile.tar.bz2
- zipfileurl.zip - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
- Tarfiles:
- tarfileurl.tar - to - directory
- tarfileurl.tar - to - zipfile.zip
- tarfileurl.tar - to - tarfile.tar
- tarfileurl.tar - to - tarfile.tar.gz
- tarfileurl.tar - to - tarfile.tar.bz2
- tarfileurl.tar - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
- tarfileurl.tar.gz - to - directory
- tarfileurl.tar.gz - to - zipfile.zip
- tarfileurl.tar.gz - to - tarfile.tar
- tarfileurl.tar.gz - to - tarfile.tar.gz
- tarfileurl.tar.gz - to - tarfile.tar.bz2
- tarfileurl.tar.gz - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
- tarfileurl.tar.bz2 - to - directory
- tarfileurl.tar.bz2 - to - zipfile.zip
- tarfileurl.tar.bz2 - to - tarfile.tar
- tarfileurl.tar.bz2 - to - tarfile.tar.gz
- tarfileurl.tar.bz2 - to - tarfile.tar.bz2
- tarfileurl.tar.bz2 - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
-
class
mwtab.converter.
Translator
(from_path, to_path, from_format=None, to_format=None, validate=False)[source]¶ Translator abstract class.
-
class
mwtab.converter.
MWTabFileToMWTabFile
(from_path, to_path, from_format=None, to_format=None, validate=False)[source]¶ Translator concrete class that can convert between
mwTab
andJSON
formats.
mwtab.validator¶
This module contains routines to validate consistency of the mwTab
formatted files, e.g. make sure that Samples
and Factors
identifiers are consistent across the file, make sure that all
required key-value pairs are present.
-
mwtab.validator.
validate_file
(mwtabfile, section_schema_mapping={'ANALYSIS': Schema({'ANALYSIS_TYPE': <class 'str'>, Optional('LABORATORY_NAME'): <class 'str'>, Optional('OPERATOR_NAME'): <class 'str'>, Optional('DETECTOR_TYPE'): <class 'str'>, Optional('SOFTWARE_VERSION'): <class 'str'>, Optional('ACQUISITION_DATE'): <class 'str'>, Optional('ANALYSIS_PROTOCOL_FILE'): <class 'str'>, Optional('ACQUISITION_PARAMETERS_FILE'): <class 'str'>, Optional('PROCESSING_PARAMETERS_FILE'): <class 'str'>, Optional('DATA_FORMAT'): <class 'str'>, Optional('ACQUISITION_ID'): <class 'str'>, Optional('ACQUISITION_TIME'): <class 'str'>, Optional('ANALYSIS_COMMENTS'): <class 'str'>, Optional('ANALYSIS_DISPLAY'): <class 'str'>, Optional('INSTRUMENT_NAME'): <class 'str'>, Optional('INSTRUMENT_PARAMETERS_FILE'): <class 'str'>, Optional('NUM_FACTORS'): <class 'str'>, Optional('NUM_METABOLITES'): <class 'str'>, Optional('PROCESSED_FILE'): <class 'str'>, Optional('RANDOMIZATION_ORDER'): <class 'str'>, Optional('RAW_FILE'): <class 'str'>}), 'CHROMATOGRAPHY': Schema({Optional('CHROMATOGRAPHY_SUMMARY'): <class 'str'>, 'CHROMATOGRAPHY_TYPE': <class 'str'>, 'INSTRUMENT_NAME': <class 'str'>, 'COLUMN_NAME': <class 'str'>, Optional('FLOW_GRADIENT'): <class 'str'>, Optional('FLOW_RATE'): <class 'str'>, Optional('COLUMN_TEMPERATURE'): <class 'str'>, Optional('METHODS_FILENAME'): <class 'str'>, Optional('SOLVENT_A'): <class 'str'>, Optional('SOLVENT_B'): <class 'str'>, Optional('METHODS_ID'): <class 'str'>, Optional('COLUMN_PRESSURE'): <class 'str'>, Optional('INJECTION_TEMPERATURE'): <class 'str'>, Optional('INTERNAL_STANDARD'): <class 'str'>, Optional('INTERNAL_STANDARD_MT'): <class 'str'>, Optional('RETENTION_INDEX'): <class 'str'>, Optional('RETENTION_TIME'): <class 'str'>, Optional('SAMPLE_INJECTION'): <class 'str'>, Optional('SAMPLING_CONE'): <class 'str'>, Optional('ANALYTICAL_TIME'): <class 'str'>, Optional('CAPILLARY_VOLTAGE'): <class 'str'>, Optional('MIGRATION_TIME'): <class 'str'>, Optional('OVEN_TEMPERATURE'): <class 'str'>, Optional('PRECONDITIONING'): <class 'str'>, Optional('RUNNING_BUFFER'): <class 'str'>, Optional('RUNNING_VOLTAGE'): <class 'str'>, Optional('SHEATH_LIQUID'): <class 'str'>, Optional('TIME_PROGRAM'): <class 'str'>, Optional('TRANSFERLINE_TEMPERATURE'): <class 'str'>, Optional('WASHING_BUFFER'): <class 'str'>, Optional('WEAK_WASH_SOLVENT_NAME'): <class 'str'>, Optional('WEAK_WASH_VOLUME'): <class 'str'>, Optional('STRONG_WASH_SOLVENT_NAME'): <class 'str'>, Optional('STRONG_WASH_VOLUME'): <class 'str'>, Optional('TARGET_SAMPLE_TEMPERATURE'): <class 'str'>, Optional('SAMPLE_LOOP_SIZE'): <class 'str'>, Optional('SAMPLE_SYRINGE_SIZE'): <class 'str'>, Optional('RANDOMIZATION_ORDER'): <class 'str'>, Optional('CHROMATOGRAPHY_COMMENTS'): <class 'str'>}), 'COLLECTION': Schema({'COLLECTION_SUMMARY': <class 'str'>, Optional('COLLECTION_PROTOCOL_ID'): <class 'str'>, Optional('COLLECTION_PROTOCOL_FILENAME'): <class 'str'>, Optional('COLLECTION_PROTOCOL_COMMENTS'): <class 'str'>, Optional('SAMPLE_TYPE'): <class 'str'>, Optional('COLLECTION_METHOD'): <class 'str'>, Optional('COLLECTION_LOCATION'): <class 'str'>, Optional('COLLECTION_FREQUENCY'): <class 'str'>, Optional('COLLECTION_DURATION'): <class 'str'>, Optional('COLLECTION_TIME'): <class 'str'>, Optional('VOLUMEORAMOUNT_COLLECTED'): <class 'str'>, Optional('STORAGE_CONDITIONS'): <class 'str'>, Optional('COLLECTION_VIALS'): <class 'str'>, Optional('STORAGE_VIALS'): <class 'str'>, Optional('COLLECTION_TUBE_TEMP'): <class 'str'>, Optional('ADDITIVES'): <class 'str'>, Optional('BLOOD_SERUM_OR_PLASMA'): <class 'str'>, Optional('TISSUE_CELL_IDENTIFICATION'): <class 'str'>, Optional('TISSUE_CELL_QUANTITY_TAKEN'): <class 'str'>}), 'METABOLOMICS WORKBENCH': Schema({'VERSION': <class 'str'>, 'CREATED_ON': <class 'str'>, Optional('STUDY_ID'): <class 'str'>, Optional('ANALYSIS_ID'): <class 'str'>, Optional('PROJECT_ID'): <class 'str'>, Optional('HEADER'): <class 'str'>, Optional('DATATRACK_ID'): <class 'str'>}), 'MS': Schema({'INSTRUMENT_NAME': <class 'str'>, 'INSTRUMENT_TYPE': <class 'str'>, 'MS_TYPE': <class 'str'>, 'ION_MODE': <class 'str'>, Optional('MS_COMMENTS'): <class 'str'>, Optional('CAPILLARY_TEMPERATURE'): <class 'str'>, Optional('CAPILLARY_VOLTAGE'): <class 'str'>, Optional('COLLISION_ENERGY'): <class 'str'>, Optional('COLLISION_GAS'): <class 'str'>, Optional('DRY_GAS_FLOW'): <class 'str'>, Optional('DRY_GAS_TEMP'): <class 'str'>, Optional('FRAGMENT_VOLTAGE'): <class 'str'>, Optional('FRAGMENTATION_METHOD'): <class 'str'>, Optional('GAS_PRESSURE'): <class 'str'>, Optional('HELIUM_FLOW'): <class 'str'>, Optional('ION_SOURCE_TEMPERATURE'): <class 'str'>, Optional('ION_SPRAY_VOLTAGE'): <class 'str'>, Optional('IONIZATION'): <class 'str'>, Optional('IONIZATION_ENERGY'): <class 'str'>, Optional('IONIZATION_POTENTIAL'): <class 'str'>, Optional('MASS_ACCURACY'): <class 'str'>, Optional('PRECURSOR_TYPE'): <class 'str'>, Optional('REAGENT_GAS'): <class 'str'>, Optional('SOURCE_TEMPERATURE'): <class 'str'>, Optional('SPRAY_VOLTAGE'): <class 'str'>, Optional('ACTIVATION_PARAMETER'): <class 'str'>, Optional('ACTIVATION_TIME'): <class 'str'>, Optional('ATOM_GUN_CURRENT'): <class 'str'>, Optional('AUTOMATIC_GAIN_CONTROL'): <class 'str'>, Optional('BOMBARDMENT'): <class 'str'>, Optional('CDL_SIDE_OCTOPOLES_BIAS_VOLTAGE'): <class 'str'>, Optional('CDL_TEMPERATURE'): <class 'str'>, Optional('DATAFORMAT'): <class 'str'>, Optional('DESOLVATION_GAS_FLOW'): <class 'str'>, Optional('DESOLVATION_TEMPERATURE'): <class 'str'>, Optional('INTERFACE_VOLTAGE'): <class 'str'>, Optional('IT_SIDE_OCTOPOLES_BIAS_VOLTAGE'): <class 'str'>, Optional('LASER'): <class 'str'>, Optional('MATRIX'): <class 'str'>, Optional('NEBULIZER'): <class 'str'>, Optional('OCTPOLE_VOLTAGE'): <class 'str'>, Optional('PROBE_TIP'): <class 'str'>, Optional('RESOLUTION_SETTING'): <class 'str'>, Optional('SAMPLE_DRIPPING'): <class 'str'>, Optional('SCAN_RANGE_MOVERZ'): <class 'str'>, Optional('SCANNING'): <class 'str'>, Optional('SCANNING_CYCLE'): <class 'str'>, Optional('SCANNING_RANGE'): <class 'str'>, Optional('SKIMMER_VOLTAGE'): <class 'str'>, Optional('TUBE_LENS_VOLTAGE'): <class 'str'>, Optional('MS_RESULTS_FILE'): Or(<class 'str'>, <class 'dict'>)}), 'MS_METABOLITE_DATA': Schema({'Units': <class 'str'>, 'Data': Schema([{Or('Metabolite', 'Bin range(ppm)'): <class 'str'>, Optional(<class 'str'>): <class 'str'>}]), 'Metabolites': Schema([{Or('Metabolite', 'Bin range(ppm)'): <class 'str'>, Optional(<class 'str'>): <class 'str'>}]), Optional('Extended'): Schema([{'Metabolite': <class 'str'>, Optional(<class 'str'>): <class 'str'>, 'sample_id': <class 'str'>}])}), 'NM': Schema({'INSTRUMENT_NAME': <class 'str'>, 'INSTRUMENT_TYPE': <class 'str'>, 'NMR_EXPERIMENT_TYPE': <class 'str'>, Optional('NMR_COMMENTS'): <class 'str'>, Optional('FIELD_FREQUENCY_LOCK'): <class 'str'>, Optional('STANDARD_CONCENTRATION'): <class 'str'>, 'SPECTROMETER_FREQUENCY': <class 'str'>, Optional('NMR_PROBE'): <class 'str'>, Optional('NMR_SOLVENT'): <class 'str'>, Optional('NMR_TUBE_SIZE'): <class 'str'>, Optional('SHIMMING_METHOD'): <class 'str'>, Optional('PULSE_SEQUENCE'): <class 'str'>, Optional('WATER_SUPPRESSION'): <class 'str'>, Optional('PULSE_WIDTH'): <class 'str'>, Optional('POWER_LEVEL'): <class 'str'>, Optional('RECEIVER_GAIN'): <class 'str'>, Optional('OFFSET_FREQUENCY'): <class 'str'>, Optional('PRESATURATION_POWER_LEVEL'): <class 'str'>, Optional('CHEMICAL_SHIFT_REF_CPD'): <class 'str'>, Optional('TEMPERATURE'): <class 'str'>, Optional('NUMBER_OF_SCANS'): <class 'str'>, Optional('DUMMY_SCANS'): <class 'str'>, Optional('ACQUISITION_TIME'): <class 'str'>, Optional('RELAXATION_DELAY'): <class 'str'>, Optional('SPECTRAL_WIDTH'): <class 'str'>, Optional('NUM_DATA_POINTS_ACQUIRED'): <class 'str'>, Optional('REAL_DATA_POINTS'): <class 'str'>, Optional('LINE_BROADENING'): <class 'str'>, Optional('ZERO_FILLING'): <class 'str'>, Optional('APODIZATION'): <class 'str'>, Optional('BASELINE_CORRECTION_METHOD'): <class 'str'>, Optional('CHEMICAL_SHIFT_REF_STD'): <class 'str'>, Optional('BINNED_INCREMENT'): <class 'str'>, Optional('BINNED_DATA_NORMALIZATION_METHOD'): <class 'str'>, Optional('BINNED_DATA_PROTOCOL_FILE'): <class 'str'>, Optional('BINNED_DATA_CHEMICAL_SHIFT_RANGE'): <class 'str'>, Optional('BINNED_DATA_EXCLUDED_RANGE'): <class 'str'>, Optional('NMR_RESULTS_FILE'): Or(<class 'str'>, <class 'dict'>)}), 'NMR_BINNED_DATA': Schema({'Units': <class 'str'>, 'Data': Schema([{Or('Metabolite', 'Bin range(ppm)'): <class 'str'>, Optional(<class 'str'>): <class 'str'>}])}), 'NMR_METABOLITE_DATA': Schema({'Units': <class 'str'>, 'Data': Schema([{Or('Metabolite', 'Bin range(ppm)'): <class 'str'>, Optional(<class 'str'>): <class 'str'>}]), 'Metabolites': Schema([{Or('Metabolite', 'Bin range(ppm)'): <class 'str'>, Optional(<class 'str'>): <class 'str'>}]), Optional('Extended'): Schema([{'Metabolite': <class 'str'>, Optional(<class 'str'>): <class 'str'>, 'sample_id': <class 'str'>}])}), 'PROJECT': Schema({'PROJECT_TITLE': <class 'str'>, Optional('PROJECT_TYPE'): <class 'str'>, 'PROJECT_SUMMARY': <class 'str'>, 'INSTITUTE': <class 'str'>, Optional('DEPARTMENT'): <class 'str'>, Optional('LABORATORY'): <class 'str'>, 'LAST_NAME': <class 'str'>, 'FIRST_NAME': <class 'str'>, 'ADDRESS': <class 'str'>, 'EMAIL': <class 'str'>, 'PHONE': <class 'str'>, Optional('FUNDING_SOURCE'): <class 'str'>, Optional('PROJECT_COMMENTS'): <class 'str'>, Optional('PUBLICATIONS'): <class 'str'>, Optional('CONTRIBUTORS'): <class 'str'>, Optional('DOI'): <class 'str'>}), 'SAMPLEPREP': Schema({'SAMPLEPREP_SUMMARY': <class 'str'>, Optional('SAMPLEPREP_PROTOCOL_ID'): <class 'str'>, Optional('SAMPLEPREP_PROTOCOL_FILENAME'): <class 'str'>, Optional('SAMPLEPREP_PROTOCOL_COMMENTS'): <class 'str'>, Optional('PROCESSING_METHOD'): <class 'str'>, Optional('PROCESSING_STORAGE_CONDITIONS'): <class 'str'>, Optional('EXTRACTION_METHOD'): <class 'str'>, Optional('EXTRACT_CONCENTRATION_DILUTION'): <class 'str'>, Optional('EXTRACT_ENRICHMENT'): <class 'str'>, Optional('EXTRACT_CLEANUP'): <class 'str'>, Optional('EXTRACT_STORAGE'): <class 'str'>, Optional('SAMPLE_RESUSPENSION'): <class 'str'>, Optional('SAMPLE_DERIVATIZATION'): <class 'str'>, Optional('SAMPLE_SPIKING'): <class 'str'>, Optional('ORGAN'): <class 'str'>, Optional('ORGAN_SPECIFICATION'): <class 'str'>, Optional('CELL_TYPE'): <class 'str'>, Optional('SUBCELLULAR_LOCATION'): <class 'str'>}), 'STUDY': Schema({'STUDY_TITLE': <class 'str'>, Optional('STUDY_TYPE'): <class 'str'>, 'STUDY_SUMMARY': <class 'str'>, 'INSTITUTE': <class 'str'>, Optional('DEPARTMENT'): <class 'str'>, Optional('LABORATORY'): <class 'str'>, 'LAST_NAME': <class 'str'>, 'FIRST_NAME': <class 'str'>, 'ADDRESS': <class 'str'>, 'EMAIL': <class 'str'>, 'PHONE': <class 'str'>, Optional('NUM_GROUPS'): <class 'str'>, Optional('TOTAL_SUBJECTS'): <class 'str'>, Optional('NUM_MALES'): <class 'str'>, Optional('NUM_FEMALES'): <class 'str'>, Optional('STUDY_COMMENTS'): <class 'str'>, Optional('PUBLICATIONS'): <class 'str'>, Optional('SUBMIT_DATE'): <class 'str'>}), 'SUBJECT': Schema({'SUBJECT_TYPE': <class 'str'>, 'SUBJECT_SPECIES': <class 'str'>, Optional('TAXONOMY_ID'): <class 'str'>, Optional('GENOTYPE_STRAIN'): <class 'str'>, Optional('AGE_OR_AGE_RANGE'): <class 'str'>, Optional('WEIGHT_OR_WEIGHT_RANGE'): <class 'str'>, Optional('HEIGHT_OR_HEIGHT_RANGE'): <class 'str'>, Optional('GENDER'): <class 'str'>, Optional('HUMAN_RACE'): <class 'str'>, Optional('HUMAN_ETHNICITY'): <class 'str'>, Optional('HUMAN_TRIAL_TYPE'): <class 'str'>, Optional('HUMAN_LIFESTYLE_FACTORS'): <class 'str'>, Optional('HUMAN_MEDICATIONS'): <class 'str'>, Optional('HUMAN_PRESCRIPTION_OTC'): <class 'str'>, Optional('HUMAN_SMOKING_STATUS'): <class 'str'>, Optional('HUMAN_ALCOHOL_DRUG_USE'): <class 'str'>, Optional('HUMAN_NUTRITION'): <class 'str'>, Optional('HUMAN_INCLUSION_CRITERIA'): <class 'str'>, Optional('HUMAN_EXCLUSION_CRITERIA'): <class 'str'>, Optional('ANIMAL_ANIMAL_SUPPLIER'): <class 'str'>, Optional('ANIMAL_HOUSING'): <class 'str'>, Optional('ANIMAL_LIGHT_CYCLE'): <class 'str'>, Optional('ANIMAL_FEED'): <class 'str'>, Optional('ANIMAL_WATER'): <class 'str'>, Optional('ANIMAL_INCLUSION_CRITERIA'): <class 'str'>, Optional('CELL_BIOSOURCE_OR_SUPPLIER'): <class 'str'>, Optional('CELL_STRAIN_DETAILS'): <class 'str'>, Optional('SUBJECT_COMMENTS'): <class 'str'>, Optional('CELL_PRIMARY_IMMORTALIZED'): <class 'str'>, Optional('CELL_PASSAGE_NUMBER'): <class 'str'>, Optional('CELL_COUNTS'): <class 'str'>, Optional('SPECIES_GROUP'): <class 'str'>}), 'SUBJECT_SAMPLE_FACTORS': Schema([{'Subject ID': <class 'str'>, 'Sample ID': <class 'str'>, 'Factors': <class 'dict'>, Optional('Additional sample data'): {Optional('RAW_FILE_NAME'): <class 'str'>, Optional(<class 'str'>): <class 'str'>}}]), 'TREATMENT': Schema({'TREATMENT_SUMMARY': <class 'str'>, Optional('TREATMENT_PROTOCOL_ID'): <class 'str'>, Optional('TREATMENT_PROTOCOL_FILENAME'): <class 'str'>, Optional('TREATMENT_PROTOCOL_COMMENTS'): <class 'str'>, Optional('TREATMENT'): <class 'str'>, Optional('TREATMENT_COMPOUND'): <class 'str'>, Optional('TREATMENT_ROUTE'): <class 'str'>, Optional('TREATMENT_DOSE'): <class 'str'>, Optional('TREATMENT_DOSEVOLUME'): <class 'str'>, Optional('TREATMENT_DOSEDURATION'): <class 'str'>, Optional('TREATMENT_VEHICLE'): <class 'str'>, Optional('ANIMAL_VET_TREATMENTS'): <class 'str'>, Optional('ANIMAL_ANESTHESIA'): <class 'str'>, Optional('ANIMAL_ACCLIMATION_DURATION'): <class 'str'>, Optional('ANIMAL_FASTING'): <class 'str'>, Optional('ANIMAL_ENDP_EUTHANASIA'): <class 'str'>, Optional('ANIMAL_ENDP_TISSUE_COLL_LIST'): <class 'str'>, Optional('ANIMAL_ENDP_TISSUE_PROC_METHOD'): <class 'str'>, Optional('ANIMAL_ENDP_CLINICAL_SIGNS'): <class 'str'>, Optional('HUMAN_FASTING'): <class 'str'>, Optional('HUMAN_ENDP_CLINICAL_SIGNS'): <class 'str'>, Optional('CELL_STORAGE'): <class 'str'>, Optional('CELL_GROWTH_CONTAINER'): <class 'str'>, Optional('CELL_GROWTH_CONFIG'): <class 'str'>, Optional('CELL_GROWTH_RATE'): <class 'str'>, Optional('CELL_INOC_PROC'): <class 'str'>, Optional('CELL_MEDIA'): <class 'str'>, Optional('CELL_ENVIR_COND'): <class 'str'>, Optional('CELL_HARVESTING'): <class 'str'>, Optional('PLANT_GROWTH_SUPPORT'): <class 'str'>, Optional('PLANT_GROWTH_LOCATION'): <class 'str'>, Optional('PLANT_PLOT_DESIGN'): <class 'str'>, Optional('PLANT_LIGHT_PERIOD'): <class 'str'>, Optional('PLANT_HUMIDITY'): <class 'str'>, Optional('PLANT_TEMP'): <class 'str'>, Optional('PLANT_WATERING_REGIME'): <class 'str'>, Optional('PLANT_NUTRITIONAL_REGIME'): <class 'str'>, Optional('PLANT_ESTAB_DATE'): <class 'str'>, Optional('PLANT_HARVEST_DATE'): <class 'str'>, Optional('PLANT_GROWTH_STAGE'): <class 'str'>, Optional('PLANT_METAB_QUENCH_METHOD'): <class 'str'>, Optional('PLANT_HARVEST_METHOD'): <class 'str'>, Optional('PLANT_STORAGE'): <class 'str'>, Optional('CELL_PCT_CONFLUENCE'): <class 'str'>, Optional('CELL_MEDIA_LASTCHANGED'): <class 'str'>})}, verbose=False, metabolites=True)[source]¶ Validate
mwTab
formatted file.Parameters: Returns: Validated file.
Return type:
mwtab.mwrest¶
This module provides routines for accessing the Metabolomics Workbench REST API.
See https://www.metabolomicsworkbench.org/tools/MWRestAPIv1.0.pdf for details.
-
mwtab.mwrest.
analysis_ids
(base_url='https://www.metabolomicsworkbench.org/rest/')[source]¶ Method for retrieving a list of analysis ids for every current analysis in Metabolomics Workbench.
Parameters: base_url (str) – Base url to Metabolomics Workbench REST API. Returns: List of every available Metabolomics Workbench analysis identifier. Return type: list
-
mwtab.mwrest.
study_ids
(base_url='https://www.metabolomicsworkbench.org/rest/')[source]¶ Method for retrieving a list of study ids for every current study in Metabolomics Workbench.
Parameters: base_url (str) – Base url to Metabolomics Workbench REST API. Returns: List of every available Metabolomics Workbench study identifier. Return type: list
-
mwtab.mwrest.
generate_mwtab_urls
(input_items, base_url='https://www.metabolomicsworkbench.org/rest/', output_format='txt')[source]¶ Method for generating URLS to be used to retrieve mwtab files for analyses and studies through the REST API of the Metabolomics Workbench database.
Parameters: Returns: Metabolomics Workbench REST URL string(s).
Return type:
-
mwtab.mwrest.
generate_urls
(input_items, base_url='https://www.metabolomicsworkbench.org/rest/', **kwds)[source]¶ Method for creating a generator which yields validated Metabolomics Workbench REST urls.
Parameters: Returns: Metabolomics Workbench REST URL string(s).
Return type:
-
class
mwtab.mwrest.
GenericMWURL
(rest_params, base_url='https://www.metabolomicsworkbench.org/rest/')[source]¶ GenericMWURL class that stores and validates parameters specifying a Metabolomics Workbench REST URL.
- Metabolomics REST API requests are performed using URL requests in the form of
https://www.metabolomicsworkbench.org/rest/context/input_specification/output_specification
- where:
- if context = “study” | “compound” | “refmet” | “gene” | “protein”
- input_specification = input_item/input_value output_specification = output_item/[output_format]
- elif context = “moverz”
- input_specification = input_item/input_value1/input_value2/input_value3
- input_item = “LIPIDS” | “MB” | “REFMET” input_value1 = m/z_value input_value2 = ion_type_value input_value3 = m/z_tolerance_value
- output_specification = output_format
- output_format = “txt”
- elif context = “exactmass”
- input_specification = input_item/input_value1/input_value2
- input_item = “LIPIDS” | “MB” | “REFMET” input_value1 = LIPID_abbreviation input_value2 = ion_type_value
output_specification = None
-
class
mwtab.mwrest.
MWRESTFile
(source)[source]¶ MWRESTFile class that stores data from a single file download through Metabolomics Workbench’s REST API.
Mirrors
MWTabFile
.-
read
(filehandle)[source]¶ Read data into a
MWRESTFile
instance.Parameters: filehandle ( io.TextIOWrapper
,gzip.GzipFile
,bz2.BZ2File
,zipfile.ZipFile
) – file-like object.Returns: None Return type: None
-
write
(filehandle)[source]¶ Write
MWRESTFile
data into file.Parameters: filehandle ( io.TextIOWrapper
) – file-like object.Returns: None Return type: None
-
mwtab.mwextract¶
This module provides a number of functions and classes for extracting and saving data and metadata
stored in mwTab
formatted files in the form of MWTabFile
.
-
class
mwtab.mwextract.
ItemMatcher
(full_key, value_comparison)[source]¶ ItemMatcher class that can be called to match items from
mwTab
formatted files in the form ofMWTabFile
.
-
class
mwtab.mwextract.
ReGeXMatcher
(full_key, value_comparison)[source]¶ ReGeXMatcher class that can be called to match items from
mwTab
formatted files in the form ofMWTabFile
using regular expressions.
-
mwtab.mwextract.
generate_matchers
(items)[source]¶ Construct a generator that yields Matchers
ItemMatcher
orReGeXMatcher
.Parameters: items (iterable) – Iterable object containing key value pairs to match. Returns: Yields a Matcher object for each given item. Return type: ItemMatcher
orReGeXMatcher
-
mwtab.mwextract.
extract_metabolites
(sources, matchers)[source]¶ Extract metabolite data from
mwTab
formatted files in the form ofMWTabFile
.Parameters: - sources (generator) – Generator of mwtab file objects (
MWTabFile
). - matchers (generator) – Generator of matcher objects (
ItemMatcher
or
ReGeXMatcher
). :return: Extracted metabolites dictionary. :rtype:dict
- sources (generator) – Generator of mwtab file objects (
-
mwtab.mwextract.
extract_metadata
(mwtabfile, keys)[source]¶ Extract metadata data from
mwTab
formatted files in the form ofMWTabFile
.Parameters: Returns: Extracted metadata dictionary.
Return type:
-
mwtab.mwextract.
write_metadata_csv
(to_path, extracted_values, no_header=False)[source]¶ Write extracted metadata
dict
into csv file.Example: “metadata”,”value1”,”value2” “SUBJECT_TYPE”,”Human”,”Plant”
Parameters: Returns: None
Return type:
-
mwtab.mwextract.
write_metabolites_csv
(to_path, extracted_values, no_header=False)[source]¶ Write extracted metabolites data
dict
into csv file.Example: “metabolite_name”,”num-studies”,”num_analyses”,”num_samples” “1,2,4-benzenetriol”,”1”,”1”,”24” “1-monostearin”,”1”,”1”,”24” …
Parameters: Returns: None
Return type:
-
class
mwtab.mwextract.
SetEncoder
(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶ SetEncoder class for encoding Python sets
set
into json serializable objectslist
.-
default
(obj)[source]¶ Method for encoding Python objects. If object passed is a set, converts the set to JSON serializable lists or calls base implementation.
Parameters: obj (object) – Python object to be json encoded. Returns: JSON serializable object. Return type: dict
,list
,tuple
,str
,int
,float
,bool
, orNone
-
-
mwtab.mwextract.
write_json
(to_path, extracted_dict)[source]¶ Write extracted data or metadata
dict
into json file.Metabolites example: {
- “1,2,4-benzenetriol”: {
- “ST000001”: {
- “AN000001”: [
- “LabF_115816”, …
]
}
}
}
Metadata example: {
- “SUBJECT_TYPE”: [
- “Plant”, “Human”
]
}
Parameters: Returns: None
Return type:
mwtab.mwschema¶
This module provides schema definitions for different sections of the
mwTab
Metabolomics Workbench format.
-
mwtab.mwschema.
metabolomics_workbench_schema
¶ Entry point of the library, use this class to instantiate validation schema for the data that will be validated.
-
mwtab.mwschema.
project_schema
¶ Entry point of the library, use this class to instantiate validation schema for the data that will be validated.
-
mwtab.mwschema.
study_schema
¶ Entry point of the library, use this class to instantiate validation schema for the data that will be validated.
-
mwtab.mwschema.
analysis_schema
¶ Entry point of the library, use this class to instantiate validation schema for the data that will be validated.
-
mwtab.mwschema.
subject_schema
¶ Entry point of the library, use this class to instantiate validation schema for the data that will be validated.
-
mwtab.mwschema.
subject_sample_factors_schema
¶ Entry point of the library, use this class to instantiate validation schema for the data that will be validated.
-
mwtab.mwschema.
collection_schema
¶ Entry point of the library, use this class to instantiate validation schema for the data that will be validated.
-
mwtab.mwschema.
treatment_schema
¶ Entry point of the library, use this class to instantiate validation schema for the data that will be validated.
-
mwtab.mwschema.
sampleprep_schema
¶ Entry point of the library, use this class to instantiate validation schema for the data that will be validated.
-
mwtab.mwschema.
chromatography_schema
¶ Entry point of the library, use this class to instantiate validation schema for the data that will be validated.
-
mwtab.mwschema.
ms_schema
¶ Entry point of the library, use this class to instantiate validation schema for the data that will be validated.
-
mwtab.mwschema.
nmr_schema
¶ Entry point of the library, use this class to instantiate validation schema for the data that will be validated.
-
mwtab.mwschema.
ms_metabolite_data_schema
¶ Entry point of the library, use this class to instantiate validation schema for the data that will be validated.
-
mwtab.mwschema.
nmr_binned_data_schema
¶ Entry point of the library, use this class to instantiate validation schema for the data that will be validated.