The mwtab API Reference

Routines for working with mwTab format files used by the Metabolomics Workbench.

This package includes the following modules:

mwtab
This module provides the MWTabFile class which is a python dictionary representation of a Metabolomics Workbench mwtab file. Data can be accessed directly from the MWTabFile instance using bracket accessors.
cli
This module provides command-line interface for the mwtab package.
tokenizer
This module provides the tokenizer() generator that generates tuples of key-value pairs from mwtab files.
fileio
This module provides the read_files() generator to open files from different sources (single file/multiple files on a local machine, directory/archive of files, URL address of a file).
converter
This module provides the Converter class that is responsible for the conversion of mwTab formated files into their JSON representation and vice versa.
mwschema
This module provides JSON schema definitions for the mwTab formatted files, i.e. specifies required and optional keys as well as data types.
validator
This module provides routines to validate mwTab formatted files based on schema definitions as well as checks for file self-consistency.

mwtab.mwtab

This module provides the MWTabFile class that stores the data from a single mwTab formatted file in the form of an OrderedDict. Data can be accessed directly from the MWTabFile instance using bracket accessors.

The data is divided into a series of “sections” which each contain a number of “key-value”-like pairs. Also, the file contains a specially formatted SUBJECT_SAMPLE_FACTOR block and blocks of data between *_START and *_END.

class mwtab.mwtab.MWTabFile(source, *args, **kwds)[source]

MWTabFile class that stores data from a single mwTab formatted file in the form of collections.OrderedDict.

read(filehandle)[source]

Read data into a MWTabFile instance.

Parameters:filehandle (io.TextIOWrapper, gzip.GzipFile, bz2.BZ2File, zipfile.ZipFile) – file-like object.
Returns:None
Return type:None
write(filehandle, file_format)[source]

Write MWTabFile data into file.

Parameters:
  • filehandle (io.TextIOWrapper) – file-like object.
  • file_format (str) – Format to use to write data: mwtab or json.
Returns:

None

Return type:

None

writestr(file_format)[source]

Write MWTabFile data into string.

Parameters:file_format (str) – Format to use to write data: mwtab or json.
Returns:String representing the MWTabFile instance.
Return type:str
print_file(f=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, file_format='mwtab')[source]

Print MWTabFile into a file or stdout.

Parameters:
  • f (io.StringIO) – writable file-like stream.
  • file_format (str) – Format to use: mwtab or json.
  • f – Print to file or stdout.
  • tw (int) – Tab width.
Returns:

None

Return type:

None

print_block(section_key, f=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, file_format='mwtab')[source]

Print mwtab section into a file or stdout.

Parameters:
  • section_key (str) – Section name.
  • f (io.StringIO) – writable file-like stream.
  • file_format (str) – Format to use: mwtab or json.
Returns:

None

Return type:

None

The mwtab command-line interface

Usage:
mwtab -h | –help mwtab –version mwtab convert (<from-path> <to-path>) [–from-format=<format>] [–to-format=<format>] [–validate] [–mw-rest=<url>] [–verbose] mwtab validate <from-path> [–mw-rest=<url>] [–verbose]
Options:
-h, --help Show this screen.
--version Show version.
--verbose Print what files are processing.
--validate Validate the mwTab file.
--from-format=<format>
 Input file format, available formats: mwtab, json [default: mwtab].
--to-format=<format>
 Output file format, available formats: mwtab, json [default: json].
--mw-rest=<url>
 URL to MW REST interface [default: http://www.metabolomicsworkbench.org/rest/study/analysis_id/{}/mwtab/txt].
mwtab.cli.cli(cmdargs)[source]

mwtab.tokenizer

This module provides the tokenizer() lexical analyzer for mwTab format syntax. It is implemented as Python generator-based state machine which generates (yields) tokens one at a time when next() is invoked on tokenizer() instance.

Each token is a tuple of “key-value”-like pairs, tuple of SUBJECT_SAMPLE_FACTORS or tuple of data deposited between *_START and *_END blocks.

mwtab.tokenizer.tokenizer(text)[source]

A lexical analyzer for the mwtab formatted files.

Parameters:text (str) – mwtab formatted text.
Returns:Tuples of data.
Return type:py:class:~collections.namedtuple

mwtab.fileio

This module provides routines for reading mwTab formatted files from difference kinds of sources:

  • Single mwTab formatted file on a local machine.
  • Directory containing multiple mwTab formatted files.
  • Compressed zip/tar archive of mwTab formatted files.
  • URL address of mwTab formatted file.
  • ANALYSIS_ID of mwTab formatted file.
mwtab.fileio.read_files(*sources, **kwds)[source]

Construct a generator that yields file instances.

Parameters:sources – One or more strings representing path to file(s).

mwtab.converter

This module provides functionality for converting between the Metabolomics Workbench mwTab formatted file and its equivalent JSONized representation.

The following conversions are possible:

Local files:
  • One-to-one file conversions:
    • textfile - to - textfile
    • textfile - to - textfile.gz
    • textfile - to - textfile.bz2
    • textfile.gz - to - textfile
    • textfile.gz - to - textfile.gz
    • textfile.gz - to - textfile.bz2
    • textfile.bz2 - to - textfile
    • textfile.bz2 - to - textfile.gz
    • textfile.bz2 - to - textfile.bz2
    • textfile / textfile.gz / textfile.bz2 - to - textfile.zip / textfile.tar / textfile.tar.gz / textfile.tar.bz2 (TypeError: One-to-many conversion)
  • Many-to-many files conversions:
    • Directories:
      • directory - to - directory
      • directory - to - directory.zip
      • directory - to - directory.tar
      • directory - to - directory.tar.bz2
      • directory - to - directory.tar.gz
      • directory - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
    • Zipfiles:
      • zipfile.zip - to - directory
      • zipfile.zip - to - zipfile.zip
      • zipfile.zip - to - tarfile.tar
      • zipfile.zip - to - tarfile.tar.gz
      • zipfile.zip - to - tarfile.tar.bz2
      • zipfile.zip - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
    • Tarfiles:
      • tarfile.tar - to - directory
      • tarfile.tar - to - zipfile.zip
      • tarfile.tar - to - tarfile.tar
      • tarfile.tar - to - tarfile.tar.gz
      • tarfile.tar - to - tarfile.tar.bz2
      • tarfile.tar - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
      • tarfile.tar.gz - to - directory
      • tarfile.tar.gz - to - zipfile.zip
      • tarfile.tar.gz - to - tarfile.tar
      • tarfile.tar.gz - to - tarfile.tar.gz
      • tarfile.tar.gz - to - tarfile.tar.bz2
      • tarfile.tar.gz - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
      • tarfile.tar.bz2 - to - directory
      • tarfile.tar.bz2 - to - zipfile.zip
      • tarfile.tar.bz2 - to - tarfile.tar
      • tarfile.tar.bz2 - to - tarfile.tar.gz
      • tarfile.tar.bz2 - to - tarfile.tar.bz2
      • tarfile.tar.bz2 - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
URL files:
  • One-to-one file conversions:
    • analysis_id - to - textfile
    • analysis_id - to - textfile.gz
    • analysis_id - to - textfile.bz2
    • analysis_id - to - textfile.zip / textfile.tar / textfile.tar.gz / textfile.tar.bz2 (TypeError: One-to-many conversion)
    • textfileurl - to - textfile
    • textfileurl - to - textfile.gz
    • textfileurl - to - textfile.bz2
    • textfileurl.gz - to - textfile
    • textfileurl.gz - to - textfile.gz
    • textfileurl.gz - to - textfile.bz2
    • textfileurl.bz2 - to - textfile
    • textfileurl.bz2 - to - textfile.gz
    • textfileurl.bz2 - to - textfile.bz2
    • textfileurl / textfileurl.gz / textfileurl.bz2 - to - textfile.zip / textfile.tar / textfile.tar.gz / textfile.tar.bz2 (TypeError: One-to-many conversion)
  • Many-to-many files conversions:
    • Zipfiles:
      • zipfileurl.zip - to - directory
      • zipfileurl.zip - to - zipfile.zip
      • zipfileurl.zip - to - tarfile.tar
      • zipfileurl.zip - to - tarfile.tar.gz
      • zipfileurl.zip - to - tarfile.tar.bz2
      • zipfileurl.zip - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
    • Tarfiles:
      • tarfileurl.tar - to - directory
      • tarfileurl.tar - to - zipfile.zip
      • tarfileurl.tar - to - tarfile.tar
      • tarfileurl.tar - to - tarfile.tar.gz
      • tarfileurl.tar - to - tarfile.tar.bz2
      • tarfileurl.tar - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
      • tarfileurl.tar.gz - to - directory
      • tarfileurl.tar.gz - to - zipfile.zip
      • tarfileurl.tar.gz - to - tarfile.tar
      • tarfileurl.tar.gz - to - tarfile.tar.gz
      • tarfileurl.tar.gz - to - tarfile.tar.bz2
      • tarfileurl.tar.gz - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
      • tarfileurl.tar.bz2 - to - directory
      • tarfileurl.tar.bz2 - to - zipfile.zip
      • tarfileurl.tar.bz2 - to - tarfile.tar
      • tarfileurl.tar.bz2 - to - tarfile.tar.gz
      • tarfileurl.tar.bz2 - to - tarfile.tar.bz2
      • tarfileurl.tar.bz2 - to - directory.gz / directory.bz2 (TypeError: Many-to-one conversion)
class mwtab.converter.Translator(from_path, to_path, from_format=None, to_format=None, validate=False)[source]

Translator abstract class.

class mwtab.converter.MWTabFileToMWTabFile(from_path, to_path, from_format=None, to_format=None, validate=False)[source]

Translator concrete class that can convert between mwTab and JSON formats.

class mwtab.converter.Converter(from_path, to_path, from_format='mwtab', to_format='json', validate=False)[source]

Converter class to convert mwTab files from mwTab to JSON or from JSON to mwTab format.

convert()[source]

Convert file(s) from mwTab format to JSON format or from JSON format to mwTab format. :return: None :rtype: None

mwtab.validator

This module contains routines to validate consistency of the mwTab formatted files, e.g. make sure that Samples and Factors identifiers are consistent across the file, make sure that all required key-value pairs are present.

mwtab.validator.validate_section(section, schema)[source]

Validate section of mwTab formatted file.

Parameters:
  • section – Section of MWTabFile.
  • schema – Schema definition.
Returns:

Validated section.

Return type:

collections.OrderedDict

mwtab.validator.validate_file(mwtabfile, section_schema_mapping={'ANALYSIS': Schema({Optional('PROCESSED_FILE'): <class 'str'>, Optional('OPERATOR_NAME'): <class 'str'>, Optional('ANALYSIS_COMMENTS'): <class 'str'>, Optional('SOFTWARE_VERSION'): <class 'str'>, Optional('ACQUISITION_TIME'): <class 'str'>, Optional('NUM_METABOLITES'): <class 'str'>, Optional('NUM_FACTORS'): <class 'str'>, Optional('RAW_FILE'): <class 'str'>, Optional('INSTRUMENT_PARAMETERS_FILE'): <class 'str'>, Optional('ANALYSIS_DISPLAY'): <class 'str'>, Optional('DATA_FORMAT'): <class 'str'>, Optional('DETECTOR_TYPE'): <class 'str'>, Optional('RANDOMIZATION_ORDER'): <class 'str'>, Optional('INSTRUMENT_NAME'): <class 'str'>, Optional('ACQUISITION_PARAMETERS_FILE'): <class 'str'>, Optional('LABORATORY_NAME'): <class 'str'>, Optional('ACQUISITION_DATE'): <class 'str'>, Optional('PROCESSING_PARAMETERS_FILE'): <class 'str'>, 'ANALYSIS_TYPE': <class 'str'>, Optional('ACQUISITION_ID'): <class 'str'>, Optional('ANALYSIS_PROTOCOL_FILE'): <class 'str'>}), 'CHROMATOGRAPHY': Schema({Optional('SAMPLE_LOOP_SIZE'): <class 'str'>, Optional('RETENTION_TIME'): <class 'str'>, Optional('FLOW_RATE'): <class 'str'>, Optional('WEAK_WASH_VOLUME'): <class 'str'>, Optional('SAMPLE_INJECTION'): <class 'str'>, Optional('INTERNAL_STANDARD'): <class 'str'>, 'CHROMATOGRAPHY_TYPE': <class 'str'>, Optional('CHROMATOGRAPHY_SUMMARY'): <class 'str'>, Optional('STRONG_WASH_VOLUME'): <class 'str'>, Optional('SOLVENT_A'): <class 'str'>, Optional('WASHING_BUFFER'): <class 'str'>, Optional('TIME_PROGRAM'): <class 'str'>, Optional('OVEN_TEMPERATURE'): <class 'str'>, Optional('CHROMATOGRAPHY_COMMENTS'): <class 'str'>, Optional('INTERNAL_STANDARD_MT'): <class 'str'>, Optional('METHODS_FILENAME'): <class 'str'>, Optional('MIGRATION_TIME'): <class 'str'>, Optional('TARGET_SAMPLE_TEMPERATURE'): <class 'str'>, Optional('STRONG_WASH_SOLVENT_NAME'): <class 'str'>, Optional('ANALYTICAL_TIME'): <class 'str'>, Optional('METHODS_ID'): <class 'str'>, Optional('COLUMN_PRESSURE'): <class 'str'>, 'COLUMN_NAME': <class 'str'>, Optional('SAMPLING_CONE'): <class 'str'>, Optional('SHEATH_LIQUID'): <class 'str'>, Optional('FLOW_GRADIENT'): <class 'str'>, Optional('SOLVENT_B'): <class 'str'>, Optional('WEAK_WASH_SOLVENT_NAME'): <class 'str'>, Optional('INJECTION_TEMPERATURE'): <class 'str'>, Optional('TRANSFERLINE_TEMPERATURE'): <class 'str'>, Optional('RUNNING_BUFFER'): <class 'str'>, Optional('SAMPLE_SYRINGE_SIZE'): <class 'str'>, 'INSTRUMENT_NAME': <class 'str'>, Optional('PRECONDITIONING'): <class 'str'>, Optional('RUNNING_VOLTAGE'): <class 'str'>, Optional('RANDOMIZATION_ORDER'): <class 'str'>, Optional('CAPILLARY_VOLTAGE'): <class 'str'>, Optional('COLUMN_TEMPERATURE'): <class 'str'>, Optional('RETENTION_INDEX'): <class 'str'>}), 'COLLECTION': Schema({Optional('COLLECTION_VIALS'): <class 'str'>, Optional('COLLECTION_PROTOCOL_COMMENTS'): <class 'str'>, Optional('TISSUE_CELL_IDENTIFICATION'): <class 'str'>, Optional('COLLECTION_PROTOCOL_FILENAME'): <class 'str'>, Optional('COLLECTION_LOCATION'): <class 'str'>, Optional('SAMPLE_TYPE'): <class 'str'>, 'COLLECTION_SUMMARY': <class 'str'>, Optional('COLLECTION_FREQUENCY'): <class 'str'>, Optional('STORAGE_VIALS'): <class 'str'>, Optional('COLLECTION_TUBE_TEMP'): <class 'str'>, Optional('COLLECTION_PROTOCOL_ID'): <class 'str'>, Optional('COLLECTION_DURATION'): <class 'str'>, Optional('COLLECTION_TIME'): <class 'str'>, Optional('STORAGE_CONDITIONS'): <class 'str'>, Optional('ADDITIVES'): <class 'str'>, Optional('BLOOD_SERUM_OR_PLASMA'): <class 'str'>, Optional('COLLECTION_METHOD'): <class 'str'>, Optional('VOLUMEORAMOUNT_COLLECTED'): <class 'str'>, Optional('TISSUE_CELL_QUANTITY_TAKEN'): <class 'str'>}), 'METABOLITES': Schema({'METABOLITES_START': {'Fields': <class 'list'>, 'DATA': [{'metabolite_name': <class 'str'>, Optional('moverz_quant'): <class 'str'>, Optional('other_id'): <class 'str'>, Optional('inchi_key'): <class 'str'>, Optional('ri'): <class 'str'>, Optional('pubchem_id'): <class 'str'>, Optional('kegg_id'): <class 'str'>, Optional('ri_type'): <class 'str'>, Optional('other_id_type'): <class 'str'>}]}}), 'METABOLOMICS WORKBENCH': Schema({'VERSION': <class 'str'>, Optional('DATATRACK_ID'): <class 'str'>, Optional('HEADER'): <class 'str'>, Optional('STUDY_ID'): <class 'str'>, 'CREATED_ON': <class 'str'>, Optional('ANALYSIS_ID'): <class 'str'>}), 'MS': Schema({Optional('SCANNING_RANGE'): <class 'str'>, Optional('COLLISION_ENERGY'): <class 'str'>, Optional('LASER'): <class 'str'>, Optional('SKIMMER_VOLTAGE'): <class 'str'>, Optional('INTERFACE_VOLTAGE'): <class 'str'>, Optional('SAMPLE_DRIPPING'): <class 'str'>, Optional('IONIZATION'): <class 'str'>, Optional('PRECURSOR_TYPE'): <class 'str'>, Optional('DRY_GAS_FLOW'): <class 'str'>, Optional('DRY_GAS_TEMP'): <class 'str'>, Optional('RESOLUTION_SETTING'): <class 'str'>, Optional('IONIZATION_POTENTIAL'): <class 'str'>, 'ION_MODE': <class 'str'>, Optional('MASS_ACCURACY'): <class 'str'>, Optional('TUBE_LENS_VOLTAGE'): <class 'str'>, Optional('REAGENT_GAS'): <class 'str'>, Optional('FRAGMENT_VOLTAGE'): <class 'str'>, Optional('ION_SOURCE_TEMPERATURE'): <class 'str'>, Optional('AUTOMATIC_GAIN_CONTROL'): <class 'str'>, 'MS_TYPE': <class 'str'>, 'INSTRUMENT_NAME': <class 'str'>, Optional('ACTIVATION_TIME'): <class 'str'>, Optional('IT_SIDE_OCTOPOLES_BIAS_VOLTAGE'): <class 'str'>, Optional('COLLISION_GAS'): <class 'str'>, Optional('FRAGMENTATION_METHOD'): <class 'str'>, Optional('DESOLVATION_TEMPERATURE'): <class 'str'>, Optional('ION_SPRAY_VOLTAGE'): <class 'str'>, Optional('PROBE_TIP'): <class 'str'>, Optional('SCAN_RANGE_MOVERZ'): <class 'str'>, Optional('CAPILLARY_TEMPERATURE'): <class 'str'>, Optional('SCANNING'): <class 'str'>, Optional('ACTIVATION_PARAMETER'): <class 'str'>, Optional('SOURCE_TEMPERATURE'): <class 'str'>, Optional('CDL_SIDE_OCTOPOLES_BIAS_VOLTAGE'): <class 'str'>, Optional('IONIZATION_ENERGY'): <class 'str'>, Optional('SCANNING_CYCLE'): <class 'str'>, Optional('MS_RESULTS_FILE'): Or(<class 'str'>, <class 'dict'>), Optional('SPRAY_VOLTAGE'): <class 'str'>, Optional('CDL_TEMPERATURE'): <class 'str'>, Optional('DATAFORMAT'): <class 'str'>, Optional('GAS_PRESSURE'): <class 'str'>, Optional('ATOM_GUN_CURRENT'): <class 'str'>, Optional('MS_COMMENTS'): <class 'str'>, Optional('HELIUM_FLOW'): <class 'str'>, Optional('MATRIX'): <class 'str'>, Optional('DESOLVATION_GAS_FLOW'): <class 'str'>, Optional('BOMBARDMENT'): <class 'str'>, Optional('CAPILLARY_VOLTAGE'): <class 'str'>, Optional('OCTPOLE_VOLTAGE'): <class 'str'>, 'INSTRUMENT_TYPE': <class 'str'>, Optional('NEBULIZER'): <class 'str'>}), 'MS_METABOLITE_DATA': Schema({'MS_METABOLITE_DATA:UNITS': <class 'str'>, 'MS_METABOLITE_DATA_START': {'Factors': <class 'list'>, 'DATA': <class 'list'>, 'Samples': <class 'list'>}}), 'NMR': Schema({Optional('NUM_DATA_POINTS_ACQUIRED'): <class 'str'>, Optional('BINNED_DATA_EXCLUDED_RANGE'): <class 'str'>, Optional('ZERO_FILLING'): <class 'str'>, Optional('ACQUISITION_TIME'): <class 'str'>, Optional('PULSE_SEQUENCE'): <class 'str'>, Optional('LINE_BROADENING'): <class 'str'>, Optional('BASELINE_CORRECTION_METHOD'): <class 'str'>, Optional('DUMMY_SCANS'): <class 'str'>, Optional('RELAXATION_DELAY'): <class 'str'>, Optional('BINNED_DATA_PROTOCOL_FILE'): <class 'str'>, Optional('NMR_TUBE_SIZE'): <class 'str'>, Optional('NMR_SOLVENT'): <class 'str'>, 'NMR_EXPERIMENT_TYPE': <class 'str'>, Optional('PULSE_WIDTH'): <class 'str'>, Optional('REAL_DATA_POINTS'): <class 'str'>, Optional('APODIZATION'): <class 'str'>, Optional('BINNED_INCREMENT'): <class 'str'>, Optional('SHIMMING_METHOD'): <class 'str'>, Optional('RECEIVER_GAIN'): <class 'str'>, Optional('CHEMICAL_SHIFT_REF_CPD'): <class 'str'>, Optional('BINNED_DATA_CHEMICAL_SHIFT_RANGE'): <class 'str'>, Optional('NMR_PROBE'): <class 'str'>, Optional('TEMPERATURE'): <class 'str'>, Optional('OFFSET_FREQUENCY'): <class 'str'>, Optional('WATER_SUPPRESSION'): <class 'str'>, Optional('BINNED_DATA_NORMALIZATION_METHOD'): <class 'str'>, Optional('SPECTRAL_WIDTH'): <class 'str'>, Optional('PRESATURATION_POWER_LEVEL'): <class 'str'>, Optional('CHEMICAL_SHIFT_REF_STD'): <class 'str'>, 'SPECTROMETER_FREQUENCY': <class 'str'>, 'INSTRUMENT_NAME': <class 'str'>, Optional('NUMBER_OF_SCANS'): <class 'str'>, Optional('FIELD_FREQUENCY_LOCK'): <class 'str'>, Optional('NMR_COMMENTS'): <class 'str'>, Optional('STANDARD_CONCENTRATION'): <class 'str'>, 'INSTRUMENT_TYPE': <class 'str'>, Optional('POWER_LEVEL'): <class 'str'>}), 'NMR_BINNED_DATA': Schema({'NMR_BINNED_DATA_START': {'Fields': <class 'list'>, 'DATA': <class 'list'>}}), 'PROJECT': Schema({'FIRST_NAME': <class 'str'>, Optional('FUNDING_SOURCE'): <class 'str'>, 'PHONE': <class 'str'>, 'LAST_NAME': <class 'str'>, 'EMAIL': <class 'str'>, 'ADDRESS': <class 'str'>, 'PROJECT_SUMMARY': <class 'str'>, Optional('PUBLICATIONS'): <class 'str'>, Optional('DEPARTMENT'): <class 'str'>, Optional('DOI'): <class 'str'>, 'PROJECT_TITLE': <class 'str'>, Optional('CONTRIBUTORS'): <class 'str'>, Optional('LABORATORY'): <class 'str'>, Optional('PROJECT_COMMENTS'): <class 'str'>, Optional('PROJECT_TYPE'): <class 'str'>, 'INSTITUTE': <class 'str'>}), 'SAMPLEPREP': Schema({Optional('ORGAN'): <class 'str'>, Optional('PROCESSING_STORAGE_CONDITIONS'): <class 'str'>, Optional('EXTRACT_STORAGE'): <class 'str'>, Optional('EXTRACTION_METHOD'): <class 'str'>, Optional('SAMPLEPREP_PROTOCOL_ID'): <class 'str'>, 'SAMPLEPREP_SUMMARY': <class 'str'>, Optional('EXTRACT_ENRICHMENT'): <class 'str'>, Optional('EXTRACT_CONCENTRATION_DILUTION'): <class 'str'>, Optional('CELL_TYPE'): <class 'str'>, Optional('SAMPLEPREP_PROTOCOL_COMMENTS'): <class 'str'>, Optional('PROCESSING_METHOD'): <class 'str'>, Optional('SAMPLE_RESUSPENSION'): <class 'str'>, Optional('SUBCELLULAR_LOCATION'): <class 'str'>, Optional('SAMPLE_SPIKING'): <class 'str'>, Optional('SAMPLEPREP_PROTOCOL_FILENAME'): <class 'str'>, Optional('ORGAN_SPECIFICATION'): <class 'str'>, Optional('EXTRACT_CLEANUP'): <class 'str'>, Optional('SAMPLE_DERIVATIZATION'): <class 'str'>}), 'STUDY': Schema({Optional('NUM_MALES'): <class 'str'>, 'FIRST_NAME': <class 'str'>, Optional('TOTAL_SUBJECTS'): <class 'str'>, 'STUDY_SUMMARY': <class 'str'>, 'PHONE': <class 'str'>, 'LAST_NAME': <class 'str'>, Optional('STUDY_COMMENTS'): <class 'str'>, 'ADDRESS': <class 'str'>, Optional('NUM_FEMALES'): <class 'str'>, Optional('PUBLICATIONS'): <class 'str'>, Optional('DEPARTMENT'): <class 'str'>, 'STUDY_TITLE': <class 'str'>, 'EMAIL': <class 'str'>, Optional('LABORATORY'): <class 'str'>, Optional('STUDY_TYPE'): <class 'str'>, Optional('NUM_GROUPS'): <class 'str'>, Optional('SUBMIT_DATE'): <class 'str'>, 'INSTITUTE': <class 'str'>}), 'SUBJECT': Schema({'SUBJECT_TYPE': <class 'str'>, Optional('HUMAN_TRIAL_TYPE'): <class 'str'>, Optional('CELL_STRAIN_DETAILS'): <class 'str'>, Optional('HUMAN_RACE'): <class 'str'>, Optional('AGE_OR_AGE_RANGE'): <class 'str'>, Optional('SUBJECT_COMMENTS'): <class 'str'>, Optional('ANIMAL_FEED'): <class 'str'>, Optional('HUMAN_NUTRITION'): <class 'str'>, Optional('HUMAN_MEDICATIONS'): <class 'str'>, Optional('ANIMAL_WATER'): <class 'str'>, 'SUBJECT_SPECIES': <class 'str'>, Optional('HUMAN_ALCOHOL_DRUG_USE'): <class 'str'>, Optional('GENDER'): <class 'str'>, Optional('ANIMAL_HOUSING'): <class 'str'>, Optional('HUMAN_EXCLUSION_CRITERIA'): <class 'str'>, Optional('CELL_PASSAGE_NUMBER'): <class 'str'>, Optional('SPECIES_GROUP'): <class 'str'>, Optional('HUMAN_ETHNICITY'): <class 'str'>, Optional('TAXONOMY_ID'): <class 'str'>, Optional('HEIGHT_OR_HEIGHT_RANGE'): <class 'str'>, Optional('HUMAN_INCLUSION_CRITERIA'): <class 'str'>, Optional('HUMAN_LIFESTYLE_FACTORS'): <class 'str'>, Optional('CELL_BIOSOURCE_OR_SUPPLIER'): <class 'str'>, Optional('ANIMAL_ANIMAL_SUPPLIER'): <class 'str'>, Optional('WEIGHT_OR_WEIGHT_RANGE'): <class 'str'>, Optional('ANIMAL_LIGHT_CYCLE'): <class 'str'>, Optional('GENOTYPE_STRAIN'): <class 'str'>, Optional('HUMAN_PRESCRIPTION_OTC'): <class 'str'>, Optional('CELL_PRIMARY_IMMORTALIZED'): <class 'str'>, Optional('CELL_COUNTS'): <class 'str'>, Optional('HUMAN_SMOKING_STATUS'): <class 'str'>, Optional('ANIMAL_INCLUSION_CRITERIA'): <class 'str'>}), 'SUBJECT_SAMPLE_FACTORS': Schema({'SUBJECT_SAMPLE_FACTORS': [{'subject_type': <class 'str'>, 'factors': <class 'str'>, 'additional_sample_data': <class 'str'>, 'local_sample_id': <class 'str'>}]}), 'TREATMENT': Schema({Optional('PLANT_ESTAB_DATE'): <class 'str'>, Optional('TREATMENT_PROTOCOL_COMMENTS'): <class 'str'>, Optional('ANIMAL_ENDP_TISSUE_PROC_METHOD'): <class 'str'>, Optional('ANIMAL_ENDP_CLINICAL_SIGNS'): <class 'str'>, Optional('TREATMENT_COMPOUND'): <class 'str'>, Optional('PLANT_TEMP'): <class 'str'>, Optional('ANIMAL_ENDP_TISSUE_COLL_LIST'): <class 'str'>, Optional('HUMAN_ENDP_CLINICAL_SIGNS'): <class 'str'>, Optional('PLANT_GROWTH_LOCATION'): <class 'str'>, Optional('CELL_GROWTH_CONFIG'): <class 'str'>, Optional('PLANT_WATERING_REGIME'): <class 'str'>, Optional('PLANT_HARVEST_METHOD'): <class 'str'>, Optional('TREATMENT_ROUTE'): <class 'str'>, Optional('TREATMENT_PROTOCOL_FILENAME'): <class 'str'>, Optional('CELL_INOC_PROC'): <class 'str'>, Optional('TREATMENT'): <class 'str'>, Optional('CELL_HARVESTING'): <class 'str'>, Optional('ANIMAL_ANESTHESIA'): <class 'str'>, Optional('ANIMAL_FASTING'): <class 'str'>, Optional('TREATMENT_DOSEVOLUME'): <class 'str'>, Optional('CELL_PCT_CONFLUENCE'): <class 'str'>, Optional('CELL_GROWTH_RATE'): <class 'str'>, Optional('CELL_STORAGE'): <class 'str'>, Optional('CELL_MEDIA_LASTCHANGED'): <class 'str'>, Optional('TREATMENT_DOSE'): <class 'str'>, Optional('PLANT_METAB_QUENCH_METHOD'): <class 'str'>, Optional('PLANT_PLOT_DESIGN'): <class 'str'>, Optional('TREATMENT_PROTOCOL_ID'): <class 'str'>, Optional('PLANT_LIGHT_PERIOD'): <class 'str'>, Optional('PLANT_GROWTH_SUPPORT'): <class 'str'>, Optional('CELL_GROWTH_CONTAINER'): <class 'str'>, 'TREATMENT_SUMMARY': <class 'str'>, Optional('CELL_ENVIR_COND'): <class 'str'>, Optional('TREATMENT_VEHICLE'): <class 'str'>, Optional('TREATMENT_DOSEDURATION'): <class 'str'>, Optional('PLANT_NUTRITIONAL_REGIME'): <class 'str'>, Optional('PLANT_HUMIDITY'): <class 'str'>, Optional('ANIMAL_VET_TREATMENTS'): <class 'str'>, Optional('ANIMAL_ACCLIMATION_DURATION'): <class 'str'>, Optional('PLANT_GROWTH_STAGE'): <class 'str'>, Optional('HUMAN_FASTING'): <class 'str'>, Optional('PLANT_HARVEST_DATE'): <class 'str'>, Optional('CELL_MEDIA'): <class 'str'>, Optional('PLANT_STORAGE'): <class 'str'>, Optional('ANIMAL_ENDP_EUTHANASIA'): <class 'str'>})}, validate_samples=True, validate_factors=True)[source]

Validate entire mwTab formatted file one section at a time.

Parameters:
  • mwtabfile (MWTabFile) – Instance of MWTabFile.
  • section_schema_mapping (dict) – Dictionary that provides mapping between section name and schema definition.
  • validate_samples (True or False) – Make sure that sample ids are consistent across file.
  • validate_factors (True or False) – Make sure that factors are consistent across file.
Returns:

Validated file.

Return type:

collections.OrderedDict

mwtab.mwschema

This module provides schema definitions for different sections of the mwTab Metabolomics Workbench format.

mwtab.mwschema.metabolomics_workbench_schema

Entry point of the library, use this class to instantiate validation schema for the data that will be validated.

mwtab.mwschema.project_schema

Entry point of the library, use this class to instantiate validation schema for the data that will be validated.

mwtab.mwschema.study_schema

Entry point of the library, use this class to instantiate validation schema for the data that will be validated.

mwtab.mwschema.analysis_schema

Entry point of the library, use this class to instantiate validation schema for the data that will be validated.

mwtab.mwschema.subject_schema

Entry point of the library, use this class to instantiate validation schema for the data that will be validated.

mwtab.mwschema.subject_sample_factors_schema

Entry point of the library, use this class to instantiate validation schema for the data that will be validated.

mwtab.mwschema.collection_schema

Entry point of the library, use this class to instantiate validation schema for the data that will be validated.

mwtab.mwschema.treatment_schema

Entry point of the library, use this class to instantiate validation schema for the data that will be validated.

mwtab.mwschema.sampleprep_schema

Entry point of the library, use this class to instantiate validation schema for the data that will be validated.

mwtab.mwschema.chromatography_schema

Entry point of the library, use this class to instantiate validation schema for the data that will be validated.

mwtab.mwschema.ms_schema

Entry point of the library, use this class to instantiate validation schema for the data that will be validated.

mwtab.mwschema.nmr_schema

Entry point of the library, use this class to instantiate validation schema for the data that will be validated.

mwtab.mwschema.metabolites_schema

Entry point of the library, use this class to instantiate validation schema for the data that will be validated.

mwtab.mwschema.ms_metabolite_data_schema

Entry point of the library, use this class to instantiate validation schema for the data that will be validated.

mwtab.mwschema.nmr_binned_data_schema

Entry point of the library, use this class to instantiate validation schema for the data that will be validated.