Quick Start Tutorial
Basic Workflow
Pioneer performs three major steps:
- Convert vendor MS files into the Arrow format using PioneerConverter.
- Build in silico spectral libraries using FASTA files and the Koina server.
- Search DIA experiments using a spectral library and the MS data files.
Pioneer Converter
Pioneer operates on MS/MS data stored in the Apache Arrow IPC format. Use the bundled PioneerConverter via the CLI to convert Thermo RAW files:
pioneer convert-raw /path/to/raw/or/folder
This subcommand accepts either a single .raw
file or a directory of files. See the PioneerConverter repository for additional options such as thread count and output paths.
MzML to Arrow IPC (Sciex)
For mzML-formatted data, use:
pioneer convert-mzml /path/to/mzml/or/folder
Starting Pioneer
After installation, Pioneer is accessed from the command line. Running pioneer --help
displays available subcommands:
pioneer [options] <subcommand> [subcommand-args...]
Subcommands include search
, predict
, params-search
, params-predict
, convert-raw
, and convert-mzml
. The first launch may download dependencies on Windows or Linux, while macOS performs a one-time Gatekeeper check.
A minimal end-to-end workflow is:
pioneer params-predict lib_dir lib_name fasta_dir --params-path=predict_params.json
pioneer predict predict_params.json
pioneer convert-raw raw_dir
pioneer params-search library.poin ms_data_dir results_dir --params-path=search_params.json
pioneer search search_params.json
This sequence builds a predicted spectral library, converts vendor files to Arrow, generates search parameters, and searches the experiment.
params-predict
and params-search
create template JSON files. Edit these configurations to suit your experiment before running predict
or search
. See Parameter Configuration for a description of each option.