crux-generate-peptides
Usage:Description:crux-generate-peptides [options] <protein input>Given a protein database as input, generate a list of peptides in the database that meet certain criteria (e.g. mass, length) as output. The user can specify whether to use a pre-computed peptide index database or a text fasta file.
Input:
Output:
- <protein input> – The name of the file in fasta format or the directory containing the protein index from which to parse proteins.
The program prints to standard output a series of header lines, describing how the peptides were generated, and according to what criteria, as follows:
# PROTEIN DATABASE: <protein input filename> # OPTIONS: <options used to generate the file>Then the program prints a series of tab delimited lines with the following fields, where [ ] indicates an optional field:<mass> <protein-id> <peptide-start> <peptide-length> [peptide-sequence]Options:Parameter file options:
--output-sequence <T|F>– Include the peptide sequence in the output. Default = F.--sort <p;mass|length|lexical |none>– Specify the order in which peptides are printed to standard output. Default = none.--min-mass >float<– The minimum neutral mass of the peptides to output. Default = 200.--max-mass >float<– The maximum neutral mass of the peptides to output. Default = 7200.--min-length <int>– The minimum length of the peptides to output. Default = 6.--max-length <int>– The maximum length of the peptides to output. Default = 50.enzyme <trypsin|chymotrypsin|elastase|clostripain|cyanogen-bromide|idosobenzoate|proline-endopeptidase|staph-protease|modified-chymotrypsin|elastase-trypsin-chymotrypsin|no-enzyme>– Enzyme to use for in silico digestion of protein sequences. Used in conjunction with the options digestion and missed-cleavages. Use 'no-enzyme' for non-specific digestion. Digestion rules are as follows: enzyme name [cuts after one of these residues]|{but not before one of these residues}. trypsin [RK]|{P}, elastase [ALIV]|{P}, chymotrypsin [FWY]|{P}, clostripain [R]|[], cyanogen-bromide [M]|[], iodosobenzoate [W]|[], proline-endopeptidase [P]|[], staph-protease [E]|[], modified-chymotrypsin [FWYL]|{P}, elastase-trypsin-chymotrypsin [ALIVKRWFY]|{P},aspn []|[D] (cuts before D). Default = trypsin.custom-enzyme <residues before cleavage | residues after cleavage >&ndash Specify rules for in silico digestion of protein sequences. Overrides theenzymeoption. Two lists of residues are given enclosed in square brackets or curly braces and separated by a |. The first list contains residues required/prohibited before the cleavage site and the second list is residues after the cleavage site. If the residues are required for digestion, they are in square brackets, '[' and ']'. If the residues prevent digestion, then they are enclosed in curly braces, '{' and '}'. Use X to indicate all residues. For example, trypsin cuts after R or K but not before P which is represented as[RK]|{P}. AspN cuts after any residue but only before D which is represented as[X]|[D].--digestion <full-digest|partial-digest> Degree of digestion used to generate peptides (full-digest, partial-digest). Either both ends or one end of a peptide must conform to enzyme specificity rules. Default full-digest., Used in conjunction with enzyme option when enzyme is not set to to 'no-enzyme'.--missed-cleavages <T|F>– Allow missed cleavage sites within a peptide. When used with enzyme is specified; includes peptides containing one or more potential cleavage sites. Default = F.--isotopic-mass <average|mono>– Specify the type of isotopic masses to use when calculating the peptide mass. Default = average.--unique-peptides <T|F>– For peptides appearing in multiple proteins, store a reference to only one of those proteins. Default = F.--verbosity<int>– Specify the verbosity of the current processes from 0-100. By default, verbosity is set to 100.--use-index <T|F>– Specify whether a pre-computed on-disk index should be used for retrieving the peptides instead of using a fasta file.--parameter-file <filename>– A file containing command-line or additional parameters. See the parameter documentation page for details.--verbosity <0-100>– Specify the verbosity of the current processes. Each level prints the following messages, including all those at lower verbosity levels: 0-fatal errors, 10-non-fatal errors, 20-warnings, 30-information on the progress of execution, 40-more progress information, 50-debug info, 60-detailed debug info. Default = 30.--version T– Print the version number and quit. Please note that you must include the 'T' after --version.
mod <mass change>:<aa list>:<max per peptide> –Consider modifications on any amino acid in aa list with at most max-per-peptide in one peptide. This parameter may be included with different values multiple times so long as the total number ofmod,cmod, andnmodparameters does not exceed 11.cmod <mass change>:<max distance from protein C-terminus> –Consider modifications on the C-terminus of any peptide whose C-terminus is no more than max-distance residues from the protein C-terminus. Use -1 to consider the C-terminus all peptides regardless of position in the protein. This parameter may be included with different values multiple times so long as the total number ofmod,cmod, andnmodparameters does not exceed 11. The same modifications must be given for any post-search process (compute-q-values, q-ranker).nmod <mass change>:<max distance from protein N-terminus> –Consider modifications on the N-terminus of any peptide whose N-terminus is no more than max-distance residues from the protein N-terminus. Use -1 to consider the N-terminus all peptides regardless of position in the protein. This parameter may be included with different values multiple times so long as the total number ofmod,cmod, andnmodparameters does not exceed 11. The same modifications must be given for any post-search process (compute-q-values, q-ranker).max-mods <n> –The maximum number of modifications that can be applied to a single peptide. Default = no limit.max-aas-modified <n> –The maximum number of modified amino acids that can appear in one peptide. Each aa can be modified multiple times. Default = no limit.<A-Z> <float>– Specify static modifications. This is a mass change applied to the given amino acid (in single-letter-code A thru Z) for every peptide in which it occurs. Use themodoption for generating peptides both with and without the mass change. Default C=57.