ParaLoop - The documentation
The copyright notice
paraloop
is governed by the CeCILL license under French law
The ParaLoop Quick reference card
How are read switches and parameters ?
The switches and parameters are read in the order described under. If some switch or parameter
is set at some step, it is NEVER SET again: thus the default values are set at the end of the process,
the imposed values are set at the beginning.
- The switch on the command line
$LIPMLIB/../etc/paraloop.root.cfg
f1.cfg
(see the switch --cfg
f2.cfg
f3.cfg
$HOME/.paralooprc
$LIPMLIB/../etc/paraloop.cfg
Switches and parameters useful for the end user
Files and directories
Parameter | Switch | Default | Meaning |
- | --cfg=f1.cfg,f2.cfg,f3.cfg | - | List of configuration files, the first specified is read first |
PARALOOP_max_file_size | | 1 Gb | The max output file size. If more than 1 Gb, another file is created |
PARALOOP_error_directory | | PARALOOP_error | The error directory |
PARALOOP_lock_directory | | PARALOOP_lock | The lock directory |
Messages and log files
Parameter | Switch | Default | Meaning |
- | --verbose | No | Add some messages, displayed to the console |
- | --quiet | No | Nearly no message |
PARALOOP_log_level | | 01 | 0=log nearly nothing 01=log normally 012=log more |
Input, output
Parameter | Switch | Default | Meaning |
| --input | | The name of the input file |
| --start | 0 | The start record number (0 means first record) |
| --end | End of file | The end record number |
| --interleaved | no | Distribute the data in a round-robin algorithm |
| --output | | The name of the output file |
Plugins
Parameter | Switch | Default | Meaning |
| --plugins | | Display the list of available plugins |
| --program | | The plugin to use |
| --db | | Used by some plugins (Blast |
| --wait | | Do not return, wait for every child to finish |
| --waitonly lock_directory | | Just wait until every child finishes |
| --waitonly lock_file | | Just wait until some specific child finishes |
Processors and queues
Parameter | Switch | Default | Meaning |
PARALOOP_ncpus | --ncpus | Set by the administrator | The number of cpus to use (the number of children processes to run) |
| --local | no | Run on the local machine, without sending the jobs to the cluster nodes |
PARALOOP_fair_time_limit | | Set by the administrator | Only implemented with queues. After this time has elapsed, the job is submitted again, then interrupted, letting your colleagues a chance to work. |
PARALOOP_account | --account | | Only implemented with PBS The account, passed to the qsub utility. |
PARALOOP_queue | --queue | Set by the administrator | The execution queue |
PARALOOP_qsub_params | | Set by the administrator | Additional parameters passed to qsub |
Interrupting
The best way for interruptnig the jobs is to create a file called paraloop.stp
in the lock directory.
You may also interrupt only one specific job with creating a file called paraloop.X.stp
in the lock directory, where X is the number of the starting record.
Restarting
Parameter | Switch | Default | Meaning |
| --restart lock_directory | | Restart the interrupted jobs |
| --restart lock_file | | Restart only this job |
Advanced switches and parameters
Parameter | Switch | Default | Meaning |
| --monoprocessor | | Run this job on only one processor |
| --start | | Start record to be processed |
| --end | | End record to be processed |
| --step | | Step used in the main loop. If step is < 0, the end record must be < than the start record, but this
does not work with every plugin |
| --slice_size | | Instead of specifying --start , --end , it is sometmes more convenient
specifying slices. In this case, the step parameter is always 1
--slice_size is thus the size of each slice. |
| --slice_nb | | The slice number |
| --slice_offset | | Add this offset for the calculation of the start of slice |
Parameters of the Shell plugin
Please have a look to the Shell documentation for the details about this plugin.
Parameter | Default | Meaning |
PARALOOP_Shell_interpreter | /bin/sh | path to the default shell interpreter |
Parameters of the Bioperl plugin
Please have a look to the Bioperl documentation for the details about this plugin.
Parameter | Default | Meaning |
PARALOOP_Bioperl_path | | path to the external script, ran at each iteration |
PARALOOP_Bioperl_params | '' | parameters passed to this script |
PARALOOP_Bioperl_input_format | fasta | Format of the input file, read by the external script |
Parameters of the Blast plugin
Please have a look to the Blast documentation for the details about this plugin.
Parameter | Switch | Default | Meaning |
PARALOOP_Blast_origin | | ncbi | ncbi for blast ncbi, wu for wu blast |
PARALOOP_Blast_path | | blastall if Blast_origin is ncbi , blastp if Blast_origin is wu |
PARALOOP_Blast_params | | -p blastp if Blast_origin is ncbi , '' if Blast_origin is wu |
PARALOOP_Blast_chunk | | 1 | The sequences are grouped in chunks of N sequences, N is given by this parameter |
PARALOOP_db | --db | | The database |
Parameters useful for the administrator
Those parameters may be set two files, with two different meanings:
.../etc/paraloop.root.cfg
- Those parameters cannot be overloaded by the user
.../etc/paraloop.cfg
- Those parameters are default values, they can be overloaded by the users.
Parameter | Default | Meaning |
PARALOOP_Scheduler | | The Scheduler to use:
- System for a multiprocessor machine
- PBS for a machine equipped with the PBS queing system
- Rsystem for a cluster without any queing system
|
PARALOOP_no_local_mode | no | If specified, the users will not be able to use the --local switch,
thus forcing them to use the queing system. |
PARALOOP_fair_time_limit | | Set this parameter to keep the users from monopolizing the processors |
PARALOOP_max_file_size | | If the output file grows too much, it is closed and a new file is reopened |
PARALOOP_PBS_ncpus | | The default number of cpus when using PBS |
PARALOOP_System_ncpus | | The default number of cpus when using System (or the --local switch) |
PARALOOP_Rsystem_ncpus | | The default number of cpus when using Rsystem) |
Parameters for the PBS Scheduler
Parameter | Default | Meaning |
PARALOOP_account | | The account name, passed to qsub |
PARALOOP_qsub_params | '' | Additional parameters passed to qsub |
PARALOOP_queue | | The execution queue |
Parameters for the Rsystem scheduler
Parameter | Default | Meaning |
PARALOOP_Rsystem_nodes | | The list of nodes constituting the cluster. Example:
node1,node2,node3 |
PARALOOP_Rsystem_rsh | rsh | The program to use for sending / executing something on the nodes: may be ssh |
PARALOOP_Rsystem_tmp | /tmp | The name of a temporary directory. This directory must be local to the node, it cannot be shared |
The PARALOOP documentation
The main documentation
Document | Description |
User documentation | The user documentation, including a tutorial for writing plugins |
The plugins
Document | Description |
Plugin | The abstract class at top of the plugin hierarchy |
BpInput | The abstract class used for reading files with bioperl |
LnInput | The abstract class used for reading text files |
Bioperl | A general plugin to execute a treatment on bioperl files |
Shell | A general plugin to execute some lines of scripts, one line per processor |
Blast | A specialized plugin to execute a blast (ncbi or wu) or every sequence found in the input file |
Dummy | This dummy plugin can be used as a template for writing your own plugins |
The schedulers
Document | Description |
Scheduler | The abstract class at top of the Scheduler hierarchy |
PBS | A scheduler useful when you have PBS-Pro (or other systems ?) installed |
System | This scheduler is used with multiprocessor SMP machines, or when you use the --local switch |
Rsystem | This scheduler is used with clusters which do NOT have any batch system installed |
The other objects or modules