diff --git a/README.md b/README.md index 59ac1f6..dd880ea 100755 --- a/README.md +++ b/README.md @@ -87,6 +87,16 @@ Then one can add `-L$(CUDA_HOME)/lib64/stubs` to [this line](./Makefile#L29) in ``` Why would you want to do that anyway? +## Known issues + +- This is not an issue, but when multiple consecutive GPU jobs are queued, + after the first job runs, there is a small delay for the next GPU job to run + in order to ensure that the same GPUs are not claimed by different jobs. + There was an issue causing this delay significantly longer as reported in [`#2`](https://github.com/justanhduc/task-spooler/issues/2) + but has been fixed in [176d0b76](https://github.com/justanhduc/task-spooler/commit/176d0b76). + To avoid the delay, you can use `-g` to indicate the exact GPU IDs for the job. + + ## Mailing list I created a GoogleGroup for the program. You look for the archive and the join methods in the taskspooler google group page. @@ -103,11 +113,7 @@ When the job finishes, the client notifies the server. At this time, the server Moreover the client can take advantage of many information from the server: when a job finishes, where does the job output go to, etc. -## Download - -Download the latest version (GPLv2+ licensed): ts-1.0.tar.gz - v1.0 (2016-10-19) - Changelog - -Look at the version repository if you are interested in its development. +## History Андрей Пантюхин (Andrew Pantyukhin) maintains the BSD port. @@ -121,10 +127,11 @@ Gnomeye maintains the AUR package. Eric Keller wrote a nodejs web server showing the status of the task spooler queue (github project). +Duc Nguyen took the project and develops a GPU-support version. ## Manual -**NOTE**: `man ts` is not updated (yet). +See below or `man ts` for more details. ``` usage: ts [action] [-ngfmdE] [-L ] [-D ] [cmd...] @@ -184,8 +191,8 @@ Options adding jobs: ## Thanks **Author** - - Lluís Batlle i Rossell, - - Duc Nguyen, +- Duc Nguyen, +- Lluís Batlle i Rossell, **Acknowledgement** * To Raúl Salinas, for his inspiring ideas diff --git a/main.c b/main.c index 916cf54..167d900 100755 --- a/main.c +++ b/main.c @@ -25,7 +25,7 @@ int server_socket; static char getopt_env[] = "POSIXLY_CORRECT=YES"; static char *old_getopt_env; -static char version[] = "Task Spooler v1.1 - a task queue system for the unix user.\n" +static char version[] = "Task Spooler v1.2 - a task queue system for the unix user.\n" "Copyright (C) 2007-2020 Duc Nguyen - Lluis Batlle i Rossell"; @@ -446,10 +446,10 @@ static void print_help(const char *cmd) { printf(" -T send SIGTERM to all running job groups.\n"); printf(" -u [id] put that job first. The last added, if not specified.\n"); printf(" -U swap two jobs in the queue.\n"); - printf(" -B in case of full queue on the server, quit (2) instead of waiting.\n"); printf(" -h show this help\n"); printf(" -V show the program version\n"); printf("Options adding jobs:\n"); + printf(" -B in case of full clients on the server, quit instead of waiting.\n"); printf(" -n don't store the output of the command.\n"); printf(" -E Keep stderr apart, in a name like the output file, but adding '.e'.\n"); printf(" -z gzip the stored output (if not -n).\n"); diff --git a/ts.1 b/ts.1 index 51c8685..1d34625 100755 --- a/ts.1 +++ b/ts.1 @@ -5,14 +5,14 @@ .\" that should have been distributed together with this file. .\" .\" Note: I took the gnu 'ls' man page as an example. -.TH TS 1 2016-03 "Task Spooler 1.0" +.TH TS 1 2021-05 "Task Spooler 1.2" .SH NAME ts \- task spooler. A simple unix batch system .SH SYNOPSIS .BI "ts [" actions "] [" options "] [" command... ] .sp Actions: -.BI "[\-KClhV] +.BI "[\-KClhVTRq] .BI "[\-t ["id ]] .BI "[\-c ["id ]] .BI "[\-p ["id ]] @@ -25,11 +25,18 @@ Actions: .BI "[\-i ["id ]] .BI "[\-U <"id - id >] .BI "[\-S ["num ]] +.BI "[\--get_gpu_wait] +.BI "[\--set_gpu_wait ["sec ]] +.BI "[\-a/--get_label ["id ]] +.BI "[\-F/--full_cmd ["id ]] + .sp Options: -.BI "[\-nfgmd]" +.BI "[\-nEfzmd]" .BI "[\-L <"label >] -.BI "[\-D <"id >] +.BI "[\-D <"id1,id2,... >] +.BI "[\-G/--gpus ["num ]] +.BI "[\--gpus_indices <"id1,id2,... >] .SH DESCRIPTION .B ts @@ -42,8 +49,8 @@ Calling with a command will add that command to the queue, and calling it without commands or parameters will show the task list. .SH COMMAND OPTIONS -When adding a job to ts, we can specify how it will be run and how will the -results be collected: +When adding a job to ts, we can specify how it will be run and how the +results will be collected: .TP .B "\-n" Do not store the standard output/error in a file at @@ -51,7 +58,7 @@ Do not store the standard output/error in a file at - let it use the file descriptors decided by the calling process. If it is not used, the .B jobid -for the new task will be outputed to stdout. +for the new task will be output to stdout. .TP .B "\-g" Pass the output through gzip (only if @@ -80,11 +87,14 @@ the queue. It makes more comfortable distinguishing similar commands with different goals. .TP .B "\-d" -Run the command only if the command before finished well (errorlevel = 0). This new -task enqueued depends on the result of the previous command. If the task is not run, -it is considered as failed for further dependencies. +Run the command only after the last command finished. +It does not depend on how its dependency ends. +.TP +.B "\-D " +Run the command only after the specified job IDs finished. +It does not depend on how its dependencies end. .TP -.B "\-D " +.B "\-W " Run the command only if the job of given id finished well (errorlevel = 0). This new task enqueued depends on the result of the previous command. If the task is not run, it is considered as failed for further dependencies. @@ -108,6 +118,12 @@ Run the command only if there are \fbnum\fB slots free in the queue. Without it, the job will run if there is one slot free. For example, if you use the queue to feed cpu cores, and you know that a job will take two cores, with \fB\-N\fB you can let ts know that. +.TP +.B "\-G/--gpus " +Run the job with \fbnum\fB GPUs. +.TP +.B "\-g/--gpu_indices " +Run the job with the specified GPU IDs. GPU IDs should be separated by commas. .SH ACTIONS Instead of giving a new command, we can use the parameters for other purposes: .TP @@ -124,6 +140,9 @@ It is not reliable to think that .B ts -K will finish when the server is really killed. By now it is a race condition. .TP +.B "\-T" +Send SIGTERM to all running job groups. +.TP .B "\-C" Clear the results of finished jobs from the queue. .TP @@ -133,6 +152,28 @@ This is the default behaviour if .B ts is called without options. .TP +.B "\-q/--last_queue_id" +Show the job ID of the last added. +.TP +.B "\-R/--count_running" +Return the number of running jobs +.TP +.B "\--get_gpu_wait" +Get the time to wait before running the next GPU job. This time delay is to ensure +two closely consecutive jobs do not share GPUs as GPUs' memory is not filled up +immediately after a job claims. +.TP +.B "\--set_gpu_wait " +Set time to wait before running the next GPU job (30 seconds by default). +This time delay is to ensure two closely consecutive jobs do not share GPUs +as GPUs' memory is not filled up immediately after a job claims. +.TP +.B "\-a/--get_label " +Show the job label. Of the last added, if not specified. +.TP +.B "\-F/--full_cmd " +Show the full command. Of the last added, if not specified. +.TP .B "\-t [id]" Show the last ten lines of the output file of the named job, or the last running/run if not specified. If the job is still running, it will keep on @@ -310,11 +351,11 @@ is created, which you can submit to the developer in order to fix the bug. .SH SEE ALSO .BR at (1) .SH AUTHOR -Lluis Batlle i Rossell +Duc Nguyen and Lluis Batlle i Rossell .SH NOTES This page describes .B ts -as in version 1.0. Other versions may differ. The file +as in version 1.2. Other versions may differ. The file .B TRICKS found in the distribution package can show some ideas on special uses of .B ts.