You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To use the package with the socket cluster functions, you would call the respective constructor [makeClusterFunctionsSocket()](https://mlr-org.github.io/batchtools/reference/makeClusterFunctionsSocket):
@@ -109,7 +109,7 @@ The development of [BatchJobs](https://github.com/tudo-r/BatchJobs/) and [BatchE
109
109
* batchtools does not use SQLite anymore.
110
110
Instead, all the information is stored directly in the registry using [data.tables](https://cran.r-project.org/package=data.table) acting as an in-memory database. As a side effect, many operations are much faster.
111
111
* Nodes do not have to access the registry.
112
-
[submitJobs()](https://mlr-org.github.io/batchtools/reference/submitJobs) stores a temporary object of type [JobCollection](https://mllg.github.io/batchtools/reference/JobCollection) on the file system which holds all the information necessary to execute a chunk of jobs via [doJobCollection()](https://mllg.github.io/batchtools/reference/doJobCollection) on the node.
112
+
[submitJobs()](https://mlr-org.github.io/batchtools/reference/submitJobs) stores a temporary object of type [JobCollection](https://mlr-org.github.io/batchtools/reference/JobCollection) on the file system which holds all the information necessary to execute a chunk of jobs via [doJobCollection()](https://mlr-org.github.io/batchtools/reference/doJobCollection) on the node.
113
113
This avoids file system locks because each job accesses only one file exclusively.
114
114
*`ClusterFunctionsMulticore` now uses the parallel package for multicore execution.
115
115
*`ClusterFunctionsSSH` can still be used to emulate a scheduler-like system which respects the work load on the local machine.
@@ -234,15 +234,15 @@ getStatus()
234
234
```
235
235
The resulting output includes the number of jobs in the registry, how many have been submitted, have started to execute on the batch system, are currently running, have successfully completed, and have terminated due to an R exception.
236
236
After jobs have successfully terminated, we can load their results on the master.
237
-
This can be done in a simple fashion by using either [`loadResult()`](https://mlr-org.github.io/batchtools/reference/loadResult), which returns a single result exactly in the form it was calculated during mapping, or by using [`reduceResults()`](https://mllg.github.io/batchtools/reference/reduceResults), which is a version of `Reduce()` from the base package for registry objects.
237
+
This can be done in a simple fashion by using either [`loadResult()`](https://mlr-org.github.io/batchtools/reference/loadResult), which returns a single result exactly in the form it was calculated during mapping, or by using [`reduceResults()`](https://mlr-org.github.io/batchtools/reference/reduceResults), which is a version of `Reduce()` from the base package for registry objects.
238
238
```{r}
239
239
waitForJobs()
240
240
mean(sapply(1:10, loadResult))
241
241
reduceResults(function(x, y) x + y) / 10
242
242
```
243
243
244
244
If you are absolutely sure that your function works, you can take a shortcut and use *batchtools* in an `lapply` fashion using [`btlapply()`](https://mlr-org.github.io/batchtools/reference/btlapply).
245
-
This function creates a temporary registry (but you may also pass one yourself), calls [`batchMap()`](https://mlr-org.github.io/batchtools/reference/reduceResultsList), wait for the jobs to terminate with [`waitForJobs()`](https://mllg.github.io/batchtools/reference/waitForJobs) and then uses [`reduceResultsList()`](https://mllg.github.io/batchtools/reference/reduceResultsList) to return the results.
245
+
This function creates a temporary registry (but you may also pass one yourself), calls [`batchMap()`](https://mlr-org.github.io/batchtools/reference/reduceResultsList), wait for the jobs to terminate with [`waitForJobs()`](https://mlr-org.github.io/batchtools/reference/waitForJobs) and then uses [`reduceResultsList()`](https://mlr-org.github.io/batchtools/reference/reduceResultsList) to return the results.
246
246
247
247
```{r, R.options=list(batchtools.verbose=FALSE)}
248
248
res = btlapply(rep(1e5, 10), piApprox)
@@ -393,7 +393,7 @@ If everything turns out fine, we can proceed with the calculation.
393
393
394
394
## Submitting and Collecting Results
395
395
396
-
To submit the jobs, we call [`submitJobs()`](https://mlr-org.github.io/batchtools/reference/submitJobs) and wait for all jobs to terminate using [`waitForJobs()`](https://mllg.github.io/batchtools/reference/waitForJobs).
396
+
To submit the jobs, we call [`submitJobs()`](https://mlr-org.github.io/batchtools/reference/submitJobs) and wait for all jobs to terminate using [`waitForJobs()`](https://mlr-org.github.io/batchtools/reference/waitForJobs).
397
397
```{r}
398
398
submitJobs()
399
399
waitForJobs()
@@ -464,12 +464,12 @@ After you have submitted jobs and suspect that something is going wrong, the fir
464
464
getStatus()
465
465
```
466
466
The status message shows that two of the jobs could not be executed successfully.
467
-
To get the IDs of all jobs that failed due to an error we can use [`findErrors()`](https://mlr-org.github.io/batchtools/reference/findJobs) and to retrieve the actual error message, we can use [`getErrorMessages()`](https://mllg.github.io/batchtools/reference/getErrorMessages).
467
+
To get the IDs of all jobs that failed due to an error we can use [`findErrors()`](https://mlr-org.github.io/batchtools/reference/findJobs) and to retrieve the actual error message, we can use [`getErrorMessages()`](https://mlr-org.github.io/batchtools/reference/getErrorMessages).
468
468
```{r}
469
469
findErrors()
470
470
getErrorMessages()
471
471
```
472
-
If we want to peek into the R log file of a job to see more context for the error we can use [`showLog()`](https://mlr-org.github.io/batchtools/reference/showLog) which opens a pager or use [`getLog()`](https://mllg.github.io/batchtools/reference/showLog) to get the log as character vector:
472
+
If we want to peek into the R log file of a job to see more context for the error we can use [`showLog()`](https://mlr-org.github.io/batchtools/reference/showLog) which opens a pager or use [`getLog()`](https://mlr-org.github.io/batchtools/reference/showLog) to get the log as character vector:
1. Create a Registry with [`makeRegistry()`](https://mlr-org.github.io/batchtools/reference/makeRegistry) (or [`makeExperimentRegistry()`](https://mllg.github.io/batchtools/reference/makeExperimentRegistry)) or load an existing from the file system with [`loadRegistry()`](https://mllg.github.io/batchtools/reference/loadRegistry).
488
-
2. Define computational jobs with [`batchMap()`](https://mlr-org.github.io/batchtools/reference/batchMap) or [`batchReduce()`](https://mllg.github.io/batchtools/reference/batchReduce) if you used [`makeRegistry()`](https://mllg.github.io/batchtools/reference/makeRegistry) or define with [`addAlgorithm()`](https://mllg.github.io/batchtools/reference/addAlgorithm), [`addProblem()`](https://mllg.github.io/batchtools/reference/addProblem) and [`addExperiments()`](https://mllg.github.io/batchtools/reference/addExperiments) if you started with [`makeExperimentRegistry()`](https://mllg.github.io/batchtools/reference/makeExperimentRegistry).
487
+
1. Create a Registry with [`makeRegistry()`](https://mlr-org.github.io/batchtools/reference/makeRegistry) (or [`makeExperimentRegistry()`](https://mlr-org.github.io/batchtools/reference/makeExperimentRegistry)) or load an existing from the file system with [`loadRegistry()`](https://mlr-org.github.io/batchtools/reference/loadRegistry).
488
+
2. Define computational jobs with [`batchMap()`](https://mlr-org.github.io/batchtools/reference/batchMap) or [`batchReduce()`](https://mlr-org.github.io/batchtools/reference/batchReduce) if you used [`makeRegistry()`](https://mlr-org.github.io/batchtools/reference/makeRegistry) or define with [`addAlgorithm()`](https://mlr-org.github.io/batchtools/reference/addAlgorithm), [`addProblem()`](https://mllg.github.io/batchtools/reference/addProblem) and [`addExperiments()`](https://mllg.github.io/batchtools/reference/addExperiments) if you started with [`makeExperimentRegistry()`](https://mllg.github.io/batchtools/reference/makeExperimentRegistry).
489
489
It is advised to test some jobs with [`testJob()`](https://mlr-org.github.io/batchtools/reference/testJob) in the interactive session and with `testJob(external = TRUE)` in a separate R process.
490
490
Note that you can add additional jobs if you are using an [`ExperimentRegistry`](https://mlr-org.github.io/batchtools/reference/makeExperimentRegistry).
491
491
3. If required, query the data base for job ids depending on their status, parameters or tags (see [`findJobs()`](https://mlr-org.github.io/batchtools/reference/findJobs)).
492
-
The returned tables can easily be combined in a set-like fashion with data base verbs: union ([`ojoin()`](https://mlr-org.github.io/batchtools/reference/JoinTables) for outer join), intersect ([`ijoin()`](https://mllg.github.io/batchtools/reference/JoinTables) for inner join), difference ([`ajoin()`](https://mllg.github.io/batchtools/reference/JoinTables) for anti join).
492
+
The returned tables can easily be combined in a set-like fashion with data base verbs: union ([`ojoin()`](https://mlr-org.github.io/batchtools/reference/JoinTables) for outer join), intersect ([`ijoin()`](https://mlr-org.github.io/batchtools/reference/JoinTables) for inner join), difference ([`ajoin()`](https://mlr-org.github.io/batchtools/reference/JoinTables) for anti join).
493
493
4. Submit jobs with [`submitJobs()`](https://mlr-org.github.io/batchtools/reference/submitJobs). You can specify job resources here.
494
494
If you have thousands of fast terminating jobs, you want to [`chunk()`](https://mlr-org.github.io/batchtools/reference/chunk) them first.
495
-
If some jobs already terminated, you can estimate the runtimes with [`estimateRuntimes()`](https://mlr-org.github.io/batchtools/reference/estimateRuntimes) and chunk jobs into heterogeneous groups with [`lpt()`](https://mllg.github.io/batchtools/reference/chunk) and [`binpack()`](https://mllg.github.io/batchtools/reference/chunk).
495
+
If some jobs already terminated, you can estimate the runtimes with [`estimateRuntimes()`](https://mlr-org.github.io/batchtools/reference/estimateRuntimes) and chunk jobs into heterogeneous groups with [`lpt()`](https://mlr-org.github.io/batchtools/reference/chunk) and [`binpack()`](https://mlr-org.github.io/batchtools/reference/chunk).
496
496
5. Monitor jobs. [`getStatus()`](https://mlr-org.github.io/batchtools/reference/getStatus) gives a summarizing overview.
497
-
Use [`showLog()`](https://mlr-org.github.io/batchtools/reference/showLog) and [`grepLogs()`](https://mllg.github.io/batchtools/reference/grepLogs) to investigate log file.
497
+
Use [`showLog()`](https://mlr-org.github.io/batchtools/reference/showLog) and [`grepLogs()`](https://mlr-org.github.io/batchtools/reference/grepLogs) to investigate log file.
498
498
Run jobs in the currently running session with [`testJob()`](https://mlr-org.github.io/batchtools/reference/testJob) to get a `traceback()`.
499
499
6. Collect (partial) results. [`loadResult()`](https://mlr-org.github.io/batchtools/reference/loadResult) retrieves a single result from the file system.
500
500
[`reduceResults()`](https://mlr-org.github.io/batchtools/reference/reduceResults) mimics `Reduce()` and allows to apply a function to many files in an iterative fashion.
501
-
[`reduceResultsList()`](https://mlr-org.github.io/batchtools/reference/reduceResultsList) and [`reduceResultsDataTable()`](https://mllg.github.io/batchtools/reference/reduceResultsList) collect results into a `list` or `data.table`, respectively.
501
+
[`reduceResultsList()`](https://mlr-org.github.io/batchtools/reference/reduceResultsList) and [`reduceResultsDataTable()`](https://mlr-org.github.io/batchtools/reference/reduceResultsList) collect results into a `list` or `data.table`, respectively.
Most users develop and prototype their experiments on a desktop box in their preferred IDE and later deploy to a large computing cluster.
510
-
This can be done by prototyping locally ([`testJob()`](https://mlr-org.github.io/batchtools/reference/testJob) or submit subsets via [`submitJobs()`](https://mllg.github.io/batchtools/reference/submitJobs)).
510
+
This can be done by prototyping locally ([`testJob()`](https://mlr-org.github.io/batchtools/reference/testJob) or submit subsets via [`submitJobs()`](https://mlr-org.github.io/batchtools/reference/submitJobs)).
511
511
To deploy to the cluster, just copy the file directory (as reported by `reg$file.dir`) to the remote system.
512
512
Next, log in on the cluster (typically via `ssh`), `cd` to the copied directory and call `loadRegistry("<file.dir.on.remote">, "<work.dir.on.remote>", writeable = TRUE)`.
513
513
This function will (a) source the local configuration file so that you can talk to the cluster (verify by checking the output of `reg$cluster.functions`) and (b) adjust the paths to the new system if argument `update.paths` is set.
0 commit comments