When copying large files from discover to dirac, we recommend using "cp" over "mv." As mv deletes files after copying, there is a risk on losing your files if they system hangs due to storage shortage or any other issues. ![]() It is better to transfer few large files than many small ones between the tape cache (/archive) and your nobackup area. Moving Large Files to $ARCHIVE using cp on Discover ![]() $ cp /archive/u/myplace/bigfile /discover/nobackup/myplace/ Srun.slurm: cluster configuration lacks support for cpu binding Salloc: job 8164701 queued and waiting for resources The -t flag will set the time limit to 1 hour. If you are logged onto a discover node, start an interactive shell on a datamove node, and then issue the copy command. Here is the Slurm script for the batch job for copying a big file from archives to your nobackup area using the datamove queue.Ĭp /archive/u/myplace/bigfile /discover/nobackup/myplace/Īs a rule of thumb,it is recommended to set the wall time limit to be 2 seconds per gigabyte, and multiply that by 2 for good measure. The following example, assumes $ARCHIVE and $NOBACKUP into some hypothetical pathnames. The best way to copy from your $ARCHIVE into your $NOBACKUP is by using a datamove node, either with a batch job or interactively. Untarring files in /archive will create many small files in the archive area, which will then be written to many tapes. IMPORTANT: Do not untar anything in the archive area! Once the state of the files in the archives are dual, they are ready to be copied to your $NOBACKUP where you will be able to use them. IMPORTANT: Retrieve your files from tape before attempting to use them! USE dmget Copying Large Files from $ARCHIVE to $NOBACKUP using SLURMįirst, ensure that the file you are moving is online and not on tape. You can then use Slurm's dependency functions to ensure job execution in the correct order. You can submit three jobs all at once (a datamove job to move data into the system, a compute job, and a datamove job to move data back off the system). The results of your compute job can then be transferred back out of Discover via an additional datamove job. You can then submit a compute job to analyze the data you moved. You can submit a job to the datamove partition to migrate data into the Discover environment, making it visible to Discover’s compute nodes. They also have access to Discover’s local GPFS cluster-wide file systems. The gateway nodes have external network interfaces, which allow access to the archive (Dirac), and other external systems. Jobs submitted to the datamove partition run on a “gateway” node. One way to prepare data ready for a large compute run on Discover is to first submit a batch job to the datamove partition in order to copy large data files from the archive or an external location to the Discover file systems. Using SCP and RSYNC to copy to/from Diracįile transfer between Discover and Dirac using the datamove partition.In a datamove job use bbscp through dirac.On a login node use cp to copy from $ARCHIVE.In a datamove job use cp to copy from $ARCHIVE.File Transfer from Discover to Dirac Recommended methods to transfer data between Discover and Dirac
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |