Accessing files on Data Den

By | | No Comments

Globus Server Endpoint

Data Den supports the use of Globus servers to provide high performance transfers, data sharing and access to Data Den from off campus. To access Data Den via Globus, request your volume be added to Globus.

Bundling Files

Because of the design of Data Den, often projects will need to be bundled to form larger single file archives. The most common tool do this with is tar. Tar can optionally compress the data also but can take much longer.

The following command will bundle all file files in directory, store it in the file bundle.tar.bz2, and compress it with bzip2. It will also create a small text file bundle.index.txt that can be stored with it to quickly reference what files are in the bundle.

tar -cvjf bundle.tar.bz2 driectory | tee bundle.index.txt

To extract the bundle:

tar -xvjf bundle.tar.bz2

Optionally omit -j to save time compressing, and omit -v to not print the bundle progress as it runs.

Compressing an archive can be accelerated on multi-core systems using pigz and lbzip2. The following will work on all ARC systems:

tar --use-compress-program=lbzip2 -cvf bundle.tar.bz2 brockp | tee bundle.index.txt

To extract the bundle:

tar --use-compress-program=lbzip2 -xvf bundle.tar.bz2

Storage Resource Software

If you are unsure which of our storage services should be used to host your data, we have written some software that you can download and execute to analyze your files to understand how much of your data is stored in large files, how much of your data has been accessed recently, and the distribution of file sizes and access times. Use the Storage Guidance Tool to find file statistics and get a storage recommendation.

This software doesn’t examine the contents of any data files, it merely scans file attributes, it also does not store any file names after searching through the filesystem.

If you have any questions on this software, please send us an email at arcts-support@umich.edu with your inquiry. If you are unsure about any of the recommendations the tool sends you, contact us at arcts-support@umich.edu with your inquiry.

Data Den Policies

By |

Policies

Small File Limitation and Optimal Size

Data Den’s underlying technology does not work well with small files. Because of this design, each 1TB of Data Den capacity provides only 10,000 files, and only migrates files 100 MByte or larger. The optimal file size range from 10 – 200 GBytes.

Maximum File Size

The maximum file size is 8 TByte, but files should optimally not be larger than 1 TByte. Larger archives can be split before uploading to Data Den with the split -b 200G filename command.

Sensitive Data — ePHI/HIPAA/ITAR/EAR/CUI

Data Den does not currently support PHI or other data types. It is scheduled to be reviewed for support at a later date.

System Abuse

Abuse, generally by excessive recalls of data better suited for active storage, of Data Den intentionally or not may result in performance or access being limited to preserve performance and access for other users. In the event this happens, staff will be in contact with users to engineer more appropriate solutions.

Getting Started

By |

Data Den Research Archive Storage moves to ITS Service Request System

 

The ability to request Data Den Research Archive Storage services is moving from Minder to the ITS Service Request System (SRS). You should begin placing orders for Turbo Research Storage through the SRS on July 1, 2020.

ARC will transfer your Shortcode(s) to the SRS. Contact arcts-support@umich.edu if you wish to review your information or make changes.

Benefits of the SRS include

  • You can be more involved in the process
  • User-friendly interface with easy-to-use workflows
  • Automatic validation of Shortcodes
  • With authorization, you can:
    • Place orders and review storage
    • View comprehensive reports
    • Review and update billing

Get started

Use your UMICH (Level-1) credentials and Duo two-factor authentication to login via the ITS Service Request (SRS) Portal or the ITS General Computing provision page.

SRS support

Information about the SRS and how to use it can be found throughout the SRS system.

Here are some additional resources:

Med School Researcher Storage Information

Researchers in the Medical School can take advantage of free or subsidized storage options through their respective academic units.

If you have any questions, please contact us at arcts-support@umich.edu.

What is Data Den

By | | No Comments

What is Data Den?

 

The rate for the Data Den service starting on August 1, 2020, will be $39.96 per replicated terabyte (TB) per year ($3.33/TB/month).

Data Den is a service for preserving electronic data generated from research activities. It is a low-cost, highly durable storage system and is the largest storage system operated by ARC.

Data Den is a disk-caching, tape-backed archive optimized for data that is not regularly accessed for extended periods of time (weeks to years). Data Den is not meant to replace active storage services like Turbo and Locker. It is best used for data that is set aside and is not being used at all, but still needs to be stored.  Data Den is best accessed through the Globus data management sharing system to move data into and out of the service. Data Den is only available in a replicated format.

Data Den can be part of a well-organized data management plan providing international data sharing, encryption, and data durability.