Data Archival and Curation

All hardware is connected via a 10 Gbps Science DMZ network architecture, which is a dedicated high-volume data transfer network exclusively operated to support experimental and computational research. Data archival is fully automated. The data acquisition systems (e.g., PXI units) save data to local storage that continuously backs up data to long-term replicated storage at the data center through an ownCloud platform, which is a Dropbox-like interface based on the Globus GridFTP platform.

The long-term replicated storage hardware resides in the same data center as the university’s 150 teraflop HiPerGator supercomputer. System specifications include 16,284 CPUs, 2.88 Petabytes of shared disk space, and a Mellanox FDR 56 Gbps INFINIBAND compute interconnect. The Linux-based system operates on the same platform as the Texas Advanced Computing Center Stampede supercomputer (MOAB/Torque) and supports a wide variety of software libraries and compiler suites (Intel and GNU). Users can run real-time scripts (in Matlab, Python, etc.) to process data on the fly, thus eliminating waiting time to review results, make changes, etc. The storage solution can also automatically archive data to offsite locations (through WebDAV), thus eliminating the need for the user to manually transfer data.