08 Nov Philip Lijnzaad shares his HPC experiences
The High-Performance Computing (HPC) facility speeds up computational tasks of any size. The facility has proven to be very stable, and is maintained by a group of expert and friendly admins. Stop running computational tasks on your own PC, and benefit from the HPC facility. Philip Lijnzaad already does so and shares his experience.
The availability of a High Performance Computing facility is an absolute necessity for those academics who work with large datasets. The FairShare policy ensures that small users automatically get higher priority. Even if there are tens of thousands of computing jobs waiting (this is not unusual), small jobs can be run within a few minutes.
“This data is so voluminous that computing in the cloud is currently not an option. Apart from confidentiality issues, it would simply take too long to transfer data to and from the cloud.” Philip Lijnzaad tells about his research in molecular biology, in which data from next generation sequencing (NGS) methods play a prominent role. Philip is a senior Bioinformatician at the Princess Máxima Centre for Pediatric Oncology, working in the Holstege group.
Efficient use of time and hardware
“My main usage stems from mapping; finding the NGS sequences, typically a few million, back in a reference data set. My previous work with yeast used ChIPseq data to pinpoint protein-DNA binding locations in the genome, and MNaseseq data to determine nucleosome positions. Currently I am analysing data from human single-cell RNA sequencing experiments. Both the yeast genome and the human transcriptome are relatively small. Since the computational tasks scale with the size of the reference data set, I am only a minor user of the facility now. This year to date, I have run around 6,000 jobs, using 400 CPU hours in total. It would have been cumbersome to do this on a small server such as a stand-alone desktop computer. Most of the computational burden occurs at the time the data become available, and also at the time a new data analysis idea is hatched. With the HPC facility, one group’s peak activity can take place in another group’s off-peak time, making much more efficient use of time and hardware.”
For more information about the current HPC facility, see the HPC facility wiki or contact Jeroen de Ridder.