BB: When it comes to eResearch infrastructure as a scientist what I really want is to be left alone in my office without any help, using a rock-solid HPC service, through a high performance network that offers peak performance to my desk, and that is reliable, always available, and cost effective. As a shared infrastructure, NeSI is not there yet in terms of reliability and availability, however it’s a young organisation and is improving. REANNZ is more mature and therefore performs more reliably, yet the network connection from the edge of the university to my desk is not always of the same quality.
BB: We don’t have a long tradition of running eResearch shared infrastructure facilities here in New Zealand, and that shows in terms of the quality of services currently available. Unforeseen downtime and unplanned outages still seem quite common in some of the new facilities and it is important for this to improve for science users who need reliable performance for weeks at a time. This means that investment in people who are trained and skilled in offering high quality eResearch services is vital if asset investments are to be fully leveraged and engaged with.
BB: Considering data infrastructure and storage, I believe a balanced approach between onshore and off-shore resources is required. There is no way for a NZ provider to compete with the major off-shore data infrastructure providers such as Amazon. Where it makes sense to go with a good deal from an overseas provider, we should go for it.
BB: From an HPC perspective, I can see high capability systems (such as those at NIWA) remaining onshore, however I suspect that a large fraction of NZ’s current HPC work could go overseas with very little disruption, especially work that does not require large amounts of ram, and/or fast inter node communication. While issues such as data security, code support etc. could be potential barriers, New Zealand research will ultimately make extensive use of international based services for high capacity systems.
BB: In developing people and talent in eResearch, having access to the physical hardware boxes is relatively unimportant. Software and software defined systems are becoming more central to the future of eResearch. While we do need some hardware in NZ in the capability compute space, investing in people in NZ is far more important that investing in hardware.
BB: Different disciplines have very different needs from HPC. For example, in genomics there is a requirement to analyse large datasets in a repetitive fashion, so it’s important that the data be closely associated with the computation. In computational physics (and perhaps chemistry) often we have relatively small datasets (or atleast small inputs/outputs), but perform extremely complex calculations; thus we have quite different economics when it comes to data logistics and the type of HPC support we are looking for. Again it comes back to people, we need people trained in how to deal with data and remote / cloud computing.
BB: Traditional eResearch communities have been around for a long time – the geeks of science – where the need was for a terminal that offered direct access to the inner workings of the computational system you were using. Understanding, manipulating and optimising the computer system was fundamental to the research you were trying to do. In the last decade or so we now see new eResearch users appearing from different disciplines (humanities, genomics) who are looking at eResearch infrastructure and resources in quite a different way. Increasingly these people want eResearch as a service. Many of these users are demanding and developing new ways of engaging with the computational system or the instruments to get their desired results.
BB: The expected NeSI appointment of an applciation specialist at Otago will play a very important role in catalyzing new users into the shared national infrastructure. This will increase the awareness and understanding of what’s available, how to access it, and how it can advance our research.
BB: When it comes to accessing large instruments they understand, researchers are more willing to go off their own campuses, but again their needs are usually for reasonably charged, timely access to instruments, shared or otherwise. Consolidated infrastructure only works for scientists if it is up and available, reliable, and reasonably priced.
BB: The move towards Open Data has potential to create considerable additional overhead for researchers, especially if the requirement to be “open” means data must be in particular formats or frameworks. In addition, data preservation needs to be reasonable – storing all data created is unlikely to be necessary. For example, in computational physics we may run a simulation that produces terabytes of data, which we assemble to produce a “movie” of a few hundred megabytes. It is the movie we analyse to determine the quality of the simulation, and what variables to manipulate, then we run another simulation. Preserving the movie might be worthwhile; preserving the underlying data is probably not.