Summary: As a part of an ongoing dialog about what enhancements to UCIs Research Cyberinfrastructure are needed, OIT has created this straw-man proposal for review and comment. Please send feedback to dana.roode@uci.edu.
As a top ranked academic institution, UCI continues to make impressive progress in building its reputation for the discovery and dissemination of knowledge through excellence in research, teaching, and creative expression. UCI’s most notable distinctions in research include its 3 Nobel Laureates, 24 National Academy members, and 33 American Academy of the Arts and Sciences members.
As UCI continues to grow into the next decade researchers will expect, and indeed deserve, unfettered access to an appropriate level of advanced research Cyber Infrastructure (CI). In the present context CI relates to the coupled integration of networking, storage, compute cycles, software and professional support for the construction and execution of advanced computations, archiving and analyzing data, the publication of scientific results, and grant writing.
UCI’s CI needs encompass a spectrum of hardware, software, and services ranging from the mundane to the extreme. Our current research CI is a patchwork of capability and appropriate enhancements are needed in order to accommodate research needs. Requiring researchers to pay for required upgrades, or alternately, to establish their own CI independently, is not cost-effective and is a continuing source of frustration and concern for many of our faculty.A brief summary of CI support currently provided by OIT to academic units, departments, research groups, individual researchers, and elsewhere is summarized here: http://www.nacs.uci.edu/fac_staff.html
In prioritizing CI needs we seek high-impact, low-cost opportunities that leverage economies-of-scale, are sufficiently flexible to adapt to rapidly changing technology, and are sustainable, in part, by direct recharge.Some examples of these opportunities include: providing professional assistance and programming support, providing access to a variety of high-performance computational systems and large-data storage, implementing a UC-wide CI grid, and expanding the campus Academic Data Center. Advanced networking is omitted from this proposal, as it is largely addressed through ongoing network upgrade efforts.1. Professional Assistance and Programming Support: Researchers should be free to focus their efforts on the discovery and dissemination of knowledge, not evaluating hardware and software tradeoffs, or devising methods to interact with increasingly complex-data sets. Conversations with faculty researchers and IT-support personnel indicate that a recurring, unmet need exists for access to discipline-specific consultants and programming-support personnel who could provide advice, guidance, and direct support for: the effective use of research software, advanced programming techniques and methodologies, and the development of large databases and information repositories. The benefit to researchers engaged in compute-intensive research would be to increase their productivity and more efficiently leverage scarce research-support resources. We propose adding graduate student consultants who will help provide a variety of programming support services to the research community. An OIT specialist would provide additional advice and assistance and would facilitate the activities of the student programmers . Short-term consulting would be available to the campus free of charge; long-term research projects would be sustained via recharge, or grant support.
Initial Action: Hire 1 FTE Research Computing Specialist and 2.5 FTEs as graduate student programmers. Estimated budget: $140,000, recurring.
2. Data Storage: Many researchers desire access to a well-managed, large-capacity, hierarchical, storage infrastructure for archiving data. This infrastructure would provide primary-file storage, data-sharing services, and short-term backup storage. The basic building blocks would leverage cost-effective commodity hardware and open-source software, and be arranged as single-unit building blocks, or “bricks”. The storage bandwidth (read and write) would range from ~2MB/s to ~200MB/s, depending on locality and protocol. If needed, individual groups, departments, or academic units could add storage capacity to this infrastructure to augment their evolving needs. System administration would be provided by academic-unit personnel, OIT personnel, or a combination of the two. Components for a prototype-storage system are currently being tested in OIT: http://www.nacs.uci.edu/rcs/storage/storage_project.html
Initial Action: In collaboration with academic units, specify configurations for a nominal 16 TB “Storage Brick”, evaluate commercial vs. commodity hardware/software tradeoffs, and stress test components against user requirements. Estimated budget: $70,000, non-recurring.
3. Compute Cluster Support: For the past 5 years campus researchers have enjoyed access to the UCI Medium Performance Computing (MPC) Cluster. The MPC was started with a modest investment from the School of Physical Sciences and OIT. MPC provides a high-speed parallel-execution compute environment for researchers to prove and sustain their high-performance computing needs. Researchers can become MPC partners, by adding capacity to the MPC cluster, in exchange for system administration and co-location in the campus Academic Data Center (see item 5 below). MPC currently has in excess of 300 nodes, around a dozen partners, and over several dozen active user accounts consisting of faculty, post-doctoral researchers, graduate and undergraduate students. For the last several years MPC has only had a part-time administrator and it has been a challenge to provide a robust, quality of service. Moreover, the MPC hardware has become dated and its performance has diminished relative to today’s standards. Alternate cluster support models, such as standardized standalone cluster support, should also be made available to UCI researchers to help address their individual needs.Central cluster support must also facilitate the establishment of a planned UCI/UC grid to maximize cycles available to researchers.
Initial Action: Hire 1 FTE system administrator, with grid-computing expertise, to supplement current OIT MPC staff. In addition, replace 32 CPUs each year in MPC as an ongoing investment. Estimated budget: $120,000, recurring.
4.Grid Computing: System-wide discussions are investigating the establishment of a unified, system-wide grid framework consisting of: storage, compute, and software resources, all accessible through a middleware framework called the “UC Grid”. The UC Grid middleware is intended to authenticate users based on their campus network IDs (e.g. UCInetID) and provide them access to a community of grid systems maintained by personnel distributed across the UC. With the UC Grid UCI researchers would be able to add their CI resources to the Grid and provide and receive access to other Grid collaborators. The overall benefit would be to increase the compute cycles available to researchers at any specific time, while increasing the utilization of systems distributed across the UC. Our UCLA collaborators have developed a prototype middleware framework for this Grid: http://www.ats.ucla.edu/news/default.htm, and UCI is exploring how we can participate in this opportunity. The key elements needed to implement this capability have been identified and price estimates have been established.
Initial Action: Purchase grid appliances and network switch components needed to provide cluster access control and hire 1 FTE to modify and develop software for the UC Grid. Estimated budget: $100,000, recurring.
5. Academic Data Center Expansion: The UCI Academic Data Center (ADC) is a unique campus resource, providing high-quality electrical power, air-conditioning, and security to campus units, researchers, and other academic entities, for co-location of their CI resources. The ADC is administered for the campus by OIT and is operating close to its capacity, in terms of space, power, and cooling. Expanding the capacity of the existing ADC is a less costly alternative to constructing additional new facilities to accommodate campus growth. If the adjacent OIT instructional labs were relocated elsewhere on campus the ADC floor space could be approximately doubled. Power and cooling upgrades would allow us to accommodate the needs and increased demands of new co-location clients, who are purchasing clusters of ever increasing power density. The main features of the current ADC are provided here: http://www.nacs.uci.edu/computing/AcademicDataCenter.html.
Action: Expand the ADC floor space in phased developments, by moving OIT instructional labs to other locations on campus, augmenting the power-distribution system with additional capacity (UPS, backup generator, etc.), and increasing the HVAC capacity. Estimated budget: to be determined.