Other Stuff



UoM::RCS::User Documentation


Page Contents:


Page Group:

Computational Services:

Experimental Services

Certificates and GSISSH and Stuff:


Related Pages:

Userdoc:

Wiki-based Userdoc:

Superceded Userdoc:

Success Stories

Success Stories — Private





Access to UoM-Related Computation Grid Systems Made Easy (Yes, Really)

1.

Overview

What do you mean by a Grid system?
Wikipedia defines Grid computing at great length. Here we are simply concerned with:
  • Clusters of HPC clusters of Linux machines — otherwise standard clusters are linked into "super-clusters" by means of Globus middleware. Globus helps make the cluster of clusters into one big HPC resource.
Authentication (login) to such clusters is usually (exclusively) through (X500) certificates. From the user prospective, these are very much like traditional SSH keys (though there is more going on in the background).
Be specific, what systems are we talking about?
University of Manchester staff and research students may apply for access to:
  • The National Grid System which consists of four main clusters, at the Universities of Leeds, Manchester and Oxford, and the Rutherford Appleton Laboratories; there are also numerous affiliate clusters at other sites around the UK.
  • The North-West Grid which consists of clusters at the Universities of Lancaster, Liverpool and Manchester, and the Daresbury Laboratories.
Details of both systems are given below [Section 2.].
Why should I care? How much does it cost?
Both the NGS and NW-Grid offer significant HPC resources which are free at the point of use and for which the application procedure [Section 5.] is lightweight.
How do I run my jobs?
Using the computational Grid systems at UoM mentioned below [Section 2.] you can:
  • run your computational jobs, unchanged, in the traditional way, if you so wish;
  • but you can also do much more [Section 6.].
To use the systems you need two things
In order to authenticate (login!) to these grid systems, each user will need:
  • an e-Science certificate — which from the user perspective is similar to a traditional SSH key;
  • java and either the gsi-sshterm java application, or an implementation of the Globus client software installed — we cover only the gsi-sshterm here since it is easy to use, simple to install and works just like any other terminal application (cf. xterm).
Why can't standard SSH access be made available?

2.

Grid HPC Resources Available through RCS

2.1.

NGS — the National Grid Service

The four core clusters comprise:

The only way to access the NGS is as described below using an e-Science certificate and gsi-sshterm [Section 4.] (or, for more advanced users, via the Globus client).

2.2.

NW-Grid — The North-West (England) Grid

  • The current clusters are based on dual-core, dual processor Opteron nodes with at least 8Gb memory per node, some larger nodes of 16 and 32Gb: Daresbury has 96 nodes, Lancaster has 48, Liverpool 104 and Manchester 25.
  • A significant upgrade to the clusters at both Lancaster and Liverpool is underway (2008/April) and Manchester has recently purchased to 16-core, 32 GB RAM, nodes which are expected to be in service in May 2008.
  • A dedicated fibre link exists between the four sites which will soon directly connect all compute nodes.
  • Finally, Daresbury has recently purchased an IBM BlueGene-L with 2048 cores which be accessible as part of NW-Grid.

For University of Manchester users, access to the Manchester cluster via traditional OpenSSH is available. Access to the other clusters requires an e-Science certificate and GSI-SSHTerm [Section 4.] (or, for more advanced users, the Globus client).

2.3.

UoM Campus Grid


3.

UK e-Science Certificates

What's a UK e-Science certificate?
A UK e-Science certificate is an X-500 certificate issued by the UK e-Science Authority. If you want to know the details, Wikipedia describes these well; if you don't and simply want to use one, think of it as something similar to a traditional passphrase-protected SSH key with which you can authenticate to certain systems.

(There are significant differences however: certificates are issued by a certification authority; like a driver's licence, can expire or be revoked.)
Why should I get one?
Because one is required to gain access to many "grid"-based HPC clusters on which you are entitled to an account.
How do I get one?
Unfortunately, the process is a little long-winded. In outline:
  1. Request a certificate from ca.grid-support.ac.uk.
  2. Meet, face-to-face, a representative of the UoM Registration Authority (RA) in order to verify your identity.
  3. Assuming the RA representative approves your application you will receive an email indicating that your certificate is ready for download using your Web browser.
  4. Following the instructions in the email, download the certificate into your browser; then export/save it into a file named (for example) eSciCA.cert — the browser will add suffix .p12 indicating PKCS12 format. Your browser will prompt your for a password which is used to protect the file —do not forget this!.
For more details visit the dedicated Web page.
How do I use it?
Your certificate is used with GSI-SSHTerm to authenticate (login) to grid-based compute resources.

4.

Getting access the easy way: GSI-SSHTerm

Commonly, to gain access to computational grids based on the Globus middleware stack, users download and install the Globus client software. This is not easy since Globus is no longer included in any major Linux distribution.

The Globus client software includes a version of SSH, gsissh, which can use e-Science certificates for authentication and also "real" grid tools which facilite the submission and running of jobs without ever "logging in" to the compute resource.

GSI-SSHTerm, described here, is a standalone Java application which combines both virtual terminal functionality (cf. xterm) and that of gsissh. This is sufficient to access these computational grids and submit jobs, though not to make full use of their capabilities [Section 6.]

Prerequisites

  1. First ensure your e-Science certificate is installed (correctly!) as described above.
  2. Ensure you have an up-to-date Sun Java runtime-environment (JRE) installed on your machine, e.g.,
        apt-get install sun-java6-jre
        apt-get install sun-java6-bin

gridproxyinit and GSI-SSHTerm

  1. Download gridproxyinit.jar.tar.gz, a Java-based certificate proxy, required to help GSI-SSHTerm use your certificate, and unpack it:
        prompt> tar xvz < gridproxyinit.jar.tar.gz
            #
            # ...creates a directory "gridproxyinit" and extracts files into this new
            #    directory...
            #
  2. Download jws.jnlp (local mirror), or visit the NGS to get it.

    This file relates to the Java Network Launch Protocol; it is used to automatically download required Java .jar files and then start the GSI-SSHTerm Java application (see below).
  3. Invoke/run gridproxyinit-run.jar:
        prompt> cd gridproxyinit
        prompt> java -jar ./gridproxyinit-run.jar
            #
            # ...ensure this is the Sun JRE "java", not e.g. that from GCJ,
            #    for example, 
            #        /usr/lib/jvm/java-6-sun/bin/java
            #    not
            #        /usr/lib/jvm/java-gcj/bin/java
    From the radio buttons, select PK12 — this is not the default — and under Options select the location of your certificate (e.g., /home/<username>eSciCA.cert.pk12), enter the password you used to protect it and click Create.
  4. Start the GSI-SSHTerm application using jws.jnlp:
        prompt> javaws jws.jnlp
    The first time you "run" jws.jnlp the GSI-SSHTerm application will be downloaded — a window displaying the name and publisher of the application, and download progress, will open — then you will see a Security Warning window open: click Run (assuming you trust the download!) and the GSI-SSHTerm application window should open after a few seconds.

    To open a connection to a cluster on which you have an account, click on File, New Connection, enter a hostname (e.g., man2.nw-grid.ac.uk) and click Ok.

5.

How to I apply for an account?

The application procedures for obtaining accounts on both the NGS and the NW-Grid are relatively lightweight. To apply for an account:

  • on the NW-Grid, follow the "User Registration" link (top right) of the NW-Grid home page;
  • on the NGS, follow the "Apply for Access" link on the NGS home page.

6.

Getting more from Grid systems — what's the point of all this Grid stuff?

7.

Installing Globus

The simplest way to install Globus client software on a Linux machine is to use the VDT distro of Globus. The steps are described at www.grid-support.ac.uk (Introductory Statement):

  1. Prerequisites: VDT uses a package-manager called Pacman, so the first step is to install that; this in turn requires Python (at least v2.3). Installation of Pacman is described in the VDT documentation. If Pacman does not find a suitable installation of Python, it attempts to download and build one as part of the install/setup process.
  2. As of 2007/March/15, the next step, Base VDT Globus Installation, contains some misleading instructions, so beware:
    • When telling Pacman about the respository, ensure there is no space after cache:, i.e., pacman -cache: http://www.cs.wisc.edu/vdt/vdt_121_cache.
    • This is wrong: pacman -get Globus, it should be pacman -get http://vdt.cs.wisc.edu/vdt_1310_cache:GSIOpenSSH.
  3. Bearing in mind the above two points, install Globus as described. Then extract the required user certificate (public key) and private use key from your UK e-Science certificate, for use with Globus, also as described; however, this page then describes extracting host certificate and key, but these are not required for a Globus client and can only be extracted from a requested/downloaded server/host certificate. The UK Grid Support site includes alternative descriptions and explanation of the extraction procedurem which may prove clearer: Installing your e-Science Certificate and Private Key (part of Setting up Your User Environment); or follow Step 2 in, Preparing your User Certificate for use by Globus Toolkit.
  4. Optional Extras: this step describes how to install GSI-SSH, which is a customised version of OpenSSH which can use certificates for authentication. GSI-SSH is indeed not necessary for use of the NGS of NW-Grid, but it is very commonly used; install it. (Installation of GITS is also mentioned; this can be ignored.)

8.

Globus Gateway Systems

9.

Globus VMWare Images

A. Troubleshooting gridproxyinit-run.jar and GSI-SSHTerm

We need access to the Java console to see what's going on. Start GSISSH-Term using javaws's -viewer option:

  javaws -viewer jws.jnlp
This opens the Java Application Cache Viewer. Click on
  Exit --> Preferences --> Advanced
then double-click on Java Console and ensure the Show Console radio button is selected; click Apply and then OK. Back in the JACV window, select GSI-SSHTerm and click Launch Online. After a few seconds you should have three windows: the JACV, a Java Console and GSI-SSHTerm.

B. A Buglet With GSI-SSHTerm

On attempting to File --> New Connection. . .  

Java console

Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException
	at uk.ac.rl.esc.browser.Browser.getProfiles(Browser.java:687)
	at uk.ac.rl.esc.browser.Browser.getBrowserList(Browser.java:940)
	at com.sshtools.sshterm.GSIAuthTab.setConnectionProfile(GSIAuthTab.java:170)
	at com.sshtools.common.ui.SshToolsConnectionPanel.setConnectionProfile(SshToolsConnectionPanel.java:133)
	at com.sshtools.common.ui.SshToolsConnectionPanel.showConnectionDialog(SshToolsConnectionPanel.java:191)
	at com.sshtools.sshterm.SshTerminalPanel.actionPerformed(SshTerminalPanel.java:882)
	at com.sshtools.common.ui.StandardAction.actionPerformed(StandardAction.java:148)

  victim> strace -o open javaws jws.jnlp

  [pid 24257] open("/usr/lib", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 15
  [pid 24257] open("/root/.mozilla", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 15
  [pid 24257] open("/root/.mozilla/firefox/profiles.ini", O_RDONLY|O_LARGEFILE) = 15
  Process 24259 detached

Firing up Firefox and shutting down, all works; rm -rf ~/.mozilla/firefox and it fails again; Firefox again and it works again.