Johannes Habichs Blog: Windows HPC

Suche


Inhalte überspringen: zur Funktionsnavigation zur Inhaltenavigation mit Inhaltsübersicht und Verzeichnis A-Z

Topic

Meine Aktivitäten am RRZE und HPC Allgemein, CV

Status

Online seit einem Jahr und 191 Tagen
Letzter Eintrag: 2010.02.22, 20:41

Kalender

März 2010
Mo Di Mi Do Fr Sa So
 1   2   3   4   5   6   7 
 8   9  10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31        
Dezember    

Letzte Beiträge

Dienstag, 1. Dezember 2009

Windows HPC2008 Cluster Operational

Today the Windows HPC2008 Cluster of RRZE successfully got operational.

If you are interested in getting access to the system, contact hpc@rrze.uni-erlangen.de

Initial information for Login and usage can be found here:

Windows HPC2008 Cluster Launch Slides
  • Kommentieren (keine Kommentare)

Donnerstag, 19. November 2009

Java; A quest with unattended installation

Some guidelines for unattended Java installation in an Win2008 HPC Cluster environment:
  • Deactivate UAC on all nodes; Otherwise the nodes will simple hang, and wait for the UAC acceptance that will never happen. You can omit this by doing the first Java installation by hand via RDesktop login. Afterwards all successive unattended installation will succeed. We have currently no clue why. Perhaps some kind of adaptive UAC?
    Best practice is to deactive UAC via a registry key:
    %windir%\system32\reg.exe ADD HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System /v EnableLUA /t REG_DWORD /d 1 /f
  • Reboot the nodes, so that the registry change becomes effective.
  • All will run smoothly now, if the user installing java was logged in to the nodes at least once. This poses a problem with 20++ cluster nodes. however. The basic point is, that there is no User directory created yet and neither are all temp and AppData paths.
    Java kindly ignores any variable defined by the OS, e.g. TEMP or TMP. And gathers its own temp dirs which leads to C:\Users\Username\AppData\LocalLow\Temp and many more.
    So the installation fails once more, unless these directories are there.
    So you have to create them yourself:
    mkdir C:\Users\%USERNAME%\AppData\LocalLow\Temp\
  • After that the usual JRE unattended deployment should proceed
Note, that any login to the nodes to be installed and any prior to that, java installation can change all of the above experiences
  • Kommentieren (keine Kommentare)

Mittwoch, 18. November 2009

Windows HPC 2008 Cluster Launch

RRZE recently extended its Windows High-Performance-Computing Ressources.
Along with upgrading to the latest Windows HPC Server Release 2008, the hardware has been upgraded significantly:

16 dual-socket hexa-core AMD Istanbul Opteron processors (Dell Blade Center enclosure) equipped with 32GB of RAM service a peak performance of 2 TFLOP/s.
 AMD Istanbul Die
Interested users are invited to join the official launch on December 1st. 2009 at RRZE room 1.026.
After a quick tour of the new Job Scheduler, the main part is organized as an hands-on session were everyone can make themselves comfortable with the new environment.

A Registration via email to hpc@rrze.uni-erlangen.de is necessary for attending.

Dell Blade Server enclosure



Designated trademarks and brands are the property of their respective owners
  • Kommentieren (keine Kommentare)

Montag, 5. Oktober 2009

Ganglia 3.1.2 Running as as a service After all

With the help of the srvany.exe from the Windows Ressource Kit Tools 2003 you can run any executable in Win2008 and Win2008 R2 either as a service.

You create yourself a service running solely srvany.exe

sc create GMOND binpath= c:\programme\ganglia\srvany.exe

Edit the service specs in the registry:
LocalMachine--> System\\CurrentControlSet\\Services\\GMond

Add a subkey named Parameters
Inside "Parameters" create a String value named Application.
Edit Application and put the call to ganglia into the value data field.
E.g. c:\programme\ganglia\gmond.exe -c "c:\programme\ganglia\gmond-node.conf "



Start the service over mmc or by sc start GMOND and it should be running.



(There should also be a way to do this with the cygwin service creation tool cygrunsrv. Thanks to Nigel for pointing that out.)
  • Kommentieren (keine Kommentare)

Mittwoch, 29. April 2009

Ganglia 3.1.2 for Windows HPC2008

Recent tests of the windows ported ganglia on Microsoft Windows HPC 2008, obtained from APR Consulting web page, showed a problem.
After a few minutes of runtime, the ganglia executable eats up more and more memory till the systems starts to swap, finally becomes unstable and crashes or is no longer reachable.
Not able to deploy ganglia to the cluster I tested different releases from APR and none of them had the problem running on Win2003 x64, however all showed the same memory leak problem on HPC2008x64 or just didn't work at all.
So finally we compiled our own Cygwin based gmond.exe binary and came up with a pretty stable version, with just one flaw:
Till now the installation as a service doesn't work, neither with gmondservice.exe from APR Consulting nor with the windows native tool sc.exe.
However the installation with schtasks.exe as a scheduled task to run once on startup and then daemonize (thats what Linux calls a service), works fine.
In addition a pure swap of the executables or the config file, will now result in an updated ganglia once the node reboots or a task restart is triggered instead of removing and reinstalling a service.
All steps of deployment can be easily done with the clusrun extension, which is essential for cluster administration.

Small tutorial
(all links are below, drop a comment if something is missing/wrong)
  • Download a ganglia version (3.1.2 Langley worked indeed very well)
  • Download and install cygwin with a gcc and g++ compiler and the additional packages mentioned in the README.WIN file of the ganglia package
  • currently:
    libapr1, expat, diffutils, gcc, make, python, sharutils, sunrpc
    and for libconfuse:
    libiconv
  • Do: ./configure make make install in the root directory of the confuse lib
  • Perhaps you have to exclude the examples from the build:
    replace line: SUBDIRS = m4 po src examples tests doc with
    SUBDIRS = m4 po src tests doc
    They throwed an error on my system.
  • Do: ./configure --with-libconfuse=/usr/local --enable-static-build and make in the root of ganglia
  • With some additional dll files from cygwin, your release is now runnable. Just start the gmond.exe and look into the Event viewer which dll is missing and place them in the same folder or in a folder which is in the PATH.
Please note, that this a x86_32 binary and not x64, due to the fact that cygwin is not x64.
It should however be possible to build ganglia with the Windows Services for Unix to native x64.

Links:
Corresponding discussion in HPC2008 MS Forum
Cygwin
Ganglia
confuse library
APR Consulting web page

  • Kommentieren (keine Kommentare)

Donnerstag, 22. Januar 2009

Were is my "My Desktop" button

In order to get the "My Desktop" button back , e.g.on Windows Terminal servers, just execute the following command:

regsvr32 /n /i:U shell32

With the next reboot or upon restart of the Quick launch bar, the icon should appear.

  • Kommentieren (keine Kommentare)

Dienstag, 16. Dezember 2008

Windows CCS Cluster Upgrade

Recently the Windows CCS Cluster of the RRZE got a small upgrade.

One of the initial nodes rejoined the cluster and there are now 28 Opteron Cores available again.
Due to the usage of CFD for production runs, the user home was recently upgraded and the quota was extended to 10 GB per user.
Furthermore for special purposes and a limited amount of time there is an extra project home available with up to 120 GB space for extensive usage.
  • Kommentieren (keine Kommentare)

Freitag, 7. November 2008

Running MPI Jobs on Windows CCS

In order to run only one MPI process per allocated node on the Windows CCS Cluster, you have to tweak the system variable set by the scheduler. For each allocated processor the system variable (CCP_NODES) contains the associated hostname once.
As a consequence, four MPI processes are started.

In order to remove the redundant hostnames you call your program the following way from inside the scheduler:
mpiexec.exe -hosts %CCP_NODES: 4= 1%

%CCP_NODES: 4= 1% removes three out of four lines, which reduces each hostname down to one occurence, as the same hostnames are always consecutive.
  • Kommentieren (keine Kommentare)

Mittwoch, 10. September 2008

HPC Server 2008 launch

The official launch of HPC Server 2008 is on 16th of October 2008, at the Frankfurt Rhein-Main Airport. More information on the official HPC 2008 launch website .
  • Kommentieren (keine Kommentare)

Dienstag, 2. September 2008

Windows HPC Deployment

The Windows High Performance Cluster Competence Center located at the RWTH Aachen is giving tutorials for administrators on Windows HPC 2008 deployment. Please find more detailed information on their webpage.

  • Kommentieren (keine Kommentare)

Nach oben