Ancestors

Toot

Written by Hanno Rein on 2025-01-29 at 02:51

Many thanks to my (former) students @zyrxvo, Dang, Pejvak and Nick, who helped disassemble 16 nodes of the NIAGARA supercomputer today. Everything is now at UTSC and once we've put it back together, my group will have its own small compute cluster with 640 cores.

=> View attached media | View attached media | View attached media | View attached media

=> More informations about this toot | More toots from hannorein@mastodon.social

Descendants

Written by Hanno Rein on 2025-01-31 at 22:08

So far I'm not having much luck in getting the nodes up and running. In short, I'm not sure how to access the Lenovo XClarity Controller (XCC). By default it is supposed to try DHCP first and then fallback to a static IP if that fails. I do see some traffic on the ethernet port, including a DHCP request, but strangely the DHCP offers are not accepted. The default static IP also doesn't work. So maybe the default behaviour was changed? It's not clear to me how to reset the XCC to factory defaults

=> More informations about this toot | More toots from hannorein@mastodon.social

Written by Hanno Rein on 2025-01-31 at 22:16

The System Management Module (SMM) has its own ethernet port. I think I've successfully reset the SMM to factory defaults. However, the SMM's network port is completely silent - no traffic at all. Which I think is the default. There are instructions in the manual on how to enable the SMM network, but those require me to reach the XCC. So I'm a bit stuck. 🤷‍♂️

Any hints would be appreciated! 😉

=> More informations about this toot | More toots from hannorein@mastodon.social

Written by Hanno Rein on 2025-01-31 at 22:28

Side note: The servers use 240V. I didn't want to wait for the university to put a 240V circuit in at work and I don't have an easily accessible 240V socket at home either. So I built this MacGyver-style contraption which combines two 120V outlets on different circuits into one 240V. And yes, this is definitely not up to code for many different reasons and you should absolutely not try this at home yourself.

=> View attached media | View attached media

=> More informations about this toot | More toots from hannorein@mastodon.social

Written by Hanno Rein on 2025-02-04 at 02:27

Finally getting somewhere!

=> View attached media

=> More informations about this toot | More toots from hannorein@mastodon.social

Written by Hanno Rein on 2025-02-04 at 23:40

I've put everything in the rack today.

=> View attached media

=> More informations about this toot | More toots from hannorein@mastodon.social

Written by Oliver Stueker on 2025-01-29 at 12:38

@hannorein

Really Cool!

It’s nice to see that these Niagara nodes get a second life while making space for Trillium!

However now just imagine that Trillium will have 192 core per node and therefore you could cram these 640 cores in 3 1/3 nodes. 🤯

[#]HPC #SciNet #Alliance #DRAC

@zyrxvo

=> More informations about this toot | More toots from ostueker@mast.hpc.social

Written by Hanno Rein on 2025-01-29 at 12:45

@ostueker @zyrxvo Yeah, it'll be interesting to compare the performance per core with my specific code. (I suspect not much change - despite the 7 year time between the two).

Small downside of the new Trillium cluster: I will probably never be able to give those a second life because everything is watercooled...

=> More informations about this toot | More toots from hannorein@mastodon.social

Written by Oliver Stueker on 2025-01-29 at 23:58

@hannorein

No, I don’t expect a huge difference in per-core performance either, though I haven’t compared benchmarks yet. Niagara's Skylake CPUs already supported #AXV512 (they even waited a bit for Skylake to be released) which gave a good #performance boost over AVX2. But core-density is the new MHz. 😉

Though you may want to recompile your code if you’re currently directly linking to #MKL since Trillium has AMD CPUs. BLIS is faster on those.

[#]HPC @zyrxvo

=> More informations about this toot | More toots from ostueker@mast.hpc.social

Written by simonbp on 2025-01-31 at 22:39

@hannorein Though it is totally why North American households are supplied with 240V three phase AC, to enable easy conversation to 120V AC single-phase, 240V dual phase, and 240V triple phase easily based on the wiring. Though unless you install an appliance, you'd never know.

=> More informations about this toot | More toots from simonbp@social.linux.pizza

Written by Hanno Rein on 2025-01-31 at 22:51

@simonbp Growing up in Europe, I initially thought the North American system is really stupid, but I am beginning to like it... Not the plugs though. They still suck.

=> More informations about this toot | More toots from hannorein@mastodon.social

Written by Aaron Sawdey, Ph.D. on 2025-01-31 at 22:56

@simonbp @hannorein Households in US don't usually have 3-phase, they have 120/240. But most things I have seen that are 240v will also run on 208 which is what you get across 2 phases of 3-phase. You usually have to go to commercial buildings to actually find 3-phase power.

=> More informations about this toot | More toots from acsawdey@fosstodon.org

Written by Steven Rieder on 2025-01-31 at 22:57

@hannorein 😬

=> More informations about this toot | More toots from rieder@mastodon.social

Proxy Information
Original URL
gemini://mastogem.picasoft.net/thread/113909373115444306
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
371.118988 milliseconds
Gemini-to-HTML Time
2.334742 milliseconds

This content has been proxied by September (3851b).