Technical deep dives
Video

CoreWeave ARENA Demo

CoreWeave ARENA is a production-ready AI lab engineered to enable teams to move into full-scale production with confidence. In this demo, Corey Sanders walks through the CoreWeave ARENA experience, starting with gaining access to your CoreWeave ARENA environment and orienting around the guided lab workflow.

You’ll see how teams use CoreWeave ARENA to run real evaluation workflows, submit jobs, and review baseline outputs before iterating to optimize. The walkthrough also covers validating communication performance with NCCL tests on Slurm/SUNK and using CoreWeave Mission Control visibility to understand what’s happening during execution. By the end, you’ll have a practical view of how CoreWeave ARENA brings together the tools and signals needed to evaluate readiness and keep experimenting with confidence.

1

00:00:04,400 --> 00:00:05,066

Hi there.

2

00:00:05,066 --> 00:00:06,200

My name is Corey Sanders

3

00:00:06,200 --> 00:00:07,366

and I'm here to give you a quick

4

00:00:07,366 --> 00:00:08,466

walkthrough on the new

5

00:00:08,466 --> 00:00:11,233

ARENA experience on the CoreWeave cloud.

6

00:00:11,233 --> 00:00:12,033

Now, it started by me

7

00:00:12,033 --> 00:00:13,433

receiving an invitation,

8

00:00:13,433 --> 00:00:14,600

from the solution architects

9

00:00:14,600 --> 00:00:15,366

at CoreWeave

10

00:00:15,366 --> 00:00:17,266

who had set up an environment for me

11

00:00:17,266 --> 00:00:19,233

to be able to do some of these tests

12

00:00:19,233 --> 00:00:21,266

and an example walkthroughs on the,

13

00:00:21,266 --> 00:00:22,800

the cloud that's been

14

00:00:22,800 --> 00:00:23,933

that's been made for me.

15

00:00:23,933 --> 00:00:25,100

And so it started with me

16

00:00:25,100 --> 00:00:26,066

to be able to log

17

00:00:26,066 --> 00:00:27,600

right in here into the console.

18

00:00:27,600 --> 00:00:28,466

And so I logged in,

19

00:00:28,466 --> 00:00:31,133

and you can see here

20

00:00:31,133 --> 00:00:32,433

I have a cluster already up and running

21

00:00:32,433 --> 00:00:33,300

that's been set up for me,

22

00:00:33,300 --> 00:00:34,066

which is lovely.

23

00:00:34,066 --> 00:00:36,866

This is a, Kubernetes cluster running

24

00:00:36,866 --> 00:00:38,633

at the CoreWeave Kubernetes Service.

25

00:00:38,633 --> 00:00:40,300

And on top of this,

26

00:00:40,300 --> 00:00:41,500

they've also set up

27

00:00:41,500 --> 00:00:43,033

a SUNK environment for me,

28

00:00:43,033 --> 00:00:43,900

which we'll go through

29

00:00:43,900 --> 00:00:44,533

a little bit later,

30

00:00:44,533 --> 00:00:46,633

which is a Slurm solution.

31

00:00:46,633 --> 00:00:48,133

It's very unique on CoreWeave

32

00:00:48,133 --> 00:00:49,300

that allows me to do

33

00:00:49,300 --> 00:00:50,466

Slurm operations

34

00:00:50,466 --> 00:00:52,900

on top of a Kubernetes based

35

00:00:52,900 --> 00:00:55,100

infrastructure environment.

36

00:00:55,100 --> 00:00:56,433

And so the only really thing

37

00:00:56,433 --> 00:00:57,200

I need to do to be able

38

00:00:57,200 --> 00:00:58,233

to get access here

39

00:00:58,233 --> 00:01:00,500

is to bring a public key

40

00:01:00,500 --> 00:01:01,800

in with the private key,

41

00:01:01,800 --> 00:01:03,200

on my local machine.

42

00:01:03,200 --> 00:01:05,066

So that I can access the environment.

43

00:01:05,066 --> 00:01:07,000

And so I've already created a key.

44

00:01:07,000 --> 00:01:09,100

And so I just want to,

45

00:01:09,100 --> 00:01:11,600

get the public key out, here.

46

00:01:11,600 --> 00:01:12,633

And so I'm going to copy

47

00:01:12,633 --> 00:01:13,233

and paste it,

48

00:01:13,233 --> 00:01:16,733

and go right back into the console.

49

00:01:16,966 --> 00:01:18,100

It's just down here.

50

00:01:18,100 --> 00:01:19,400

If I click on my name and click

51

00:01:19,400 --> 00:01:21,200

settings and scroll down,

52

00:01:21,200 --> 00:01:22,266

you can see here

53

00:01:22,266 --> 00:01:23,866

SSH Public Key for Slurm.

54

00:01:23,866 --> 00:01:26,933

Type that in save it and away I go.

55

00:01:27,200 --> 00:01:28,700

And that's it for me

56

00:01:28,700 --> 00:01:30,566

in the console for now.

57

00:01:30,566 --> 00:01:33,733

Now once I get that key up and running,

58

00:01:33,733 --> 00:01:36,800

the next big step is I could go in

59

00:01:36,800 --> 00:01:38,966

and actually SSH in to the machine,

60

00:01:38,966 --> 00:01:41,366

into the login node here for my

61

00:01:41,366 --> 00:01:42,533

for my Slurm environment.

62

00:01:42,533 --> 00:01:43,766

So here you can see I'm logged

63

00:01:43,766 --> 00:01:44,400

in, gives me

64

00:01:44,400 --> 00:01:45,433

a little bit of detail of what

65

00:01:45,433 --> 00:01:46,166

the CoreWeave Slurm

66

00:01:46,166 --> 00:01:48,066

HPC cluster looks like,

67

00:01:48,066 --> 00:01:49,500

the things that have been pre-installed

68

00:01:49,500 --> 00:01:50,300

for me and set up,

69

00:01:50,300 --> 00:01:52,366

you can sort of see I'm running here,

70

00:01:52,366 --> 00:01:53,266

but instead of actually

71

00:01:53,266 --> 00:01:54,233

doing this through SSH,

72

00:01:54,233 --> 00:01:55,366

here's what I'm going to do.

73

00:01:55,366 --> 00:01:55,800

I'm actually

74

00:01:55,800 --> 00:01:57,466

going to run this through marimo

75

00:01:57,466 --> 00:01:58,866

because it helps me understand

76

00:01:58,866 --> 00:02:00,333

what I'm doing step by step

77

00:02:00,333 --> 00:02:01,566

and actually be able to,

78

00:02:02,933 --> 00:02:03,933

run the command

79

00:02:03,933 --> 00:02:04,966

directly from that environment.

80

00:02:04,966 --> 00:02:06,666

So let me spin up a marimo

81

00:02:06,666 --> 00:02:07,566

deployment here.

82

00:02:07,566 --> 00:02:08,566

It's going to be running

83

00:02:08,566 --> 00:02:11,266

on that login node. Okay.

84

00:02:11,266 --> 00:02:13,133

And so let me just copy that

85

00:02:13,133 --> 00:02:16,133

and let me open back up into,

86

00:02:16,233 --> 00:02:17,933

my console here.

87

00:02:17,933 --> 00:02:19,466

Bingo. Bango.

88

00:02:19,466 --> 00:02:22,100

And I have my marimo environment up

89

00:02:22,100 --> 00:02:23,266

and running

90

00:02:23,266 --> 00:02:25,366

as part of my SUNK deployment

91

00:02:25,366 --> 00:02:26,566

on the CoreWeave cloud.

92

00:02:26,566 --> 00:02:27,066

Okay,

93

00:02:27,066 --> 00:02:29,800

so let me just run my prerequisites here.

94

00:02:29,800 --> 00:02:30,633

I just have to make sure

95

00:02:30,633 --> 00:02:32,366

bash execution can happen.

96

00:02:32,366 --> 00:02:34,033

And so here we are. Welcome to SUNK.

97

00:02:34,033 --> 00:02:35,733

This allows me to play

98

00:02:35,733 --> 00:02:36,700

with the environment

99

00:02:36,700 --> 00:02:37,600

for SUNK deployment

100

00:02:37,600 --> 00:02:39,000

that's been set up for me.

101

00:02:39,000 --> 00:02:40,533

Within seconds,

102

00:02:40,533 --> 00:02:42,366

you can see I'm up and running

103

00:02:42,366 --> 00:02:43,966

and ready to run some commands.

104

00:02:43,966 --> 00:02:45,733

So the first thing you know, I can see.

105

00:02:45,733 --> 00:02:47,400

Hey, what's my storage look like?

106

00:02:47,400 --> 00:02:48,866

Running these commands

107

00:02:48,866 --> 00:02:50,000

give me exactly storage

108

00:02:50,000 --> 00:02:50,833

that's been set up for me

109

00:02:50,833 --> 00:02:51,833

in the environment.

110

00:02:51,833 --> 00:02:54,933

I can see what resources do I have?

111

00:02:55,066 --> 00:02:56,566

Right. And what's the defaults here?

112

00:02:56,566 --> 00:02:57,766

So I've got the partition

113

00:02:57,766 --> 00:02:58,833

be all partition.

114

00:02:58,833 --> 00:03:01,200

And it gives me four nodes.

115

00:03:01,200 --> 00:03:02,900

These are H100s

116

00:03:02,900 --> 00:03:03,800

which will be, you know, pretty

117

00:03:03,800 --> 00:03:05,866

nice to me to run a couple of jobs on.

118

00:03:05,866 --> 00:03:07,433

And so I can even go into some details

119

00:03:07,433 --> 00:03:08,700

on those partitions

120

00:03:08,700 --> 00:03:09,333

and take a look

121

00:03:09,333 --> 00:03:11,466

and see things like default. Yes.

122

00:03:11,466 --> 00:03:13,166

And so what other other

123

00:03:13,166 --> 00:03:14,366

sort of configurations

124

00:03:14,366 --> 00:03:16,866

have been set for me ahead of time?

125

00:03:16,866 --> 00:03:17,933

And so this makes,

126

00:03:17,933 --> 00:03:19,200

of course, the running of these jobs

127

00:03:19,200 --> 00:03:20,200

so much easier.

128

00:03:20,200 --> 00:03:22,466

I'm interacting with Slurm right now.

129

00:03:22,466 --> 00:03:24,833

And so I can go in and take a deeper

130

00:03:24,833 --> 00:03:26,866

look here at some of the configuration.

131

00:03:26,866 --> 00:03:28,266

And so this is sort of one

132

00:03:28,266 --> 00:03:29,400

of the many configuration files,

133

00:03:29,400 --> 00:03:30,600

but this is a key one for Slurm.

134

00:03:30,600 --> 00:03:34,066

And so you can see things like, preempt

135

00:03:34,066 --> 00:03:35,666

type es for policy.

136

00:03:35,666 --> 00:03:37,000

So you can understand

137

00:03:37,000 --> 00:03:37,466

sort of

138

00:03:37,466 --> 00:03:38,700

what, you know,

139

00:03:38,700 --> 00:03:41,166

what's expected for preemption.

140

00:03:41,166 --> 00:03:43,833

You can, as jobs are sort of prioritized.

141

00:03:43,833 --> 00:03:45,200

A whole bunch of different settings

142

00:03:45,200 --> 00:03:45,766

and controls

143

00:03:45,766 --> 00:03:46,666

here around

144

00:03:46,666 --> 00:03:47,433

sort of how you're going

145

00:03:47,433 --> 00:03:49,033

to run your Slurm job

146

00:03:49,033 --> 00:03:49,900

on this environment.

147

00:03:49,900 --> 00:03:50,766

I'm going to leave it all

148

00:03:50,766 --> 00:03:52,966

as it has been set for me.

149

00:03:52,966 --> 00:03:55,366

But I could go in and modify, right?

150

00:03:55,366 --> 00:03:56,500

I could even create,

151

00:03:56,500 --> 00:03:58,033

you know, in my own little,

152

00:03:58,033 --> 00:03:59,833

window here and cat out,

153

00:03:59,833 --> 00:04:01,600

do it, you know, do an SSH

154