In [1]:
%matplotlib inline
import pandas as pd
import socket
host = socket.getfqdn()

from core import  load, zoom, calc, save,plots,monitor
In [2]:
#reload funcs after updating ./core/*.py
import importlib
importlib.reload(load)
importlib.reload(zoom)
importlib.reload(calc)
importlib.reload(save)
importlib.reload(plots)
importlib.reload(monitor)
Out[2]:
<module 'core.monitor' from '/ccc/work/cont003/gen7420/odakatin/monitor-sedna/notebook/core/monitor.py'>

If you submit the job with job scheduler, above¶

below are list of enviroment variable one can pass

%env local='2"¶

local : if True run dask local cluster, if not true, put number of workers setted in the 'local' if no 'local ' given, local will be setted automatically to 'True'

%env ychunk='2'¶

%env tchunk='2'¶

controls chunk. 'False' sets no modification from original netcdf file's chunk.¶

ychunk=10 will group the original netcdf file to 10 by 10¶

tchunk=1 will chunk the time coordinate one by one¶

%env file_exp=¶

'file_exp': Which 'experiment' name is it?¶

. this corresopnds to intake catalog name without path and .yaml¶

%env year=¶

for Validation, this correspoinds to path/year/month 's year¶

for monitoring, this corresponids to 'date' having * means do all files in the monitoring directory¶

setting it as 0[0-9] &1[0-9]& *[2-3][0-9], the job can be separated in three lots.¶

%env month=¶

for monitoring this corresponds to file path path-XIOS.{month}/¶

#

%env control=FWC_SSH¶

name of control file to be used for computation/plots/save/ & how it is called from Monitor.sh¶

Monitor.sh calls M_MLD_2D

and AWTD.sh, Fluxnet.sh, Siconc.sh, IceClim.sh, FWC_SSH.sh

  • AWTD.sh M_AWTMD

  • Fluxnet.sh M_Fluxnet

  • Siconc.sh M_Ice_quantities
  • IceClim.sh M_IceClim M_IceConce M_IceThick

FWC_SSH.sh M_FWC_2D M_FWC_integrals M_FWC_SSH M_SSH_anomaly

Integrals.sh M_Mean_temp_velo M_Mooring M_Sectionx M_Sectiony

%env save= proceed saving? True or False , Default is setted as True¶

%env plot= proceed plotting? True or False , Default is setted as True¶

%env calc= proceed computation? or just load computed result? True or False , Default is setted as True¶

%env save=False¶

%env lazy=False¶

For debugging this cell can help¶

%env file_exp=SEDNA_DELTA_MONITOR %env year=2012 %env month=01

0[1-2]¶

%env ychunk=10 %env ychunk=False %env save=False %env plot=True %env calc=True # %env lazy=False

False¶

%env control=M_Fluxnet

M_Sectiony ok with ychunk=False local=True lazy=False¶

In [3]:
%%time
# 'savefig': Do we save output in html? or not. keep it true. 
savefig=True
client,cluster,control,catalog_url,month,year,daskreport,outputpath = load.set_control(host)
!mkdir -p $outputpath
!mkdir -p $daskreport
client
local True
using host= irene4225.c-irene.mg1.tgcc.ccc.cea.fr starting dask cluster on local= True workers 16
10000000000
False
tgcc local cluster starting
This code is running on  irene4225.c-irene.mg1.tgcc.ccc.cea.fr using  SEDNA_DELTA_MONITOR file experiment, read from  ../lib/SEDNA_DELTA_MONITOR.yaml  on year= 2012  on month= 01  outputpath= ../results/SEDNA_DELTA_MONITOR/ daskreport= ../results/dask/6419174irene4225.c-irene.mg1.tgcc.ccc.cea.fr_SEDNA_DELTA_MONITOR_01M_FWC_2D/
CPU times: user 3.82 s, sys: 719 ms, total: 4.54 s
Wall time: 1min 42s
Out[3]:

Client

Client-9d0a340b-13d9-11ed-9cb7-080038b93113

Connection method: Cluster object Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status

Cluster Info

LocalCluster

a76772c3

Dashboard: http://127.0.0.1:8787/status Workers: 64
Total threads: 256 Total memory: 251.06 GiB
Status: running Using processes: True

Scheduler Info

Scheduler

Scheduler-587df3fd-8a40-4d94-b796-de4d099dc8ce

Comm: tcp://127.0.0.1:32983 Workers: 64
Dashboard: http://127.0.0.1:8787/status Total threads: 256
Started: 1 minute ago Total memory: 251.06 GiB

Workers

Worker: 0

Comm: tcp://127.0.0.1:37595 Total threads: 4
Dashboard: http://127.0.0.1:35772/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:37908
Local directory: /tmp/dask-worker-space/worker-3k7ne9bi

Worker: 1

Comm: tcp://127.0.0.1:35067 Total threads: 4
Dashboard: http://127.0.0.1:35203/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:44098
Local directory: /tmp/dask-worker-space/worker-w24jyamx

Worker: 2

Comm: tcp://127.0.0.1:40234 Total threads: 4
Dashboard: http://127.0.0.1:41262/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:45365
Local directory: /tmp/dask-worker-space/worker-oth5v22m

Worker: 3

Comm: tcp://127.0.0.1:46090 Total threads: 4
Dashboard: http://127.0.0.1:33179/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:34694
Local directory: /tmp/dask-worker-space/worker-kpplhjws

Worker: 4

Comm: tcp://127.0.0.1:38094 Total threads: 4
Dashboard: http://127.0.0.1:36938/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:37536
Local directory: /tmp/dask-worker-space/worker-zivp0h93

Worker: 5

Comm: tcp://127.0.0.1:36444 Total threads: 4
Dashboard: http://127.0.0.1:33368/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:39543
Local directory: /tmp/dask-worker-space/worker-kigtqszd

Worker: 6

Comm: tcp://127.0.0.1:42737 Total threads: 4
Dashboard: http://127.0.0.1:36574/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:35544
Local directory: /tmp/dask-worker-space/worker-em5hrv1p

Worker: 7

Comm: tcp://127.0.0.1:42399 Total threads: 4
Dashboard: http://127.0.0.1:35642/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:35728
Local directory: /tmp/dask-worker-space/worker-4f7wdr__

Worker: 8

Comm: tcp://127.0.0.1:38041 Total threads: 4
Dashboard: http://127.0.0.1:44336/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:40933
Local directory: /tmp/dask-worker-space/worker-kmuu6q9c

Worker: 9

Comm: tcp://127.0.0.1:45781 Total threads: 4
Dashboard: http://127.0.0.1:36158/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:45322
Local directory: /tmp/dask-worker-space/worker-i31zu4l6

Worker: 10

Comm: tcp://127.0.0.1:37101 Total threads: 4
Dashboard: http://127.0.0.1:34058/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:42982
Local directory: /tmp/dask-worker-space/worker-pvype8wo

Worker: 11

Comm: tcp://127.0.0.1:32974 Total threads: 4
Dashboard: http://127.0.0.1:37614/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:35660
Local directory: /tmp/dask-worker-space/worker-4phrn6n9

Worker: 12

Comm: tcp://127.0.0.1:46781 Total threads: 4
Dashboard: http://127.0.0.1:37562/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:44880
Local directory: /tmp/dask-worker-space/worker-f_grekw6

Worker: 13

Comm: tcp://127.0.0.1:42007 Total threads: 4
Dashboard: http://127.0.0.1:44719/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:34550
Local directory: /tmp/dask-worker-space/worker-iv5kn_08

Worker: 14

Comm: tcp://127.0.0.1:39522 Total threads: 4
Dashboard: http://127.0.0.1:43421/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:43980
Local directory: /tmp/dask-worker-space/worker-3afjx3s3

Worker: 15

Comm: tcp://127.0.0.1:38055 Total threads: 4
Dashboard: http://127.0.0.1:42905/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:42920
Local directory: /tmp/dask-worker-space/worker-s5d4ai4r

Worker: 16

Comm: tcp://127.0.0.1:42740 Total threads: 4
Dashboard: http://127.0.0.1:37438/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:43293
Local directory: /tmp/dask-worker-space/worker-nvhv7dry

Worker: 17

Comm: tcp://127.0.0.1:46685 Total threads: 4
Dashboard: http://127.0.0.1:39554/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:45911
Local directory: /tmp/dask-worker-space/worker-zis2eyxr

Worker: 18

Comm: tcp://127.0.0.1:44490 Total threads: 4
Dashboard: http://127.0.0.1:36450/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:37291
Local directory: /tmp/dask-worker-space/worker-1aanjo57

Worker: 19

Comm: tcp://127.0.0.1:39028 Total threads: 4
Dashboard: http://127.0.0.1:34765/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:43738
Local directory: /tmp/dask-worker-space/worker-izk74vp0

Worker: 20

Comm: tcp://127.0.0.1:36076 Total threads: 4
Dashboard: http://127.0.0.1:38342/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:45741
Local directory: /tmp/dask-worker-space/worker-581pvx35

Worker: 21

Comm: tcp://127.0.0.1:36902 Total threads: 4
Dashboard: http://127.0.0.1:34048/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:42919
Local directory: /tmp/dask-worker-space/worker-jp4mc86i

Worker: 22

Comm: tcp://127.0.0.1:42931 Total threads: 4
Dashboard: http://127.0.0.1:38429/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:32947
Local directory: /tmp/dask-worker-space/worker-zsv0ax61

Worker: 23

Comm: tcp://127.0.0.1:37556 Total threads: 4
Dashboard: http://127.0.0.1:33317/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:44391
Local directory: /tmp/dask-worker-space/worker-f43w1zzf

Worker: 24

Comm: tcp://127.0.0.1:37126 Total threads: 4
Dashboard: http://127.0.0.1:40357/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:46514
Local directory: /tmp/dask-worker-space/worker-dnx3s9yb

Worker: 25

Comm: tcp://127.0.0.1:37877 Total threads: 4
Dashboard: http://127.0.0.1:37434/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:42503
Local directory: /tmp/dask-worker-space/worker-f0qblfo_

Worker: 26

Comm: tcp://127.0.0.1:38308 Total threads: 4
Dashboard: http://127.0.0.1:44074/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:34730
Local directory: /tmp/dask-worker-space/worker-a_so_gbb

Worker: 27

Comm: tcp://127.0.0.1:42524 Total threads: 4
Dashboard: http://127.0.0.1:39967/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:41176
Local directory: /tmp/dask-worker-space/worker-gtuf6hri

Worker: 28

Comm: tcp://127.0.0.1:34796 Total threads: 4
Dashboard: http://127.0.0.1:33250/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:44898
Local directory: /tmp/dask-worker-space/worker-64ircrco

Worker: 29

Comm: tcp://127.0.0.1:33837 Total threads: 4
Dashboard: http://127.0.0.1:35805/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:39823
Local directory: /tmp/dask-worker-space/worker-r7one_kf

Worker: 30

Comm: tcp://127.0.0.1:35204 Total threads: 4
Dashboard: http://127.0.0.1:39086/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:43379
Local directory: /tmp/dask-worker-space/worker-ujrt2awb

Worker: 31

Comm: tcp://127.0.0.1:43284 Total threads: 4
Dashboard: http://127.0.0.1:35106/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:36134
Local directory: /tmp/dask-worker-space/worker-45h_mhnw

Worker: 32

Comm: tcp://127.0.0.1:45449 Total threads: 4
Dashboard: http://127.0.0.1:37624/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:37218
Local directory: /tmp/dask-worker-space/worker-o_0gg0s6

Worker: 33

Comm: tcp://127.0.0.1:35922 Total threads: 4
Dashboard: http://127.0.0.1:34244/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:35363
Local directory: /tmp/dask-worker-space/worker-wizggho4

Worker: 34

Comm: tcp://127.0.0.1:45847 Total threads: 4
Dashboard: http://127.0.0.1:34130/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:35968
Local directory: /tmp/dask-worker-space/worker-d3dllls_

Worker: 35

Comm: tcp://127.0.0.1:43605 Total threads: 4
Dashboard: http://127.0.0.1:38695/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:40430
Local directory: /tmp/dask-worker-space/worker-2_e0h8pc

Worker: 36

Comm: tcp://127.0.0.1:43632 Total threads: 4
Dashboard: http://127.0.0.1:35483/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:38603
Local directory: /tmp/dask-worker-space/worker-hq3ft6mk

Worker: 37

Comm: tcp://127.0.0.1:37066 Total threads: 4
Dashboard: http://127.0.0.1:35210/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:37195
Local directory: /tmp/dask-worker-space/worker-6mwbeg9r

Worker: 38

Comm: tcp://127.0.0.1:43615 Total threads: 4
Dashboard: http://127.0.0.1:39824/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:44109
Local directory: /tmp/dask-worker-space/worker-csekpovk

Worker: 39

Comm: tcp://127.0.0.1:44235 Total threads: 4
Dashboard: http://127.0.0.1:38344/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:34049
Local directory: /tmp/dask-worker-space/worker-vkc2dfcy

Worker: 40

Comm: tcp://127.0.0.1:46595 Total threads: 4
Dashboard: http://127.0.0.1:40519/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:34707
Local directory: /tmp/dask-worker-space/worker-qg_1y4ua

Worker: 41

Comm: tcp://127.0.0.1:43103 Total threads: 4
Dashboard: http://127.0.0.1:43720/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:38612
Local directory: /tmp/dask-worker-space/worker-vxmugyte

Worker: 42

Comm: tcp://127.0.0.1:35535 Total threads: 4
Dashboard: http://127.0.0.1:40856/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:42388
Local directory: /tmp/dask-worker-space/worker-d6ji6sdj

Worker: 43

Comm: tcp://127.0.0.1:39097 Total threads: 4
Dashboard: http://127.0.0.1:46299/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:38403
Local directory: /tmp/dask-worker-space/worker-jlqrmyai

Worker: 44

Comm: tcp://127.0.0.1:40891 Total threads: 4
Dashboard: http://127.0.0.1:38110/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:43965
Local directory: /tmp/dask-worker-space/worker-0ra1i6ao

Worker: 45

Comm: tcp://127.0.0.1:46401 Total threads: 4
Dashboard: http://127.0.0.1:36485/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:33268
Local directory: /tmp/dask-worker-space/worker-h1td809p

Worker: 46

Comm: tcp://127.0.0.1:36671 Total threads: 4
Dashboard: http://127.0.0.1:36557/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:45676
Local directory: /tmp/dask-worker-space/worker-h5rm0dsf

Worker: 47

Comm: tcp://127.0.0.1:33654 Total threads: 4
Dashboard: http://127.0.0.1:45491/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:40040
Local directory: /tmp/dask-worker-space/worker-1fbxan8x

Worker: 48

Comm: tcp://127.0.0.1:32835 Total threads: 4
Dashboard: http://127.0.0.1:46604/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:43535
Local directory: /tmp/dask-worker-space/worker-adt1_qg_

Worker: 49

Comm: tcp://127.0.0.1:45908 Total threads: 4
Dashboard: http://127.0.0.1:34163/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:40694
Local directory: /tmp/dask-worker-space/worker-02ybbmw6

Worker: 50

Comm: tcp://127.0.0.1:40709 Total threads: 4
Dashboard: http://127.0.0.1:33721/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:36798
Local directory: /tmp/dask-worker-space/worker-n85jj67y

Worker: 51

Comm: tcp://127.0.0.1:46201 Total threads: 4
Dashboard: http://127.0.0.1:40217/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:43573
Local directory: /tmp/dask-worker-space/worker-vkd84mcf

Worker: 52

Comm: tcp://127.0.0.1:42525 Total threads: 4
Dashboard: http://127.0.0.1:40439/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:40545
Local directory: /tmp/dask-worker-space/worker-eizjgm76

Worker: 53

Comm: tcp://127.0.0.1:36535 Total threads: 4
Dashboard: http://127.0.0.1:34486/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:41046
Local directory: /tmp/dask-worker-space/worker-y6jy5b38

Worker: 54

Comm: tcp://127.0.0.1:39801 Total threads: 4
Dashboard: http://127.0.0.1:32922/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:39862
Local directory: /tmp/dask-worker-space/worker-opr3iblh

Worker: 55

Comm: tcp://127.0.0.1:46546 Total threads: 4
Dashboard: http://127.0.0.1:36334/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:46673
Local directory: /tmp/dask-worker-space/worker-2zv6bk04

Worker: 56

Comm: tcp://127.0.0.1:44994 Total threads: 4
Dashboard: http://127.0.0.1:45837/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:45051
Local directory: /tmp/dask-worker-space/worker-s4iozfm1

Worker: 57

Comm: tcp://127.0.0.1:38918 Total threads: 4
Dashboard: http://127.0.0.1:44395/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:41889
Local directory: /tmp/dask-worker-space/worker-m7pia4_t

Worker: 58

Comm: tcp://127.0.0.1:39363 Total threads: 4
Dashboard: http://127.0.0.1:46608/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:46834
Local directory: /tmp/dask-worker-space/worker-i3cqr1fk

Worker: 59

Comm: tcp://127.0.0.1:38910 Total threads: 4
Dashboard: http://127.0.0.1:33880/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:44979
Local directory: /tmp/dask-worker-space/worker-zmjmw7k7

Worker: 60

Comm: tcp://127.0.0.1:44423 Total threads: 4
Dashboard: http://127.0.0.1:34365/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:41442
Local directory: /tmp/dask-worker-space/worker-ifk40wx4

Worker: 61

Comm: tcp://127.0.0.1:39561 Total threads: 4
Dashboard: http://127.0.0.1:44474/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:45184
Local directory: /tmp/dask-worker-space/worker-lwr8poac

Worker: 62

Comm: tcp://127.0.0.1:35542 Total threads: 4
Dashboard: http://127.0.0.1:44745/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:38434
Local directory: /tmp/dask-worker-space/worker-eoi39s27

Worker: 63

Comm: tcp://127.0.0.1:46769 Total threads: 4
Dashboard: http://127.0.0.1:46379/status Memory: 3.92 GiB
Nanny: tcp://127.0.0.1:35120
Local directory: /tmp/dask-worker-space/worker-11p9d0of

read plotting information from a csv file¶

In [4]:
df=load.controlfile(control)
#Take out 'later' tagged computations
#df=df[~df['Value'].str.contains('later')]
df
Out[4]:
Value Inputs Equation Zone Plot Colourmap MinMax Unit Oldname Unnamed: 10
FWC_2D gridS.vosaline,param.mask,param.e3t,param.e1te2t calc.FWC2D_UFUNC(data) BBFG maps Spectral_r (0,24) m S-1

Computation starts here¶

Each computation consists of

  1. Load NEMO data set
  2. Zoom data set
  3. Compute (or load computed data set)
  4. Save
  5. Plot
  6. Close
In [5]:
%%time
import os
calcswitch=os.environ.get('calc', 'True') 
lazy=os.environ.get('lazy','False' )
loaddata=((df.Inputs != '').any()) 
print('calcswitch=',calcswitch,'df.Inputs != nothing',loaddata, 'lazy=',lazy)
data = load.datas(catalog_url,df.Inputs,month,year,daskreport,lazy=lazy) if ((calcswitch=='True' )*loaddata) else 0 
data
calcswitch= True df.Inputs != nothing True lazy= False
../lib/SEDNA_DELTA_MONITOR.yaml
using param_xios reading  ../lib/SEDNA_DELTA_MONITOR.yaml
using param_xios reading  <bound method DataSourceBase.describe of sources:
  param_xios:
    args:
      combine: nested
      concat_dim: y
      urlpath: /ccc/work/cont003/gen7420/odakatin/CONFIGS/SEDNA/SEDNA-I/SEDNA_Domain_cfg_Tgt_20210423_tsh10m_L1/param_f32/x_*.nc
      xarray_kwargs:
        compat: override
        coords: minimal
        data_vars: minimal
        parallel: true
    description: SEDNA NEMO parameters from MPI output  nav_lon lat fails
    driver: intake_xarray.netcdf.NetCDFSource
    metadata:
      catalog_dir: /ccc/work/cont003/gen7420/odakatin/monitor-sedna/notebook/../lib/
>
{'name': 'param_xios', 'container': 'xarray', 'plugin': ['netcdf'], 'driver': ['netcdf'], 'description': 'SEDNA NEMO parameters from MPI output  nav_lon lat fails', 'direct_access': 'forbid', 'user_parameters': [{'name': 'path', 'description': 'file coordinate', 'type': 'str', 'default': '/ccc/work/cont003/gen7420/odakatin/CONFIGS/SEDNA/MESH/SEDNA_mesh_mask_Tgt_20210423_tsh10m_L1/param'}], 'metadata': {}, 'args': {'urlpath': '/ccc/work/cont003/gen7420/odakatin/CONFIGS/SEDNA/SEDNA-I/SEDNA_Domain_cfg_Tgt_20210423_tsh10m_L1/param_f32/x_*.nc', 'combine': 'nested', 'concat_dim': 'y'}}
0 read gridS ['vosaline']
lazy= False
using load_data_xios_kerchunk reading  gridS
using load_data_xios_kerchunk reading  <bound method DataSourceBase.describe of sources:
  data_xios_kerchunk:
    args:
      consolidated: false
      storage_options:
        fo: file:////ccc/cont003/home/ra5563/ra5563/catalogue/DELTA/201201/gridS_0[0-5][0-9][0-9].json
        target_protocol: file
      urlpath: reference://
    description: CREG025 NEMO outputs from different xios server in kerchunk format
    driver: intake_xarray.xzarr.ZarrSource
    metadata:
      catalog_dir: /ccc/work/cont003/gen7420/odakatin/monitor-sedna/notebook/../lib/
>
      took 23.14039945602417 seconds
0 merging gridS ['vosaline']
param mask will be included in data
param e3t will be included in data
param e1te2t will be included in data
param nav_lat will be included in data
param nav_lon will be included in data
param mask2d will be included in data
CPU times: user 29 s, sys: 8.74 s, total: 37.7 s
Wall time: 1min 21s
Out[5]:
<xarray.Dataset>
Dimensions:        (t: 31, z: 150, y: 6540, x: 6560)
Coordinates:
    time_centered  (t) object dask.array<chunksize=(1,), meta=np.ndarray>
  * t              (t) object 2012-01-01 12:00:00 ... 2012-01-31 12:00:00
  * y              (y) int64 1 2 3 4 5 6 7 ... 6535 6536 6537 6538 6539 6540
  * x              (x) int64 1 2 3 4 5 6 7 ... 6555 6556 6557 6558 6559 6560
  * z              (z) int64 1 2 3 4 5 6 7 8 ... 143 144 145 146 147 148 149 150
    mask           (z, y, x) bool dask.array<chunksize=(150, 13, 6560), meta=np.ndarray>
    e3t            (z, y, x) float64 dask.array<chunksize=(150, 13, 6560), meta=np.ndarray>
    e1te2t         (y, x) float64 dask.array<chunksize=(13, 6560), meta=np.ndarray>
    nav_lat        (y, x) float32 dask.array<chunksize=(13, 6560), meta=np.ndarray>
    nav_lon        (y, x) float32 dask.array<chunksize=(13, 6560), meta=np.ndarray>
    mask2d         (y, x) bool dask.array<chunksize=(13, 6560), meta=np.ndarray>
Data variables:
    vosaline       (t, z, y, x) float32 dask.array<chunksize=(1, 150, 13, 6560), meta=np.ndarray>
Attributes: (12/26)
    CASE:                    DELTA
    CONFIG:                  SEDNA
    Conventions:             CF-1.6
    DOMAIN_dimensions_ids:   [2, 3]
    DOMAIN_halo_size_end:    [0, 0]
    DOMAIN_halo_size_start:  [0, 0]
    ...                      ...
    nj:                      13
    output_frequency:        1d
    start_date:              20090101
    timeStamp:               2022-Jan-17 19:00:16 GMT
    title:                   ocean T grid variables
    uuid:                    d8db76f6-a436-451a-9ab1-72dc892753af
xarray.Dataset
    • t: 31
    • z: 150
    • y: 6540
    • x: 6560
    • time_centered
      (t)
      object
      dask.array<chunksize=(1,), meta=np.ndarray>
      bounds :
      time_centered_bounds
      long_name :
      Time axis
      standard_name :
      time
      time_origin :
      1900-01-01 00:00:00
      Array Chunk
      Bytes 248 B 8 B
      Shape (31,) (1,)
      Count 32 Tasks 31 Chunks
      Type object numpy.ndarray
      31 1
    • t
      (t)
      object
      2012-01-01 12:00:00 ... 2012-01-...
      axis :
      T
      bounds :
      time_counter_bounds
      long_name :
      Time axis
      standard_name :
      time
      time_origin :
      1900-01-01 00:00:00
      array([cftime.DatetimeNoLeap(2012, 1, 1, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 2, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 3, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 4, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 5, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 6, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 7, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 8, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 9, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 10, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 11, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 12, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 13, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 14, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 15, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 16, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 17, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 18, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 19, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 20, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 21, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 22, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 23, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 24, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 25, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 26, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 27, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 28, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 29, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 30, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 31, 12, 0, 0, 0, has_year_zero=True)],
            dtype=object)
    • y
      (y)
      int64
      1 2 3 4 5 ... 6537 6538 6539 6540
      array([   1,    2,    3, ..., 6538, 6539, 6540])
    • x
      (x)
      int64
      1 2 3 4 5 ... 6557 6558 6559 6560
      array([   1,    2,    3, ..., 6558, 6559, 6560])
    • z
      (z)
      int64
      1 2 3 4 5 6 ... 146 147 148 149 150
      array([  1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,  14,
              15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,  26,  27,  28,
              29,  30,  31,  32,  33,  34,  35,  36,  37,  38,  39,  40,  41,  42,
              43,  44,  45,  46,  47,  48,  49,  50,  51,  52,  53,  54,  55,  56,
              57,  58,  59,  60,  61,  62,  63,  64,  65,  66,  67,  68,  69,  70,
              71,  72,  73,  74,  75,  76,  77,  78,  79,  80,  81,  82,  83,  84,
              85,  86,  87,  88,  89,  90,  91,  92,  93,  94,  95,  96,  97,  98,
              99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112,
             113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126,
             127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140,
             141, 142, 143, 144, 145, 146, 147, 148, 149, 150])
    • mask
      (z, y, x)
      bool
      dask.array<chunksize=(150, 13, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 5.99 GiB 12.20 MiB
      Shape (150, 6540, 6560) (150, 13, 6560)
      Count 1632 Tasks 544 Chunks
      Type bool numpy.ndarray
      6560 6540 150
    • e3t
      (z, y, x)
      float64
      dask.array<chunksize=(150, 13, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 47.95 GiB 97.60 MiB
      Shape (150, 6540, 6560) (150, 13, 6560)
      Count 1632 Tasks 544 Chunks
      Type float64 numpy.ndarray
      6560 6540 150
    • e1te2t
      (y, x)
      float64
      dask.array<chunksize=(13, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 327.32 MiB 666.25 kiB
      Shape (6540, 6560) (13, 6560)
      Count 1632 Tasks 544 Chunks
      Type float64 numpy.ndarray
      6560 6540
    • nav_lat
      (y, x)
      float32
      dask.array<chunksize=(13, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 163.66 MiB 333.12 kiB
      Shape (6540, 6560) (13, 6560)
      Count 1632 Tasks 544 Chunks
      Type float32 numpy.ndarray
      6560 6540
    • nav_lon
      (y, x)
      float32
      dask.array<chunksize=(13, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 163.66 MiB 333.12 kiB
      Shape (6540, 6560) (13, 6560)
      Count 1632 Tasks 544 Chunks
      Type float32 numpy.ndarray
      6560 6540
    • mask2d
      (y, x)
      bool
      dask.array<chunksize=(13, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 40.91 MiB 83.28 kiB
      Shape (6540, 6560) (13, 6560)
      Count 1632 Tasks 544 Chunks
      Type bool numpy.ndarray
      6560 6540
    • vosaline
      (t, z, y, x)
      float32
      dask.array<chunksize=(1, 150, 13, 6560), meta=np.ndarray>
      cell_methods :
      time: mean (interval: 40 s)
      interval_operation :
      40 s
      interval_write :
      1 d
      long_name :
      salinity
      online_operation :
      average
      standard_name :
      sea_water_practical_salinity
      units :
      1e-3
      Array Chunk
      Bytes 743.18 GiB 48.80 MiB
      Shape (31, 150, 6540, 6560) (1, 150, 13, 6560)
      Count 34272 Tasks 16864 Chunks
      Type float32 numpy.ndarray
      31 1 6560 6540 150
  • CASE :
    DELTA
    CONFIG :
    SEDNA
    Conventions :
    CF-1.6
    DOMAIN_dimensions_ids :
    [2, 3]
    DOMAIN_halo_size_end :
    [0, 0]
    DOMAIN_halo_size_start :
    [0, 0]
    DOMAIN_number :
    0
    DOMAIN_number_total :
    544
    DOMAIN_position_first :
    [1, 1]
    DOMAIN_position_last :
    [6560, 13]
    DOMAIN_size_global :
    [6560, 6540]
    DOMAIN_size_local :
    [6560, 13]
    DOMAIN_type :
    box
    NCO :
    netCDF Operators version 4.9.1 (Homepage = http://nco.sf.net, Code = http://github.com/nco/nco)
    description :
    ocean T grid variables
    history :
    Tue Jan 18 17:23:11 2022: ncks -4 -L 1 SEDNA-DELTA_1d_gridS_201201-201201_NOZIP_0000.nc /ccc/scratch/cont003/gen7420/talandel/SEDNA/SEDNA-DELTA-S/SPLIT/1d/2012/01/SEDNA-DELTA_1d_gridS_201201-201201_0000.nc Tue Jan 18 17:22:45 2022: ncrcat -n 31,2,1 SEDNA-DELTA_1d_gridS_0000_01.nc SEDNA-DELTA_1d_gridS_201201-201201_NOZIP_0000.nc
    ibegin :
    0
    jbegin :
    0
    name :
    /ccc/scratch/cont003/ra5563/talandel/ONGOING-RUNS/SEDNA-DELTA-XIOS.46/SEDNA-DELTA_1d_gridS
    ni :
    6560
    nj :
    13
    output_frequency :
    1d
    start_date :
    20090101
    timeStamp :
    2022-Jan-17 19:00:16 GMT
    title :
    ocean T grid variables
    uuid :
    d8db76f6-a436-451a-9ab1-72dc892753af
In [6]:
%%time
monitor.auto(df,data,savefig,daskreport,outputpath,file_exp='SEDNA'
            )
#calc= True
#save= True
#plot= False
Value='FWC_2D'
Zone='BBFG'
Plot='maps'
cmap='Spectral_r'
clabel='m'
clim= (0, 24)
outputpath='../results/SEDNA_DELTA_MONITOR/'
nc_outputpath='../nc_results/SEDNA_DELTA_MONITOR/'
filename='SEDNA_maps_BBFG_FWC_2D'
data=monitor.optimize_dataset(data)
#2 Zooming Data
data= zoom.BBFG(data)
data=monitor.optimize_dataset(data)
<xarray.Dataset>
Dimensions:        (t: 31, z: 150, y: 5264, x: 6560)
Coordinates:
    time_centered  (t) object dask.array<chunksize=(1,), meta=np.ndarray>
  * t              (t) object 2012-01-01 12:00:00 ... 2012-01-31 12:00:00
  * y              (y) int64 1277 1278 1279 1280 1281 ... 6537 6538 6539 6540
  * x              (x) int64 1 2 3 4 5 6 7 ... 6555 6556 6557 6558 6559 6560
  * z              (z) int64 1 2 3 4 5 6 7 8 ... 143 144 145 146 147 148 149 150
    mask           (z, y, x) bool dask.array<chunksize=(150, 8, 6560), meta=np.ndarray>
    e3t            (z, y, x) float64 dask.array<chunksize=(150, 8, 6560), meta=np.ndarray>
    e1te2t         (y, x) float64 dask.array<chunksize=(8, 6560), meta=np.ndarray>
    nav_lat        (y, x) float32 dask.array<chunksize=(8, 6560), meta=np.ndarray>
    nav_lon        (y, x) float32 dask.array<chunksize=(8, 6560), meta=np.ndarray>
    mask2d         (y, x) bool dask.array<chunksize=(8, 6560), meta=np.ndarray>
Data variables:
    vosaline       (t, z, y, x) float32 dask.array<chunksize=(1, 150, 8, 6560), meta=np.ndarray>
Attributes: (12/26)
    CASE:                    DELTA
    CONFIG:                  SEDNA
    Conventions:             CF-1.6
    DOMAIN_dimensions_ids:   [2, 3]
    DOMAIN_halo_size_end:    [0, 0]
    DOMAIN_halo_size_start:  [0, 0]
    ...                      ...
    nj:                      13
    output_frequency:        1d
    start_date:              20090101
    timeStamp:               2022-Jan-17 19:00:16 GMT
    title:                   ocean T grid variables
    uuid:                    d8db76f6-a436-451a-9ab1-72dc892753af
xarray.Dataset
    • t: 31
    • z: 150
    • y: 5264
    • x: 6560
    • time_centered
      (t)
      object
      dask.array<chunksize=(1,), meta=np.ndarray>
      bounds :
      time_centered_bounds
      long_name :
      Time axis
      standard_name :
      time
      time_origin :
      1900-01-01 00:00:00
      Array Chunk
      Bytes 248 B 8 B
      Shape (31,) (1,)
      Count 32 Tasks 31 Chunks
      Type object numpy.ndarray
      31 1
    • t
      (t)
      object
      2012-01-01 12:00:00 ... 2012-01-...
      axis :
      T
      bounds :
      time_counter_bounds
      long_name :
      Time axis
      standard_name :
      time
      time_origin :
      1900-01-01 00:00:00
      array([cftime.DatetimeNoLeap(2012, 1, 1, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 2, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 3, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 4, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 5, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 6, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 7, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 8, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 9, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 10, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 11, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 12, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 13, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 14, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 15, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 16, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 17, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 18, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 19, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 20, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 21, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 22, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 23, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 24, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 25, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 26, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 27, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 28, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 29, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 30, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 31, 12, 0, 0, 0, has_year_zero=True)],
            dtype=object)
    • y
      (y)
      int64
      1277 1278 1279 ... 6538 6539 6540
      array([1277, 1278, 1279, ..., 6538, 6539, 6540])
    • x
      (x)
      int64
      1 2 3 4 5 ... 6557 6558 6559 6560
      array([   1,    2,    3, ..., 6558, 6559, 6560])
    • z
      (z)
      int64
      1 2 3 4 5 6 ... 146 147 148 149 150
      array([  1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,  14,
              15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,  26,  27,  28,
              29,  30,  31,  32,  33,  34,  35,  36,  37,  38,  39,  40,  41,  42,
              43,  44,  45,  46,  47,  48,  49,  50,  51,  52,  53,  54,  55,  56,
              57,  58,  59,  60,  61,  62,  63,  64,  65,  66,  67,  68,  69,  70,
              71,  72,  73,  74,  75,  76,  77,  78,  79,  80,  81,  82,  83,  84,
              85,  86,  87,  88,  89,  90,  91,  92,  93,  94,  95,  96,  97,  98,
              99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112,
             113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126,
             127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140,
             141, 142, 143, 144, 145, 146, 147, 148, 149, 150])
    • mask
      (z, y, x)
      bool
      dask.array<chunksize=(150, 8, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 4.82 GiB 11.26 MiB
      Shape (150, 5264, 6560) (150, 12, 6560)
      Count 2071 Tasks 439 Chunks
      Type bool numpy.ndarray
      6560 5264 150
    • e3t
      (z, y, x)
      float64
      dask.array<chunksize=(150, 8, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 38.59 GiB 90.09 MiB
      Shape (150, 5264, 6560) (150, 12, 6560)
      Count 2071 Tasks 439 Chunks
      Type float64 numpy.ndarray
      6560 5264 150
    • e1te2t
      (y, x)
      float64
      dask.array<chunksize=(8, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 263.46 MiB 615.00 kiB
      Shape (5264, 6560) (12, 6560)
      Count 2071 Tasks 439 Chunks
      Type float64 numpy.ndarray
      6560 5264
    • nav_lat
      (y, x)
      float32
      dask.array<chunksize=(8, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 131.73 MiB 307.50 kiB
      Shape (5264, 6560) (12, 6560)
      Count 2071 Tasks 439 Chunks
      Type float32 numpy.ndarray
      6560 5264
    • nav_lon
      (y, x)
      float32
      dask.array<chunksize=(8, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 131.73 MiB 307.50 kiB
      Shape (5264, 6560) (12, 6560)
      Count 2071 Tasks 439 Chunks
      Type float32 numpy.ndarray
      6560 5264
    • mask2d
      (y, x)
      bool
      dask.array<chunksize=(8, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 32.93 MiB 76.88 kiB
      Shape (5264, 6560) (12, 6560)
      Count 2071 Tasks 439 Chunks
      Type bool numpy.ndarray
      6560 5264
    • vosaline
      (t, z, y, x)
      float32
      dask.array<chunksize=(1, 150, 8, 6560), meta=np.ndarray>
      cell_methods :
      time: mean (interval: 40 s)
      interval_operation :
      40 s
      interval_write :
      1 d
      long_name :
      salinity
      online_operation :
      average
      standard_name :
      sea_water_practical_salinity
      units :
      1e-3
      Array Chunk
      Bytes 598.18 GiB 45.04 MiB
      Shape (31, 150, 5264, 6560) (1, 150, 12, 6560)
      Count 42145 Tasks 13609 Chunks
      Type float32 numpy.ndarray
      31 1 6560 5264 150
  • CASE :
    DELTA
    CONFIG :
    SEDNA
    Conventions :
    CF-1.6
    DOMAIN_dimensions_ids :
    [2, 3]
    DOMAIN_halo_size_end :
    [0, 0]
    DOMAIN_halo_size_start :
    [0, 0]
    DOMAIN_number :
    0
    DOMAIN_number_total :
    544
    DOMAIN_position_first :
    [1, 1]
    DOMAIN_position_last :
    [6560, 13]
    DOMAIN_size_global :
    [6560, 6540]
    DOMAIN_size_local :
    [6560, 13]
    DOMAIN_type :
    box
    NCO :
    netCDF Operators version 4.9.1 (Homepage = http://nco.sf.net, Code = http://github.com/nco/nco)
    description :
    ocean T grid variables
    history :
    Tue Jan 18 17:23:11 2022: ncks -4 -L 1 SEDNA-DELTA_1d_gridS_201201-201201_NOZIP_0000.nc /ccc/scratch/cont003/gen7420/talandel/SEDNA/SEDNA-DELTA-S/SPLIT/1d/2012/01/SEDNA-DELTA_1d_gridS_201201-201201_0000.nc Tue Jan 18 17:22:45 2022: ncrcat -n 31,2,1 SEDNA-DELTA_1d_gridS_0000_01.nc SEDNA-DELTA_1d_gridS_201201-201201_NOZIP_0000.nc
    ibegin :
    0
    jbegin :
    0
    name :
    /ccc/scratch/cont003/ra5563/talandel/ONGOING-RUNS/SEDNA-DELTA-XIOS.46/SEDNA-DELTA_1d_gridS
    ni :
    6560
    nj :
    13
    output_frequency :
    1d
    start_date :
    20090101
    timeStamp :
    2022-Jan-17 19:00:16 GMT
    title :
    ocean T grid variables
    uuid :
    d8db76f6-a436-451a-9ab1-72dc892753af
#3 Start computing 
data= calc.FWC2D_UFUNC(data)
monitor.optimize_dataset(data)
add optimise here once otimise can recognise
<xarray.Dataset>
Dimensions:        (t: 31, y: 5264, x: 6560)
Coordinates:
    time_centered  (t) object dask.array<chunksize=(1,), meta=np.ndarray>
  * t              (t) object 2012-01-01 12:00:00 ... 2012-01-31 12:00:00
  * y              (y) int64 1277 1278 1279 1280 1281 ... 6537 6538 6539 6540
  * x              (x) int64 1 2 3 4 5 6 7 ... 6555 6556 6557 6558 6559 6560
    e1te2t         (y, x) float64 dask.array<chunksize=(8, 6560), meta=np.ndarray>
    nav_lat        (y, x) float32 dask.array<chunksize=(8, 6560), meta=np.ndarray>
    nav_lon        (y, x) float32 dask.array<chunksize=(8, 6560), meta=np.ndarray>
    mask2d         (y, x) bool dask.array<chunksize=(8, 6560), meta=np.ndarray>
Data variables:
    FWC2D          (t, y, x) float32 dask.array<chunksize=(1, 8, 6560), meta=np.ndarray>
xarray.Dataset
    • t: 31
    • y: 5264
    • x: 6560
    • time_centered
      (t)
      object
      dask.array<chunksize=(1,), meta=np.ndarray>
      Array Chunk
      Bytes 248 B 8 B
      Shape (31,) (1,)
      Count 32 Tasks 31 Chunks
      Type object numpy.ndarray
      31 1
    • t
      (t)
      object
      2012-01-01 12:00:00 ... 2012-01-...
      array([cftime.DatetimeNoLeap(2012, 1, 1, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 2, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 3, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 4, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 5, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 6, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 7, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 8, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 9, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 10, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 11, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 12, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 13, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 14, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 15, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 16, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 17, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 18, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 19, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 20, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 21, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 22, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 23, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 24, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 25, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 26, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 27, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 28, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 29, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 30, 12, 0, 0, 0, has_year_zero=True),
             cftime.DatetimeNoLeap(2012, 1, 31, 12, 0, 0, 0, has_year_zero=True)],
            dtype=object)
    • y
      (y)
      int64
      1277 1278 1279 ... 6538 6539 6540
      array([1277, 1278, 1279, ..., 6538, 6539, 6540])
    • x
      (x)
      int64
      1 2 3 4 5 ... 6557 6558 6559 6560
      array([   1,    2,    3, ..., 6558, 6559, 6560])
    • e1te2t
      (y, x)
      float64
      dask.array<chunksize=(8, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 263.46 MiB 615.00 kiB
      Shape (5264, 6560) (12, 6560)
      Count 2071 Tasks 439 Chunks
      Type float64 numpy.ndarray
      6560 5264
    • nav_lat
      (y, x)
      float32
      dask.array<chunksize=(8, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 131.73 MiB 307.50 kiB
      Shape (5264, 6560) (12, 6560)
      Count 2071 Tasks 439 Chunks
      Type float32 numpy.ndarray
      6560 5264
    • nav_lon
      (y, x)
      float32
      dask.array<chunksize=(8, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 131.73 MiB 307.50 kiB
      Shape (5264, 6560) (12, 6560)
      Count 2071 Tasks 439 Chunks
      Type float32 numpy.ndarray
      6560 5264
    • mask2d
      (y, x)
      bool
      dask.array<chunksize=(8, 6560), meta=np.ndarray>
      Array Chunk
      Bytes 32.93 MiB 76.88 kiB
      Shape (5264, 6560) (12, 6560)
      Count 2071 Tasks 439 Chunks
      Type bool numpy.ndarray
      6560 5264
    • FWC2D
      (t, y, x)
      float32
      dask.array<chunksize=(1, 8, 6560), meta=np.ndarray>
      long_name :
      Fresh Water Content 2D
      Array Chunk
      Bytes 3.99 GiB 307.50 kiB
      Shape (31, 5264, 6560) (1, 12, 6560)
      Count 128819 Tasks 13609 Chunks
      Type float32 numpy.ndarray
      6560 5264 31
#4 Saving  SEDNA_maps_BBFG_FWC_2D
data=save.datas(data,plot=Plot,path=nc_outputpath,filename=filename)
start saving data
saving data in a file
t (1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
slice(0, 1, None)
2022-08-04 11:43:34,302 - distributed.worker_memory - WARNING - Worker is at 97% memory usage. Pausing worker.  Process memory: 3.81 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:34,345 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:37126 (pid=7551) exceeded 99% memory budget. Restarting...
2022-08-04 11:43:34,529 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37126
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
    return fut.result()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:51936 remote=tcp://127.0.0.1:37126>: ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37126 after 30 s
2022-08-04 11:43:34,960 - distributed.nanny - WARNING - Restarting worker
2022-08-04 11:43:35,027 - distributed.worker_memory - WARNING - Worker is at 96% memory usage. Pausing worker.  Process memory: 3.79 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:35,192 - distributed.worker_memory - WARNING - Worker is at 94% memory usage. Resuming worker. Process memory: 3.71 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:53,984 - distributed.nanny - WARNING - Worker process still alive after 3.9999975204467777 seconds, killing
2022-08-04 11:44:11,411 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:32835
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:32835 after 30 s
2022-08-04 11:44:11,544 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43632
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43632 after 30 s
2022-08-04 11:44:11,544 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44423
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44423 after 30 s
2022-08-04 11:44:42,115 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43632
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43632 after 30 s
2022-08-04 11:44:42,118 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44423
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44423 after 30 s
2022-08-04 11:44:42,118 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:32835
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:32835 after 30 s
2022-08-04 11:45:14,113 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44423
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44423 after 30 s
2022-08-04 11:45:14,115 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43632
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43632 after 30 s
2022-08-04 11:45:14,115 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:32835
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:32835 after 30 s
2022-08-04 11:45:42,133 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:32974 (pid=7566) exceeded 99% memory budget. Restarting...
2022-08-04 11:45:42,380 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:35204 (pid=7529) exceeded 99% memory budget. Restarting...
2022-08-04 11:45:42,413 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:32974
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
    n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:56706 remote=tcp://127.0.0.1:32974>: Stream is closed
2022-08-04 11:45:42,789 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:40234 -> tcp://127.0.0.1:35204
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:40234 remote=tcp://127.0.0.1:51652>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:42,790 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35204
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:46266 remote=tcp://127.0.0.1:35204>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:43,113 - distributed.nanny - WARNING - Restarting worker
2022-08-04 11:45:43,119 - distributed.nanny - WARNING - Restarting worker
2022-08-04 11:45:47,147 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35204
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:46384 remote=tcp://127.0.0.1:35204>: Stream is closed
2022-08-04 11:45:50,177 - distributed.worker_memory - WARNING - Worker is at 100% memory usage. Pausing worker.  Process memory: 3.96 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:45:50,228 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:40234 (pid=7514) exceeded 99% memory budget. Restarting...
2022-08-04 11:45:50,588 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:32835 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:32835 remote=tcp://127.0.0.1:57448>: Stream is closed
2022-08-04 11:45:50,588 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:33654 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:33654 remote=tcp://127.0.0.1:44918>: Stream is closed
2022-08-04 11:45:50,588 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:38041 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:38041 remote=tcp://127.0.0.1:42922>: Stream is closed
2022-08-04 11:45:50,589 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
    n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51650 remote=tcp://127.0.0.1:40234>: Stream is closed
2022-08-04 11:45:50,589 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:42399 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:42399 remote=tcp://127.0.0.1:56642>: Stream is closed
2022-08-04 11:45:50,589 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:46781 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:46781 remote=tcp://127.0.0.1:43566>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:50,589 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:36671 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:36671 remote=tcp://127.0.0.1:60050>: Stream is closed
2022-08-04 11:45:50,595 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:44423 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:44423 remote=tcp://127.0.0.1:49478>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:50,596 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:43605 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:43605 remote=tcp://127.0.0.1:51542>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:50,599 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:35922 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:35922 remote=tcp://127.0.0.1:55064>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:50,588 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
    n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51688 remote=tcp://127.0.0.1:40234>: Stream is closed
2022-08-04 11:45:50,588 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
    n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51734 remote=tcp://127.0.0.1:40234>: Stream is closed
2022-08-04 11:45:50,589 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:52226 remote=tcp://127.0.0.1:40234>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:50,590 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
    n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51726 remote=tcp://127.0.0.1:40234>: Stream is closed
2022-08-04 11:45:50,592 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
    n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51658 remote=tcp://127.0.0.1:40234>: Stream is closed
2022-08-04 11:45:50,604 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
    n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51732 remote=tcp://127.0.0.1:40234>: Stream is closed
2022-08-04 11:45:50,717 - distributed.nanny - WARNING - Restarting worker
2022-08-04 11:45:50,934 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:35535 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
    num_bytes = self.write_to_fd(self._write_buffer.peek(size))
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
    return self.socket.send(data)  # type: ignore
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:35535 remote=tcp://127.0.0.1:46884>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:51,011 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:46546 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
    num_bytes = self.write_to_fd(self._write_buffer.peek(size))
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
    return self.socket.send(data)  # type: ignore
BrokenPipeError: [Errno 32] Broken pipe

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:46546 remote=tcp://127.0.0.1:59298>: BrokenPipeError: [Errno 32] Broken pipe
2022-08-04 11:45:51,241 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
    n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51724 remote=tcp://127.0.0.1:40234>: Stream is closed
2022-08-04 11:45:51,333 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:33837 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
    num_bytes = self.write_to_fd(self._write_buffer.peek(size))
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
    return self.socket.send(data)  # type: ignore
BrokenPipeError: [Errno 32] Broken pipe

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:33837 remote=tcp://127.0.0.1:39534>: BrokenPipeError: [Errno 32] Broken pipe
2022-08-04 11:45:51,313 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51728 remote=tcp://127.0.0.1:40234>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:51,635 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:45781 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
    num_bytes = self.write_to_fd(self._write_buffer.peek(size))
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
    return self.socket.send(data)  # type: ignore
BrokenPipeError: [Errno 32] Broken pipe

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:45781 remote=tcp://127.0.0.1:40034>: BrokenPipeError: [Errno 32] Broken pipe
2022-08-04 11:46:02,125 - distributed.nanny - WARNING - Worker process still alive after 3.999999237060547 seconds, killing
2022-08-04 11:46:02,155 - distributed.nanny - WARNING - Worker process still alive after 3.9999996185302735 seconds, killing
2022-08-04 11:46:09,761 - distributed.nanny - WARNING - Worker process still alive after 3.9999988555908206 seconds, killing
2022-08-04 11:46:11,818 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38308
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38308 after 30 s
2022-08-04 11:46:11,890 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39028
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39028 after 30 s
2022-08-04 11:46:13,294 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43615
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43615 after 30 s
2022-08-04 11:46:13,338 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:13,509 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45847
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45847 after 30 s
2022-08-04 11:46:13,586 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39522
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39522 after 30 s
2022-08-04 11:46:14,801 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36902 after 30 s
2022-08-04 11:46:15,155 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:15,353 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:15,503 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42737
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42737 after 30 s
2022-08-04 11:46:15,771 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45449
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45449 after 30 s
2022-08-04 11:46:16,377 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39028
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39028 after 30 s
2022-08-04 11:46:16,503 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:16,494 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43103
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43103 after 30 s
2022-08-04 11:46:16,495 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:34796
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:34796 after 30 s
2022-08-04 11:46:16,495 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43103
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43103 after 30 s
2022-08-04 11:46:16,506 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45449
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45449 after 30 s
2022-08-04 11:46:16,585 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46090
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46090 after 30 s
2022-08-04 11:46:16,674 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:17,191 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:34796
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:34796 after 30 s
2022-08-04 11:46:17,240 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46090
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46090 after 30 s
2022-08-04 11:46:17,241 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:17,243 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39522
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39522 after 30 s
2022-08-04 11:46:17,208 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44235
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44235 after 30 s
2022-08-04 11:46:17,945 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:18,307 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45449
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45449 after 30 s
2022-08-04 11:46:18,347 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:18,371 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:34796
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:34796 after 30 s
2022-08-04 11:46:18,412 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37877
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37877 after 30 s
2022-08-04 11:46:18,493 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42931
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42931 after 30 s
2022-08-04 11:46:18,495 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37877
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37877 after 30 s
2022-08-04 11:46:18,529 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42737
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42737 after 30 s
2022-08-04 11:46:18,532 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43103
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43103 after 30 s
2022-08-04 11:46:18,532 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42931
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42931 after 30 s
2022-08-04 11:46:18,729 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45449
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45449 after 30 s
2022-08-04 11:46:18,730 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:19,014 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:19,200 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:19,197 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:19,435 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46401
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46401 after 30 s
2022-08-04 11:46:19,435 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36902 after 30 s
2022-08-04 11:46:19,399 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37877
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37877 after 30 s
2022-08-04 11:46:19,461 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:19,461 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39522
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39522 after 30 s
2022-08-04 11:46:19,719 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46401
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46401 after 30 s
2022-08-04 11:46:19,713 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42737
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42737 after 30 s
2022-08-04 11:46:19,875 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38918
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38918 after 30 s
2022-08-04 11:46:19,875 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38308
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38308 after 30 s
2022-08-04 11:46:19,876 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36902 after 30 s
2022-08-04 11:46:20,058 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:20,020 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:20,106 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:20,293 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46090
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46090 after 30 s
2022-08-04 11:46:20,294 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45847
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45847 after 30 s
2022-08-04 11:46:20,740 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:20,970 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42737
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42737 after 30 s
2022-08-04 11:46:20,954 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46401
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46401 after 30 s
2022-08-04 11:46:21,025 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:21,134 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:21,213 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:21,213 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:21,213 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37877
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37877 after 30 s
2022-08-04 11:46:21,268 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38308
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38308 after 30 s
2022-08-04 11:46:21,269 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38918
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38918 after 30 s
2022-08-04 11:46:21,346 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42931
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42931 after 30 s
2022-08-04 11:46:21,351 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:21,371 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:21,454 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39522
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39522 after 30 s
2022-08-04 11:46:21,387 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39522
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39522 after 30 s
2022-08-04 11:46:21,457 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43103
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43103 after 30 s
2022-08-04 11:46:21,458 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36902 after 30 s
2022-08-04 11:46:21,458 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:34796
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:34796 after 30 s
2022-08-04 11:46:21,681 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:21,744 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38918
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38918 after 30 s
2022-08-04 11:46:21,744 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39028
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39028 after 30 s
2022-08-04 11:46:22,464 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46401
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46401 after 30 s
2022-08-04 11:46:22,464 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44235
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44235 after 30 s
2022-08-04 11:46:22,465 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:22,443 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38308
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38308 after 30 s
2022-08-04 11:46:22,882 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:22,945 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44235
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44235 after 30 s
2022-08-04 11:46:22,945 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46401
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46401 after 30 s
2022-08-04 11:46:23,544 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46090
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46090 after 30 s
2022-08-04 11:46:23,610 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45847
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45847 after 30 s
2022-08-04 11:46:42,098 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38308
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38308 after 30 s
2022-08-04 11:46:42,120 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39028
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39028 after 30 s
2022-08-04 11:46:43,378 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43615
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43615 after 30 s
2022-08-04 11:46:44,044 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45847
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45847 after 30 s
2022-08-04 11:46:44,045 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39522
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39522 after 30 s
2022-08-04 11:46:44,147 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:45,337 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:45,514 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:46,096 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42737
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42737 after 30 s
2022-08-04 11:46:46,112 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36902 after 30 s
2022-08-04 11:46:46,128 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45449
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45449 after 30 s
2022-08-04 11:46:47,325 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:47,326 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46090
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46090 after 30 s
2022-08-04 11:46:47,326 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:34796
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:34796 after 30 s
2022-08-04 11:46:47,515 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:47,693 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:47,693 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45449
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45449 after 30 s
2022-08-04 11:46:47,830 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44235
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44235 after 30 s
2022-08-04 11:46:48,083 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:34796
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:34796 after 30 s
2022-08-04 11:46:48,115 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43103
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43103 after 30 s
2022-08-04 11:46:48,237 - distributed.scheduler - WARNING - Received heartbeat from unregistered worker 'tcp://127.0.0.1:36902'.
2022-08-04 11:46:48,400 - distributed.worker - ERROR - Scheduler was unaware of this worker 'tcp://127.0.0.1:36902'. Shutting down.
2022-08-04 11:46:48,400 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:37066 -> tcp://127.0.0.1:36902
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:37066 remote=tcp://127.0.0.1:48872>: Stream is closed
2022-08-04 11:46:48,406 - distributed.nanny - ERROR - Worker process died unexpectedly
2022-08-04 11:46:48,797 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43103
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43103 after 30 s
2022-08-04 11:46:49,042 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:52658 remote=tcp://127.0.0.1:36902>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:46:49,336 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42931
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42931 after 30 s
2022-08-04 11:46:49,346 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43103
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43103 after 30 s
2022-08-04 11:46:49,346 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42931
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42931 after 30 s
2022-08-04 11:46:49,346 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:34796
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:34796 after 30 s
2022-08-04 11:46:49,655 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46401
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46401 after 30 s
2022-08-04 11:46:49,823 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42737
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42737 after 30 s
2022-08-04 11:46:49,870 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46090
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 264, in write
    async def write(self, msg, serializers=None, on_error="message"):
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 418, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 329, in connect
    await asyncio.wait_for(comm.write(local_info), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 420, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46090 after 30 s
2022-08-04 11:46:49,914 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:49,943 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:49,992 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:34796 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
    num_bytes = self.write_to_fd(self._write_buffer.peek(size))
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
    return self.socket.send(data)  # type: ignore
BrokenPipeError: [Errno 32] Broken pipe

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:34796 remote=tcp://127.0.0.1:49372>: BrokenPipeError: [Errno 32] Broken pipe
2022-08-04 11:46:50,097 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:50,259 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:44490 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
    num_bytes = self.write_to_fd(self._write_buffer.peek(size))
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
    return self.socket.send(data)  # type: ignore
BrokenPipeError: [Errno 32] Broken pipe

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:44490 remote=tcp://127.0.0.1:57862>: BrokenPipeError: [Errno 32] Broken pipe
2022-08-04 11:46:50,808 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:46401 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
    num_bytes = self.write_to_fd(self._write_buffer.peek(size))
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
    return self.socket.send(data)  # type: ignore
BrokenPipeError: [Errno 32] Broken pipe

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:46401 remote=tcp://127.0.0.1:56638>: BrokenPipeError: [Errno 32] Broken pipe
2022-08-04 11:46:54,190 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:52624 remote=tcp://127.0.0.1:36902>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:46:54,207 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38918
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 264, in write
    async def write(self, msg, serializers=None, on_error="message"):
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 418, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 329, in connect
    await asyncio.wait_for(comm.write(local_info), time_left())
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 420, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
    raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38918 after 30 s
2022-08-04 11:46:55,281 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:35542 (pid=7530) exceeded 99% memory budget. Restarting...
2022-08-04 11:46:55,636 - distributed.nanny - WARNING - Restarting worker
2022-08-04 11:46:56,532 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
    n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:54216 remote=tcp://127.0.0.1:35542>: Stream is closed
2022-08-04 11:46:56,570 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:43605 -> tcp://127.0.0.1:35542
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
    num_bytes = self.write_to_fd(self._write_buffer.peek(size))
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
    return self.socket.send(data)  # type: ignore
BrokenPipeError: [Errno 32] Broken pipe

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:43605 remote=tcp://127.0.0.1:52964>: BrokenPipeError: [Errno 32] Broken pipe
2022-08-04 11:47:04,811 - distributed.worker_memory - WARNING - Worker is at 97% memory usage. Pausing worker.  Process memory: 3.84 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:47:06,855 - distributed.worker_memory - WARNING - Worker is at 86% memory usage. Resuming worker. Process memory: 3.40 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:47:19,635 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
ConnectionRefusedError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 291, in connect
    comm = await asyncio.wait_for(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
    return fut.result()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 461, in connect
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <distributed.comm.tcp.TCPConnector object at 0x2acd6da66890>: ConnectionRefusedError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
    comm = await rpc.connect(worker)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
    return await connect_attempt
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
    comm = await connect(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 317, in connect
    raise OSError(
OSError: Timed out trying to connect to tcp://127.0.0.1:36902 after 30 s
slice(1, 2, None)
slice(2, 3, None)
slice(3, 4, None)
slice(4, 5, None)
slice(5, 6, None)
slice(6, 7, None)
slice(7, 8, None)
slice(8, 9, None)
slice(9, 10, None)
slice(10, 11, None)
slice(11, 12, None)
2022-08-04 11:54:09,309 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:39097 (pid=7574) exceeded 99% memory budget. Restarting...
2022-08-04 11:54:09,538 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39097
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:45624 remote=tcp://127.0.0.1:39097>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:54:09,537 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:46201 -> tcp://127.0.0.1:39097
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:46201 remote=tcp://127.0.0.1:45220>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:54:09,534 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39097
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:41322 remote=tcp://127.0.0.1:39097>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:54:09,557 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39097
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:40654 remote=tcp://127.0.0.1:39097>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:54:09,576 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39097
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 315, in write
    raise StreamClosedError()
tornado.iostream.StreamClosedError: Stream is closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
    response = await get_data_from_worker(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
    return await retry_operation(_get_data, operation="get_data_from_worker")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
    return await retry(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
    return await coro()
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
    response = await send_recv(
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 916, in send_recv
    await comm.write(msg, serializers=serializers, on_error="raise")
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 326, in write
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:40250 remote=tcp://127.0.0.1:39097>: Stream is closed
2022-08-04 11:54:09,578 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:37595 -> tcp://127.0.0.1:39097
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:37595 remote=tcp://127.0.0.1:48340>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:54:09,579 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:35067 -> tcp://127.0.0.1:39097
Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
    response = await comm.read(deserializers=serializers)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
    convert_stream_closed_error(self, e)
  File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
    raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://127.0.0.1:35067 remote=tcp://127.0.0.1:59496>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:54:09,681 - distributed.nanny - WARNING - Restarting worker
slice(12, 13, None)
slice(13, 14, None)
slice(14, 15, None)
slice(15, 16, None)
slice(16, 17, None)
slice(17, 18, None)
slice(18, 19, None)
slice(19, 20, None)
slice(20, 21, None)
slice(21, 22, None)
slice(22, 23, None)
slice(23, 24, None)
slice(24, 25, None)
slice(25, 26, None)
slice(26, 27, None)
slice(27, 28, None)
slice(28, 29, None)
slice(29, 30, None)
slice(30, 31, None)
CPU times: user 13min 47s, sys: 3min 25s, total: 17min 12s
Wall time: 24min 35s