2022-08-04 11:43:34,302 - distributed.worker_memory - WARNING - Worker is at 97% memory usage. Pausing worker. Process memory: 3.81 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:34,345 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:37126 (pid=7551) exceeded 99% memory budget. Restarting...
2022-08-04 11:43:34,529 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37126
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:51936 remote=tcp://127.0.0.1:37126>: ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37126 after 30 s
2022-08-04 11:43:34,960 - distributed.nanny - WARNING - Restarting worker
2022-08-04 11:43:35,027 - distributed.worker_memory - WARNING - Worker is at 96% memory usage. Pausing worker. Process memory: 3.79 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:35,192 - distributed.worker_memory - WARNING - Worker is at 94% memory usage. Resuming worker. Process memory: 3.71 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:53,984 - distributed.nanny - WARNING - Worker process still alive after 3.9999975204467777 seconds, killing
2022-08-04 11:44:11,411 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:32835
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:32835 after 30 s
2022-08-04 11:44:11,544 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43632
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43632 after 30 s
2022-08-04 11:44:11,544 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44423
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44423 after 30 s
2022-08-04 11:44:42,115 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43632
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43632 after 30 s
2022-08-04 11:44:42,118 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44423
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44423 after 30 s
2022-08-04 11:44:42,118 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:32835
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:32835 after 30 s
2022-08-04 11:45:14,113 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44423
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44423 after 30 s
2022-08-04 11:45:14,115 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43632
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43632 after 30 s
2022-08-04 11:45:14,115 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:32835
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:32835 after 30 s
2022-08-04 11:45:42,133 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:32974 (pid=7566) exceeded 99% memory budget. Restarting...
2022-08-04 11:45:42,380 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:35204 (pid=7529) exceeded 99% memory budget. Restarting...
2022-08-04 11:45:42,413 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:32974
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:56706 remote=tcp://127.0.0.1:32974>: Stream is closed
2022-08-04 11:45:42,789 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:40234 -> tcp://127.0.0.1:35204
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:40234 remote=tcp://127.0.0.1:51652>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:42,790 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35204
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:46266 remote=tcp://127.0.0.1:35204>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:43,113 - distributed.nanny - WARNING - Restarting worker
2022-08-04 11:45:43,119 - distributed.nanny - WARNING - Restarting worker
2022-08-04 11:45:47,147 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35204
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:46384 remote=tcp://127.0.0.1:35204>: Stream is closed
2022-08-04 11:45:50,177 - distributed.worker_memory - WARNING - Worker is at 100% memory usage. Pausing worker. Process memory: 3.96 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:45:50,228 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:40234 (pid=7514) exceeded 99% memory budget. Restarting...
2022-08-04 11:45:50,588 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:32835 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:32835 remote=tcp://127.0.0.1:57448>: Stream is closed
2022-08-04 11:45:50,588 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:33654 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:33654 remote=tcp://127.0.0.1:44918>: Stream is closed
2022-08-04 11:45:50,588 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:38041 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:38041 remote=tcp://127.0.0.1:42922>: Stream is closed
2022-08-04 11:45:50,589 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51650 remote=tcp://127.0.0.1:40234>: Stream is closed
2022-08-04 11:45:50,589 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:42399 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:42399 remote=tcp://127.0.0.1:56642>: Stream is closed
2022-08-04 11:45:50,589 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:46781 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:46781 remote=tcp://127.0.0.1:43566>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:50,589 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:36671 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:36671 remote=tcp://127.0.0.1:60050>: Stream is closed
2022-08-04 11:45:50,595 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:44423 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:44423 remote=tcp://127.0.0.1:49478>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:50,596 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:43605 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:43605 remote=tcp://127.0.0.1:51542>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:50,599 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:35922 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:35922 remote=tcp://127.0.0.1:55064>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:50,588 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51688 remote=tcp://127.0.0.1:40234>: Stream is closed
2022-08-04 11:45:50,588 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51734 remote=tcp://127.0.0.1:40234>: Stream is closed
2022-08-04 11:45:50,589 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:52226 remote=tcp://127.0.0.1:40234>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:50,590 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51726 remote=tcp://127.0.0.1:40234>: Stream is closed
2022-08-04 11:45:50,592 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51658 remote=tcp://127.0.0.1:40234>: Stream is closed
2022-08-04 11:45:50,604 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51732 remote=tcp://127.0.0.1:40234>: Stream is closed
2022-08-04 11:45:50,717 - distributed.nanny - WARNING - Restarting worker
2022-08-04 11:45:50,934 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:35535 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
num_bytes = self.write_to_fd(self._write_buffer.peek(size))
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
return self.socket.send(data) # type: ignore
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:35535 remote=tcp://127.0.0.1:46884>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:51,011 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:46546 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
num_bytes = self.write_to_fd(self._write_buffer.peek(size))
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
return self.socket.send(data) # type: ignore
BrokenPipeError: [Errno 32] Broken pipe
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:46546 remote=tcp://127.0.0.1:59298>: BrokenPipeError: [Errno 32] Broken pipe
2022-08-04 11:45:51,241 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51724 remote=tcp://127.0.0.1:40234>: Stream is closed
2022-08-04 11:45:51,333 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:33837 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
num_bytes = self.write_to_fd(self._write_buffer.peek(size))
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
return self.socket.send(data) # type: ignore
BrokenPipeError: [Errno 32] Broken pipe
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:33837 remote=tcp://127.0.0.1:39534>: BrokenPipeError: [Errno 32] Broken pipe
2022-08-04 11:45:51,313 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:51728 remote=tcp://127.0.0.1:40234>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:51,635 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:45781 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
num_bytes = self.write_to_fd(self._write_buffer.peek(size))
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
return self.socket.send(data) # type: ignore
BrokenPipeError: [Errno 32] Broken pipe
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:45781 remote=tcp://127.0.0.1:40034>: BrokenPipeError: [Errno 32] Broken pipe
2022-08-04 11:46:02,125 - distributed.nanny - WARNING - Worker process still alive after 3.999999237060547 seconds, killing
2022-08-04 11:46:02,155 - distributed.nanny - WARNING - Worker process still alive after 3.9999996185302735 seconds, killing
2022-08-04 11:46:09,761 - distributed.nanny - WARNING - Worker process still alive after 3.9999988555908206 seconds, killing
2022-08-04 11:46:11,818 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38308
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38308 after 30 s
2022-08-04 11:46:11,890 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39028
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39028 after 30 s
2022-08-04 11:46:13,294 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43615
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43615 after 30 s
2022-08-04 11:46:13,338 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:13,509 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45847
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45847 after 30 s
2022-08-04 11:46:13,586 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39522
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39522 after 30 s
2022-08-04 11:46:14,801 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36902 after 30 s
2022-08-04 11:46:15,155 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:15,353 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:15,503 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42737
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42737 after 30 s
2022-08-04 11:46:15,771 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45449
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45449 after 30 s
2022-08-04 11:46:16,377 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39028
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39028 after 30 s
2022-08-04 11:46:16,503 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:16,494 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43103
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43103 after 30 s
2022-08-04 11:46:16,495 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:34796
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:34796 after 30 s
2022-08-04 11:46:16,495 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43103
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43103 after 30 s
2022-08-04 11:46:16,506 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45449
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45449 after 30 s
2022-08-04 11:46:16,585 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46090
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46090 after 30 s
2022-08-04 11:46:16,674 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:17,191 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:34796
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:34796 after 30 s
2022-08-04 11:46:17,240 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46090
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46090 after 30 s
2022-08-04 11:46:17,241 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:17,243 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39522
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39522 after 30 s
2022-08-04 11:46:17,208 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44235
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44235 after 30 s
2022-08-04 11:46:17,945 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:18,307 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45449
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45449 after 30 s
2022-08-04 11:46:18,347 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:18,371 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:34796
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:34796 after 30 s
2022-08-04 11:46:18,412 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37877
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37877 after 30 s
2022-08-04 11:46:18,493 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42931
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42931 after 30 s
2022-08-04 11:46:18,495 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37877
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37877 after 30 s
2022-08-04 11:46:18,529 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42737
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42737 after 30 s
2022-08-04 11:46:18,532 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43103
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43103 after 30 s
2022-08-04 11:46:18,532 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42931
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42931 after 30 s
2022-08-04 11:46:18,729 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45449
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45449 after 30 s
2022-08-04 11:46:18,730 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:19,014 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:19,200 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:19,197 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:19,435 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46401
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46401 after 30 s
2022-08-04 11:46:19,435 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36902 after 30 s
2022-08-04 11:46:19,399 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37877
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37877 after 30 s
2022-08-04 11:46:19,461 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:19,461 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39522
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39522 after 30 s
2022-08-04 11:46:19,719 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46401
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46401 after 30 s
2022-08-04 11:46:19,713 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42737
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42737 after 30 s
2022-08-04 11:46:19,875 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38918
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38918 after 30 s
2022-08-04 11:46:19,875 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38308
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38308 after 30 s
2022-08-04 11:46:19,876 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36902 after 30 s
2022-08-04 11:46:20,058 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:20,020 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:20,106 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:20,293 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46090
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46090 after 30 s
2022-08-04 11:46:20,294 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45847
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45847 after 30 s
2022-08-04 11:46:20,740 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:20,970 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42737
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42737 after 30 s
2022-08-04 11:46:20,954 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46401
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46401 after 30 s
2022-08-04 11:46:21,025 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:21,134 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:21,213 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:21,213 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:21,213 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37877
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37877 after 30 s
2022-08-04 11:46:21,268 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38308
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38308 after 30 s
2022-08-04 11:46:21,269 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38918
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38918 after 30 s
2022-08-04 11:46:21,346 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42931
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42931 after 30 s
2022-08-04 11:46:21,351 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:21,371 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:21,454 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39522
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39522 after 30 s
2022-08-04 11:46:21,387 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39522
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39522 after 30 s
2022-08-04 11:46:21,457 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43103
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43103 after 30 s
2022-08-04 11:46:21,458 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36902 after 30 s
2022-08-04 11:46:21,458 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:34796
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:34796 after 30 s
2022-08-04 11:46:21,681 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:21,744 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38918
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38918 after 30 s
2022-08-04 11:46:21,744 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39028
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39028 after 30 s
2022-08-04 11:46:22,464 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46401
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46401 after 30 s
2022-08-04 11:46:22,464 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44235
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44235 after 30 s
2022-08-04 11:46:22,465 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:22,443 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38308
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38308 after 30 s
2022-08-04 11:46:22,882 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:22,945 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44235
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44235 after 30 s
2022-08-04 11:46:22,945 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46401
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46401 after 30 s
2022-08-04 11:46:23,544 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46090
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46090 after 30 s
2022-08-04 11:46:23,610 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45847
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45847 after 30 s
2022-08-04 11:46:42,098 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38308
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38308 after 30 s
2022-08-04 11:46:42,120 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39028
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39028 after 30 s
2022-08-04 11:46:43,378 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43615
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43615 after 30 s
2022-08-04 11:46:44,044 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45847
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45847 after 30 s
2022-08-04 11:46:44,045 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:39522
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:39522 after 30 s
2022-08-04 11:46:44,147 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:45,337 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36076
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36076 after 30 s
2022-08-04 11:46:45,514 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:46,096 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42737
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42737 after 30 s
2022-08-04 11:46:46,112 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36902 after 30 s
2022-08-04 11:46:46,128 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45449
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45449 after 30 s
2022-08-04 11:46:47,325 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46769
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46769 after 30 s
2022-08-04 11:46:47,326 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46090
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46090 after 30 s
2022-08-04 11:46:47,326 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:34796
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:34796 after 30 s
2022-08-04 11:46:47,515 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:47,693 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35542 after 30 s
2022-08-04 11:46:47,693 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45449
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45449 after 30 s
2022-08-04 11:46:47,830 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44235
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44235 after 30 s
2022-08-04 11:46:48,083 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:34796
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:34796 after 30 s
2022-08-04 11:46:48,115 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43103
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43103 after 30 s
2022-08-04 11:46:48,237 - distributed.scheduler - WARNING - Received heartbeat from unregistered worker 'tcp://127.0.0.1:36902'.
2022-08-04 11:46:48,400 - distributed.worker - ERROR - Scheduler was unaware of this worker 'tcp://127.0.0.1:36902'. Shutting down.
2022-08-04 11:46:48,400 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:37066 -> tcp://127.0.0.1:36902
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:37066 remote=tcp://127.0.0.1:48872>: Stream is closed
2022-08-04 11:46:48,406 - distributed.nanny - ERROR - Worker process died unexpectedly
2022-08-04 11:46:48,797 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43103
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43103 after 30 s
2022-08-04 11:46:49,042 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:52658 remote=tcp://127.0.0.1:36902>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:46:49,336 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42931
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42931 after 30 s
2022-08-04 11:46:49,346 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43103
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43103 after 30 s
2022-08-04 11:46:49,346 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42931
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42931 after 30 s
2022-08-04 11:46:49,346 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:34796
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:34796 after 30 s
2022-08-04 11:46:49,655 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46401
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46401 after 30 s
2022-08-04 11:46:49,823 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42737
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42737 after 30 s
2022-08-04 11:46:49,870 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46090
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 264, in write
async def write(self, msg, serializers=None, on_error="message"):
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 418, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 329, in connect
await asyncio.wait_for(comm.write(local_info), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 420, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46090 after 30 s
2022-08-04 11:46:49,914 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:49,943 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:49,992 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:34796 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
num_bytes = self.write_to_fd(self._write_buffer.peek(size))
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
return self.socket.send(data) # type: ignore
BrokenPipeError: [Errno 32] Broken pipe
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:34796 remote=tcp://127.0.0.1:49372>: BrokenPipeError: [Errno 32] Broken pipe
2022-08-04 11:46:50,097 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44490
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44490 after 30 s
2022-08-04 11:46:50,259 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:44490 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
num_bytes = self.write_to_fd(self._write_buffer.peek(size))
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
return self.socket.send(data) # type: ignore
BrokenPipeError: [Errno 32] Broken pipe
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:44490 remote=tcp://127.0.0.1:57862>: BrokenPipeError: [Errno 32] Broken pipe
2022-08-04 11:46:50,808 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:46401 -> tcp://127.0.0.1:40234
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
num_bytes = self.write_to_fd(self._write_buffer.peek(size))
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
return self.socket.send(data) # type: ignore
BrokenPipeError: [Errno 32] Broken pipe
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:46401 remote=tcp://127.0.0.1:56638>: BrokenPipeError: [Errno 32] Broken pipe
2022-08-04 11:46:54,190 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:52624 remote=tcp://127.0.0.1:36902>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:46:54,207 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38918
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 264, in write
async def write(self, msg, serializers=None, on_error="message"):
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 418, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 329, in connect
await asyncio.wait_for(comm.write(local_info), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 420, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38918 after 30 s
2022-08-04 11:46:55,281 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:35542 (pid=7530) exceeded 99% memory budget. Restarting...
2022-08-04 11:46:55,636 - distributed.nanny - WARNING - Restarting worker
2022-08-04 11:46:56,532 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35542
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:54216 remote=tcp://127.0.0.1:35542>: Stream is closed
2022-08-04 11:46:56,570 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:43605 -> tcp://127.0.0.1:35542
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
num_bytes = self.write_to_fd(self._write_buffer.peek(size))
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
return self.socket.send(data) # type: ignore
BrokenPipeError: [Errno 32] Broken pipe
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:43605 remote=tcp://127.0.0.1:52964>: BrokenPipeError: [Errno 32] Broken pipe
2022-08-04 11:47:04,811 - distributed.worker_memory - WARNING - Worker is at 97% memory usage. Pausing worker. Process memory: 3.84 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:47:06,855 - distributed.worker_memory - WARNING - Worker is at 86% memory usage. Resuming worker. Process memory: 3.40 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:47:19,635 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36902
ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 291, in connect
comm = await asyncio.wait_for(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 461, in connect
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <distributed.comm.tcp.TCPConnector object at 0x2acd6da66890>: ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 317, in connect
raise OSError(
OSError: Timed out trying to connect to tcp://127.0.0.1:36902 after 30 s