2022-08-04 11:43:24,153 - distributed.worker_memory - WARNING - Worker is at 95% memory usage. Pausing worker. Process memory: 3.74 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:24,283 - distributed.worker_memory - WARNING - Worker is at 93% memory usage. Resuming worker. Process memory: 3.65 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:24,473 - distributed.worker_memory - WARNING - Worker is at 97% memory usage. Pausing worker. Process memory: 3.82 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:24,674 - distributed.worker_memory - WARNING - Worker is at 96% memory usage. Pausing worker. Process memory: 3.79 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:24,713 - distributed.worker_memory - WARNING - Worker is at 91% memory usage. Resuming worker. Process memory: 3.60 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:24,833 - distributed.worker_memory - WARNING - Worker is at 94% memory usage. Resuming worker. Process memory: 3.70 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:25,100 - distributed.worker_memory - WARNING - Worker is at 96% memory usage. Pausing worker. Process memory: 3.78 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:25,267 - distributed.worker_memory - WARNING - Worker is at 97% memory usage. Pausing worker. Process memory: 3.83 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:25,310 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:33065 (pid=172687) exceeded 99% memory budget. Restarting...
2022-08-04 11:43:25,316 - distributed.worker_memory - WARNING - Worker is at 94% memory usage. Resuming worker. Process memory: 3.69 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:25,572 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:33065
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:48092 remote=tcp://127.0.0.1:33065>: Stream is closed
2022-08-04 11:43:25,825 - distributed.worker_memory - WARNING - Worker is at 96% memory usage. Pausing worker. Process memory: 3.78 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:25,959 - distributed.nanny - WARNING - Restarting worker
2022-08-04 11:43:25,992 - distributed.worker_memory - WARNING - Worker is at 92% memory usage. Resuming worker. Process memory: 3.64 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:43:45,003 - distributed.nanny - WARNING - Worker process still alive after 3.9999975204467777 seconds, killing
2022-08-04 11:45:06,596 - distributed.worker_memory - WARNING - Worker is at 95% memory usage. Pausing worker. Process memory: 3.76 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:45:06,676 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:33890 (pid=172807) exceeded 99% memory budget. Restarting...
2022-08-04 11:45:06,949 - distributed.nanny - WARNING - Restarting worker
2022-08-04 11:45:07,591 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:33890
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:56444 remote=tcp://127.0.0.1:33890>: Stream is closed
2022-08-04 11:45:13,467 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:36646 (pid=172725) exceeded 99% memory budget. Restarting...
2022-08-04 11:45:13,735 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:40991 -> tcp://127.0.0.1:36646
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:40991 remote=tcp://127.0.0.1:39102>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:13,751 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36646
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39832 remote=tcp://127.0.0.1:36646>: Stream is closed
2022-08-04 11:45:14,168 - distributed.nanny - WARNING - Restarting worker
2022-08-04 11:45:14,976 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36646
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:39962 remote=tcp://127.0.0.1:36646>: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36646 after 30 s
2022-08-04 11:45:17,242 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 214, in read
async def read(self, deserializers=None):
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 418, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 420, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:40991 after 30 s
2022-08-04 11:45:18,405 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 264, in write
async def write(self, msg, serializers=None, on_error="message"):
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 418, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 329, in connect
await asyncio.wait_for(comm.write(local_info), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 420, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:40991 after 30 s
2022-08-04 11:45:19,987 - distributed.worker_memory - WARNING - Worker is at 96% memory usage. Pausing worker. Process memory: 3.80 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:45:20,092 - distributed.worker_memory - WARNING - Worker is at 93% memory usage. Resuming worker. Process memory: 3.69 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:45:20,806 - distributed.worker_memory - WARNING - Worker is at 100% memory usage. Pausing worker. Process memory: 3.93 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:45:20,818 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:40991 (pid=172742) exceeded 99% memory budget. Restarting...
2022-08-04 11:45:21,101 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:33970 -> tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:33970 remote=tcp://127.0.0.1:57670>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:21,102 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:44714 -> tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:44714 remote=tcp://127.0.0.1:53970>: Stream is closed
2022-08-04 11:45:21,102 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:39398 remote=tcp://127.0.0.1:40991>: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:40991 after 30 s
2022-08-04 11:45:21,107 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:37697 -> tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:37697 remote=tcp://127.0.0.1:37460>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:21,107 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:43076 -> tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:43076 remote=tcp://127.0.0.1:51470>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:21,102 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39382 remote=tcp://127.0.0.1:40991>: Stream is closed
2022-08-04 11:45:21,102 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39182 remote=tcp://127.0.0.1:40991>: Stream is closed
2022-08-04 11:45:21,102 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39206 remote=tcp://127.0.0.1:40991>: Stream is closed
2022-08-04 11:45:21,102 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39178 remote=tcp://127.0.0.1:40991>: Stream is closed
2022-08-04 11:45:21,105 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:38616 -> tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:38616 remote=tcp://127.0.0.1:42054>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:21,106 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:36067 -> tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:36067 remote=tcp://127.0.0.1:42316>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:21,145 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39384 remote=tcp://127.0.0.1:40991>: Stream is closed
2022-08-04 11:45:21,108 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39198 remote=tcp://127.0.0.1:40991>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:21,109 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39166 remote=tcp://127.0.0.1:40991>: Stream is closed
2022-08-04 11:45:21,174 - distributed.nanny - WARNING - Restarting worker
2022-08-04 11:45:21,102 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39170 remote=tcp://127.0.0.1:40991>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:21,103 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:39418 remote=tcp://127.0.0.1:40991>: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:40991 after 30 s
2022-08-04 11:45:21,103 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:39416 remote=tcp://127.0.0.1:40991>: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:40991 after 30 s
2022-08-04 11:45:21,103 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:39396 remote=tcp://127.0.0.1:40991>: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:40991 after 30 s
2022-08-04 11:45:21,145 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39210 remote=tcp://127.0.0.1:40991>: Stream is closed
2022-08-04 11:45:21,403 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39204 remote=tcp://127.0.0.1:40991>: Stream is closed
2022-08-04 11:45:21,628 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:32888 -> tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:32888 remote=tcp://127.0.0.1:50948>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:21,697 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39208 remote=tcp://127.0.0.1:40991>: Stream is closed
2022-08-04 11:45:21,823 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39188 remote=tcp://127.0.0.1:40991>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:21,832 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:40737 -> tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:40737 remote=tcp://127.0.0.1:52106>: Stream is closed
2022-08-04 11:45:21,801 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:46297 -> tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:46297 remote=tcp://127.0.0.1:41306>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:21,822 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:43548 -> tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:43548 remote=tcp://127.0.0.1:55332>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:21,877 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:41396 -> tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:41396 remote=tcp://127.0.0.1:51134>: Stream is closed
2022-08-04 11:45:21,860 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39212 remote=tcp://127.0.0.1:40991>: Stream is closed
2022-08-04 11:45:21,875 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39200 remote=tcp://127.0.0.1:40991>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:21,880 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39194 remote=tcp://127.0.0.1:40991>: Stream is closed
2022-08-04 11:45:21,908 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39180 remote=tcp://127.0.0.1:40991>: Stream is closed
2022-08-04 11:45:22,025 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39144 remote=tcp://127.0.0.1:40991>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:45:22,072 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:33466 -> tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:33466 remote=tcp://127.0.0.1:38240>: Stream is closed
2022-08-04 11:45:22,075 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39202 remote=tcp://127.0.0.1:40991>: Stream is closed
2022-08-04 11:45:23,430 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:36890 -> tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:36890 remote=tcp://127.0.0.1:54956>: Stream is closed
2022-08-04 11:45:23,433 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39190 remote=tcp://127.0.0.1:40991>: Stream is closed
2022-08-04 11:45:23,625 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:39192 remote=tcp://127.0.0.1:40991>: Stream is closed
2022-08-04 11:45:25,968 - distributed.nanny - WARNING - Worker process still alive after 3.999998664855957 seconds, killing
2022-08-04 11:45:33,174 - distributed.nanny - WARNING - Worker process still alive after 3.999999237060547 seconds, killing
2022-08-04 11:45:37,757 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41160
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41160 after 30 s
2022-08-04 11:45:37,846 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41243
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41243 after 30 s
2022-08-04 11:45:38,955 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43074
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43074 after 30 s
2022-08-04 11:45:40,186 - distributed.nanny - WARNING - Worker process still alive after 3.9999994277954105 seconds, killing
2022-08-04 11:45:40,120 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36762
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36762 after 30 s
2022-08-04 11:45:40,201 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35512
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35512 after 30 s
2022-08-04 11:45:44,228 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42621
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42621 after 30 s
2022-08-04 11:45:44,229 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42359
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42359 after 30 s
2022-08-04 11:45:44,229 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44428
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44428 after 30 s
2022-08-04 11:45:44,229 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38909
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38909 after 30 s
2022-08-04 11:45:45,137 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41690
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41690 after 30 s
2022-08-04 11:45:46,017 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41979
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41979 after 30 s
2022-08-04 11:45:46,082 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42138
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42138 after 30 s
2022-08-04 11:45:46,526 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37066
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37066 after 30 s
2022-08-04 11:45:46,526 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37001
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37001 after 30 s
2022-08-04 11:45:49,600 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46045
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46045 after 30 s
2022-08-04 11:45:49,529 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43530
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43530 after 30 s
2022-08-04 11:45:49,633 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43074
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43074 after 30 s
2022-08-04 11:45:49,934 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35512
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35512 after 30 s
2022-08-04 11:45:49,990 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46045
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46045 after 30 s
2022-08-04 11:45:50,023 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45561
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45561 after 30 s
2022-08-04 11:45:50,439 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35512
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35512 after 30 s
2022-08-04 11:45:50,488 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41243
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41243 after 30 s
2022-08-04 11:45:50,530 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41160
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41160 after 30 s
2022-08-04 11:45:50,439 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42138
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42138 after 30 s
2022-08-04 11:45:50,530 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41361
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41361 after 30 s
2022-08-04 11:45:50,530 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42359
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42359 after 30 s
2022-08-04 11:45:51,040 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43530
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43530 after 30 s
2022-08-04 11:45:51,074 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35512
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35512 after 30 s
2022-08-04 11:45:51,174 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41160
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41160 after 30 s
2022-08-04 11:45:51,489 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42359
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42359 after 30 s
2022-08-04 11:45:51,524 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42138
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42138 after 30 s
2022-08-04 11:45:51,525 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43593
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43593 after 30 s
2022-08-04 11:45:51,525 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44428
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44428 after 30 s
2022-08-04 11:45:51,525 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44368
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44368 after 30 s
2022-08-04 11:45:51,782 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37001
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37001 after 30 s
2022-08-04 11:45:51,783 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37066
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37066 after 30 s
2022-08-04 11:45:51,781 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36029
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36029 after 30 s
2022-08-04 11:45:51,853 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44635
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44635 after 30 s
2022-08-04 11:45:51,883 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37001
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37001 after 30 s
2022-08-04 11:45:51,923 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36762
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36762 after 30 s
2022-08-04 11:45:51,988 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37001
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37001 after 30 s
2022-08-04 11:45:52,047 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37066
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37066 after 30 s
2022-08-04 11:45:52,095 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37066
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37066 after 30 s
2022-08-04 11:45:52,096 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37001
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37001 after 30 s
2022-08-04 11:45:52,346 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42359
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42359 after 30 s
2022-08-04 11:45:52,453 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35512
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35512 after 30 s
2022-08-04 11:45:53,201 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41456
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41456 after 30 s
2022-08-04 11:45:53,234 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41361
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41361 after 30 s
2022-08-04 11:45:53,540 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42621
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42621 after 30 s
2022-08-04 11:45:53,540 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44368
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44368 after 30 s
2022-08-04 11:45:53,540 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38909
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38909 after 30 s
2022-08-04 11:45:53,593 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42138
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42138 after 30 s
2022-08-04 11:45:53,732 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43530
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43530 after 30 s
2022-08-04 11:45:53,771 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43593
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43593 after 30 s
2022-08-04 11:45:53,772 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45561
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45561 after 30 s
2022-08-04 11:45:53,772 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46045
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46045 after 30 s
2022-08-04 11:45:55,052 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44635
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44635 after 30 s
2022-08-04 11:45:55,093 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44775
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44775 after 30 s
2022-08-04 11:45:57,219 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37001
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37001 after 30 s
2022-08-04 11:46:09,446 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41243
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41243 after 30 s
2022-08-04 11:46:09,450 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41160
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41160 after 30 s
2022-08-04 11:46:10,938 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43074
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43074 after 30 s
2022-08-04 11:46:13,174 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36762
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36762 after 30 s
2022-08-04 11:46:13,175 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35512
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35512 after 30 s
2022-08-04 11:46:15,445 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41690
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41690 after 30 s
2022-08-04 11:46:16,155 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42359
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42359 after 30 s
2022-08-04 11:46:16,155 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42621
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42621 after 30 s
2022-08-04 11:46:16,156 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38909
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38909 after 30 s
2022-08-04 11:46:16,156 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44428
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44428 after 30 s
2022-08-04 11:46:17,411 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37001
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37001 after 30 s
2022-08-04 11:46:17,411 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37066
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37066 after 30 s
2022-08-04 11:46:17,412 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41979
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41979 after 30 s
2022-08-04 11:46:17,412 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42138
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42138 after 30 s
2022-08-04 11:46:20,534 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41243
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41243 after 30 s
2022-08-04 11:46:20,535 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41160
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41160 after 30 s
2022-08-04 11:46:20,536 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41361
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41361 after 30 s
2022-08-04 11:46:20,631 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43530
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43530 after 30 s
2022-08-04 11:46:20,631 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43074
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43074 after 30 s
2022-08-04 11:46:21,021 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46045
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46045 after 30 s
2022-08-04 11:46:21,224 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46045
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46045 after 30 s
2022-08-04 11:46:21,225 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45561
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45561 after 30 s
2022-08-04 11:46:21,380 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35512
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35512 after 30 s
2022-08-04 11:46:21,403 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43530
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43530 after 30 s
2022-08-04 11:46:21,404 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35512
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35512 after 30 s
2022-08-04 11:46:21,403 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41160
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41160 after 30 s
2022-08-04 11:46:21,459 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42138
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42138 after 30 s
2022-08-04 11:46:21,459 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42359
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42359 after 30 s
2022-08-04 11:46:21,731 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35512
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35512 after 30 s
2022-08-04 11:46:23,021 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37066
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37066 after 30 s
2022-08-04 11:46:23,022 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37001
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37001 after 30 s
2022-08-04 11:46:23,201 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44635
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44635 after 30 s
2022-08-04 11:46:23,219 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37001
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37001 after 30 s
2022-08-04 11:46:23,219 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37066
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37066 after 30 s
2022-08-04 11:46:23,220 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36762
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36762 after 30 s
2022-08-04 11:46:23,220 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37001
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37001 after 30 s
2022-08-04 11:46:23,251 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:35512
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:35512 after 30 s
2022-08-04 11:46:23,284 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42359
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42359 after 30 s
2022-08-04 11:46:23,285 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42138
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42138 after 30 s
2022-08-04 11:46:23,285 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44428
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44428 after 30 s
2022-08-04 11:46:23,285 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44368
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44368 after 30 s
2022-08-04 11:46:23,285 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43593
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43593 after 30 s
2022-08-04 11:46:23,381 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37001
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37001 after 30 s
2022-08-04 11:46:23,381 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:37066
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:37066 after 30 s
2022-08-04 11:46:23,402 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42359
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42359 after 30 s
2022-08-04 11:46:23,459 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36029
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:36029 after 30 s
2022-08-04 11:46:24,935 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41456
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41456 after 30 s
2022-08-04 11:46:24,936 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41361
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41361 after 30 s
2022-08-04 11:46:25,204 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44368
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44368 after 30 s
2022-08-04 11:46:25,210 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42138
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42138 after 30 s
2022-08-04 11:46:25,210 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44635
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44635 after 30 s
2022-08-04 11:46:25,211 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:44775
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:44775 after 30 s
2022-08-04 11:46:25,460 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:42621
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:42621 after 30 s
2022-08-04 11:46:25,460 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:38909
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:38909 after 30 s
2022-08-04 11:46:25,463 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43530
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43530 after 30 s
2022-08-04 11:46:25,464 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43593
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43593 after 30 s
2022-08-04 11:46:25,464 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:45561
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:45561 after 30 s
2022-08-04 11:46:25,464 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:46045
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 328, in connect
handshake = await asyncio.wait_for(comm.read(), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:46045 after 30 s
2022-08-04 11:46:27,292 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41690
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:55434 remote=tcp://127.0.0.1:41690>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:46:27,567 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:36762 -> tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 971, in _handle_write
num_bytes = self.write_to_fd(self._write_buffer.peek(size))
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1148, in write_to_fd
return self.socket.send(data) # type: ignore
BrokenPipeError: [Errno 32] Broken pipe
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:36762 remote=tcp://127.0.0.1:32792>: BrokenPipeError: [Errno 32] Broken pipe
2022-08-04 11:46:29,019 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:41690 -> tcp://127.0.0.1:43074
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
TimeoutError: [Errno 110] Connection timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:41690 remote=tcp://127.0.0.1:55434>: TimeoutError: [Errno 110] Connection timed out
2022-08-04 11:46:29,051 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:42113 -> tcp://127.0.0.1:41979
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/iostream.py", line 1140, in read_from_fd
return self.socket.recv_into(buf, len(buf))
ConnectionResetError: [Errno 104] Connection reset by peer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:42113 remote=tcp://127.0.0.1:43490>: ConnectionResetError: [Errno 104] Connection reset by peer
2022-08-04 11:46:29,042 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:43074
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 264, in write
async def write(self, msg, serializers=None, on_error="message"):
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 418, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 329, in connect
await asyncio.wait_for(comm.write(local_info), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 420, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:43074 after 30 s
2022-08-04 11:46:29,280 - distributed.nanny - WARNING - Worker process still alive after 3.9999988555908206 seconds, killing
2022-08-04 11:46:33,093 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41979
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 315, in write
raise StreamClosedError()
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 329, in connect
await asyncio.wait_for(comm.write(local_info), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 326, in write
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:40848 remote=tcp://127.0.0.1:41979>: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41979 after 30 s
2022-08-04 11:46:39,853 - distributed.worker_memory - WARNING - Worker tcp://127.0.0.1:41456 (pid=172671) exceeded 99% memory budget. Restarting...
2022-08-04 11:46:40,154 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41456
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:50220 remote=tcp://127.0.0.1:41456>: Stream is closed
2022-08-04 11:46:40,161 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41456
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 233, in read
n = await stream.read_into(chunk)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2705, in _get_data
response = await send_recv(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 918, in send_recv
response = await comm.read(deserializers=deserializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) Ephemeral Worker->Worker for gather local=tcp://127.0.0.1:50018 remote=tcp://127.0.0.1:41456>: Stream is closed
2022-08-04 11:46:40,178 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:36323 -> tcp://127.0.0.1:41456
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 223, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1692, in get_data
response = await comm.read(deserializers=serializers)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 239, in read
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:36323 remote=tcp://127.0.0.1:51772>: Stream is closed
2022-08-04 11:46:40,330 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:41456
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 315, in write
raise StreamClosedError()
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 329, in connect
await asyncio.wait_for(comm.write(local_info), time_left())
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 326, in write
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:50268 remote=tcp://127.0.0.1:41456>: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 333, in connect
raise OSError(
OSError: Timed out during handshake while connecting to tcp://127.0.0.1:41456 after 30 s
2022-08-04 11:46:40,352 - distributed.nanny - WARNING - Restarting worker
2022-08-04 11:46:44,572 - distributed.worker_memory - WARNING - Worker is at 95% memory usage. Pausing worker. Process memory: 3.75 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:46:45,320 - distributed.worker_memory - WARNING - Worker is at 88% memory usage. Resuming worker. Process memory: 3.48 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:46:49,049 - distributed.worker_memory - WARNING - Worker is at 95% memory usage. Pausing worker. Process memory: 3.74 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:46:52,496 - distributed.worker_memory - WARNING - Worker is at 81% memory usage. Resuming worker. Process memory: 3.21 GiB -- Worker memory limit: 3.92 GiB
2022-08-04 11:46:55,691 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:33890
ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 291, in connect
comm = await asyncio.wait_for(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 461, in connect
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <distributed.comm.tcp.TCPConnector object at 0x2b652cf68250>: ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 317, in connect
raise OSError(
OSError: Timed out trying to connect to tcp://127.0.0.1:33890 after 30 s
2022-08-04 11:46:55,700 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 449, in connect
stream = await self.client.connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/tcpclient.py", line 275, in connect
af, addr, stream = await connector.start(connect_timeout=timeout)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 291, in connect
comm = await asyncio.wait_for(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 317, in connect
raise OSError(
OSError: Timed out trying to connect to tcp://127.0.0.1:40991 after 30 s
2022-08-04 11:46:55,769 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 291, in connect
comm = await asyncio.wait_for(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 461, in connect
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <distributed.comm.tcp.TCPConnector object at 0x2b53a58f5fc0>: ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 317, in connect
raise OSError(
OSError: Timed out trying to connect to tcp://127.0.0.1:40991 after 30 s
2022-08-04 11:46:58,335 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:36646
ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 291, in connect
comm = await asyncio.wait_for(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 461, in connect
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <distributed.comm.tcp.TCPConnector object at 0x2ba2b7254070>: ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 317, in connect
raise OSError(
OSError: Timed out trying to connect to tcp://127.0.0.1:36646 after 30 s
2022-08-04 11:46:58,975 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 449, in connect
stream = await self.client.connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/tornado/tcpclient.py", line 275, in connect
af, addr, stream = await connector.start(connect_timeout=timeout)
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 291, in connect
comm = await asyncio.wait_for(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 317, in connect
raise OSError(
OSError: Timed out trying to connect to tcp://127.0.0.1:40991 after 30 s
2022-08-04 11:46:59,037 - distributed.worker - ERROR - Worker stream died during communication: tcp://127.0.0.1:40991
ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 291, in connect
comm = await asyncio.wait_for(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
return fut.result()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 461, in connect
convert_stream_closed_error(self, e)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/tcp.py", line 142, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc.__class__.__name__}: {exc}") from exc
distributed.comm.core.CommClosedError: in <distributed.comm.tcp.TCPConnector object at 0x2b8400015090>: ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 1983, in gather_dep
response = await get_data_from_worker(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2725, in get_data_from_worker
return await retry_operation(_get_data, operation="get_data_from_worker")
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 383, in retry_operation
return await retry(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/utils_comm.py", line 368, in retry
return await coro()
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/worker.py", line 2702, in _get_data
comm = await rpc.connect(worker)
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1371, in connect
return await connect_attempt
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/core.py", line 1307, in _connect
comm = await connect(
File "/ccc/cont003/home/ra5563/ra5563/monitor/lib/python3.10/site-packages/distributed/comm/core.py", line 317, in connect
raise OSError(
OSError: Timed out trying to connect to tcp://127.0.0.1:40991 after 30 s