Monday, March 11, 2024

server timing out for a large download - fix was server-side - nginx

Downloading a large file from noirlab.edu, only 4.5 GB had been downloaded before "server error" was reported. When I tried to resume the download, the server would not respond. (and the issue was not that the server was not reachable - if I started a fresh download in a different location, it would start - only the resume was not working.)

Unfortunately, with the noirlab link, we're unable to resume partial downloads with aria2c after 12 hours or so. The download also seems to stop after a few hours of downloading, with a server error message.  

I've copy-pasted relevant parts of a log file produced by aria2, showing that the server times out when the client requests either
(a) more than one chunk, or
(b) chunk near the end of the file

2024-02-15 22:09:56.738161 [DEBUG] [SocketCore.cc:993] Securely connected to noirlab.edu (54.200.108.162:443) with TLSv1.2
2024-02-15 22:09:56.738161 [INFO] [HttpConnection.cc:128] CUID#7 - Requesting:
GET /public/media/archives/videos/dome_4kmaster/big-astronomy.zip HTTP/1.1
User-Agent: aria2/1.36.0
Accept: */*,application/metalink4+xml,application/metalink+xml
Want-Digest: SHA-512;q=1, SHA-256;q=1, SHA;q=0.1

2024-02-15 22:09:57.336946 [DEBUG] [WinTLSSession.cc:435] WinTLS: Read request: 16384 buffered: 14200
2024-02-15 22:09:57.336946 [INFO] [HttpConnection.cc:163] CUID#7 - Response received:
HTTP/1.1 200 OK
Date: Thu, 15 Feb 2024 16:39:56 GMT
Content-Type: application/zip
Content-Length: 262602447953
Connection: keep-alive
Server: nginx
Last-Modified: Wed, 01 Nov 2023 21:27:23 GMT
ETag: "6542c2bb-3d24535c51"
Access-Control-Allow-Origin: *
X-Cache-Status: BYPASS
Accept-Ranges: bytes

(Lots of lines with CUID#2 to 12 and so on, omitted here,...)

2024-02-15 22:11:09.882632 [DEBUG] [AbstractCommand.cc:181] CUID#13 - socket: read:0, write:0, hup:0, err:0

2024-02-15 22:11:10.885437 [DEBUG] [AbstractCommand.cc:181] CUID#13 - socket: read:0, write:0, hup:0, err:0

(Lots more lines with socket read:0 etc, finally ... )

2024-02-15 22:12:04.827538 [DEBUG] [AbstractCommand.cc:325] CUID#13 - Marking IP address 54.149.252.235 as bad
2024-02-15 22:12:04.827538 [INFO] [AbstractCommand.cc:366] CUID#13 - Restarting the download. URI=https://noirlab.edu/public/media/archives/videos/dome_4kmaster/big-astronomy.zip
  -> [AbstractCommand.cc:340] errorCode=2 Timeout.

Reported this to the people at the website, and they made an ip-address-only link available for me to try. That worked, with the resume functionality working even after many hours.

[#b2e54e 64GiB/244GiB(26%) CN:15 DL:1.0MiB ETA:50h51m51s]

No comments:

Post a Comment