Downloading with correct filename with cURL and Wget

Sometimes, some websites would have links like:

https://example.com/get?file_id=123
https://example.com/get/foobar-1.0.tar.gz?blah=mlah

The first one probably is a redirection to an actual location with a clear filename in URL, and/or serving the content with Content-Disposition header to tell the client what the filename is.  The second one is just to annoy me.

However, they cause issue with Wget and cURL (with -O) as it would save them as:

get?file_id=123
foobar-1.0.tar.gz?blah=mlah

That is not that what we want for the downloaded files, both only use the URL you provide to decide the filename for the saved file.

In web browser, such links will be handled properly as long as the responding headers are set correctly, the save dialog would give you the correct filenames.

Both cURL and Wget can handle the cases, but the solutions don't come without warnings, be sure to read wget(1) and curl(1) before implementing the solutions mentioned below.

Wget

I love using Wget, because it's by default saving the downloads, cURL sends to standard output if without -O.  In the past, I manually used -O <manual-input-filename> when using with Wget with dummy names like a.tar.gz, it doesn't matter what I give them, because they are getting unpacked anyways.

But it's getting to the point that I felt wasting lots of keystrokes, and the solution is quite easy.  For Wget, put the following lines in ~/.wgetrc:

content_disposition = on
trust_server_names = on


The first option needs no explanations for its purpose, but wget(1) warns it's still experimental and:
This can currently result in extra round-trips to the server for a "HEAD" request, and is known to suffer from a few bugs, which is why it is not currently enabled by default.
The second options would use redirected URL as the saved filename, and you just do as usual:

$ wget <link>

Wget will save the intended filenames for you.

cURL

As for cURL, it's a bit optionally wordy:

$ curl -OJL <link>

-J is --remote-header-name, and -L is --location, which follows Location header or 3XX redirections.  For -J, curl(1) warns:
Exercise judicious use of this option, especially on Windows. A rogue server could send you the name of a DLL or other file that could possibly be loaded automatically by Windows or some third party software.
There is ~/.curlrc, but I don't dig into it, I prefer to keep cURL default behavior as it, because sometimes you want to see the content directly.

Comments

Popular Posts