Large file transfer with Gemini

Gemini operates on top of the TLS protocol, and the TLS protocol provides the 'close_notify' message as a mechanism to indicate a graceful end-of-transmission. That is to say, receipt of a close_notify message by a client signals that data has been transferred successfully from the server and no truncation has occurred.

The Gemini protocol mandates the use of close_notify in its exchanges, and this makes Gemini fairly well suited for the transfer of large files (at least if the server is properly written, see note 1 below). If a close_notify message is received then the client can be sure that the data that the server expected to send was actually received.

In some senses, Gemini is *better* suited to large file transfer than its usual use-case, communication around a blogosphere. The high startup cost of a TLS connection means that you get a higher data-to-overhead ratio when you transfer large files compared to small text files. The one drawback for Gemini is that there is no universal way of indicating the expected data size a-priori (see note 2), so the data must be treated as a raw byte stream.

Requirements for servers

For a server to be well-suited for large file transfer over Gemini it must:



If either of these points is violated then there is no way for the client to guarantee that the data it received is correct. Other issues such as data integrity and packet ordering are handled by the lower-level protocols (TLS, TCP), and need not be a concern of the Gemini server itself.

My libpxd reference server polluxd satisfies both of these points, and there are others that do as well. I believe gmid is a good example, though I have not looked at it in some time.

Additional suggestions

If data integrity is of high concern then there should be another mechanism in place to e.g. sign the transmitted data or provide a similarly-named file containing the hashes of the target file. Another possibility is to send the hash as a mimetype parameter (akin to note 1 below), but interpreting/respecting that parameter it would be up to the client software.

Notes

1 - There are several Gemini servers out there (e.g. gmnisrv) that do not issue a close_notify message in violation of the Gemini protocol. Don't use these servers for large data transfer unless you have some out-of-band mechanism for guaranteeing data integrity. The authors should fix these servers if they are still active projects.

2 - Gemini response that return success (2x status code) can have a mimetype and associated parameters appended after the code. Per the mimetype spec (RFC2045) there is not a unified set of parameters that are associated with all mimetypes, and the Gemini protocol itself notes that a "Client MUST deal with MIME parameters that are not understood by simply ignoring them." I therefore see no reason that a response of something like '20 application/octet-stream; size=10240' would not be possible. It would of course be up to a client to interpret that metadata properly, but I think there's a case to be made that the *mechanism* is there.


Source