So I Read the Netnews Specs

Two things in the internet world have always fascinated me: newsgroups and mailinglists. I didn't grow up in the BBS world of the 1990s, but the idea of distributed messaging among multiple different hosts led me to experiment with a lot of different federated/distributed protocols and networks. Mailinglists, on the other hand, take an already ubiquitous network, email, that is not nearly as held down by binary sharing, copyright infringements, illegal activity, and bad public perception, and creates a well-working, but not perfect, forum system out of this network for working with git patches, QAs, announcements, etc. The two environments couldn't be more different.

So it is definitely to my surprise that Netnews articles are actually in valid plaintext email format, that many of the same headers are used by both mailinglists and news articles, and that threading in mailinglists and newsgroups are exactly the same. What, then, distinguishes newsgroups from mailinglists?

Netnews articles are defined originally in RFC 850, and then again in RFCs 1036, 5536, and 5537. The RFCs define the news article format that bases itself on the internet mail format, the use of control messages, the flood-fill algorithm used to propagate messages over a netnews network, and methods of duplicate suppression using the Path header and the tracking of message IDs (and optionally the Ihave and Sendme control messages, which are also implemented in NNTP generally). It does not, however, define the protocol or method of transferring news articles from one host to another. This process is often done with UUCP, NNTP, and may even be done via SMTP/email, as all valid news articles are also valid email messages.

One aspect of differentiation between news articles and emails are how they are organized into forum spaces: mailinglists vs. newsgroups. In a mailinglist system, a user sends an email *to* a mailinglist address, and the SMTP server then forwards that email to *other users via email*. In newsgroups, a news message is created with a From header that may (or may not be) a valid email address, as well as a Newsgroups header that specifies the newsgroups the message is a part of, which additionally allows one to crosspost the same article into multiple newsgroups. The netnews article is then propagated via various means to netnews hosts, and either forwarded via email or downloaded by netnews readers.

Importantly, mailinglist messages are sent to an email address, news articles specify newsgroups in the header. Mailinglists store the emails and push them to end user emails, newsgroups store them and allow them to be downloaded or to be pushed to end user emails.

The other main aspect of differentiation is that netnews articles keep track of the path when they are propagated in a network, but emails do *not* keep track of the path when they are forwarded throughout the network. While this path is not *necessary* for making sure one server doesn't receive duplicates, it is one good network traffic optimization that news articles have over email.

Netnews articles also specify a couple other optional headers, like Followup-to, Control, Injection-Info, Expires, Archive, Supersedes, and Distribution that could all be used in emails just as well, but likely are not.

Even with all of these differences, Netnews articles are more similar to the emails of mailinglists than people generally like to admit, and I will show that by going through some of the core similarities.

Firstly, both netnews articles and emails have a From header, a Date, a Subject, a Message-ID, and a message body. Netnews articles specify a stricter subset of valid values for these fields, but they are essentially the same.

The biggest similarity is the one that people seem to emphasize as the most different between newsgroups and mailinglists: Threading. Both Netnews and mailinglist emails organize messages into threads using Subjects, Message-IDs, and the References header. The References header references the Message-IDs that it is responding to. A new thread is created on a new subject, and a new subject must *not* use a References header. Clients can organize newsgroup and mailinglist messages using pretty much the exact same method. Netnews clients might additionally take into account the Newsgroups header of each message in a thread, but that's pretty much the biggest difference.

Netnews is so similar to mailinglists that one could create a Netnews system using SMTP and a POP3-like protocol; SMTP would be the equivalent of NNTP's host-to-host transient mode in transferring articles between hosts, and the POP3-like protocol would be equivalent to NNTP's reader mode, allowing users to pull down the articles of a "newsgroup" mailbox. One could even keep track of the forwarding path of an email just like Netnews does with Path in order to do the same optimization for keeping track of if a server has already been sent a message.

In fact, there are two different ways you could create a netnews-system over email: 1. still using the netnews format and sending all netnews articles to one particular email address of every host, or 2. by just using the regular email format, adding a Path header and a couple of those optional headers, and forwarding emails to *mailinglist* addresses. Each "newsgroup" would then have its own mail address that people can send to and pull down emails from. Every host would propagate to other hosts that they know of by forwardig messages to them. This is functionally very similar to two mailinglists being subscribed to each other.

For example, I could make an email address comp.protocols.gemini@auragem.letz.dev and connect it up to comp.protocols.gemini@satch.xyz, implement duplicate message detection/supression, and the Path header for optimizing the floodfill algorithm, and that's a functioning "netnews" system with a very similar organization to Usenet. Threading would work basically the same in already-existing clients, people can subscribe to get messages pushed to their end-user email addresses, or they can use some other protocol (POP3, Gemini, etc.) to download messages from the "newsgroup" mailbox. The only main flaw would be lack of crossposting.

How then does this involve misfin? Creating a netnews-like system over misfin is actually very similar to how one would do it over email, and a benefit is that misfin already keeps track of the path that messages are forwarded through during propagation via the senders line. This means we can already use the same optimization during the floodfill algorithm that netnews uses. The downside is misfin doesn't have Message-IDs, which makes two things harder, but not impossible: Duplicate detection, and Threading. Dupliate detection can be done by using the first sender and the first timestamp (and optionally the hash of the message body, if one so wishes). Threading can be done by just assuming every message with the same subject is part of the same thread within a particular mailbox/mailinglist/newsgroup.

But what are the benefits of misfin over netnews and regular email? I believe misfin is much more resistant to spam because of the way client certificates are used and verified, both by end users and by hosts of these mailinglists/newsgroup addresses. Every message within this "Newsfin" (news-misfin) system would have to come from a working and verifiable misfin address, which I think is a benefit rather than a downside to this approach over the Netnews approach. Hosts that do not properly verify the immediate (previous) sender, or who spoof senders, would very easily be cut off, along with all of their mailboxes, from the network. I also think it makes a lot of sense that the same protocol for mails is used for forum systems. This allows a great level of interoperability between forum messages and mailing, including how messages can be forwarded to other users, or even to aggregator mailinglists. Lastly, misfin does not have the risk of different file formats and binaries being sent over the protocol, making it much more secure, safe, and space-efficient than Usenet's current 400 TiB of articles and binaries and complicated newsgroup hierarchy.

My misfin-server software has already implemented the duplicate message detection *and* the floodfill algorithm with the same optimization that Netnews details in its RFCs, by using the senders list as the network path of a message. The things still left to do are finding a way to do Control messages (and making these secure), and making linking hosts up to each other easier. Lastly, I would like to continue working with Satch on finishing up the GMAP spec and extending it so that it can be used to download from public mailboxes (of mailinglists/newsfin addresses). This would essentially be the misfin/newsfin equivalent of NNTP's reader mode.

P.S. It is important to note that NNTP itself has other optimizations, like batch forwarding of messages to a host, that misfin wouldn't necessarily have. However, I do wonder how necesary batching is for misfin messages in general, especially now that networks have been significantly improved since Netnews was originally created in the 1980s.

Posted in: s/misfin

🚀 clseibold

2024-10-15 · 9 months ago · 👍 johano, arma, jecxjo

5 Comments ↓

👾 jecxjo · 2024-10-15 at 19:33:

Love this idea! Then again, I'm still reading newsgroups in spite of it being flooded with spam.

I'm wondering if the thread support can be defined around the subject line. Hashing the message you're threading from. There should be a simple solution to making the huge nest of newsgroups that both support misfin but also can be more feature rich with a smarter client.

I also like the idea that the system would be misfin compatible while also being independent. This would mean we can use mifin clients while not making normal misfin protocol support threading. Make diving into the nested monster of threaded conversations an opt-in by actively choosing the service. Some newsgroups threads end up nothing but branches

🚀 clseibold [OP] · 2024-10-15 at 19:51:

@jecxjo Thanks. I plan on starting the first Newsfin "group" today, which I think will be called "meta.announcements" (where "meta." will become the hierarchy for any topic about newsfin itself). Objections or alternative ideas are welcome!

As for threading: my initial thoughts were that having the same subject should be fine enough to create a thread, and I'm not completely sure if we need threads that branch off each other. However, we *could* do branches via your method by placing a hash in a subject line. The hash would be of the original sender and timestamp of the first message of the thread you are branching from, perhaps. Idk.

We don't have to extend regular misfin to add simple threading if we just assume all messages of the same subject are part of the same thread. If we want branching off, that also wouldn't require any extension to misfin if we used the hash in the subject line method above. If we want something more advanced, then there are other ways like adding an identifier in the message body itself, or I guess we could introduce a different more advanced protocol that interoperates with misfin (although, I feel like this shouldn't be necessary).

👾 jecxjo · 2024-10-15 at 22:44:

i think i misunderstood your previous post about threading. if all you were looking for was "group by subject and time" then yeah that's a no brainer. I thought by threading you meant the whole big tree of replies to replies.

Excited for the first... newsgroup? to be created.

🚀 clseibold [OP] · 2024-10-15 at 23:06:

@jecxjo Yeah, the previous post's threading was different, because it wasn't for mailinglists/newsgroups, it was for an end-user's mailbox. From the perspective of a mailinglist, because everyone can pretty much see or access every thread, it is assumed that the use of the same subject all goes to the same thread.

From the perspective of an end-user's mailbox, you might have overlapping subjects from different people, and you don't want those mails to be put into the same threads. That's why you take into account the participants in the conversation as well. Replies and sub-threads were not really considerations in any of the threading that I was picturing or talking about.

Anyways, I'm going to get the newsfin group setup soon. I think I'm going to name it "newsfin.announce" though. Since this system is using misfin addresses for each newsfin group, I don't want to pick names that could conflict with common misfin address names on pubnix servers, etc.

🐐 satch · 2024-10-16 at 10:23:

@clseibold let’s get GMAP wrappped up!

Source