Google News RSS isn't really RSS

2024-12-31 | #google #rss #newswaffle | @Acidus

As I wrote recently, Yahoo News RSS feeds used by NewsWaffle are essentially dead:

Yahoo News' Topic-specific RSS feeds are dead

This is a shame since I do find news aggregation quite useful. If I want to read technology articles, I want to go to a single place and get a consolidated list of stories across multiple sites. That's what Yahoo News was doing. Its death has led me to look for different ways to power the news aggregator for NewsWaffle.

The obvious answer is Google News.

Back in August 2022, when I implemented the news aggregation function in NewsWaffle, I explained why I did it using Yahoo instead of Google News:

Why not build this using Google News? After all, Google News is, by far, the best news aggregator out there. In fact, many of my favorite smol web news sites, like 68k.news, are just wrappers around Google News. The answer is I'm trying hard to remove Google services from my personal life.

And I largely have been able to remove Google from my personal life. So while I wasn't thrilled to be willingly adopting a Google service, my naive hope was that I could easily swap out Yahoo News's RSS feeds for Google News's RSS feed, and this would be the quickest path to reintroducing this feature.

Sadly this doesn't work, because Google New's RSS feeds require a web browser with JavaScript enabled to get to the actual news articles.

Here is how most RSS feeds work:

RSS item ➡️ request direct link to article

Here is how Yahoo's RSS feeds worked:

RSS item ➡️ request yahoo.com URL ➡️ 302 redirect to actual news article.

But here is how Google News's RSS feeds "work":

RSS item ➡️ google.com URL ➡️ HTML response, which downloads 1.2 MB of HTML content, plus another 400 KB of CSS and JS content. It then runs this obfuscated JavaScript to make HTTP calls to hidden, undocumented APIs, which send back an obfuscated payload that the JavaScript then decrypts and redirect you to the final page.

In short, Google News's RSS feeds require you to use a modern JavaScript-enabled web browser to resolve the RSS items to the actual news story

This. Is. Fucking. Insane.

Worse, Google keeps changing the exact mechanism used to reveal the URL. The most popular project for this on Github tries five different decoding methods under the hood, and still has to deal with things like rate limiting, etc.

Google News Url Decoder on Github

I suspect this dynamism is exactly why the amazing 68k.news website recently stopped working. The list of news articles is correct, but if you try and click one, you get an error.

68k.news Headlines from the Future.

I enjoy hacking together projects on Gemini, but I don't want something that requires a lot of fuss and maintenance. And playing cat-and-mouse against Google over something as stupid as RSS feed URLs looks like a losing battle and a colossal pain in the ass.

So... instead, I guess I'm going to need to figure out how to do news aggregation myself.


Source