Changes to AuraGem Search

I'm currently in the process of moving my search engine database from Firebird to Postgresql. The entire search engine crawler data will also be rebuilt from scratch.

The process is almost complete, I just need to move over the actual search querying sql code to postgresql's version of searching.

I will also be completely redoing a lot of my crawler code to make it more memory efficient. Previously, the crawler would store GBs worth of crawl data (URLs of pages crawled and to be crawled) in memory, but now that geminispace has grown extremely large in the number of URLs, this was causing my system to swap *a lot* (like, 10GBs worth of swap, lol).

These changes will hopefully improve the search engine's speed overall.

๐Ÿš€ clseibold

Jun 20 ยท 2 weeks ago

2 Comments โ†“

๐Ÿ‘ป ps ยท Jun 21 at 01:01:

Take a look at Manticore as well; I've used it for my personal search engine.

It's based on Sphinx, does not require external SQL DB in most cases.

โ€” https://github.com/manticoresoftware

๐Ÿš€ clseibold [OP] ยท Jun 21 at 01:58:

@ps Thanks, but postgresql is working just fine, and the move to it is pretty much finished now.

The performance issues are primarily caused by the swapping problem because of the way I originally chose to do my crawler.

While postgresql might be more efficient than firebird, the main reason for the switch is because postgresql gets better updates, is more supported and widely used, and has better tools (pgAdmin is better than anything you can get for firebird). And it supports way more features.


Source