This Guy Has Built an Open Source Search Engine as an Alternative to Google in His Spare Time

noodlejetski@lemm.ee · 9 months ago

This Guy Has Built an Open Source Search Engine as an Alternative to Google in His Spare Time

RachelRodent@lemmy.dbzer0.com · 9 months ago

there is a business around it and the project doesn’t really have any background so no trust that has built up. I would thread carefully

Appel@whiskers.bim.boats · 9 months ago

So have many others, except they didn’t start a company based on it. As soon as it is part of a company, it is no longer free and open

loics2@lemm.ee · 9 months ago

Why? It depends on the business model, even RMS says it’s ok to make money with open source

Spectranox@lemmy.dbzer0.com · 9 months ago

Ok, now, to make this the be-all and end-all of search engines…

Can it be self-hosted?

IrritableOcelot@beehaw.org · edit-2 9 months ago

Looks like it from the readme!

Spectranox@lemmy.dbzer0.com · 9 months ago

Amazing, will try this out on the Pi then.

elvith@feddit.de · 9 months ago

I was wondering the same, but I didn’t find any information on how it builds the search index. I guess it takes quite a while until it’s usable. Also, it might be very dependent on the speed if the internet connection and also the available storage.

perishthethought@lemm.ee · 9 months ago

In the github page linked in this post:

We recommend everyone to use the hosted version at stract.com, but you can also follow the steps outlined in CONTRIBUTING.md to setup the engine locally.

Justin@lemmy.jlh.name · 9 months ago

I wonder how it compares with searxng. I do like that it’s written in Rust instead of Python.

Refurbished Refurbisher@lemmy.sdf.org · edit-2 9 months ago

It’s got a fully independent search index according to the README. SearxNG, LibreX, LibreY, etc. just takes results from multiple search engines and combines them.

jlow (he/him)@beehaw.org · 9 months ago

Mh, but there are (were?) other search-engines where you could crawl the web yourself, I relember doing that for the lolz, can’t rember the name, though.

jlow (he/him)@beehaw.org · 9 months ago

Ah, as thinking of

https://yacy.net/

as mentioned here: https://h4.io/@helioselene/111908397221160157

Big P@feddit.uk · 9 months ago

What are the actual reasonable outcomes here:

The search engine becomes successful and requires monetization to pay for the hosting/indexing costs
The search engine does not become successful and the ever increasing cost of indexing the entire internet forces monetization or shut down
You self host your own version, in which case you need to start indexing yourself (see problem #2)

immortaly007@feddit.nl · 9 months ago

I think what would be interesting is to get everyone who self hosts this do part of the indexing. As in, find some way to split the indexing over self-hosted instances running this search engine. Then make sure “the internet” is divided somewhat reasonably. Kind of what crypto does, but instead producing the indexes instead of nothing.

Amju Wolf@pawb.social · 9 months ago

That would give random strangers (at least partial) control over what is indexed and how and you’d have to trust them all. I’m not sure that’s a great idea.

dfyx@lemmy.helios42.de · 9 months ago

There areways to get around this. Give every indexing job to multiple nodes, decide the result by majority vote between those nodes and penalize (i.e. exclude) nodes that repeatedly produce results that don’t match the majority. Basically what distributed research has done for decades.

Getting the details of such a system right wouldn’t be easy but far from impossible.

key@lemmy.keychat.org · 9 months ago

https://github.com/StractOrg/stract