Most people either use google as their search engine, or one of the “privacy friendly ones” (ddg, qwant, brave, startpage, …), or use self hosted or publicly available metasearch engines, like searxng, or whoogle, etc.

This websites lists out websites which have their own indexes, and which depend on big providers.

Why YSK?

It is good for your privacy to not use a big provider like google, which now prefers to serve you ai generated ssummaries, which are based on a few giant websites, and this is not good for a open web.

I am also a person who almost always uses “(insert query) reddit” to get better results, because I mostly do not want SEO spam, and reddit results used to be human generated content. Now even that is hit and miss. Also, reddit made a deal with google, so for newer results from reddit, you can only get them from google.

Then we have the “privacy friendly ones” which most of the time are wrappers for other bigger indexes, for example ddg famously uses bing, brave “suppliments” (read this suppliments as almost always) it’s results from google, startpage is basically a google frontend, etc. Brave, qwant, and few others also claim to have their own indexes, but they are small and not rich as google and bing. Also, wwhen you think about it - what is their business model - how do they get money for the search apis - most either serve adds or have some form of tracking. Also, bing has “kinda” closed it’s search api (not really clear about this), so many of these privacy friendly options will have to either switch to google, or only serve using their indexes.

Meta-search engines kinda seem like better options, as you can run searxng on your own machine, or use the public ones, but it still has problems. You are still bringing the big providers traffic, which makes their advertisement clients happier and prefer them over smaller search engines. If you use a public instance, then it is good for your privacy, but the public instance would now generate a lot traffic, and often get banned or rate limited, and hence you can not rely on them. If you use your personal instances (I did this for a long time), you will still be tracked as your IP is still visible. You avoid their annoying ui and popups but still are tracked.

So what should you use?

You can only decide this. I would prefer something which has a reasonable business model - if they do advertisement, that should ideally be non tracking. Ideally their client and server code should be foss (so you can verify their claims), or have paid plans or apis if you do not want ads.

For example, Kagi has only paid plans, but I do not prefer or use them, because they are expensive (5 dollars for 300 searches per month or something similar. I am from one of third world countries, and 5 dollars is a lot. plus 300 searches seem less to me) but that is subjective, and your privacy has a price, so this is not neccessarily a objectively bad thing. But their code is closed source, and they do not completely use their own indexes.

I have also used Mullvad’s Leta search engine for about a month, and they are now effectively frontends for brave search or google (you can choose). Their business plan initially was that Leta was only available to their VPN clients, and VPN subscription would supplement the search cost. Now they have it available for free, so I do not really understand their business plan (maybe the number of clients they have is large enough, and number of leta users is small, that they can afford to run leta for loss, and maybe as possible advertisement for mullvad. Mullvad to me is a good privacy centric company. I am not their client, but they seem to be trust worthy. You can try them, but you would still support some big provider.

You can also try the independent search providers listed in the article. They are often small, serve bad (subjectively speaking; your taste regarding search engines is also heavily tuned to google like results because of years of exposure to it) results, but using them also supports open web (you would often find that these smaller providers do not have good indexes for big websites, and sometimes it is intentional, sometimes it is a byproduct of them being careful, or the websites banning/rate limiting then).

I have now started trying stract, and will try others too. You should also consider trying some independent search engines.

In my personal case - I have a offline setup where I have large sections of wikipedia and a few other websites (like programning language docs, or my favorite manga wiki, will be adding much of stack overflow soon) available offline, and I use my custon launcher to search through them (faster then searching them online). I bookmark a lot of sites (~ 2000) and do this to stop searching the same stuff over and over again. This has reduced at least 30-40% of all my searches. But I still need a search engine for anything I do not have currently, or stuff I do not/ can not get. I am trying stract, because it is open source, they seen to have some fine plans for business in future (non tracking, current search term related ads or subscription service ; currenlty they are running on previous funding from nlnet); search results are acceptable (not good, but servicable); and finally - it is written in RUST (I an a rust fan). I am not affiliated with the project, but just spreading a good word because I just found them, and could not find much online.

PS: I am not used to writing much, and not a good typist. Please forgive the brevity. Feel free to correct me, both on spellings and content

  • Tiger@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    3
    ·
    19 hours ago

    Thank you for the post and insights. I’m a very happy paying Kagi user, for me it’s so worth it to never see ads and enjoy better privacy. The results are always great as far as I know (I’m not a pro at comparing them with Google but they seem just as good or better)

    I especially like the AI summaries (even though I’m a skeptic about over using AI generally) because many sites block my IP because of being in China and using weird VPNs and proxies, so the summaries let me know the info without having to open the page up.

    • sga@lemmings.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      9 hours ago

      In this particular case, have you tried checking out uncensored models, maybe trained primarily for west locally, maybe some mistral model. I am not saying western models would give you correct or better or unbiased answers, but if you use multiple sources, you can probably form some results yourself. you can possibly keep a combo of smaller quant of small models (8b should run reasonably if you have 16 gib memory, even 12-14 B models can work, and you can download 2-3 models from different providers)

      Side questions: Are you allowed to use huggingface in china? similarly, is all your lemmy life through vpns, or is lemmy small enough not large enough to yet be censored. Also, if you do not mind answering, how is the infamous firewall implemented? is it a blacklist banning certain stuff, or whitelist allowing only selected stuff, or some combinations (like for example, (and I am not accusing china of censoring wiki here, I just do not know enough, and just trying to form a hypothetical example) ban whole wikipedia, but whitelist only certain articles, or whitelist all of wiki, and ban only certain articles, or something like ban english wiki, then whitelist some stuff, but keep all chineese wiki whitelisted and then possibly block some? Also, how is vpn usage treated by government? is it discouraged, or encouraged, or if you use it a lot, and possibly mix some irl activity and vpn activity, is it pursued in some ways?

      Sorry for asking so many questions, but I was just curious. In any case, if you feel uncomfortable, do not answer any question. Thank you

      • Tiger@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        2
        ·
        4 hours ago

        It’s a blacklist censoring stuff, so if the domain you want to access is obscure and not mainstream you might get it. If it’s at all political in nature though, or an open discussion forum, or a news site it’s most probably blocked.

        Anyone who needs the global internet needs a VPN. The govt is so powerful with their filtering and knowledge they can block any vpn at any time, basically. You might be able to guarantee a way through if you have a very dialed in and obscure configuration, otherwise you’re just getting lucky on borrowed time.

        Only a small handful of commercial VPNs from the west work, and they are only ones that are very purposefully made with stealth and obfuscation modes. Those modes blend traffic and go to great lengths to hide the fact that it’s a VPN tunnel occurring. To punctuate that, off the shelf Wireguard or OpenVPN are immediately detected and blocked with no chance to get through at all.

        Also it varies by region, if you’re at a fancy hotel in Shanghai you might have an easier time if you’re in the countryside. Also you have an easier time on mobile networks.

        About Lemmy, it’s obscure enough I think you don’t need a vpn to browse it a little depending on the instance. But a lot of stuff in the view coming from global URLs dues get blocked, so the general experience is broken so need a VPN on.

        Generally, I need so much global internet that’s blocked all the time for work and stuff that I need multiple VPN solutions and provider accounts active at any time in case any goes down (this is very much the case recently, my usual ones not working and have to use backup ones) and am always testing out new ways. On average I have to spend 10% of my working time every just trying to get a good connection online.

        • sga@lemmings.worldOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          3 hours ago

          Thanks for sharing. I have heard pretty crazy stuff, and this seems more believable.

          off the shelf Wireguard or OpenVPN are immediately detected and blocked with no chance to get through at all

          Is it possible for you guys to buy vps outside and somehow pay them (that would probably be the hard part) or maybe find some friends or diaspora outside, and have them pay for vps to set your personal vpns. You can make it somewhat obfuscated, and mostly hide it as some work related stuff.

          Do you guys use tor or i2p? I would presume you are “allowed” but still have to hide it.

          On average I have to spend 10% of my working time every just trying to get a good connection online.

          that is really sad, sorry to hear about that.

          • Tiger@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            2
            ·
            2 hours ago

            Yes I buy and use VPS outside of China and use that and test with it and stuff, but that kind of doesn’t matter if they detect the protocol and block it anyway, regardless of IP address, you need the obfuscated service. My group has tested building our own and had mixed success but the free that work well with better.

            What’s really the best is to use what local IT guys use, really local and esoteric stuff, and is what I’m using now on desktop because the otherwise best western ones (Mullvad, Astrill) having issues.

            I’ve used Tor here, but don’t need it enough to know much about it and it’s been a while.

            Also you gotta know here that the VPNs you do get to work might have been hacked out actually operated by the government, my devices out traffic surely accessible to them.

        • sga@lemmings.worldOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          3 hours ago

          Thanks for sharing. I have heard pretty crazy stuff, and this seems more believable.

          off the shelf Wireguard or OpenVPN are immediately detected and blocked with no chance to get through at all

          Is it possible for you guys to buy vps outside and somehow pay them (that would probably be the hard part) or maybe find some friends or diaspora outside, and have them pay for vps to set your personal vpns. You can make it somewhat obfuscated, and mostly hide it as some work related stuff.

          Do you guys use tor or i2p? I would presume you are “allowed” but still have to hide it.

          On average I have to spend 10% of my working time every just trying to get a good connection online.

          that is really sad, sorry to hear about that.

  • hansolo@lemmy.today
    link
    fedilink
    arrow-up
    8
    ·
    2 days ago

    Some caveats here.

    DDG is a wrapper for Bing results, but the DDG business model is to use only the search term to serve ads mixed in to the Bing results. Startpage is similar, with Google results instead. So you have to dodge SEO pages occasionally.

    SearX isn’t bad, but you have to trust the instance owner a bit.

    In short, any user should alternate and use multiple search engines. Spread your risk out.

    • sga@lemmings.worldOP
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 days ago

      This is one part of what I wanted to convey, but for example - bing has some results - they uprank sites they are paid by - then send to ddg - ddg again adds ads - sends to you. /you are effectively using bing with better privacy (I don’t mind ads, I can recognise/avoid them). And other problem with ddg is the thing that I brought up - bing is planning/already underway (depending on source of information) to scratch its search api, and mostly replace with a “ai powered api”. Most users of bing api would have to find alternatives.

      Same with searx, you are more private (assuming you use a public instance), but it is not great for open web. the up/downranking done at source by bing/google can not be avoided (it can be done, afaik there are some scripts/extensions to hide results from certain sites, but you still can not uprank (can not uprank something that was not even sent to you)), and it hurts smaller websites, like articles or forums that are not big platforms or doing heavy seo.

      One should most definitely use more than 1 search engine. You can even find interest specific engines.

  • Vinny_93@lemmy.world
    link
    fedilink
    arrow-up
    5
    arrow-down
    1
    ·
    edit-2
    2 days ago

    First off. I hate ads. I do everything in my power to avoid seeing them.

    But for my search engine, I tolerate them. Why? Because 80% of this ad revenue profits is used to plant trees to combat global warming.

    I’ve been using Ecosia for quite a while now and as it uses Bing for most the algorithmic stuff, you won’t notice a huge difference. But there is some custom indexing in there.

    Ecosia passes all kinds of integrity tests and certifications, so I do not question whether they are legit or not. I just feel my search efforts are best spent on Ecosia, who find a way to actually do some good in the world with something as heinous as ad revenue.

      • LWD@lemm.ee
        link
        fedilink
        arrow-up
        3
        ·
        2 days ago

        The number of trees Ecosia plants after you see

        1 ad: 0

        100 ads: 0

        1 million ads: 0

        Like every other ad provider, Ecosia only makes money after you click on the ads. So staring at them doesn’t do you (or them)/any good.

    • sga@lemmings.worldOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 days ago

      I am not gaoing to say anything on ecosia here, since i do not know much about them. All I can say is, most of the ad revenue is going towards bing api pricing, then their servers and other costs of doing a business (for example, rent, electricity, regulations, etc), then their employees, then if they have some amoun of oney remaining, they would possibly use it to plant it. Most people doing business do not immediately want to use their profits, and generally invests parts of profits back to build some runway (ad revenue is not consistent), invest in employees (insurances or incentives), or payback any loans. Even if they have some vc funding or recieving grants, they would have to profitable by themselves, and then only can they plant tress. And keeping this in mind, my guess would be that they are not planting that much

      • Vinny_93@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        2 days ago

        I should’ve mentioned that of course, they cover costs first and then invest their profits in good causes. From an article by Utopia.org:

        “Ecosia donates 100% of its profits to climate action projects. Of this, a minimum of 80% goes to tree-planting projects. This focus on planting trees is for the benefit of the planet, animals and people in general.”

        They surpassed 100 million in 2020 and according to their website they are well past 200 million at the moment. The 80% figure also comes from themselves and they have been certified after intensive scrutiny.

        Source