Hello and welcome to yet another technology-fuelled rant from yours truly!
Today’s topic is search engines, those little websites all browsers use that allow you to find other websites, given a query by the user (that’s you!). Search engines, and doing internet research in general has become synonymous with Google, with it now being common to suggest to other people to “google” something. In this post, I wish to give you a breakdown for why this might not be so good, what can be done, and what the drawbacks are. Let’s get into it.
Why ditch the default?
Every modern browser has a search engine that it uses by default. All popular browsers happen to use either Google or Bing, the only exception being the Brave browser (if achieving 1% market share is sufficient to be “popular”). This is of course because Google and Microsoft pay top dollar to convince browser makers to set their search engine as the default on the browser in question. Interestingly, this may be about to change, as a recent court ruling seems to have found this practice to be monopolistic behavior. However, should this ruling actually stop Google from paying billions of dollars to browser companies, you better get ready to wave bye-bye to Firefox, since the Mozilla foundation gets the majority of their money from these deals… We will see what happens.
Either way, I hope it’s obvious why being the default matters – people generally don’t switch. This phenomenon of human behavior clearly is bad for fostering competition, and there are lots of ideas out there on how best to address this without banning things just for being popular. Of course, a large corporation trying to maximize profits is expected, so why bother switching for reasons other than wanting to be a hipster? Here are some arguments you might consider:
- Search Engine Optimization exists, meaning bigger search engines might be getting worse every year. Search Engine Optimization is the practice of tailoring a website to please the ranking algorithm of the engine you are targeting, such that your website will rank more highly for more key-words than your less optimized rivals. This means business gurus are constantly tinkering with the popular search engines, trying to reverse-engineer the design principles of the algorithm and thus game the system. This is partly why, when you search for information about a product, you always get the same set of generic product review websites, which all say the same generic things. It’s a constant cat-and-mouse game with big-search engines, since everyone wants to optimize for them. When the big companies try to intervene against having their engines gamed, it inadvertently promotes something else, until the gurus catch up. This may explain why if you search for a recipe you often see content that accidentally does well with these ultra-engineered algorithms such as those 20 page long cooking blog-posts in which some lady rambles on about her life before giving you that pancake recipe you wanted. If instead you are a search provider, noone cares about you, so your index will inadvertedly surface much more “natural” content for queries.
- You do not wish to support business practices that emphasize advertisers rather than users. Many big-tech companies make a lot of their money from appealing to advertisers. To be able to charge a premium, they help advertisers target the audiences of their desire. Of course, to target a specific group, you need to know the makeup and identities of the group’s members. This set of incentives practically invites questionable levels of data collection about users, with significant potential for misuse. Thankfully, many governments around the world seem to have started thinking about this problem, but there’s still a long and rocky path ahead of us. You might not want to support these practices, which would be a great reason to look for another service with a different business model.
- One or two companies controlling what information can and can’t be found is reason for concern. This is a purely theoretical argument to draw your attention to the potential problems an over-concentrated market could promote. If a company with 90% market share on access to information decides that something does not exist, perhaps because its existence goes against the company’s profit-maximizing business interests, would a meaningful number of people ever find out? I hope you will agree that in the best case this level of market dominance is questionable, and in the worst case reminiscent of a George Orwell novel. Again, I don’t really think that these companies have bad intentions, but the point is that we should not have to rely on their good-will. Your ability to choose as a thoughtful human-being can help avert the possibility of bad outcomes, so why not exercise that sweet, sweet free-will?
- You want to find out about the features that the big providers don’t have. This is the most fun/least cynical reason I’ll talk about. Basically, each search engine has a different design philosophy. Google clearly values performance and a minimal aesthetic, while Microsoft seems to not shy away from giving users lots of options and additional information. The smaller players also have interesting features that I’ll talk about later on in more detail, but often revolve around privacy, new technologies, and new ways for exploration. Who knows, maybe one of them has something you never realized you needed in your life?
The alternatives come in a few flavors
So what’s out there? The smaller players in the search engine space generally don’t track their users and can be broadly categorized as follows:
1) The Front-ends
These search engines are probably the most common and can be summarized as giving you the results of one of the big players, typically minus the user tracking. They don’t usually give you the exact same results as the big providers, as they vary in the degree to which they use and mix the rankings provided by their big-tech counterparts, but they are generally pretty similar. DuckDuckGo, Startpage and Qwant all fall in this category. DuckDuckGo and Qwant are more closely related to Bing, while Startpage leans more heavily on Google. So if you already use Google or Bing, switching to one of these should provide a very similar user experience for basic search, minus the tracking and plus whatever unique features that search engine has. I’ll talk more about what makes each unique later. Yahoo! and Ecosia are also in this category, but I don’t consider them in any depth here, as Yahoo! is heavily reliant on Bing and Ecosia is more of a charity than a search engine.
2) The Meta searchers
The next category is that of meta-search engines, which as the name may have suggested, compile the results of multiple independent search engines for you. Basically, they will ask each from a set of engines to provide results to your query, and then combine these rankings, with respect to the relevancy scores associated with the results. This is very similar to 1) with the difference that the engines I ascribe to this category clearly display where each result hails from. This is unlike 1) where the result sources are not easily determined by the user. The main player here seems to be SearXNG, which is free software anyone can host themselves, with private people sharing access to their servers by listing them publicly on searx.space. Technically there is also MetaGer, but it really seems to act more as a Yahoo! front-end, which again, seems to be a Bing front-end, so I don’t consider it a serious contender in this category.
3) The Independents
Search engines in this category have the most work cut out for themselves by creating their own index without the help of existing players. In case you didn’t know, search engines are technically a type of AI that is constantly trying to re-rank web pages so that the page listed first is the one you are most likely to click on, given your query. Naturally, that means that independent search engines have a lot of catching up work to do since they do not use the many years of user-generated feedback that the goliaths have. Brave Search is by far the most popular one in this category, but there is also Mojeek. There is also Yep, but it’s currently very bare-bones, so I don’t consider it further in this post.
4) The Weird ones
With the recent boom of Large-Language-Models (LLMs), there have been a lot of companies trying to give internet search a new spin. These generally emphasize an AI generated summary of the results pages, with the actual results listing being more in the background. While this certainly is useful for quick, low-stakes questions, I don’t find them very useful for doing serious research. Perplexity AI is the most well-known one, but the original player in this space is probably You.com. It should be noted that Perplexity does track users, so is not the most privacy-conscious choice. Because this category of search tools is so different, they are harder to compare, so I will omit them from the following sections. Another search engine that could be considered and is fairly well-known is Yandex. In my opinion, it might not be the most trustworthy choice for unbiased information, though be it a different bias than what the big tech companies are said to have.
A very flawed empirical comparison
In this section, I want to give you a small, empirical comparison of my own experience when using each of these engines. I will consider Google, Bing, DuckDuckGo, Startpage, Qwant, SearXNG, Brave, and Mojeek. For a set of categories, I will come up with one common query and one niche query where I have clear expectations about what the results page should show me. I only consider traditional search, so no images/video searches. The categories are: people, places, finance, news, science.
I will evaluate the first 5 results on the page, and then score the results for accuracy, (lack of) redundancy, and the provision of other helpful information (complementarity). Sufficient performance will grant one point for each rating, while insufficient performance will give no points, meaning any of my ten test queries can yield up to three points.
This will not be a blind test, so take the results with a grain of salt. Additionaly, my criteria may oversimplify differences dramatically. Again, it’s just meant to give you an impression. If we wanted a truly scientific comparison, it would take something in the style of lmsys but for search engines.
So here are my results: First place is Bing with 28/30 narrowly beating second-place Google (27/30) as it provided better context for the finance queries I tested. I was expecting the big players to be close, but did not expect Bing to beat Google in my metrics, as my previous impression had been that it is less accurate.
To my surprise, DuckDuckGo was on par with Google, also achieving 27/30, thanks to still being pretty consistent at providing helpful context. On third place are Startpage and Qwant, both achieving 23/30. Both suffered from my complementarity category, as it was rarer for them to provide helpful widgets.
On fourth place is Brave search with 22/30, by far the best search engine that is completely independent, but also struggling to provide helpful context for many queries. However, Brave search was the most impressive out of all the engines when it came to news stories, with very helpful widgets to compare narratives.
On fifth place is Mojeek with 15/30, which did not always provide accurate results and struggled to provide context. While this appears as poor performance, I have to say that the results Mojeek provided were noticeably more original than all the other engines, which I had not expected.
Lastly is SearXNG with only 9/30 points, thanks to flaky accuracy, big problems with redundancy and no helpful context. This was the only engine that managed to score 0/3 for one of my test searches. I was very surprised to see a meta-search engine flop this hard, as it seems like a great idea on paper.
In sum, if you take my results at face value, then Bing is the best for depth, DuckDuckGo is the best privacy alternative to Google/Bing, Brave is the best for independent results and news, Mojeek is the best for exploration, and SearXNG may better be avoided if you care about finding useful information.
Cool new toys
Now I want to briefly introduce you to some of my favorite features with these smaller search engines. This is not a comprehensive list of unique features, as I don’t like all of them, so please do explore the functionality each tool provides yourself.
- Bangs are commands in the form of e.g.,
!site query
that allow you to push a query to another page’s search engine. Here’s an example: Let’s say you want to read the Wikipedia page of Abraham Lincoln, you would first go to Wikipedia’s site and then type in “Abraham Lincoln” in their search field. With bangs you can instead just do!w abraham lincoln
and get taken to the result directly. There are thousands of pages that can be indexed like this, and once you get used to it, it’s hard to go without. Want to look for a Youtube video?!yt
. Can’t find what you’re looking for?!start
. This feature makes smaller players much more viable, because you can easily query another search engine whenever it under-performs. DuckDuckGo and Brave search both have this feature. - Custom Indexes are a feature unique to Brave search and Mojeek. They allow any user to re-weight the search engine’s index according to their preferences and make it accessible to other users. This means you can make a “goggle” (“focus” for Mojeek) that essentially turns that instance into a search engine for academic articles, or one that helps you uncover niché blogs about animal husbandry – your creativity is the limit. While I love this feature in theory, seemingly neither Brave nor Mojeek have invested all that much into polishing this into something most people would be able to use. However, if refined, this would consistute the most powerful tool for internet exploration ever made.
- DuckDuckGo currently is the only smaller search engine that also gives you access to a range of language models for free. I don’t mean those AI generated page summaries but a separate space for you to chat with these models. The functionality is a bit limited in this dedicated space, and the models available are obviously not the state of the art ones. However, if DuckDuckGo keeps improving this feature, it might eventually become a real game-changer!
- Both Startpage and DuckDuckGo provide you with methods to anonymously view content. The approach Startpage takes is more general, with you being able to view any page inside a proxy. This means it becomes harder to track you, as the website is talking to the proxy, not directly to you. DuckDuckGo, on the other hand, lets you view Youtube videos on their video results page, allowing you to watch them without Youtube being able to associate them to your Google account. Personally, I think these are pretty niché features, but maybe you’ll like them?
Some hard truths
Although I generally find these smaller engines quite exciting, it’s not all rosy:
- While normal search is pretty passable on most smaller search engines, things can start to noticeably under-perform once you branch out from that. All non-independent search engines I tried seemed to heavily rely on Bing for images, meaning there is still a long way to go if this is an important feature to you. That being said, DuckDuckGo’s implementation stands out as being the only one actually good enough to satisfy most users. The independent search engines are really not the best at finding images, both in terms of quantity and quality.
- Maps is still dominated by the big providers, especially since Qwant dropped out of the race. DuckDuckGo uses Apple maps as their default maps provider, while Startpage actually lets you choose between Google maps and Bing maps for your mapping needs. The others do not currently provide maps. Qwant used to have their own independent mapping service, which I actually felt had the potential to eventually compete with Google maps, but unfortunately they recently abandoned this feature. My hope is that Brave will eventually make a maps service, as they seem to be the most dynamic company I discussed here in terms of feature expansion.
The moral of the story
In sum, the main advantage of the big players seems to be maintaining consistency for fringe searches more than anything, but as we’ve seen, privacy-focused alternatives like DuckDuckGo are already playing serious catch-up to the giants.
I personally use Brave search as my default, as it gives me a nice mix of original results and unique features, while using DuckDuckGo as a back-up for those really obscure queries. Using backups is extremely easy thanks to Bangs. In fact, Brave search even embeds buttons to perform the same search on other engines on their own results page whenever it’s not so confident it found what you wanted (Mojeek also has this).
So in essence, I choose to use the search engine that I find most interesting and interoperable, rather than the one that is strictly the most accurate. Maybe you should give this approach to internet exploration a shot too? In either case, I hope you found this informative, and that I managed to arouse your curiosity about the world of alternative tools that’s out there. Thanks for reading! 🙂
If you enjoyed this post, consider checking out my website (http://lukmayer.github.io), where I have additional content and post things first!