Insights From The Blog

ChatGPT vs Gemini: which AI is best?

Even the most cursory glance at the internet will show that AI is probably the single biggest issue at the moment, and it shows no sign of abating. With a growing number of chatbots, text editors and photo-generation software being available, much of the content that we read, see and responses to our queries are non-human. But are all of these systems comparable? Are some better than others? In this article, we’ll have a good look at two of the front-runner AI engines and put them up against each other.

What is AI?

In simple terms, advances in AI have made it possible to let data dictate its programming through the use of progressive learning algorithms. By sifting through data for patterns and trends, AI enables algorithms to learn new tasks. Algorithms may learn new skills, like chess, and even new products to propose online. Computer programmes powered by artificial intelligence can swiftly search the entire internet for relevant material, filter it, and then use it to answer questions. Not only that, it can write coherent sentences and generate documents that look just like content made by humans. Apart from those written by me, which are really good.

The contenders

Technical development company OpenAI released ChatGPT in November 2022. The company was established in 2015 and was led by Elon Musk and Sam Altman, and has a number of investors, the most prominent of which is Microsoft.  Being one of the first AI developments to hit the market, ChatGPT is the one that most people identify with and will cite if asked about AI.

Gemini is the next-generation GenAI model family of Google’s DeepMind and Google Research artificial intelligence research teams that has finally hit the internet and several Google products. The training of all Gemini models ensures that they are “natively multimodal” — that is, capable of processing and using data in contexts beyond text alone. Gemini underwent a great deal of pretraining and fine-tuning using a wide range of media types, including several languages’ worth of text, codebases, and photos and videos. Gemini is available in three standards; Gemini Ultra, the top-level form, Gemini Pro, a “lite” model, and Gemini Nano which runs on mobile devices like the Pixel 8 Pro.

These are both very powerful tools, but which one is the best, if any? Let’s have a deep dive and see how they hold up against each other.

Language Models

The first thing to note is that both Gemini and ChatGPT are massively improved large language models (LLMs), which are more powerful and extensive than anything ever made publicly available. Keep in mind that while we refer to them as chatbots, the two have slightly different intended user experiences. ChatGPT is built to facilitate natural-sounding talks and problem-solving, just like having a one-on-one chat with a subject-matter expert. In contrast, Gemini appears to be built to optimise data processing and task automation for the benefit of the user. ChatGPT benefits from having been around for longer and therefore has had greater access to natural language and learning modules. But Gemini, while newer, has had access to all of Google’s web experience and content. However, that doesn’t necessarily make it more powerful since anyone who uses multiple search engines will know that different engines – Yahoo, Bing, Yandex, Baidu etc – can give startlingly different search results from the same query, so maybe not just being tied to Google is better. 

The general consensus is that ChatGPT currently has the edge over Gemini, but that may not be the case in six months’ time. 

Information Retrieval Capabilities

This is one of the core requirements of AI, and if a system can’t do it well or effectively, then it will be fairly ineffectual as an intelligent entity. 

When it comes to this particular function, the two AI giants are considerably different from one another. By default, Gemini takes into account all of the information that is available to it, which includes the internet, the massive library of information and expertise that Google maintains, as well as its own training data.

In contrast, ChatGPT will frequently continue to make the decision to attempt to answer a question by relying simply on the algorithms that it has learned from its training data. It is possible that this will result in information that is no longer current. The user, on the other hand, has the ability to go around this by instructing it to search the internet in order to obtain the most recent and accurate data. However, this still involves the introduction of an additional stage, which Gemini has demonstrated is not really required.

This one really has to go to Gemini as its data always seems fresh and considered, while there are instances when ChatGPT looks like it has just regurgitated information from a website verbatim. Don’t let that put you off though; both systems are incredibly powerful and tend to return good results, though based on the input information. The old computer adage about “rubbish in – rubbish out” still holds.

Multi-Modal Capabilities

The term multi-modal relates to the ability of an AI program to sort, assess, and process information in the form of written content, images, and audio data. The first release of ChatGPT was great at processing text but less so at doing the same with other forms of information. However, since the system has been upgraded to the GPT-4 engine, it has become a lot better at multi-modal processing and can process images and audio by using the specialist DALL-E add-on with the same astounding results. Gemini has been developed with Google’s Imagen 2 engine which was designed to be just as good at handling other forms of data from its inception. 

These two engines generate content in different ways and with markedly different results. When compared on the basis of the identical prompt, ChatGPT is, without a doubt, more consistent when it comes to the creation of an image that closely resembles what the user is searching for. On the other hand, Imagen 2 and the Gemini engine are marginally superior when it comes to the production of photorealistic and extremely detailed images, which are a bit more noticeable in comparison to the results of ChatGPT.

Overall, ChatGPT tends to give more consistent results on the three inputs, while Gemini is slightly better at creating images. Consistency wins and this round goes to ChatGPT. 

The Results

However you look at it, the results are pretty close between these two giants of AI. To be fair, neither of them is faultless in any way. In addition, both continue to experience hallucinations and will, on a pretty regular basis, supply information that is completely incorrect. It is not known if this is due to the fact that the internet is inaccurate or that search engines are not casting their search nets wide enough. However, it does indicate that it is generally better to have a check on facts, which is something that kind of misses the point of AI. If your objective is just to create written content, then both engines have their strengths and weaknesses. 

It is highly likely that the capability of Gemini to interface with Gmail and Google Docs will be a major selling point for you if you are a dedicated user of the Google system. In a similar vein, if you are an experienced coder and your primary requirement is to compose code, you should thoroughly investigate Gemini.

On the other hand, at the time of writing this, ChatGPT is superior for activities such as writing and preparing documents, summarising, generating images for general purposes, and learning through chats. Because of this, it continues to hold the position of being the greatest option that is currently accessible.

If you are looking for a strong and useful AI engine, then ChatGPT is probably the one to go for, but it should be looking over its shoulder, ‘cos Gemini is charging up fast behind it.