AI Deep Research Showdown

AI chatbots are constantly evolving, and many now offer "deep research" options that allow them to research specific topics for you. These bots act as autonomous AI agents, searching the web on your behalf, finding appropriate online resources, and then providing you with a detailed report based on their findings. The goal is to save you the time of sifting through hundreds or thousands of websites yourself.

Deep research is rapidly becoming a powerful feature across various AIs. You can find it in OpenAI’s ChatGPT, Google Gemini, Perplexity AI, and even xAI’s Grok (calling it DeepSearch). Microsoft has introduced a deep research type with two AI agents (Researcher and Analyst); however, they require a Microsoft 365 Copilot license and need an Enterprise or Business subscription, so they aren’t yet available to average Copilot users.

This certainly sounds like a useful and helpful skill. But how do different AI services perform when faced with this challenge? To answer this question, I tried the deep research functions of ChatGPT, Gemini, Perplexity AI, and Grok. I submitted the same query to each service, asking them to "explore how time travel has been depicted in movies and television and how it reflects our values, fears, and desires."

Here’s how each AI’s research mode works and how they handled my topic.

ChatGPT

OpenAI’s ChatGPT offers two different deep research modes: full and light. The full version delivers a detailed, in-depth report, but it can take up to 30 minutes to find the best sources and present its findings. The light version delivers a shorter, less-penetrating report but typically completes in a few minutes. Which one you can use and how many queries you can submit depends on your plan.

ChatGPT Plus, Team, and Edu users get 25 queries a month (10 full and 15 light), Enterprise users get 10 (all full), Pro users get 250 (125 full and 125 light), and free users get five (all light). Once you reach your limit for full deep research, your queries will automatically default to light.

Whether you’re using the full or light version, the process is the same. You can also use the desktop app for Windows or macOS. Type or speak your query at the prompt, select the Deep Research button, and then submit your request. If the full version is in effect, prepare to wait a while to get a response. If the light version is running, you won’t have to wait nearly as long.

I submitted the query about time travel in film and TV to both full deep research and light – using my Plus subscription in the first case and a free account in the second. Both used the GPT-4o model. Both also asked me to clarify the type of analysis I wanted, such as deciding between a thematic approach or a more historical one, and whether to include only classic movies and TV shows or more modern ones.

The full version took around 17 minutes to search the web and compile its results, but it delivered a detailed, in-depth report with several examples and a helpful chart of TV shows and films. The light version only took around eight minutes from start to finish, but it delivered a shorter, less-penetrating report – a Cliff’s Notes version of the full report. Both reports addressed my topic and were interesting to read, but full deep research gets the nod for its thoroughness.

Google Gemini

Gemini’s deep research mode is available to subscribers and free users alike. Subscribers typically get 20 queries per day, though that number may vary. Free users are limited to five queries per month.

To enable deep research, click the dropdown in the upper-left corner that lists the current model. Subscribers can choose between 2.0 Flash, 2.5 Pro (experimental), and Deep Research with 2.5 Pro. Free users can select either 2.0 Flash or Deep Research. After selecting the desired model, a Deep Research button should appear below the prompt. Type your question at the prompt, select the Deep Research button (if it isn’t already highlighted), and then submit your request.

After I submitted the query about time travel, Gemini’s deep research quickly generated an outline of how it planned to tackle my topic, which I could either tweak or approve as is. I gave it the thumbs up, and Gemini started its research on the web.

The AI kept me apprised of its progress at each step, indicating what it was doing, what websites it was consulting, and how the report was shaping up. The entire process took around 10 minutes.

The resulting report was quite deep, thorough, and lengthy. I liked the inclusion of a table that included examples of movies discussed. Gemini’s writing style was more academic than ChatGPT’s, which was less formal and a bit more fun to read. But Gemini still proved worth the task.

Perplexity AI

Perplexity’s deep research mode is available to paid subscribers and free users. Pro subscribers get up to 500 queries per day, while free users receive five queries daily. At the prompt, type your question, select the Research button, and then submit your request.

I set Perplexity’s research mode to chew on the same time travel topic. Here, the AI kept me apprised of its progress, telling me what specific subtopics it was researching and what websites it was analyzing. Perplexity only took around five minutes to compile its findings and submit its report. But the results were disappointing.

The report itself was much shorter than the ones generated by the other AIs. Each topic or element only received a few paragraphs, most of which lacked any in-depth analysis. The report was okay as a quick read. But it reminded me of homework turned in by a kid who just wanted to get it done without investing too much time or effort.

Grok AI

xAI’s Grok 3 offers two deep research modes: DeepSearch and DeeperSearch. DeepSearch looks at a wide range of online sources, though not all of them are useful or reliable. This mode can also work very quickly. DeeperSearch is an upgraded version of DeepSearch that leverages more high-quality resources and takes longer to run but typically delivers a more in-depth report.

Whichever mode you choose, X Premium+ subscribers enjoy an unlimited number of queries, while free and Basic users are limited to 10 DeepSearch queries per day.

To try the feature, type your query at the Grok prompt, click the down arrow for DeepSearch, and then select either DeepSearch or DeeperSearch. When ready, submit your request.

I tossed the same time travel query at Grok. In DeepSearch mode, the AI completed the entire process in just a minute and a half, a record for speed. DeeperSearch took a little longer at two and a half minutes. Given the speed, I fully expected to receive a lousy report. But the results surprised me. In both modes, Grok delivered a report that was interesting and informative, albeit brief. The research listed various examples, a helpful chart of TV shows and films, and some clever analysis. Not bad at all.

So, which AI performed the best? I’d have to declare ChatGPT the winner. Though it took the longest, its report was the most thorough, in-depth, best-written, and most interesting. Otherwise, Grok is definitely worth a try if you’re in a hurry. Of course, all of this is based on just one query. For other topics, I might nominate a different champion. But these results are still worth considering the next time you need an AI to handle your own type of deep research.

Now, let’s delve deeper into the specific differences between each AI model and explore their unique strengths and weaknesses when handling complex research tasks. We will focus on several key aspects:

  • Quality and Diversity of Information Sources: Is the AI able to identify and utilize information from various reliable sources?
  • Depth and Insight of Analysis: Does the AI merely repeat information, or can it provide profound analysis and valuable insights?
  • Clarity and Readability of Reports: Are the AI-generated reports easy to understand, logically clear, and well-structured?
  • Processing Time and Efficiency: Is the time required for the AI to complete the research task reasonable, and how does it compare to other models?

By conducting a more detailed assessment of these key aspects, we can gain a better understanding of the capabilities of each AI model and determine which model is best suited for specific research needs.

ChatGPT Deep Dive

First, let’s review ChatGPT’s performance in deep research. As mentioned earlier, ChatGPT offers two different deep research modes: full and light. The full version provides a more in-depth and thorough analysis but takes longer to complete. The light version is faster but compromises on depth and detail.

In terms of information sources, ChatGPT appears to be able to access a wide range of online resources, including academic journals, news articles, blogs, and websites. However, in some cases, it may rely on less reliable sources, which can affect the accuracy and credibility of its reports.

In terms of depth and insight of analysis, ChatGPT’s full version is generally able to provide profound analysis and valuable insights. It can identify relationships between different sources and present well-reasoned arguments. However, the light version tends to lack this depth and may provide more superficial analysis.

In terms of clarity and readability of reports, ChatGPT is generally able to generate reports that are easy to understand, logically clear, and well-structured. However, in some cases, its writing style may be too formal or academic, which can reduce its appeal.

In terms of processing time and efficiency, ChatGPT’s full version takes a relatively long time to complete research tasks. This may be due to its more thorough analysis and reliance on a wider range of sources. The light version is faster but compromises on depth and detail.

Google Gemini’s Approach

Next, let’s look at Google Gemini’s performance in deep research. Gemini offers a deep research mode that is available to both subscribers and free users. It allows users to adjust or approve the AI’s outline for tackling the topic.

In terms of information sources, Gemini appears to be able to access a similar range of online resources as ChatGPT. However, it may have stricter screening for the reliability of certain sources, which may improve the accuracy and credibility of its reports.

In terms of depth and insight of analysis, Gemini’s deep research is generally able to provide profound analysis and valuable insights. It can identify relationships between different sources and present well-reasoned arguments. However, its writing style may be more academic than ChatGPT’s, which can reduce its appeal.

In terms of clarity and readability of reports, Gemini is generally able to generate reports that are easy to understand, logically clear, and well-structured. However, its academic style may make it difficult for some readers to grasp.

In terms of processing time and efficiency, Gemini’s deep research is generally faster than ChatGPT’s full version. This may be due to its more efficient analysis and reliance on a more streamlined set of sources.

Perplexity AI’s Perspective

Now, let’s assess Perplexity AI’s performance in deep research. Perplexity offers a deep research mode that is available to both paid subscribers and free users. It informs users of the specific subtopics it is researching and the websites it is analyzing.

In terms of information sources, Perplexity appears to be able to access a similar range of online resources as ChatGPT and Gemini. However, it may have stricter screening for the reliability of certain sources, which may improve the accuracy and credibility of its reports.

In terms of depth and insight of analysis, Perplexity’s deep research often lacks depth and detail. It may provide more superficial analysis and may not be able to identify relationships between different sources.

In terms of clarity and readability of reports, Perplexity is generally able to generate reports that are easy to understand, logically clear, and well-structured. However, its concise nature may make it lack appeal.

In terms of processing time and efficiency, Perplexity’s deep research is generally faster than ChatGPT and Gemini. This may be due to its more efficient analysis and reliance on a more streamlined set of sources.

Grok AI: Speed and Sources

Finally, let’s look at Grok AI’s performance in deep research. Grok offers two deep research modes: DeepSearch and DeeperSearch. DeepSearch looks at a wide range of online resources, while DeeperSearch leverages more high-quality resources and takes longer to run.

In terms of information sources, Grok’s DeepSearch may rely on less reliable sources, which can affect the accuracy and credibility of its reports. DeeperSearch focuses more on high-quality sources.

In terms of depth and insight of analysis, Grok’s deep research is generally able to provide interesting and informative reports, despite their shorter length. It can identify relationships between different sources and present well-reasoned arguments.

In terms of clarity and readability of reports, Grok is generally able to generate reports that are easy to understand, logically clear, and well-structured. However, its concise nature may make it lack appeal.

In terms of processing time and efficiency, Grok’s deep research is the fastest of all the models. This may be due to its more efficient analysis and reliance on a more streamlined set of sources.

In conclusion, each AI model has its unique strengths and weaknesses in deep research. ChatGPT offers the most thorough and in-depth analysis but takes the longest to complete. Gemini offers a similar analysis to ChatGPT but with a more academic writing style. Perplexity AI is faster but lacks depth and detail. Grok AI is the fastest but may rely on less reliable sources.

Ultimately, the best AI model for you will depend on your specific research needs. If you need the most thorough and in-depth analysis and don’t mind waiting longer, then ChatGPT may be the best choice. If you need a more efficient analysis and are willing to accept some compromises on detail, then Gemini or Perplexity AI may be better options. If you need the fastest analysis and don’t mind relying on less reliable sources, then Grok AI may be the best choice.