Google shows AI answers at the top of search results, so users might not visit the websites that provide the data for those answers.
Several website owners claim they not in a position of power to prevent Google’s AI from summarizing their content.
Publishers have revealed that the Google tool used for generating AI answers is also responsible for indexing web pages for search results.
If sites blocked Alphabet Google like they blocked some AI competitors, it would make it harder for sites to be found on the internet.
A federal court recently ruled that Google’s strong position in search is unfair and gives it an advantage in the growing competition over AI technology, according to search startups and publishers.
Publishers are in a difficult position as they must decide if they want to let AI use their content, potentially making their websites irrelevant, or risk being excluded from Google search, an important source of website traffic.
Joe Ragazzo, the publisher of Talking Points Memo, described it as a major crisis for these companies.
These options are both bad.
Either you leave and face instant death, or you team up with them and likely suffer a slow death as they will eventually abandon you too.
Google stated that AI Overviews, which are summaries showcased at the top of Google search results, are part of their ongoing effort to provide better quality information and support opportunities for publishers and businesses.
A Google spokesperson said that Google sends lots of visitors to websites every day and wants to keep this relationship going.
AI Overviews are making search more useful, leading people to use search more frequently and uncovering new content opportunities.
Google has been using a software called Googlebot to browse many websites and create a large internet collection from the beginning.
The index has made it difficult for companies, including well-funded ones like Microsoft, to create rival search engines.
Generative AI’s increasing popularity has led to a surge of new startups trying to provide search tools that use AI to give short, precise answers to users’ queries.
Google is worried about the future of its search engine because chatbots are becoming very popular.
Before these new companies can challenge the search giant’s business, they need to search the web thoroughly.
And that is a difficult task to accomplish.
Website owners include a file with rules for bots to avoid spending money, computing power, and storage on being crawled.
Google and Microsoft’s Bing have greater flexibility and are typically able to direct traffic to websites via their search engines.
New search startups can’t guarantee high traffic initially, so they are starting to pay publishers to use their content, according to Alex Rosenberg, the CEO of Tako, an AI startup.
Tech companies are now paying for content and access in order to stay competitive, according to Rosenberg.
Google doesn’t need to do that.
While media companies and AI startups are making deals, Google has not actively participated.
Google has indicated to publishers in private that, except for a $60 million deal with Reddit, it is not interested in negotiating, according to two sources who wished to remain anonymous.
Media companies don’t have much power in these discussions.
Google recently introduced AI Overviews, a feature that provides brief answers to user questions using artificial intelligence at the beginning of search results.
Publishers were worried about how the answers might affect their website traffic and did not have a definite way to handle these concerns.
Google has a distinct crawler for certain AI products, like its chatbot Gemini.
The Googlebot is used for both AI Overviews and Google searches.
A spokesperson from the company stated that Googlebot controls AI Overviews because AI and the company’s search engine are closely connected.
The spokesperson mentioned that their search results page displays information in different ways, such as images and graphics.
Google stated that publishers can prevent specific pages or parts of pages from showing up in AI Overviews in search results. However, this action would also probably stop those snippets from appearing in other Google search features, such as web link listings.
Many publishers do not want to reduce their audience by relying on search engines for a large portion of their traffic.
Marc McCollum, who leads innovation at Raptive, which represents publishers and influencers, says that Google’s stance does not fully acknowledge the danger this presents to content creators who depend on search visibility for their income.
Creators who choose to opt out may end up lowering their visibility in search results, potentially impacting their reach and earnings negatively.
Kyle Wiens, the CEO of iFixit, a website that provides free online repair guides for consumer electronics, mentioned that their connection with Google is less stable compared to other AI companies.
Wiens said he can prevent ClaudeBot from indexing their website without causing any harm to their business, in an email mentioning the bot developed by Anthropic, a generative AI startup.
Blocking Googlebot could result in a decrease in website traffic and potential customers.
Google’s agreement with Reddit, where many users discuss specific topics passionately, provides the company with valuable data for AI models.
The agreement happened at the same time Google made changes to increase the visibility of forum results in search, leading to a significant increase in traffic to Reddit.
A Reddit spokesperson mentioned that better product quality and speed have helped increase website traffic.
Perplexity, a search startup, is discussing with Reddit about using their content. However, the person familiar with the situation says that the rate set by Google is too high for a startup to meet.
Google announced that the agreement with Reddit is a broad partnership that includes more than just training data.
The Reddit spokesperson refused to provide any comments about their business talks.
Perplexity declined to comment.
Other search startups have decided that the data is just not accessible.
Vladimir Prelovac, the founder of a search startup called Kagi, mentioned that it would take 20 years of their current revenue just to pay Reddit.
I’m not even considering that as a possibility.
Startup companies are not the only ones facing difficulties.
OpenAI just released SearchGPT, a trial version of its popular chatbot designed for searching.
Popular websites such as Amazon, Goodreads, and Uniqlo have restricted the GPT crawler from accessing their sites, which could create problems for OpenAI’s search goals.
OpenAI mentioned that websites may show up in search results despite opting out of having their content used for AI training.
Prelovac stated that about 50% of Kagi’s expenses are for obtaining search data through crawling and other methods.
Having a comprehensive index is essential for a search engine to provide users with a thorough look at what’s on the internet.
Prelovac mentioned that the data becomes even more crucial for companies wanting to use AI models like ChatGPT to directly address users’ questions.
Prelovac said that generative AI models are not very intelligent by themselves.
Access to the same search index is necessary for generating high-quality AI output.
Richard Socher, founder of search startup You.com, stated that startups are compelled to make difficult decisions because of the common use of robots.txt files that provide crawling guidelines.
Socher mentioned that since the files are not considered legally binding, companies can scrape public data as long as they don’t need log-in information or subscriber credentials.
He mentioned that we do our best to avoid putting too much strain on any website when we crawl.
If a website’s robots.txt file only lets Google crawl it and no one else, it basically favors Google’s dominance in search.
Neeva, a search company created by ex-Google employees which was acquired by Snowflake Inc.
Last year, they supported “crawl neutrality” to help startups create their search indexes more easily.
Bloomberg reported that the Justice Department is exploring ways to address Google’s monopoly in online search, such as requiring the company to share data with competitors or possibly breaking it up.
One idea that has gained a lot of interest is making Google share the data it gathers with others or allowing its competitors access to its search index.
The Digital Markets Act in the European Union makes Google share certain search query data.
Wiens, the iFixit CEO, believes that Google’s search dominance gives it an advantage over other AI companies, making it a key issue in their antitrust concerns.
He said separating Google search from their AI work would help clarify things.
DuckDuckGo stated that changes in search technology are making Google’s search index and antitrust issues more troubling.
Kamyl Bazbaz, DuckDuckGo’s senior vice-president of public affairs, stated that search indexes are crucial in the era of generative AI.
TPM’s Ragazzo stated that no matter what happens with the antitrust case, it is crucial for publishers to manage their own direction and not depend too much on any single tech platform like Google.
Ragazzo stated that forming genuine relationships with readers is essential for creating a publication that can endure over time.