Technology Amazon reportedly investigating Perplexity AI after accusations it scrapes...

-

Amazon reportedly investigating Perplexity AI after accusations it scrapes websites without consent

Amazon reportedly investigating Perplexity AI after accusations it scrapes websites without consent

Amazon Web Services has started an investigation to determine whether Perplexity AI is breaking its rules, according to Wired. To, be precise, the company’s cloud division is reportedly looking into allegations that the service is using a crawler, which is hosted on its servers, that ignores the Robots Exclusion Protocol. This protocol is a web standard, wherein developers put a robots.txt file on a domain containing instructions on whether bots can or can’t access a particular page. Complying with those instructions is voluntary, but crawlers from reputable companies have generally been respecting them since web developers started implementing the standard in the ’90s. 

In an earlier piece, Wired reported that it discovered a virtual machine that was bypassing its website’s robots.txt instructions. That machine was hosted on an Amazon Web Services server using the IP address 44.221.181.252 that’s “certainly operated by Perplexity.” It reportedly visited other Condé Nast properties hundreds of times over the past three months to scrape their content, as well. The Guardian, Forbes and The New York Times had also detected it visiting their publications multiple times, Wired said. To confirm whether Perplexity truly was scraping its content, Wired entered headlines or short descriptions of its articles into the company’s chatbot. The tool then responded with results that closely paraphrased its articles “with minimal attribution.” 

A recent Reuters report claimed that Perplexity isn’t the only AI company that’s bypassing robots.txt files to gather content used to train large language models. However, it seems like Wired only provided Amazon with information on Perplexity AI’s crawler. “AWS’s terms of service prohibit abusive and illegal activities and our customers are responsible for complying with those terms,” Amazon Web Services told us in a statement. “We routinely receive reports of alleged abuse from a variety of sources and engage our customers to understand those reports.” The spokesperson also added that the company’s cloud division told Wired it was investigating information the publication provided as it does all reports of potential violations. 

Perplexity spokesperson Sara Platnick told Wired that the company has already responded to Amazon’s inquiries and denied that its crawlers are bypassing the Robots Exclusion Protocol. “Our PerplexityBot — which runs on AWS — respects robots.txt, and we confirmed that Perplexity-controlled services are not crawling in any way that violates AWS Terms of Service,” she said. Platnick told us that Amazon looked into Wired’s media inquiry only as part of a standard protocol for investigating reports of abuse of its resources. The company has apparently not heard from Amazon about any type of investigation before Wired contacted the company. Platnick admitted to Wired, however, that PerplexityBot will ignore robots.text when a user includes a specific URL in their chatbot inquiry. 

Aravind Srinivas, the CEO of Perplexity, also previously denied that his company is “ignoring the Robot Exclusions Protocol and then lying about it.” Srinivas did admit to Fast Company that Perplexity uses third-party web crawlers on top of its own, and that the bot Wired identified was one of them.

Update, June 28, 2024, 2:20PM ET: We have updated this post to add Perplexity’s statement to Engadget.

Update, June 28, 2024, 8:27PM ET: We have updated this post to a statement from Amazon Web Services. 

This article originally appeared on Engadget at https://www.engadget.com/amazon-investigating-perplexity-ai-after-accusations-it-scrapes-websites-without-consent-133003374.html?src=rss

Engadget is a web magazine with obsessive daily coverage of everything new in gadgets and consumer electronics

Source : https://www.engadget.com/amazon-investigating-perplexity-ai-after-accusations-it-scrapes-websites-without-consent-133003374.html?src=rss

Latest news

NFTs and blockchain bridge Ethiopia’s past and present in new art exhibition

“Ethiopia at the Crossroads” is getting a special blockchain boost for its final stint at the Toledo...

How US job market slump could boost Bitcoin prices

Bitcoin could rise due to a weaker job market, but Bitcoin ETFs are on track to their...

Bitcoin Ichimoku cloud reassures as BTC price bounces from 4-month low

BTC price action attempts to recover from the start of Mt. Gox transfers, but analysis argues the...

Alvopetro Energy: Leading Independent Upstream and Midstream Gas Developer in Brazil

Alvopetro Energy: Leading Independent Upstream and Midstream Gas Developer in Brazil Alvopetro Energy (TSXV:ALV;OTCQX:ALVOF) is a pioneering independent natural gas...

As Meta’s Threads celebrates first anniversary, will it now become a challenger to X?

Despite Threads hitting 175 million monthly active users, it’s still too early to say whether it could...

Why people are losing trust in mainstream media — Mario Nawfal

Mario Nawfal breaks down how mainstream media has lost the public’s trust and why social media offers...
Advertisement

Must read

Advertisement

You might also likeRELATED
Recommended to you