AI Scraper Pest
If you've recently found evidence in your web logs of a persistent scraper
of your content and with no identifying user agent string, I have the
answer.
Artesian Solutions (https://www.artesian.co) an AI powered sales
intelligence platform based in Boston, USA and Reading, near London, UK have
unleashed a crawler that is ignoring "robots.txt" (and not even consulting
it) and placing servers worldwide under great load.
The robot uses a standard web browser user agent string so as to mask its
activities:-
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like
Gecko) Chrome/67.0.3396.99 Safari/537.36"
While Artesian's website is hosted on Google Cloud, the rogue IP address to
look for in your web logs is [77.89.171.98] and is owned by Fluidata (now
known as FluidOne, https://www.fluidone.co.uk) but assigned to Artesian
Solutions as part of a 32-IP block.
Formal complaints to both Fluidata and Artesian have gone unanswered.
Blocking via the ".htaccess" file is highly recommended to save your
bandwidth.
inetnum: 77.89.171.96 - 77.89.171.127
netname: ASS010-80082
descr: Artesian Solutions IP Assignment
country: GB
admin-c: FAT7-RIPE
tech-c: FAT7-RIPE
status: ASSIGNED PA
mnt-by: FLUID-MNT
mnt-lower: FLUID-MNT
mnt-routes: FLUID-MNT
created: 2011-01-18T16:15:05Z
last-modified: 2011-01-18T16:15:05Z
source: RIPE
role: Fluidata Admin Team
address: 2 More London Riverside
address: London
address: SE1 2AP
abuse-mailbox: abuse@fluidata.co.uk
admin-c: CR2693-RIPE
tech-c: PD2904-RIPE
tech-c: DF3424-RIPE
tech-c: CR2693-RIPE
nic-hdl: FAT7-RIPE
created: 2006-04-08T19:51:04Z
last-modified: 2010-12-20T10:59:56Z
source: RIPE # Filtered
mnt-by: FLUID-MNT
|