Donate for the Cryptome archive of files from June 1996 to the present


18 January 2019

AI Scraper Pest


Anonymous sends:

AI Scraper Pest
If you've recently found evidence in your web logs of a persistent scraper
of your content and with no identifying user agent string, I have the
answer.

Artesian Solutions (https://www.artesian.co) an AI powered sales
intelligence platform based in Boston, USA and Reading, near London, UK have
unleashed a crawler that is ignoring "robots.txt" (and not even consulting
it) and placing servers worldwide under great load.

The robot uses a standard web browser user agent string so as to mask its
activities:-

"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like
Gecko) Chrome/67.0.3396.99 Safari/537.36"

While Artesian's website is hosted on Google Cloud, the rogue IP address to
look for in your web logs is [77.89.171.98] and is owned by Fluidata (now
known as FluidOne, https://www.fluidone.co.uk) but assigned to Artesian
Solutions as part of a 32-IP block.

Formal complaints to both Fluidata and Artesian have gone unanswered.

Blocking via the ".htaccess" file is highly recommended to save your
bandwidth.



inetnum:        77.89.171.96 - 77.89.171.127
netname:        ASS010-80082
descr:          Artesian Solutions IP Assignment
country:        GB
admin-c:        FAT7-RIPE
tech-c:         FAT7-RIPE
status:         ASSIGNED PA
mnt-by:         FLUID-MNT
mnt-lower:      FLUID-MNT
mnt-routes:     FLUID-MNT
created:        2011-01-18T16:15:05Z
last-modified:  2011-01-18T16:15:05Z
source:         RIPE

role:           Fluidata Admin Team
address:        2 More London Riverside
address:        London
address:        SE1 2AP
abuse-mailbox:  abuse@fluidata.co.uk
admin-c:        CR2693-RIPE
tech-c:         PD2904-RIPE
tech-c:         DF3424-RIPE
tech-c:         CR2693-RIPE
nic-hdl:        FAT7-RIPE
created:        2006-04-08T19:51:04Z
last-modified:  2010-12-20T10:59:56Z
source:         RIPE # Filtered
mnt-by:         FLUID-MNT