langcliffe
Well-known member
A forum member recently commented that the requirement to click a dummy image when clicking directly on an image link in the BCRA Online Archive is inconvenient. To address this concern, I'll explain the reasons behind this implementation.
The BCRA Online Archive houses nearly 40,000 images, totaling over 70 Gigabytes of storage. This substantial size necessitates cloud storage, as standard web servers cannot accommodate it. To access a required image, a request is sent to the cloud, the image is downloaded to the archive's cache, and it is then served from there.
A key factor in the BCRA's cloud storage contract is the amount of data transferred. To protect the archive from automated scraping, especially as AI knowledge bases become more prevalent, measures have been implemented to deter these activities. Allowing such scraping could make the costs unviable.
To thwart automated scraping, the archive software incorporates several features, including a CAPTCHA-like measure that requires users to click a dummy image before accessing an image via a direct link. This ensures human interaction and helps prevent automated bots from scraping the archive. While it may be a minor inconvenience, it is a necessary step to ensure the archive remains available.
The BCRA Online Archive houses nearly 40,000 images, totaling over 70 Gigabytes of storage. This substantial size necessitates cloud storage, as standard web servers cannot accommodate it. To access a required image, a request is sent to the cloud, the image is downloaded to the archive's cache, and it is then served from there.
A key factor in the BCRA's cloud storage contract is the amount of data transferred. To protect the archive from automated scraping, especially as AI knowledge bases become more prevalent, measures have been implemented to deter these activities. Allowing such scraping could make the costs unviable.
To thwart automated scraping, the archive software incorporates several features, including a CAPTCHA-like measure that requires users to click a dummy image before accessing an image via a direct link. This ensures human interaction and helps prevent automated bots from scraping the archive. While it may be a minor inconvenience, it is a necessary step to ensure the archive remains available.