• Black Sheep Diggers presentation - March 29th 7pm

    In the Crown Hotel Middlesmoor the Black Sheep Diggers are going to provide an evening presentation to locals and other cavers.

    We will be highlighting with slides and explanations the explorations we have been doing over the years and that of cave divers plus research of the fascinating world of nearby lead mines.

    Click here for more details

AI Generated Photographs

The Old Ruminator

Well-known member
I live in a world where its now hard to define what is real anymore. I have had numerous calls from "my bank " this week telling me that my account had been hacked ( my mate has just lost £7,000 but got it refunded ). Yesterdays call asked for me by name and named my bank. Other calls have my local code. I was so worried that I paid the bank a visit. Yes all scams and I was told to ignore everything. So how do I know a real one ?

So onward to photographs and a similiar trend. The internet is awash with fake images from Hubble etc such as this one.



Total nonsense of course but the trend is now spreading to caving images. Some are so perfect in every way that they cannot be "real ". Rather like Chat GBT text which when examined as student test results the majority were not picked up. So we say " wow " or that dreadful word " stunning " when seeing some of these fantastic caving images. Sadly they are quite that but also utterly false. I do minimal editing and dont shoot in RAW. Why push the image beyond the bounds of reality ? Of course I cant show specific caving images here without upsetting somebody but you will recognise them as the pop up on social media. So how far do we go with it ? Ultimately the camera may decide as they become more intuitive and start to reflect the way the images are presented. The AI will learn what you like best to see and give you that result.
 
The state of AI and AI assisted CGI now means you can't believe anything. I watched a video on Youtube yesterday showing a very large aircraft taking off from an unfeasibly short runway and just avoiding skimming the sea before generating sufficient lift. Apart from the infeasibility, the only clue to it being fake was that the very realistic waves on the sea weren't breaking in the right way when they hit the cliff (I only spotted that because I'm a sea kayaker and have watched many real waves!)

The same thing probably happened with radio and fake voices. Society will adapt eventually, but much damage may occur before that happens. Ways for sources to be trusted need to re-evolve, as technology has moved too fast. This hasn't been helped by some of our politicians taking advantage of the public's trust and normalising blatant lies.
 
Speaking from the perspective of someone who works at an institute where such things are developed, used and pushed to the next level... generative AI is fascinating.

We previously thought human creativity was the last bastion, but it turns out with enough data an algorithm can learn the “rules” by being given sequences of things and then what came next. Unlike traditional machine learning, we are not giving the algorithm a specific target and some predictors to learn the relationship between, we are just getting it to learn patterns discovered in the dataset by trawling through it. Such models are called “foundation models”.

For large language models, they are fed huge quantities of text and they learn the rules of “language”. This is why LLMs generate one word at a time, that is how they were trained - a sequence of words and what word came next. Turns out you can use the same technique in a huge number of areas beyond text, images are just one but there are many others that are at the cutting edge which will appear soon. But because LLMs only learn the rules of language, they never learn what is realistic or correct, just what words are related to other words. This is where we get the concept of hallucinations from.

An example of a hallucination in the context of text is a model response which is factually incorrect, or in an image it could be something that cannot exist (think 6 fingers). Now if you know the subject area, you can identify these things. But a lot of people won’t, hence they are great for misinformation.

The thing that is going to get really interesting is that because these things naturally hallucinate (they are part of the normal functioning of the model) and they can churn out so much data that seems real that goes into the internet - we are polluting the exact source of all of the information that such algorithms are trained on. Over time, hallucinations which were once oddities will be repeated, errors and misinformation will also be repeated. The same will occur with bias, racism and all of the other nasty stuff that exists on the internet.

These things will progress further as we can fire more compute at them, but there will be a point most likely where they begin to deteriorate unless we develop new algorithms.

So when it comes to generated images, there will be small telltale signs that will signify hallucinations. They can just be very hard to find. For example, in your Hubble example, any astronomer who has used those images will know for a fact that Hubble can’t generate images like that (sensor artefacts etc are usually present like how stars have streaks coming from them).
 
Last edited:
Very interesting but also worrisome. Here is another from the AI Space Factory. Supposedly the Earth from the ISS.



Lots of supposed Ukrainian battle films on line now but some obviously war games. Will we truly arrive at a point where we dont know what is real anymore. Lots of perfect ladies now popping up ( literally ) the old brush up for magazines has morphed into something far more frightening. Probably the easiest way to fake a caving image is to mess around with scale by adding a tiny figure to a passage view thus making the passage appear to be huge. My selfies will soon be my head on top of a super fit 30 year old. ( or maybe that already )
 
Last edited:
So that image you posted is another great example. Any astronaut would look at that and pick out engineering that doesn’t exist. Or a geographer would look at the earth and see a made up landmass that they don’t recognise.

I wonder in generative caving images what hallucinations we might be able to pick out that the non-cavers would just believe were real. If you want to try it out, you can use the Dalle models for free online from OpenAI which can generate synthetic images from text (edit: you use to access it for free, now it is part of a paid ChatGPT service)
 
Although people do exist with 6 fingers
Absolutely, but unless you ask for it, the model is generating that as its “typical” impression of a human hand that comes attached to a human body, learned from data. And multiple fingers are not that common.

That’s why when it comes to generating people, honestly that image of a person could absolutely have a real life counterpart, but it would be purely coincidental. Easier to spot hallucinations with things that we know are absolute like the shape of earths landmass or the shape of the ISS. For people it is harder, but with weird things like multiple fingers or missing limbs, you can be more confident than not that it is fake.

We have to remember, these algorithms are not aiming to create something that exists in reality - they aim to create new content which can coincidentally be similar to reality. So fundamentally everything that comes out of then is “fake”, but can coincidentally marry with reality. We can have debates about how we define “fake”.
 
Edwardov, would you say that the models currently given publicity are a dead end? Should we be building models that learn to use logic, rather than replicating patterns in the data they're trained on?
 
Edwardov, would you say that the models currently given publicity are a dead end? Should we be building models that learn to use logic, rather than replicating patterns in the data they're trained on?
Currently, we seem to be progressing by throwing more and more data and compute at problems to extract the “knowledge” captured in data. In essence this is very “brute force” and will have its physical limits unless we figure out another way to do things smarter.

While generative AI has blown everyone away, it is actually a very simple concept. Give a big enough model (to learn enough patterns) enough pairs of a sequence of words (tokens in LLM land) and what comes next, it can learn the rules of language - what naturally follows what from lots of examples. What is amazing is that from this basic concept you get this weird emergent behaviour where you can converse with it and ask questions to get plausible answers. What made it possible was a lot of compute and data.

But really, the area is advancing a lot and LLMs seemingly came out of nowhere, so who knows. But there is a general feel that everyone is rushing down the brute force route just to ride the hype wave, to the detriment of the environment (if you want some shocking numbers, look up the energy consumption of a single training run of an LLM).

I believe the next big thing to really advance us is in causality. Data only captures correlation. We can figure out what things might be related to other things, but with lots of data the chances of this being a random relationship increases. To figure out what causes what in data will really cause a stir, because then you can have models based on reality, not just correlation in which reality has been muddled. Plus then we can better understand why a model has done something - this currently alludes us especially for these very complicated models where it can be impossible.
 
Quite apart from the disturbing trend that it's becoming impossible to tell what's true and what isn't, I am concerned about the huge amount of processing power and raw data being used for generating nonsense. All these servers and data farms exist in the real world, built from real (often scarce) materials and consume vast amounts of power and water. Yes, much of the power may be from renewable sources, but it's being diverted from more useful applications like heating homes, say.
Just something to think about when you're playing 'AI Wezzit' ?
 
Quite apart from the disturbing trend that it's becoming impossible to tell what's true and what isn't, I am concerned about the huge amount of processing power and raw data being used for generating nonsense. All these servers and data farms exist in the real world, built from real (often scarce) materials and consume vast amounts of power and water. Yes, much of the power may be from renewable sources, but it's being diverted from more useful applications like heating homes, say.
Just something to think about when you're playing 'AI Wezzit' ?
Yes there is a real ethical dilemma emerging for those who develop these systems. Previously, model performance trumped any environmental concerns, but now there is growing awareness of the environmental damage. So you have to pick your problems and gauge if the solution is worth it as part of your ethics assessment. At least now companies are moving towards GPUs which are far more efficient compared to CPUs, plus regulation may require the publication of environmental impact and energy usage like some companies are doing already.
 
just that RAW, being more easily editted, allows more latitude for embellishments
But most of the controls in Adobe Camera RAW (which I use) simply affect the colour, brightness and all the usual stuff - there are a few 'erase' or 'heal' tools, but they're for removing spots and glitches, not faking photos. I think there's a real 'wrong end of the stick' issue here. Granted, in Photoshop itself there are now a few AI-based features that would allow you to 'fake' photos, but being originally shot in RAW or not would make zero difference to that - it would just be better quality.

RAW formats simply allow a much greater dynamic range of brightness and colour in the captured photo - that's it. So difficult shots, in terms of bright lights and dark shadows, are now easier to recover. As a photographer, I want my output to look as good as possible - that just seems obvious to me. For example, it would be very difficult indeed to do something like this without shooting in RAW.

_IMG0765.jpg
 
Other calls have my local code.
Just going back to this in your OP. The phone number that is displayed on your phone means absolutely nothing. It's very easy to fake the number, there are mobile apps that do it. Scammers tend to use a local code as they think it's more likely you'll think it's somebody you know. They even do it with mobile codes, which is a straight giveaway that they're from a country that uses geographical mobile codes, not UK.
 
Although people do exist with 6 fingers
Indeed. The sister of one of my former students was born with six digits on each hand, but had one digit surgically removed from each hand when a child. As I was teaching her sister genetics, the family history of polydactyly made an interesting, if slightly unclear, topic.

I used to joke that Gary Sobers would have been an even better cricketer if he had not cut his own extra fingers off.
 
Back
Top