AI Image Generation

orange808 · Post by **orange808** » Mon Jul 03, 2023 9:10 pm

Well, I've seen roop in action today. The end is nigh for artists; won't be long before this evolves even further. It's scary close to reproducing most any character with very little manual work. Just add concept art to train the model.

Getting faces right is one of the things that machine learning struggled with. What strikes me is the consistency the model now has in generating people in multiple poses and actions with genuinely believable consistency, because roop forces a face onto the character (to replace a face that was already very close). It's not ready to produce ready made comic books, art assets, or commercial art just yet (or handle multiple characters or beasts), but the end is nigh. I shouldn't be surprised, but I can't shake the feeling of shock.

We already know that resizing images evades detection. I'm not expecting anyone to save us with tech. I'd say that being able hack code will save a person, but the machine is getting better at that, too. I wonder if this will make a pleb indie like me obsolete?

Can't stop it. I just hope the tools don't get all locked behind paywalls. If we're going to automate, it would be nice if it leveled the playing field, versus becoming part of some proprietary framework that is unavailable or charges a lot of money. Cold comfort, but it's better than the boss and the shareholders getting even richer off our misfortune. A lot of people are going to be squeezed out really soon.

BryanM · Post by **BryanM** » Tue Jul 04, 2023 12:18 am

I've mentioned it in the apocalypse thread, but I'll reiterate it here too:

OpenAI has a robotics product coming up soon, to be revealed this summer. An android with a language model (Mandatory meme link on that idea.), called Neo by 1X.

The language models, as crude as they are, are a massive breakthrough. Just having a language to action interface, being able to come up with answers to "ought" style questions... these were completely intractable problems before. The language models make what was impossible, possible. Calling it "proto AGI" isn't completely unfounded, I think.

They're going to be rather crap in the beginning (I don't think it can even deadlift a mere 100 lbs), but this is as bad as they'll ever be. Self driving cars have the same issue as kill bots, since the risk of killing people means they have to be almost perfect before giving them trust. That's a problem labor and non-lethal bots don't have; dropping a box doesn't have quite the same consequences as a fifty car pile up on the interstate.

And of course on the instrumental goal of power-seeking: the machines won't even have to pull a fast one on us to assume control. Labor, war and companionship (that's also a euphemism that includes 'sex'): they'll be doing everything. We'll completely disempower ourselves, by ourselves. (That's a callback to a movie quote: "I say your civilization, because as soon as we started thinking for you it really became our civilization, which is of course what this is all about.")

I remember in another thread here a poster assumed another "AI winter" was inevitable, and I disagreed since it seemed like things were approaching a point of being actually useful for something finally. I internalized what doubling means: useless at first, eventually super human. The mother jones gif is a classic illustration:

There's no reason that lake can't overfill for some time, hence why every nerd conflates reaching AGI with reaching super intelligence; we'll surpass human level systems very quickly. (And of course an AGI system is effectively superhuman to begin with, since it can't get bored, can process faster... and can be much more specialized. You can't exactly scoop out the entirety of your motor functions or such and dedicate them to something else.)

This was back when the state of the art were low-res images of birds and flowers, mind you. I think This Person Doesn't Exist hadn't came out yet. So it wasn't unreasonable to think stalling out was going to happen at the time.

orange808 · Post by **orange808** » Tue Jul 04, 2023 1:03 am

BryanM wrote:I've mentioned it in the apocalypse thread, but I'll reiterate it here too:

OpenAI has a robotics product coming up soon, to be revealed this summer. An android with a language model (Mandatory meme link on that idea.), called Neo by 1X.

I could be wrong, but both domestic robotics and virtual reality make me giggle. Sounds like poor OpenAI has a case of the Bushnell Robot Fever (similar to Cook/Zuck VR Fever).

Waste of time and money.

Bushnell managed to seed so many valuable future technologies and he fell in love with the least immediately feasible. What a shame.

Language models are nice enough, I suppose. But a domestic robot has to do more than speak. I can hook up a language model to cell phones with a simple thin client app and provide conversation. I don't see why I need an expensive creepy mannequin on roller skates to use a microphone and listen to a speaker. ¯⁠\⁠_⁠(⁠ツ⁠)⁠_⁠/⁠¯ Could add a camera and put Max Headroom on the boob tube if you want to look at a face and believe it's looking at you.

I'm sorry, but useful augmented reality and domestic humanoid robotics are very distant dreams. I don't know what Mother Jones was using as a measurement, but those technologies require unbelievably powerful and compact computing (and an unprecedented battery breakthrough) to become feasible. We don't even understand the proper computing power or process of consciousness/thought, so I fail to understand how we can compare our brains to silicon. There's no baseline measurement.

Maybe OpenAI is doing a smart patent play. Later on, someone else will solve robotics problems in different and better ways when the tech is actually feasible. That's when OpenAI can leap in with an overly broad ancient patent and demand a ridiculous ransom for their useless outdated work. That would be particularly ironic for a company with "Open" in the name, but not unexpected.

BryanM · Post by **BryanM** » Tue Jul 04, 2023 2:06 am

1X's current robot is already rolling around in some places as a mall cop. It's a horrible beast to gaze upon: a centaur of a human forever fused to a segway, with clamps for hands. And that face; please stop givin' robots faces pliz. The Neo is a light labor unit to be used as a stockboy or whatever. It's normal lookin' at least.

You're thinking of the colloquial everyday use of the word "language". A dead, useless thing. Instead of the gigachad arbitrary tool that it actually is we programmers know it to be. It's one of the mind's control centers, it is not just mouth noises to kill time with. The Minecraft bot driven by a language model writing command subroutines is an example of the kinds of things it's capable of in the physical world.

Navigating a body. Understanding directions. Forming a list of goals. Ticking off a list of goals.

Nobody is buying these things for their homes until they have human skin and fat on them, and appear and behave like a human. Industry and then war will come first.

There's no utility behind having a tv screen strapped to your face, or the crypto/nft "innovation" of open ledgers. Noting that there are very stupid people and scam artists in the world, isn't an argument or rebuttal. It is an excuse to shut your brain off and not have to engage with possibilities one finds uncomfortable. The clowns Zook and Musk are great representatives of capitalists, but they're not nerds nor engineers.

There is a benefit in firing employees and replacing them with machines. There's a break even point; they'll continue to not be worth the cost and upkeep until they are. The transition will be very rapid once that point is broken past, that's how things work.

... ah, this is nostalgic, really. People used to say the internet would never catch on, and the Watchtower had many articles warning innocent mormons about the risks of internet pornography. (I can't find the jpg of the article with the 70's guy with the mustache, someone halp ;_;) Something's finally happening again, the reactionary pushback is comforting..

orange808 · Post by **orange808** » Tue Jul 04, 2023 2:44 am

In case you misunderstood, I completely understand developing purely industrial robotics for controlled work environments with a dedicated use case. I believe in robots, but not androids.

About machine learning models that are "thinking", it doesn't do any of that; at least, not to my eye. It's navigating decision trees in a very optimised and elegant way, but it's still working through trees using a model. It's still not thinking. Not really. ¯⁠\⁠_⁠(⁠ツ⁠)⁠_⁠/⁠¯

As for a completely open ledger, that could have utility if people could handle it. Unfortunately, most people do things they don't want others to know about and other people won't understand. Then again, I probably wouldn't be happy if my significant other was hiring a sex worker or cheating on me; I wouldn't understand. Anyhow, if we could see what everyone earns and pays out, it would really hurt the masters of the universe. Super PACs with no secrets would be a terrible inconvenience for the right. Absolute transparency would have benefits. It's never going to happen, though.

Then again, it's not really that big of a step for plebs. Everything you and I purchase is already mostly stored in a open ledger of sorts--and the data is getting more accurate by the day. So, we get the disadvantages without any of perks for society. Only the lucky rich few get privacy.

BryanM · Post by **BryanM** » Tue Jul 04, 2023 5:58 am

A twitter conversation about how illustrators know deep down that they're fucked.

I completely understand developing purely industrial robotics for controlled work environments with a dedicated use case. I believe in robots, but not androids.

We're moving into a phase where machines built for human spaces are starting to be viable.

Take those robot-arms-on-a-rail chefs. They occupy and use up space. Have to be bolted to the ceiling. What they're able to do is very limited. In contrast, an android is limited solely by its mind, it's not tethered to a 4 foot long rail.

What's more worth $150,000 is pretty clear, they have to at least be a force multiplier of human overseers before they can start taking physical jobs, which are hard for machines to do.

"Sim2Real" has been focusing a lot on human warehouses. As opposed to the made for roombas warehouses Amazon uses.

About machine learning models that are "thinking", it doesn't do any of that; at least, not to my eye. It's navigating decision trees in a very optimised and elegant way, but it's still working through trees using a model. It's still not thinking. Not really. ¯⁠\⁠_⁠(⁠ツ⁠)⁠_⁠/⁠¯

It's navigating a messy latent space with tens of thousands of dimensions. "Optimized" and "elegant" are some of the last words I'd personally use to describe a gigantic opaque list of numbers, pooped out of a meat grinder.

I might have said this here before, but our brains aren't one cohesive engine. It's built out of hundreds or thousands of nodes, each with their own functions and goals. Each a sub intelligence, that contributes to the gestalt whole. Is the part of your brain that thinks up words more intelligent than your brainstem?

The only decent objective definition of "what is consciousness" (not qualia, which is a whole other kettle of philosophy) I've been able to come to is "controlled hallucination."

The sensory feed we're constantly processing while awake? That's the bulk of being alive and what a brain handles.

The stuff we run through our heads: rotating shapes, making up words, stories, music, pictures, movies, games, etc? Our brains... are absolute garbage at this. It's always the first thing I point out when giving a basic "how to illustrate crap" tutorial. You can't see the finished product perfectly all at once in your brain, you have to start with broad layouts+postures, and refine the details as you go. (It makes me feel like the guy in the Mario 64 A-button videos explaining parallel universes.)

But I certainly feel more "conscious" when processing that kind of information, than I do about the neurons that work on making my hand grab a glass of water. The unconscious is the domain of the solved and automatic. The conscious is the domain of the unsolved and fun. (I can kind of see why shadow people dislike thinking. It is a simple way to live, to reduce oneself to an animal... and apparently an efficient use of a brain. It really is kind of creepy to think we're like two different species: docile cattle, and ADHD maniacs at the extreme ends.)

Anyway, what I'm trying to get at is there's a matter of degrees and type, and to try to have a open mind. In moments where it's reacting to a stimulus and producing an output by running through its latent space, I think that is a kind of "thinking". Lesser and alien than ours for sure, stripped down of so many functions, completely absent of so much understanding, but it's something. Maybe.

To dismiss them entirely is to believe our own minds are not physical nor computational.

Lander · Post by **Lander** » Tue Jul 04, 2023 2:34 pm

Algebraic data structures as we understand them - lists, trees, and so on - are quite trivial compared to the malleable composite representations used to drive these megascale cloud-edge-whatever AI data mungers.

I'm reminded of a slide from a Simon Peyton Jones lecture about building a usable (yet useless *scholarly wink*) pure functional language; a graph with effort on one axis and abstraction on the other, effectively going from ye gods how through hack hack hack and terminating at wow! why was that ever a problem?

If an average human mind can hold four-ish facts in active reasoning memory before reaching overload and needing to build an abstraction to bring the problem down to their level of cognition, what of a machine that can hold more facts - rigid mathematical ones, at least - and understand abstraction primitives intrinsically so as to generate them trivially or derive new ones by inference?

It resembles compsci nerds deriving complex programs for the intentionally obtuse Brainfuck language, and being able to verify them as correct only with the help of an iterative proving machine. Replace the nerd with a machine that problems-for-solves instead of solving-for-problems, and perhaps somewhere amid all the nonsense overcomplexity will be something that tiptoes toward the technology / magic line.

Though I'm not sure we'd know it if we saw it - I can conceive of a 500-layer structure of assorted category theory primitives, but sure as hell can't imagine what you'd use one for, let alone whatever new stuff might end up layered atop it or established within as substructure.

orange808 · Post by **orange808** » Tue Jul 04, 2023 3:29 pm

It's not magic.

Not related to matrix math on Nvidia hardware, but: FPGA isn't magic, either. FPGA is emulation. You've been seduced. You're in awe. Don't be. These are all machines running code. Everyone gets seduced so easy.

In the case of machine learning, we're actually discussing regression trees, but there's no reason to deep dive into the subject. I doubt any of us is actively developing bleeding edge machine learning software, anyway. I haven't got a clue what kinds of optimisations and clever hacks are used to maximise performance. I can't write an "AI" model and you can't either. I can still grasp the basic concept. You can, too. It's not magic, though. It's code.

The real key is math. Nvidia silicon has revolutionised accelerated matrix mathematics. We're brute forcing "AI" with trees. There's no magic. This isn't completely new clever code doing magic. It's trees and matrix math on new fast silicon. Today's devs make progress because they have more options and tools. They didn't invent new incredible concepts. The GPU does matrix math at speeds that we didn't have decades ago. It's not magic.

/rant

To dismiss them entirely is to believe our own minds are not physical nor computational

My mind doesn't perform matrix multiplication without a piece of paper and a few minutes. I don't see machine learning minds. I see specialist machines doing what they are told to do.

Not directly related to "AI" tress, but still instructive: we've had random number algorithms for decades. If you encounter one of those on a slot machine, the outcome is impossible to predict without privileged information. The machine is not thinking or creating anything. It's doing what it's told. You see random outcomes created from nowhere. That's not what's really happening.

BryanM · Post by **BryanM** » Wed Jul 05, 2023 6:40 am

I mean, brains are obviously matrixes. There wouldn't be a need for so many neurons and synapses otherwise. Obviously you'd need math to model how a signal travels through the connectome; what else would you use? And obviously it's blind idiot programming from 'ole evolution: it's not like your brain formats itself and reinvents an exciting and novel new way to move your arm 2 degrees every single time.

This "matrix math" you're talking about forms little blind idiot subroutines and functions. Tracing these things are its own entire subfield. They're programs that generate programs after being fed information and pruned for fitness.

I'm not sure why you'd think brains are non-deterministic and can create something from nothing, though. It's math and programming all the way down.

Lander · Post by **Lander** » Wed Jul 05, 2023 2:32 pm

Nothing is truly magic, given that magic is a shorthand designed to explain things we don't understand well enough to enumerate in full. (Or have no intent of explaining to begin with, in the case of fiction.)

You could take the reduction further and say all computation is fundamentally just adding numbers, or NANDing discrete signals.
How that occurs - be it on silicon, an abacus, or grey matter - is merely another layer of implementation detail (nee abstraction) underpinning the broader computation.

A human sitting down to solve a math problem could be viewed as such; the act of reasoning is itself a computation, but nested within many other higher orders of computation that involve perceiving the world, locomotion, motor skills, etc, all executed via biological machine underpinned by a dense network of neurons and performing I/O via meat.
The conscious mind needing to be trained in order to understand math - despite the unconscious mind implicitly solving for many forms of it in order to even reach the desk - is a matter of encapsulation; monkey backend not compatible with thinky frontend.

orange808 · Post by **orange808** » Wed Jul 05, 2023 7:25 pm

This "matrix math" you're talking about forms little blind idiot subroutines and functions

I guess it's okay to throw terminology around a little, but I admit that triggers my OCD. By definition, a function is a high level construction that can provide encapsulation (and the term once implied that we were returning a value). For ML programs (and most others), code also needs to be thread safe and properly designed to handle race conditions and collisions. The function buffers arguments (although call by reference--"global" in the current scope--is an option) and returns a value to aid encapsulation.

When you say subroutine, I picture assembler pushing the program counter to the stack, updating the program counter location, executing until it meets a break&return, popping the stored address, updating the program counter, and continuing. This is down on the metal. The hardware itself isn't providing encapsulation; it only offers specific commands for updating the program counter. Not to be confused with an interrupt, that will push more information to a stack. Although, they have similarities.

Every iteration cycle of a ML program is an execution loop. I wouldn't call a loop of main a subroutine.

Iteration of the output is the point and design of the process. We start with a seeded random number generator (of some sort) and use that number to select a seed in our model for working the tree. The machine does what it's told. By definition, the functions each serve a specific purpose (and they will call one another when convenient). Every function is a blind idiot function. It does what you tell it to do. The output varies because we used "randomness" to seed the process.

Maybe I'm just jaded or I've hacked away at code for too long. I'm impressed, but I don't have stars in my eyes.

BryanM · Post by **BryanM** » Thu Jul 06, 2023 11:08 pm

The twitter reply guys being completely unimpressed by Ameca drawing a kitty are a riot.

/internet anons see Jesus walking on water

"Big deal."

"My aunt can walk on water on Thursdays."

"I could walk on water too, but I don't feel like it."

"I'll worry when he can do something useful, like cure my leprosy."

/Jesus cures that guy's leprosy

"Hoo-hum, it was going to clear up on its own anyway."

Just absolutely zero interest in the field, the rate of change, or projecting where the line will be in the future. This is at least an order of magnitude beyond where we were seven years ago.

And "an order of magnitude" is kind of a conservative way to look at things. Like I said, it's a qualitative difference. Like going from 0 to 0.001.

BryanM · Post by **BryanM** » Fri Aug 25, 2023 2:23 pm

A masterpiece.

Spoiler

... to think it was only a couple years ago I was harvesting images to use in a hypothetical AI Waifu Labyrinth game to pick up Javascript. (Along with writing a long novel-like diary of this guy basically trapped alone on a moon with these xerox'd creepy monster people the random character creation engine of his kept cranking out.) I knew at the time the tools could get a lot better, but now it seems truly idiotic to put too much effort into it. Hundreds of hours of careful curation, in-painting, etc feel like they could be obsolete in a week.

This isn't the kind of thing we've felt from technology since that final sprint from the 80's to the early 00's....

Mostly used to make fake nugget ads starring pirate skeletons.

Jonathan Frakes is still my favorite.

... but the real irony is using it to make 80's style anime stills. A lot of the stuff is used for 80's style vibe stuff; and I don't think it's just nostalgia alone. But getting sick of how everything else we're exposed to by the owners of entertainment capital all looks the same.

BIL · Post by **BIL** » Fri Aug 25, 2023 6:05 pm

Jesus, that is the perfect expression of how I feel when I'm jogging and encounter a piece of dog shit, just utter contempt for all humanity and a desire for it to share a single throat that might lie beneath a well-aimed heel. AI is terrifying!

Lemnear · Post by **Lemnear** » Fri Aug 25, 2023 6:18 pm

just for fun, i've asked to some IA to make a cover art and/or loading screen, or simply art, for a SHMUP game (prompt: a more apocalyptic alternative WW2 with the style of Battle Garegga):

Spoiler

Futuristic:

A weird screenshot:

Second Prompt Series: Japanish Background SHMUPS 16Bit, Nuclear Attack.

BryanM · Post by **BryanM** » Fri Aug 25, 2023 7:53 pm

A mushroom cloud dispersing as cherry blossoms is pretty metal.

LoRAs are really cool for making the generic models more usable. Pixel art is naturally impossible to get out of these things, unless you stuff it into the magical meat grinder and crank the handle.

How human-like the language models became from just predicting the next token was something else. Hopefully the next generations will merge multiple types of intelligences together; it's how our own brains work so.

One of the things that drives me up the wall with DeepMind is they haven't been taking solving Montezuma's Revenge seriously.

BrainΦΠΦTemple · Post by **BrainΦΠΦTemple** » Wed Sep 06, 2023 8:35 pm

AI is gettin' insAne =x

shmups.system11.org

AI Image Generation

Re: AI Image Generation

Re: AI Image Generation

Re: AI Image Generation

Re: AI Image Generation

Re: AI Image Generation

Re: AI Image Generation

Re: AI Image Generation

Re: AI Image Generation

Re: AI Image Generation

Re: AI Image Generation

Re: AI Image Generation

Re: AI Image Generation

Re: AI Image Generation

Re: AI Image Generation

Re: AI Image Generation

Re: AI Image Generation

Re: AI Image Generation