The Allure of GenAI

Unless you’ve been living under a rock, the talk of the town has been Artificial Intelligence or AI in short. And much as some might contest, it will NEVER replace real humanity because it has no soul. At its core, it’s essentially a randomizer output that has its randomness reduced based on what they can predict, using a collective “brain” (I say brain but it’s more like a database of human historical works)

But that also means it’s extremely powerful. You’re basically asking a hive mind of the whole world when you drop a question in a Claude, Gemini, ChatGPT window. Cause chances are, there’s many others who had the same question as you.

Generative AI (or GenAI) uses that same algorithm and has gone thru a pretty intense lifecycle. One of the most popular GenAI video is the infamous “Will Smith eating spaghetti”.

It was so outrageous and comical that it became the unofficial AI benchmark for GenAI in 2023.

It’s been three years since. Not only AI has gotten quicker and powerful (being able to process more and more information at a go). It’s honestly mind-blowing to see how much the Will Smith video had progressed.

But the other spectrum is that a lot of creative work from designers and artists has gone into the “brain”. There’s a whole lot of ethical problems arising from this rapid expansion, and there wasn’t governance guardrails in place. Oh wait, let me rephrase - there was guardrails to prevent harmful creations, but none to protect human creativity… well unless you had a lot of money to throw at IPs, copyrights and lawsuits. But that’s another topic for another day.

I’m not here to contest if GenAI should be used. I see the allure of it, and all the social media amplification of “you can do this in one prompt”, but the truth is far from that. What I’m here to do is to share my experience from inception to concept to production, using GenAI and you can judge for yourself if it’s worth the effort.

So let’s talk about Frames and Viz. This is a YouTube series that uses my actual experience in video games, playing and working (all 30 years of my life) as well as my dataviz knowledge from the past decade. I find a video game that has good practices in design, UXUI and dissect it to see how we can use it in data visualization too.

The issue was, being a solo creator - I know firsthand how much work goes into video editing, interviewing, recording and scheduling; having done it for 2-3 years for #SecretsOfTheViz. I enjoy chatting with folks and learning from them, but I definitely don’t enjoy the tedious behind the scenes work to get it published.

I knew that I wanted to do it audio-only because it’s much easier (and less jarring) to cut and stitch audio than it is for video. But having nothing on the screen is boring because I’m a visual guy.

Creating the first look

So I toyed around the idea of an avatar. The thing about avatars is it has to capture your vibe, your personality and looks (if possible). Remember what I said about AI being a predictive engine? Statistically, multiple births (twins, triplets, etc) take up about 5% of all births. Which means to say the AI is 5% capable of generating something that looks like you, and that % does go up if you feed it your photos or you have a large online presence that probably got fed into the “brain”. This is also why, your generated face (i.e. put me in an astronaut suit, make me wear different clothes, poses, etc) always looks uncanny and a little off.

It took me over 50 generations across different tools (Gemini, ChatGPT, Higgsfield, ElevenLabs, etc) but I didn’t feel any of them really captured what I was envisioning.

Recreating that consistency

If you work at a design studio, they have character design sheets, drafts, measurements, moodboard, and a whole lot of guidelines to ensure consistency of the character. With AI, it’s a little different because again there’s nothing out there for AI to infer from, to modify your current asset because it’s sorta original (at least to AI). So needless to say, I had a hard time telling AI to render my avatar in different poses, emotions, angles, etc. And I was also burning through my credits/tokens/currency to generate all these failed images. But here’s the silver lining, folks with that creative mind and eye will always be in front of AI - there’s no doubt about that.

Taking a page from video games

So I went back to the drawing board, and identified what elements I wanted to capture. My blue hair (I recently dyed my whole hair blue, then subsequent blue highlights but I really liked that look), my casual look with denim jacket, t-shirt, jeans and sneakers (more specifically my white/red air jordans). And most importantly, my two prized toys (yes toys 😂) - my lightsaber and my Iron Man gauntlet. Those were my identity, and what people who knew me, knew that I loved. And by quantifying these, I realize it’s not too different from creating an avatar in video games like World of Warcraft where you can adjust hairstyles, body shapes, height, facial features from a library of presets.

It still captures your vibes but not exactly how you look. But it becomes your identity. The same reason why I reuse the same styles across different games.

Creating the first draft (again)

Have you wondered why the avatar system always has the same facial trends - asian, black, white, hispanic, etc? It’s an average representation of the population. And guess what - that’s effectively the AI brain too! What other mediums has tons of assets (both official and fandom)? That’s right - animations/manga/comics. And AI can (within a good measure) predict where to draw the facial features, body shape, etc. Human ingenuity has always been ideas that come from remixing, recreation and innovation.

So if you think about that in this context, I’m okay with the average of the drawing style (let’s not bring the ethical aspect in here) with my own mix (my identifiable elements).

Maintaining consistency (again)

And now with GenAI, I can bring that to life and keep consistency like a design studio would - by creating character sheets, variants, references that I can then, feed it back to the “brain” to create more stuff.

Case in point, the trailer created using Higgsfield’s Kling and Seedance models. I’m gonna sound like a broken record but in a nutshell, it took over 30-50 generated videos to get to this version.

And this published version is stitched from the segments that worked, and I still had to do sound design and music in Adobe After Effects. It’s not a “one prompt” like many influencers promise.

Does this mean it’s original? I don’t know the right legal answer to that, but it’s me… for now, until it becomes even more common to become average.

Does this mean it’s worth the squeeze to do this via AI? It probably would have cost more (and rightfully so, you’re paying for creativity) going with an actual design studio. But the allure of GenAI is rapid prototyping and filling in adjacent knowledge gaps (I learnt about camera angles, types of slashes, types of explosions, types of movement).

So that’s my very long two cents worth. Hope that gives you some insight into this whole over-promised utopia of AI generation and workflow.

That said, I would still love to pay someone to do an actual better job, so if you know someone or you think you can do a better job than this AI slop - buzz me. We need to keep the humans in the loop as we embrace AI as well, or everything becomes an average.