mjb and mj...ch4: reflecting on AI and AI-infused image creation


Clarification Note: The text of this post was crafted completely by ChatGPT AI with a touch of my own. The image in the post was crafted by Midjourney AI that followed my prompts.

[ Ch.1 ] [ Ch.2 ] [ Ch.3 ]


Recently, there have been intense discussions about the essence of AI, particularly within the cutting-edge domains of text-to-text and text-to-image systems. Esteemed minds consistently share their thoughts on where AI stands now and where it might be headed. However, the rapid progress in AI fields makes it incredibly challenging for a human to keep pace. It may very well require an AI system itself to thoroughly explore the crucial aspects of this subject.

I have neither the aspirations nor the expertise to participate in the discussions of these distinguished experts. Instead, this serves as my humble reflection, coming from the perspective of an enthusiastic neophyte who uses one of today's leading AI platforms: Midjourney and ChatGPT. My journey begins with a longstanding history in photography and digital images manipulation, supported by an educational foundation in physics, math, and engineering, alongside a self-cultivated grounding in the liberal arts, particularly inclined towards belle letters. This blend of experiences has strongly influenced how I view the AI tools I have at my disposal. I make no claim that my perspective is universal, or even definitively accurate, if such a perspective can indeed be presented.

For around six months, I've immersed myself in using both tools. Some of the outcomes of my efforts, featured within this blog, might offer a clear peek into my creative process i.e. of 'what' I am doing with the tools. However, when faced with the question of 'why' I am doing it, asked by others or even by my own self-reflection, I find myself unable to give a definite answer. Clearly, I'm not pursuing this for practical or financial gain. Undeniably, I take immense pleasure in this pursuit, but that's as far as my explanation goes, and this unanswered question bothers me.

Almost every day, I come up with resolutions that, though sincere, turn out to be fleeting. This quest, in and of itself, is an engaging intellectual adventure, compelling me to voraciously consume and analyze a wide range of information across various fields of knowledge, and engage in deep conversations with ChatGPT. Astonishingly, I've discovered these exchanges to be more fruitful than those with the humans accessible to me.

And here I am, sharing how I currently perceive creation of compelling AI-enhanced images.

So far, I have managed to come up with a list of the current reasons 'why' I am doing it:

1. Addiction to achieving compelling results much faster compared to any other art forms I am practicing

2. Aesthetic pleasure in generating compelling images

3. Intellectual enjoyment that comes from playing with words using loose rules, often leading to unexpected and exciting results.

4. Art Therapy Practice.

5. Having my personal 'museum on demand', that is always accesible

6. Opportunity to simultaneously utilize my skills, this spans fields that previously appeared unrelated to me.

7. Appreciation of people whoes views I value

8. And finally - its uselessness; as Oscar Wilde once coined 'We can forgive a man for making a useful thing as long as he does not admire it. The only excuse for making a useless thing is that one admires it intensely. All art is quite useless.'

Upon assembling this list and examining it, something struck me - "Wait a minute, isn't this a vague description of the reasons Castalians played The Game?” I'm referring to The Glass Beads Game described in 'The Glass Beads Game' by Herman Hesse.

Just in case if you never heard of it "The Glass Bead Game" by Herman Hesse was first published in 1943 in Switzerland. It stands out as one of Hermann Hesse's more overtly speculative and intellectually complex works. Hesse was awarded the Nobel Prize in Literature in 1946, "The Glass Bead Game" were a significant factor in him being awarded the prize.

The novel is set in a distant future where society is organized around intellectual and aesthetic pursuits. The central focus of this society is the titular game, that combines elements of music, mathematics, philosophy, and art. The novel explores themes of individualism, intellectualism, and the search for higher meaning. In the book, Hesse deliberately described the game (also known as "Magister Ludi") in abstract and symbolic terms, leaving many details open to interpretation, allowing readers to imagine it in their own way.

This is how one (myself included) may interpret the Game:

Players:

The game is typically played by highly educated individuals, known as Castalians.

Components:

It involves the use of special symbols, often represented by 'glass beads', which are arranged to create complex patterns and structures on the 'board'. Each symbol likely represents a concept or idea from various fields of knowledge. No specific detail regarding 'beads' and 'board' was provided, the terms serve rather as methaphors of the actual devices. However, during the game the players were 'observing the evolving patterns' of the 'beads' composition, that implies that the 'beads' and the 'board' were not just neutral pieces of material but highly advanced devices that contained/could access vast arrays of information and could interact with each other. One has to keep in mind that Hesse envisioned the Game in early 40s, he could not possibly provide even fictional details about the Game specifics as the first electronic computers (possible potential candidates for the game components prototypes) appeared in late 40s and were a far cry from modern devices. Hesse's focus was on the conceptual and symbolic aspects of the Game rather than the physical manipulation of 'beads'.

Integration of Knowledge:

The game involves integrating knowledge from a wide range of disciplines including mathematics, music, philosophy, literature, and more. Players use 'beads' to explore the relationships and connections between these different fields.

Goal and Purpose:

The primary purpose of the game is intellectual and artistic. It serves as a means for players to explore and express complex ideas, often in abstract and symbolic ways.

The goal is to achieve a sense of visual harmony and aesthetic beauty in the arrangement of the 'beads'. The patterns created should be aesthetically pleasing and intellectually stimulating. Watching the "visual poem" would involve observing the evolving patterns and contemplating the meaning and connections behind them. The patterns created would serve as a visual representation of the synthesis of knowledge from diverse fields. It's a way of exploring the relationships between seemingly unrelated concepts.

"The Glass Bead Game" was written in a very different technological context, and Hesse's intentions were likely focused on broader philosophical and intellectual themes rather than specific predictions about AI. Nonetheless, the novel's themes could be a fertile ground for discussions about the role of advanced technology in shaping the pursuit of knowledge and understanding. It may be interpreted and discussed in the context of the AI era and the potential implications of advanced technology on intellectual pursuits.

To my surprise I was not able to find any meaningful discussion on Hesse's role as a prominent AI visioner. It might be that there is none. I suspect that as leading AI developers mostly belong to the 20-40 years old generation they might never heard of him. The novel is not easy to read because of its dense and introspective style, as well as its philosophical themes. People who enjoy this book are usually scholars who study philosophy and related subjects, rather than scientists and technocrats. Also, it's quite long and the way the story is told might need more attention. Or might be its time does not come yet.

Symphony of forms and colors, embodying the essence of 'The Glass Bead Game'

Before we dive into the subject of the post i.e. comparing AI-powered image creation with the complexity of the Glass Bead Game, it's important to acknowledge the current state of Midjourney development. Right now, Midjourney, while promising, is far from the level of the fully developed Glass Bead Game. Rather, it resembles an embryonic phase in the evolution of the Game, showing hints of what AI system could become.

Feature Breakdown

The Glass Bead Game:

Initiator:

Human - Player(s)/Castalian(s)

Initiation Method:

Human-provided arrangement of special symbols, ‘beads’ on a flat ‘board’. The 'beads' and the 'board' could be interpreted as metaphors for highly advanced devices that contain or have access to extensive arrays of information, and possess the capability to interact with one another.

Processing Method:

There is no a definitive description of how it processes the information. The novel just provides a literary and philosophical concept, allowing readers to contemplate the nature of knowledge, creativity, and the synthesis of diverse disciplines.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

Domains of Knowledge:

Specific domains are not explicitly defined, but various references suggest the inclusion of the following:

concepts from mathematics, physics, and other scientific disciplines;

musical compositions, musical theories, and artistic elements;

philosophical ideas and concepts;

historical events, works of literature and poetry, as well as literary theories;

concepts from religious and spiritual traditions;

language, linguistics, and the structure of communication;

ethics and morality;

political theories, sociological concepts, and analyses of social structures;

elements of technology

Outcome:

The exact outcome or goal of the game is left deliberately vague and open to interpretation. The main goal of the Glass Bead Game is often thought to be reaching a deeper understanding and insight. It involves exploring and combining complex ideas from different areas of knowledge. This game helps people see connections between things that might seem unrelated at first. It's like a journey to bring together diverse elements in a harmonious and unified way. The result can be a stunning and intricate pattern of interconnected ideas, resembling a work of art.

 
 

Midjourney AI Platform

Initiator:

Human - Prompter/Prompt Engineer

Initiation Method:

Human-written text prompt containing words, phrases, and characters.                                                                                                                                                                                                                                                                                                                 

Processing Method:

Midjourne uses Large Language Model and Diffusion Model to create unique images based on text prompts. The large language model helps understand the meaning of the prompts and converts them into guides to the diffusion process. The AI image generator begins with a canvas of visual noise, progressively refining it through latent diffusion, resulting in an image representing the objects and concepts described in the prompt.

Domains of Knowledge:

Midjourney is trained on a massive dataset of text and images, and it learns to associate certain words and phrases with certain visual concepts.










Outcome:

When a user provides Midjourney with a prompt, it leverages its understanding of the world to produce an image that aligns with the given prompt. This is accomplished by randomly selecting from the pool of images it has acquired associations with during its training.



 
 

To put it simply, Midjourney, with its blend of Large Language Model AI and Diffusion Model AI, can definitely perform specific tasks in their respective areas. But it doesn't capture the complete, imaginative nature of playing "The Glass Bead Game." This special intellectual pursuit isn't about just getting things done — it's about seeking deep understanding and enlightenment through the fusion of diverse knowledge.