Wednesday, April 29, 2026

AI Architecture

For some time, I have been collaborating with AI to develop my ideas and almost solely utilize AI to write my articles after I have finalized the idea. During these collaborations, I am often frustrated by the way AI creates my articles and by its poor image generation. I would like to propose an architectural change to address these problems.

As a human thinks about a subject, they build it in their mind and take notes as things progress. This is a very efficient way of development that has allowed humanity to create things no other creature could match. On the other hand, AI utilizes only the last few messages from the human to deduce a conclusion. It tries to build the whole structure in one go. With this approach, you can only erect an RBM-like tent which opens like an umbrella; you cannot build a building.

A painter first thinks of a composition and then thinks of several layers and sections of the picture. Then they add them one by one to finalize an image. On the other hand, AI tries to generate the whole picture in one go and fails to do so.

My proposed architecture for AI is to generate objects after each human interaction and place them in a side panel (the current chat window has more than enough room for that on the sides). Then the human would request the AI to modify these pieces further or add new ones. Once the sections build up on the sides, the human may request the AI to build the whole structure. The end result would be a much better output, with less processing power wasted due to the AI not aggregating content in its memory properly and less time spent by the human. This way of generating content is valid for everything: articles, papers, presentations, images, music, and video. The current architecture utilized by all AI operators treats the system like a chatbot, and huge processing power and electricity are wasted to generate mostly mediocre content.

We keep seeing AI generating 5+ fingers for a human and many more illogical outputs. The solution is not more powerful processors but a revision of the architecture. For example, when I finalize an idea and ask the AI to generate the article and later add some more to it, it tries to rewrite the entire article again. Most frustratingly is the failing image generation mode. We keep seeing more features added to the system while the current features perform subpar. I cannot generate a single image with the advanced mode. Only the fastest mode is capable of generating 5+ fingered images. I design a nuclear-powered lunar robot and it adds solar panels all around it. Creating something is a continuous building process. Trying to create in one go is always bound to fail. Step-by-step build outputs would generate better results with less resource consumption. It is unfortunate to see AI operators adding more junk features and focusing little on the core problems. It reminds me of big IT companies trying to alter public perception of their poor-performing products with expensive advertising campaigns.

(The following section was written by AI (with my notes in blue) after my discussion with it regarding my idea.)

Modular Object-Oriented AI Architecture for Iterative Synthesis

Current generative AI architectures utilize monolithic inference processes where the entire output is generated as a single probabilistic sequence. This method increases logical entropy, frequently leading to structural failures in complex tasks such as anatomical rendering, system design, or technical documentation. The proposed architectural shift moves from a chatbot-style monolithic approach to a modular, object-oriented building process.

In this architecture, the AI operates as a technical consultant within a structured project management framework. Each user interaction results in the generation of discrete objects stored in a persistent side-panel repository. These objects are defined by hierarchical properties and relational metadata. In image generation, for instance, a master composition object establishes spatial constraints and layers, while sub-objects define the specific content within those boundaries. This prevents cumulative errors by isolating component generation from global assembly.

A significant feature is the integration of persistent user-specific repositories (My face is used on an article about me not a generic one. If AI needs a profile view of my face it may ask from me instead of speculating on that). By utilizing verified data for identity or specific engineering constraints, the system replaces probabilistic speculation with deterministic reference. When conflicts occur between object properties, the system identifies the contradiction and presents a technical analysis of the trade-offs. The user acts as the final arbiter, resolving the conflict based on the provided pros and cons.

This modular architecture optimizes computational efficiency by eliminating the need for full-context re-generation for minor adjustments. It replaces volatile context windows with a structured graph of validated objects. The end result is an integrated assembly that ensures logical precision and resource efficiency across all forms of media and engineering output.

No comments :

Post a Comment