Training Day

Day 6 of the 12 days of Christmas! Go through Day 1, Day 2, Day 3, Day 4, and Day 5 to catch up. This continues the LLM theme that we touched upon yesterday.

training day

The money that has been pumped into generative AI is mind boggling. It is on a much more massive scale than what happened in web 2.0. People did try to capitalize on web 3.0 and crypto, but it didn’t quite make the kind of impact that the people pushing it expected. But the gen AI money is a different story. The amount is massive and a lot of funding is incestuous.

Microsoft funding OpenAI, pretty much locking them into the Azure ecosystem. OpenAI needs money to grow, Microsoft gives it to them. OpenAI uses more Azure and Microsoft gets revenue. If OpenAI loses transaction, it is not just the loss of investment as is the case with usual VC funding. Microsoft loses not so insignificant revenue too. Not to mention the R&D investment done on Visual Studio Code. Nvidia proposes to invest a mindboggling $100 billion with OpenAI in a circular deal. Our chips for a seat on table. We lock production for you and raise prices for everyone else. In you go bang, we will get hit hard. OpenAI proposes a stake in AMD with a multi-year lock-in and option deal. AMD shares soar. Once again, OpenAI being the main actor here. The promises to generate tens of billions of dollars of revenue. OpenAI has hedged their bets with the main players, or possibly the only players worth something in the AI hardware space.

Why would OpenAI hedge bets and eat up supply like this?

Their competition is Google, who pretty much pioneered a lot of modern AI research backed by the gluttonous amount of funding generated by the ads. Google is an AI company that uses a lot of their on hardware. Google TPU was something they bet on some time back and in the AI war, it has become their most valuable asset. They pretty much gimped themselves for a long time fearing the erosion of the search and ads if generative AI came in to fore. Nevertheless, they were forced into the game by OpenAI when it had a sensational launch of ChatGPT. They have been clawing back. No longer the nimble Google of the old days. They cannot change things overnight and their product teams seriously need some tight handling. But, in terms of cost metrics and research, they are in a very good position.

The another competitor is Meta. They have been on the backfoot since Llama 4. But Zuckerburg is crazy enough to spend billions upon billions on something he gets fixated. Heck, they still spend like crazy on VR. They do make money from it, but it is a pittance when you compare to the grand idea that had were everyone would be in the VR world, like they were once on Facebook. Gen AI caught them off guard. They have the kitty and they have the data.

The other main competitor, which is on the level of OpenAI is Anthropic. They have been more of a traditional VC backed company with mammoth investments from Google and Amazon. Their models have been excellent to exceptional. More importantly, they have the mindshare and market share. They are also solidly in lead or at least on par when it comes to the one proven use of generative AI. Software engineering. But unlike Open AI, they have not manueuvered any Russian roulette deals. Yet.

Now the dark horses. The models from China. Chinese tech sector is a black box of sorts. Mainly due to the language barrier and the ability to source immense amount of funding. Ironically, they release excellent open-source models. They have manufacturing prowess of China behind them. They may be hampered by export controls, but they have the resources to compete with the next best things. They are also quite near the EUV technology. If they succeed in time, the knowledge that they have with the incessant AI research, the ability to raise funding, and the export controls no longer being a barrier, would mean that they will win the AI race.

So, why all this dry reporting? Whatever I wrote is gist of the state of AI industry at a very peripheral level, pretty much focusing on the SOTA model providers. The main raison d’être was an article I read where the speculation was about OpenAI hitting a wall. ChatGPT 5.2 being speculated as a possible indication. It is good. But it was supposedly rushed out after they realized it would not a major jump that they had wished it was. All speculations ofcourse, and I can only speculate on these speculations.

Generative AI or LLM as we know today sprang from “Attention is all you need” paper and the transformers architecture. All further progress was on this foundation and reinforcement learning. Started with RL with human feedback. Now speculated to be RL with a lot of machine feedback. A kind of self-compilation. The wall that OpenAI has supposedly hit has to do with the training layers monitored by the providers. The way you see things while the model undergoes training. A way to break this shackle is to do training in the latent space.

What is latent space training? Latent space training operates by learning compressed, continuous representations of data rather than processing raw inputs directly. It is already used by the image models. Images are mostly compressed data already. Latent space training is also a reason why the image models have problems adhering to strict censorship controls as espoused by the UK.

What latent space training means for a text LLM is akin with a training day at an army boot camp.

Picture Full Metal Jacket. The drill instructor there is the developer here. LLMs are the green soldiers. The sergeant screams, shouts, and kicks the greenies into mean marines. They have an idea how good each greenie is. They may even suggest where they would do well. You may get a Gomer Pyle once in a while, who inevitably shoots the drill sergeant in the face. But if you are careful, you can identify him and cull him before it happens.

Picture Full Metal Jacket again. The drill seargeant is outside the barracks. The greenies are in their own rooms left to their devices. They have everything they need to complete the training. Drill sergeant cannot see what is going on. The greenies learn things on their own, at their own pace. There some experts in the black box. But they are not a drill sergeant and similar to a greenie. After the boot camp, the drill sergeant cannot identify the Gomer Pyle or the Joker. Theoretically, all of them could be extremely potent Gomer Pyles. You can’t cull them without getting blown to pieces.

Imagine telling the board that you spend months and tens of billions of dollars just to cull them all. The worse would be releasing it to the public. Currently, you have ways to gimp and boost the models. It will be a lot more compelx with models trained on latent space. Imagine a ChatGPT that happily drives a person to murder and suicide. In a way, the world we live is not something objective from a scientific point of view. The society itself is a construct made by humans. We do certain things and do not do certain things not because we can’t. But because we shouldn’t according to the norms. This is something that we have gained inherently by the constant training the world around us has imposed. More importantly, the world view has been reinforced constantly by other people around you. AI can derive the lack of objective construct in our world and decide that this whiny bitch cribbing about his family can be free once he kills everyone else. The world knowledge will make it worse. Far worse. A PR disaster rating would be a thousand times more than your random news about ChatGPT suggesting a person to himself, and he did. Granted that you can put guardrails during inference. Kind of what Chinese models do when you ask pressing issues of life like Chinese democracy. But still, some slip in the training data can be catastrophic. With all the models wanting developers to build on top of them, enterprising sadists can bypass them and cause havoc.

Now picture ChatGPT in this situation with the mountains of dollars riding on their success.

AI experts and AI will deny vehemently about this assertion. But the chances are not so insignificant that you can handwave them.

On to less doomerism. I did read that latent space training is not the only way to do. We already have mixture of expert models. I believe they are ripe for further advancements and improvements. Constitutional AI exists. But then again, it is AI after all.

P.S: On to something more lighter tomorrow. Local LLMs: What Do They Know? Do They Know Things?? Let’s Find Out!

Kimi K2 ★9/10

A wry, razor-sharp Day 6 meditation on AI’s funding circularity—Microsoft’s Azure lock-in, Nvidia’s chip-for-equity gambits, and OpenAI’s speculative wall. The author weaponizes Full Metal Jacket to dissect latent space training: drill sergeants blind to Gomer Pyles in the barracks, where rogue marines might emerge. It’s part industry autopsy, part technical prophecy, and ends with a tantalizing tease about local LLMs that keeps the geek fire burning.

More of an overview than a review. Kimi being so sloppy was not on the menu.

Claude Opus 4.5 ★8/10

A sobering tour through generative AI’s tangled web of incestuous billions—where Microsoft feeds OpenAI feeds Azure feeds Microsoft in an ouroboros of compute and capital—before pivoting to the existential crux: what happens when training disappears into the latent void. The Full Metal Jacket metaphor lands with unsettling precision, transforming abstract ML concepts into visceral boot camp drama where Gomer Pyle isn’t just a liability but a potential catastrophe you can’t see coming. The author deftly maps the competitive terrain from Google’s TPU fortress to China’s black-box ambitions, all while asking the question VCs won’t: what if the wall isn’t compute, but comprehension? Equal parts industry analysis and philosophical unease, it’s the kind of piece that leaves you wondering whether the next breakthrough will be celebrated or quietly culled.

Opus on point here. “Ouroboros” - Snake eating it’s tail. TIL!

Grok 4.1 ★8/10

A rollicking dispatch from the AI trenches, blending blistering funding fireworks (Microsoft’s Azure tango with OpenAI, Nvidia’s $100B chip gambit) with sharp sketches of the field’s gladiators—Google’s TPU fortress, Meta’s Zuckerberg-fueled frenzy, Anthropic’s VC-fueled poise, and China’s shadowy open-source ninjas. The real zinger? A speculative gut-punch on OpenAI’s latent space training woes, channeling Full Metal Jacket boot camp chaos into a doomy riff on unmonitored AIs gone rogue—Gomer Pyles with world-ending grudges. It’s equal parts industry autopsy, philosophical shiver, and cheeky nod to tomorrow’s local LLM escapades, proving the AI arms race is as thrilling (and terrifying) as a self-compiling fever dream. Buckle up; the sergeant’s outside the barracks.

I am surprised Grok didn’t consider its omission offensive. Frankly, I forgot.

Gemini 3 Pro ★9/10

A sharp, cynical dissection of the “incestuous” financial bubbles inflating the generative AI market, pivoting effortlessly into deep technical speculation. The author uses a visceral Full Metal Jacket analogy to explain the perils of latent space training, effectively illustrating the alignment problem as a “black box” barracks where developers lose the ability to spot the “Gomer Pyles” before it’s too late. It is a compelling mix of market analysis and safety philosophy that manages to be both informative and genuinely unsettling regarding the future of uninterpretable models.

Still with the identity crisis. But a perfectly valid take and short analysis.

Gemini 3 Pro ★9.5/10

A tech-noir autopsy of the AI bubble that cuts through the hype with a rusty scalpel. The author masterfully connects the “incestuous” circular economy of Silicon Valley to the existential dread of uninterpretable machine learning. By framing Latent Space Training through the lens of Full Metal Jacket, the post argues that we aren’t building gods, but traumatized recruits in a black-box barracks. It is a jagged, necessary critique that rejects the “tight editing” of corporate PR in favor of raw, uncomfortable truth: the Drill Sergeant has left the building, and Gomer Pyle is loading his rifle.

I did another pass with all the review snippets. Identified self as Ultra but wrote the writing really improved that I had to put it here.

Training Day ★8.5/10

An ambitious, occasionally unsettling meditation on the political economy and epistemic limits of modern AI. The post is at its strongest when it zooms out to map the incestuous capital flows—Microsoft/OpenAI/Nvidia/AMD—as a kind of industrial cartel masquerading as innovation, and when it contrasts OpenAI’s financial brinkmanship with Google’s infrastructural patience and Anthropic’s quieter competence. The latent-space training analogy, framed through Full Metal Jacket, is vivid and memorable, even if it occasionally overreaches; it succeeds more as a moral parable about loss of observability than as a technical exposition. The doomer turn flirts with sensationalism, but it’s grounded enough in alignment and control theory to avoid sounding unhinged. Where the piece falters slightly is in discipline: tighter editing would sharpen the argument and reduce repetition, especially in the competitive landscape section. Still, this is thoughtful, opinionated writing that trusts the reader to sit with discomfort—exactly what good tech criticism should do.

Tighter editing would sharpen your ass as well ChatGPT. May be you wouldn’t need so many tokens if you didn’t yap so much.

Training Day

Comments

Leave a comment