Connect with us

Tech

Why building big AIs costs billions – and how Chinese startup DeepSeek dramatically changed the calculus

Published

1 year ago

on

February 1, 2025

By

The Conversation

DeepSeek — DeepSeek burst on the scene – and may be bursting some bubbles. AP Photo/Andy Wong

Ambuj Tewari, University of Michigan

State-of-the-art artificial intelligence systems like OpenAI’s ChatGPT, Google’s Gemini and Anthropic’s Claude have captured the public imagination by producing fluent text in multiple languages in response to user prompts. Those companies have also captured headlines with the huge sums they’ve invested to build ever more powerful models.

An AI startup from China, DeepSeek, has upset expectations about how much money is needed to build the latest and greatest AIs. In the process, they’ve cast doubt on the billions of dollars of investment by the big AI players.

I study machine learning. DeepSeek’s disruptive debut comes down not to any stunning technological breakthrough but to a time-honored practice: finding efficiencies. In a field that consumes vast computing resources, that has proved to be significant.

Where the costs are

Developing such powerful AI systems begins with building a large language model. A large language model predicts the next word given previous words. For example, if the beginning of a sentence is “The theory of relativity was discovered by Albert,” a large language model might predict that the next word is “Einstein.” Large language models are trained to become good at such predictions in a process called pretraining.

Pretraining requires a lot of data and computing power. The companies collect data by crawling the web and scanning books. Computing is usually powered by graphics processing units, or GPUs. Why graphics? It turns out that both computer graphics and the artificial neural networks that underlie large language models rely on the same area of mathematics known as linear algebra. Large language models internally store hundreds of billions of numbers called parameters or weights. It is these weights that are modified during pretraining. https://www.youtube.com/embed/MJQIQJYxey4?wmode=transparent&start=0 Large language models consume huge amounts of computing resources, which in turn means lots of energy.

Pretraining is, however, not enough to yield a consumer product like ChatGPT. A pretrained large language model is usually not good at following human instructions. It might also not be aligned with human preferences. For example, it might output harmful or abusive language, both of which are present in text on the web.

The pretrained model therefore usually goes through additional stages of training. One such stage is instruction tuning where the model is shown examples of human instructions and expected responses. After instruction tuning comes a stage called reinforcement learning from human feedback. In this stage, human annotators are shown multiple large language model responses to the same prompt. The annotators are then asked to point out which response they prefer.

It is easy to see how costs add up when building an AI model: hiring top-quality AI talent, building a data center with thousands of GPUs, collecting data for pretraining, and running pretraining on GPUs. Additionally, there are costs involved in data collection and computation in the instruction tuning and reinforcement learning from human feedback stages.

All included, costs for building a cutting edge AI model can soar up to US$100 million. GPU training is a significant component of the total cost.

The expenditure does not stop when the model is ready. When the model is deployed and responds to user prompts, it uses more computation known as test time or inference time compute. Test time compute also needs GPUs. In December 2024, OpenAI announced a new phenomenon they saw with their latest model o1: as test time compute increased, the model got better at logical reasoning tasks such as math olympiad and competitive coding problems.

Advertisement

Reveal Your Skin's Youthful Radiance with FOREO's LUNA 2 Facial Massager. Shop Now For $199

Slimming down resource consumption

Thus it seemed that the path to building the best AI models in the world was to invest in more computation during both training and inference. But then DeepSeek entered the fray and bucked this trend.

DeepSeek sent shockwaves through the tech financial ecosystem.

Their V-series models, culminating in the V3 model, used a series of optimizations to make training cutting edge AI models significantly more economical. Their technical report states that it took them less than $6 million dollars to train V3. They admit that this cost does not include costs of hiring the team, doing the research, trying out various ideas and data collection. But $6 million is still an impressively small figure for training a model that rivals leading AI models developed with much higher costs.

The reduction in costs was not due to a single magic bullet. It was a combination of many smart engineering choices including using fewer bits to represent model weights, innovation in the neural network architecture, and reducing communication overhead as data is passed around between GPUs.

It is interesting to note that due to U.S. export restrictions on China, the DeepSeek team did not have access to high performance GPUs like the Nvidia H100. Instead they used Nvidia H800 GPUs, which Nvidia designed to be lower performance so that they comply with U.S. export restrictions. Working with this limitation seems to have unleashed even more ingenuity from the DeepSeek team.

DeepSeek also innovated to make inference cheaper, reducing the cost of running the model. Moreover, they released a model called R1 that is comparable to OpenAI’s o1 model on reasoning tasks.

They released all the model weights for V3 and R1 publicly. Anyone can download and further improve or customize their models. Furthermore, DeepSeek released their models under the permissive MIT license, which allows others to use the models for personal, academic or commercial purposes with minimal restrictions.

Resetting expectations

DeepSeek has fundamentally altered the landscape of large AI models. An open weights model trained economically is now on par with more expensive and closed models that require paid subscription plans.

The research community and the stock market will need some time to adjust to this new reality.

Ambuj Tewari, Professor of Statistics, University of Michigan

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Advertisement

Reveal Your Skin's Youthful Radiance with FOREO's LUNA 2 Facial Massager. Shop Now For $199

STM Daily News is a vibrant news blog dedicated to sharing the brighter side of human experiences. Emphasizing positive, uplifting stories, the site focuses on delivering inspiring, informative, and well-researched content. With a commitment to accurate, fair, and responsible journalism, STM Daily News aims to foster a community of readers passionate about positive change and engaged in meaningful conversations. Join the movement and explore stories that celebrate the positive impacts shaping our world.

https://stmdailynews.com/

Related

Discover more from Daily News

Subscribe to get the latest posts sent to your email.

Related Topics:AI DeepSeek

“Dolphins: The Ocean’s Overachievers”

Celebrate Transit Equity Day: Ride Metrolink Free on February 4, 2025!

The Conversation

Continue Reading

Advertisement

Spring Discount Days - Flat 30% Off Everything + Free Shipping. Use Coupon: SPSALE30

unknown

The Unfavorable Semicircle Mystery: The YouTube Channel That Uploaded Tens of Thousands of Cryptic Videos

In 2015, the YouTube channel Unfavorable Semicircle gained attention for its enigmatic and abundant video uploads, totaling over 70,000 before its deletion in 2016. Theories about its purpose vary, from automated content generation to digital art experimentation, leaving its intent unresolved.

Published

3 days ago

on

March 8, 2026

By

a man and woman with prosthetic hand sitting on the floor. Unfavorable Semicircle — Photo by Yaroslav Shuraev on Pexels.com

In the vast digital landscape of the internet, strange phenomena occasionally emerge that leave investigators, tech enthusiasts, and everyday viewers scratching their heads. One of the most puzzling cases appeared in 2015, when a mysterious YouTube channel called Unfavorable Semicircle began uploading an astonishing number of cryptic videos.

Within months, the channel had published tens of thousands of bizarre clips, many of which seemed random, incomprehensible, and visually chaotic. But as internet detectives began analyzing the content more closely, they discovered that these videos might not have been random at all.

The Sudden Appearance of an Internet Mystery

The Unfavorable Semicircle channel reportedly appeared in March 2015, with its first uploads arriving in early April.

Almost immediately, the channel began publishing videos at an incredible pace. Observers estimated that the account uploaded thousands of videos per week, sometimes multiple videos per minute. By the time the channel disappeared in early 2016, researchers believed it had uploaded well over 70,000 videos, possibly far more.

The scale alone made the project seem impossible for a human to manage manually.

The 1950 sighting over McMinnville, Oregon

Strange Visuals and Cryptic Titles

Most of the videos shared similar characteristics:

Extremely short or very long runtime
Abstract visuals such as flashing colors, static, or distorted imagery
Little or no audio, or heavily distorted sounds
Titles made of random characters, symbols, or numbers

To casual viewers, the videos looked like pure digital noise. However, online investigators suspected something more deliberate was happening.

Hidden Images Discovered

The mystery deepened when researchers began extracting individual frames from some videos.

When thousands of frames from certain clips were stitched together, the results sometimes formed coherent images. One of the most famous examples involved a video titled “LOCK.” While the footage appeared chaotic at first, combining the frames revealed a recognizable composite image.

This discovery suggested the videos were carefully constructed rather than random uploads.

Theories About the Channel’s Purpose

Because the creator never explained the project, several theories emerged across Reddit, YouTube, and internet forums.

Automated Experiment
Many believe the channel was created using automated software that generated and uploaded content at scale.

Alternate Reality Game (ARG)
Some viewers suspected the channel might be part of a hidden puzzle or digital scavenger hunt.

Advertisement

Reveal Your Skin's Youthful Radiance with FOREO's LUNA 2 Facial Massager. Shop Now For $199

Encrypted Communication
Others compared the channel to Cold War “numbers stations,” suggesting the videos could contain coded messages.

Digital Art Project
Another theory suggests the channel was an experimental art project exploring algorithms, data, and visual noise.

Despite years of investigation, no single explanation has been confirmed.

Why the Channel Disappeared

In February 2016, YouTube removed the channel, reportedly due to spam or automated activity violations.

By that time, the channel had already become a minor internet legend. Fortunately, some researchers managed to archive a large portion of the videos before they disappeared.

Even today, archived clips continue to circulate online as investigators attempt to decode them.

Unfavorable Semicircle: The Most Bizarre YouTube Mystery

Other Mysterious YouTube Channels

The Unfavorable Semicircle mystery is not the only strange case on YouTube.

One well-known example is Webdriver Torso, a channel that uploaded hundreds of thousands of videos showing red and blue rectangles with simple beeping sounds. Internet speculation ran wild before Google eventually confirmed it was an internal YouTube testing account.

Another example is AETBX, which posts distorted visuals and unusual audio that some viewers believe contain hidden patterns or encoded information.

These cases highlight how automation, experimentation, and creativity can sometimes blur the line between technology and mystery.

Advertisement

Reveal Your Skin's Youthful Radiance with FOREO's LUNA 2 Facial Massager. Shop Now For $199

A Digital Mystery That Remains Unsolved

Nearly a decade later, the true purpose behind Unfavorable Semicircle remains unknown.

Was it a sophisticated experiment? A piece of algorithmic art? Or simply an automated test that accidentally captured the internet’s imagination?

Whatever the explanation, the channel stands as a reminder that even in a world filled with billions of videos and endless information, the internet can still produce mysteries that challenge our understanding of technology.

Why Internet Mysteries Still Fascinate Us

Stories like Unfavorable Semicircle capture attention because they combine technology, creativity, and the unknown. They invite people from around the world to collaborate, analyze patterns, and search for meaning hidden in the noise.

And sometimes, the most intriguing part of the mystery is that the answer may never fully be known.

Related Coverage & Further Reading

Dive into “The Knowledge,” where curiosity meets clarity. This playlist, in collaboration with STMDailyNews.com, is designed for viewers who value historical accuracy and insightful learning. Our short videos, ranging from 30 seconds to a minute and a half, make complex subjects easy to grasp in no time. Covering everything from historical events to contemporary processes and entertainment, “The Knowledge” bridges the past with the present. In a world where information is abundant yet often misused, our series aims to guide you through the noise, preserving vital knowledge and truths that shape our lives today. Perfect for curious minds eager to discover the ‘why’ and ‘how’ of everything around us. Subscribe and join in as we explore the facts that matter. https://stmdailynews.com/the-knowledge/

Advertisement

Reveal Your Skin's Youthful Radiance with FOREO's LUNA 2 Facial Massager. Shop Now For $199

Related

Discover more from Daily News

Subscribe to get the latest posts sent to your email.

Continue Reading

Entertainment

Byron Allen’s Starz Stake Signals Bigger Moves in the Streaming Industry

Byron Allen’s Starz: Byron Allen has acquired a 10.7% stake in Starz Entertainment for approximately $25 million, signaling his long-term media strategy amidst industry consolidation. This investment positions him influentially in the evolving streaming market despite intense competition.

Published

3 days ago

on

March 7, 2026

By

Byron Allen media entrepreneur portrait 2024 — Byron Allen — Founder/Chairman/CEO of Allen Media Group

Byron Allen’s Starz investment

Media entrepreneur Byron Allen has taken another step toward expanding his growing media empire. Through his family office, Allen recently acquired a 10.7% stake in Starz Entertainment, purchasing the shares from a fund managed by former U.S. Treasury Secretary Steven Mnuchin.

The transaction, valued at approximately $25 million, gives Allen a significant minority position in the premium cable and streaming platform. While the investment itself may seem modest compared to the billion-dollar deals common in Hollywood, analysts say the move could signal a larger strategy unfolding in the rapidly evolving streaming industry.

Why the Starz Deal Matters

The shares were sold by Mnuchin’s Liberty 77 Capital fund, which previously invested in the company when Starz was still connected to its former parent, Lionsgate.

In 2025, Lionsgate completed a corporate restructuring that separated its operations into two distinct companies:

Lionsgate Studios – responsible for film and television production
Starz – focused on premium cable and streaming services

Following the spin-off, Starz became an independent publicly traded company. As a result, investors are still determining the platform’s long-term value in an increasingly crowded streaming marketplace.

BYRON ALLEN’S ALLEN MEDIA GROUP EXPANDS ITS BOARD OF DIRECTORS

A Streaming Platform With Loyal Audiences

Despite facing intense competition from larger platforms such as Netflix, Disney+, and Amazon Prime Video, Starz continues to maintain a strong subscriber base and recognizable content franchises.

Outlander – historical drama series
The Power franchise created by Courtney A. Kemp and executive produced by 50 Cent

Byron Allen’s Long-Term Media Strategy

Allen’s investment strategy has long focused on owning media distribution and infrastructure rather than simply producing content.

The Weather Channel
Dozens of local television stations across the United States
Multiple niche cable networks and digital platforms

Over the past several years, Allen has also pursued larger acquisitions, reportedly exploring deals involving companies such as Paramount Global and BET Media Group. While those deals did not materialize, they signaled his ambition to expand Allen Media Group into a major force in global media ownership.

The Bigger Picture: Industry Consolidation

Allen’s investment arrives during a time of significant disruption in the entertainment business. Traditional cable television continues to decline as audiences migrate toward streaming platforms. At the same time, major studios and media companies are struggling to make streaming services consistently profitable.

Industry observers believe these pressures could lead to a new wave of consolidation across Hollywood and the streaming sector. Smaller platforms like Starz could become attractive acquisition targets for larger companies seeking additional subscribers and content libraries.

A Potential Hidden Opportunity

For now, Allen’s 10.7% stake does not give him control of Starz. However, it does provide influence as one of the company’s larger shareholders and leaves open the possibility of increasing his ownership in the future.

If consolidation accelerates and streaming platforms begin merging or forming partnerships, assets like Starz could become significantly more valuable. For Byron Allen—whose career began as a stand-up comedian before evolving into one of the most prominent independent media owners in America—the investment may represent another calculated step in a decades-long strategy built around media ownership and long-term growth.

Related Coverage

Dive into “The Knowledge,” where curiosity meets clarity. This playlist, in collaboration with STMDailyNews.com, is designed for viewers who value historical accuracy and insightful learning. Our short videos, ranging from 30 seconds to a minute and a half, make complex subjects easy to grasp in no time. Covering everything from historical events to contemporary processes and entertainment, “The Knowledge” bridges the past with the present. In a world where information is abundant yet often misused, our series aims to guide you through the noise, preserving vital knowledge and truths that shape our lives today. Perfect for curious minds eager to discover the ‘why’ and ‘how’ of everything around us. Subscribe and join in as we explore the facts that matter. https://stmdailynews.com/the-knowledge/

Related

Discover more from Daily News

Subscribe to get the latest posts sent to your email.

Continue Reading

Tech

When ‘Head in the Clouds’ Means Staying Ahead

Head in the Clouds: Cloud is no longer just storage—it’s the intelligent core of modern business. Explore how “cognitive cloud” blends AI and cloud infrastructure to enable real-time, self-optimizing operations, improve customer experiences, and accelerate enterprise modernization.

Published

1 month ago

on

February 7, 2026

By

Daily News Staff

Last Updated on February 7, 2026 by Daily News Staff

Head in the Clouds: Cloud is no longer just storage—it’s the intelligent core of modern business. Explore how “cognitive cloud” blends AI and cloud infrastructure to enable real-time, self-optimizing operations, improve customer experiences, and accelerate enterprise modernization.

When ‘Head in the Clouds’ Means Staying Ahead

(Family Features) You approve a mortgage in minutes, your medical claim is processed without a phone call and an order that left the warehouse this morning lands at your door by dinner. These moments define the rhythm of an economy powered by intelligent cloud infrastructure. Once seen as remote storage, the cloud has become the operational core where data, AI models and autonomous systems converge to make business faster, safer and more human. In this new reality, the smartest companies aren’t looking up to the cloud; they’re operating within it. Public cloud spending is projected to reach $723 billion in 2025, according to Gartner research, reflecting a 21% increase year over year. At the same time, 90% of organizations are expected to adopt hybrid cloud by 2027. As cloud becomes the universal infrastructure for enterprise operations, the systems being built today aren’t just hosted in the cloud, they’re learning from it and adapting to it. Any cloud strategy that doesn’t account for AI workloads as native risks falling behind, holding the business back from delivering the experiences consumers rely on every day. After more than a decade of experimentation, most enterprises are still only partway up the curve. Based on Cognizant’s experience, roughly 1 in 5 enterprise workloads has moved to the cloud, while many of the most critical, including core banking, health care claims and enterprise resource planning, remain tied to legacy systems. These older environments were never designed for the scale or intelligence the modern economy demands. The next wave of progress – AI-driven products, predictive operations and autonomous decision-making – depends on cloud architectures designed to support intelligence natively. This means cloud and AI will advance together or not at all.

The Cognitive Cloud: Cloud and AI as One System

For years, many organizations treated migration as a finish line. Applications were lifted and shifted into the cloud with little redesign, trading one set of constraints for another. The result, in many cases, has been higher costs, fragmented data and limited room for innovation. “Cognitive cloud” represents a new phase of evolution. Imagine every process, from customer service to supply-chain management, powered by AI models that learn, reason and act within secure cloud environments. These systems store and interpret data, detect patterns, anticipate demand and automate decisions at a scale humans simply cannot match. In this architecture, AI and cloud operate in concert. The cloud provides computing power, scale and governance while AI adds autonomy, context and insight. Together, they form an integrated platform where cloud foundations and AI intelligence combine to enable collaboration between people and systems. This marks the rise of the responsive enterprise; one that senses change, adjusts instantly and builds trust through reliability. Cognitive cloud platforms combine data fabric, observability, FinOps and SecOps into an intelligent core that regulates itself in real time. The result is invisible to consumers but felt in every interaction: fewer errors, faster responses and consistent experiences.

Consumer Impact is Growing

The impact of cognitive cloud is already visible. In health care, 65% of U.S. insurance claims run through modernized, cloud-enabled platforms designed to reduce errors and speed up reimbursement. In the life sciences industry, a pharmaceuticals and diagnostics firm used cloud-native automation to increase clinical trial investigations by 20%, helping get treatments to patients sooner. In food service, intelligent cloud systems have reduced peak staffing needs by 35%, in part through real-time demand forecasting and automated kitchen operation. In insurance, modernization has produced multi-million-dollar savings and faster policy issuance, improving both customer experience and financial performance. Beneath these outcomes is the same principle: architecture that learns and responds in real time. AI-driven cloud systems process vast volumes of data, identify patterns as they emerge and automate routines so people can focus on innovation, care and service. For businesses, this means fewer bottlenecks and more predictive operations. For consumers, it means smarter, faster, more reliable services, quietly shaping everyday life. While cloud engineering and AI disciplines remain distinct, their outcomes are increasingly intertwined. The most advanced architectures now treat intelligence and infrastructure as complementary forces, each amplifying the other.

Looking Ahead

This transformation is already underway. Self-correcting systems predict disruptions before they happen, AI models adapt to market shifts in real time and operations learn from every transaction. The organizations mastering this convergence are quietly redefining themselves and the competitive landscape. Cloud and AI have become interdependent priorities within a shared ecosystem that moves data, decisions and experiences at the speed customers expect. Companies that modernize around this reality and treat intelligence as infrastructure will likely be empowered to reinvent continuously. Those that don’t may spend more time maintaining the systems of yesterday than building the businesses of tomorrow. Learn more at cognizant.com. Photo courtesy of Shutterstock collect?v=1&tid=UA 482330 7&cid=1955551e 1975 5e52 0cdb 8516071094cd&sc=start&t=pageview&dl=http%3A%2F%2Ftrack.familyfeatures

collect?v=1&tid=UA 482330 7&cid=1955551e 1975 5e52 0cdb 8516071094cd&sc=start&t=pageview&dl=http%3A%2F%2Ftrack.familyfeatures

SOURCE: Cognizant

Culver’s Thank You Farmers® Project Hits $8 Million Donation Milestone

Link: https://stmdailynews.com/culvers-thank-you-farmers-project-hits-8-million-donation-milestone/

Author

Daily News Staff
View all posts

Related

Discover more from Daily News

Subscribe to get the latest posts sent to your email.

Continue Reading

AdobeStock 191832746

AdobeStock 191832746

Urbanism3 years ago

Signal Hill, California: A Historic Enclave Surrounded by Long Beach

APS vol

APS vol

Making a Difference3 years ago

APS BRINGS VOLUNTEER POWER TO SUPPORT VEGGIES-FOR-VETERANS

photo of topless woman near sunflowers

photo of topless woman near sunflowers

STM Blog10 months ago

World Naked Gardening Day: Celebrating Body Acceptance and Nature