Connect with us

Tech

Why building big AIs costs billions – and how Chinese startup DeepSeek dramatically changed the calculus

Published

on

DeepSeek
DeepSeek burst on the scene – and may be bursting some bubbles. AP Photo/Andy Wong

Ambuj Tewari, University of Michigan

State-of-the-art artificial intelligence systems like OpenAI’s ChatGPT, Google’s Gemini and Anthropic’s Claude have captured the public imagination by producing fluent text in multiple languages in response to user prompts. Those companies have also captured headlines with the huge sums they’ve invested to build ever more powerful models.

An AI startup from China, DeepSeek, has upset expectations about how much money is needed to build the latest and greatest AIs. In the process, they’ve cast doubt on the billions of dollars of investment by the big AI players.

I study machine learning. DeepSeek’s disruptive debut comes down not to any stunning technological breakthrough but to a time-honored practice: finding efficiencies. In a field that consumes vast computing resources, that has proved to be significant.

Where the costs are

Developing such powerful AI systems begins with building a large language model. A large language model predicts the next word given previous words. For example, if the beginning of a sentence is “The theory of relativity was discovered by Albert,” a large language model might predict that the next word is “Einstein.” Large language models are trained to become good at such predictions in a process called pretraining.

Pretraining requires a lot of data and computing power. The companies collect data by crawling the web and scanning books. Computing is usually powered by graphics processing units, or GPUs. Why graphics? It turns out that both computer graphics and the artificial neural networks that underlie large language models rely on the same area of mathematics known as linear algebra. Large language models internally store hundreds of billions of numbers called parameters or weights. It is these weights that are modified during pretraining. https://www.youtube.com/embed/MJQIQJYxey4?wmode=transparent&start=0 Large language models consume huge amounts of computing resources, which in turn means lots of energy.

Pretraining is, however, not enough to yield a consumer product like ChatGPT. A pretrained large language model is usually not good at following human instructions. It might also not be aligned with human preferences. For example, it might output harmful or abusive language, both of which are present in text on the web.

The pretrained model therefore usually goes through additional stages of training. One such stage is instruction tuning where the model is shown examples of human instructions and expected responses. After instruction tuning comes a stage called reinforcement learning from human feedback. In this stage, human annotators are shown multiple large language model responses to the same prompt. The annotators are then asked to point out which response they prefer.

It is easy to see how costs add up when building an AI model: hiring top-quality AI talent, building a data center with thousands of GPUs, collecting data for pretraining, and running pretraining on GPUs. Additionally, there are costs involved in data collection and computation in the instruction tuning and reinforcement learning from human feedback stages.

All included, costs for building a cutting edge AI model can soar up to US$100 million. GPU training is a significant component of the total cost.

The expenditure does not stop when the model is ready. When the model is deployed and responds to user prompts, it uses more computation known as test time or inference time compute. Test time compute also needs GPUs. In December 2024, OpenAI announced a new phenomenon they saw with their latest model o1: as test time compute increased, the model got better at logical reasoning tasks such as math olympiad and competitive coding problems.

Advertisement
Reveal Your Skin's Youthful Radiance with FOREO's LUNA 2 Facial Massager. Shop Now For $199

Slimming down resource consumption

Thus it seemed that the path to building the best AI models in the world was to invest in more computation during both training and inference. But then DeepSeek entered the fray and bucked this trend.

DeepSeek sent shockwaves through the tech financial ecosystem.

Their V-series models, culminating in the V3 model, used a series of optimizations to make training cutting edge AI models significantly more economical. Their technical report states that it took them less than $6 million dollars to train V3. They admit that this cost does not include costs of hiring the team, doing the research, trying out various ideas and data collection. But $6 million is still an impressively small figure for training a model that rivals leading AI models developed with much higher costs.

The reduction in costs was not due to a single magic bullet. It was a combination of many smart engineering choices including using fewer bits to represent model weights, innovation in the neural network architecture, and reducing communication overhead as data is passed around between GPUs.

It is interesting to note that due to U.S. export restrictions on China, the DeepSeek team did not have access to high performance GPUs like the Nvidia H100. Instead they used Nvidia H800 GPUs, which Nvidia designed to be lower performance so that they comply with U.S. export restrictions. Working with this limitation seems to have unleashed even more ingenuity from the DeepSeek team.

DeepSeek also innovated to make inference cheaper, reducing the cost of running the model. Moreover, they released a model called R1 that is comparable to OpenAI’s o1 model on reasoning tasks.

They released all the model weights for V3 and R1 publicly. Anyone can download and further improve or customize their models. Furthermore, DeepSeek released their models under the permissive MIT license, which allows others to use the models for personal, academic or commercial purposes with minimal restrictions.

Resetting expectations

DeepSeek has fundamentally altered the landscape of large AI models. An open weights model trained economically is now on par with more expensive and closed models that require paid subscription plans.

The research community and the stock market will need some time to adjust to this new reality.

Ambuj Tewari, Professor of Statistics, University of Michigan

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Advertisement
Reveal Your Skin's Youthful Radiance with FOREO's LUNA 2 Facial Massager. Shop Now For $199

STM Daily News is a vibrant news blog dedicated to sharing the brighter side of human experiences. Emphasizing positive, uplifting stories, the site focuses on delivering inspiring, informative, and well-researched content. With a commitment to accurate, fair, and responsible journalism, STM Daily News aims to foster a community of readers passionate about positive change and engaged in meaningful conversations. Join the movement and explore stories that celebrate the positive impacts shaping our world.

https://stmdailynews.com/

Breaking News

BREAKING: NASA’s Artemis II Countdown Underway as Moon Mission Launch Window Opens

Published

on

Last Updated on April 1, 2026 by Daily News Staff

Published: April 1, 2026 | By: STM Daily News

Artemis II countdown is underway. Rocket on launch pad ready for launch.

Source: NASA/John Kraus

Artemis II countdown is underway

CAPE CANAVERAL, Fla. — The countdown has officially begun for Artemis II, NASA’s highly anticipated return to crewed lunar missions, marking a historic step toward sending humans back to the Moon for the first time in more than 50 years.

At precisely 4:44 p.m. EDT, the countdown clock started ticking at Kennedy Space Center, targeting a 6:24 p.m. launch on Wednesday, April 1. The mission will be the first crewed flight of NASA’s powerful Space Launch System (SLS) rocket and Orion spacecraft.

🚀 Final Preparations Underway

Inside the Rocco Petrone Launch Control Center, engineers and launch teams are actively powering up flight systems, verifying communications, and preparing for one of the most complex fueling operations ever attempted.

The rocket will be loaded with hundreds of thousands of gallons of super-cooled liquid hydrogen and liquid oxygen, a delicate process requiring precise timing and coordination.

Meanwhile, at Launch Complex 39B, crews are filling the sound suppression system—a massive water tank designed to release a high-volume deluge at liftoff, protecting the rocket from extreme acoustic energy generated during launch.

NHQ202603290006large

Source: NASA / Bill Ingalls

👨‍🚀 Crew in Quarantine Ahead of Launch

The four-person crew remains in quarantine at the Neil A. Armstrong Operations and Checkout Building, undergoing final medical checks and mission briefings.

  • Reid Wiseman – Commander
  • Victor Glover – Pilot
  • Christina Koch – Mission Specialist
  • Jeremy Hansen – Mission Specialist (Canadian Space Agency)

Glover, a Southern California native and Ontario High School graduate, is set to make history as the first Black astronaut to travel to lunar space—bringing a powerful local connection to this global mission.

The crew is following a controlled sleep and nutrition schedule while receiving continuous updates on launch conditions and spacecraft readiness.

🌤️ Weather Conditions 80% Favorable

NASA and U.S. Space Force weather teams are closely monitoring conditions ahead of fueling operations. Current forecasts show an 80% chance of favorable weather, with concerns focused on potential cloud cover and high winds.

Advertisement
Reveal Your Skin's Youthful Radiance with FOREO's LUNA 2 Facial Massager. Shop Now For $199

Weather will continue to be evaluated as the countdown progresses.

📺 How to Watch the Launch Live

NASA will provide live coverage throughout launch day:

  • 7:45 a.m. EDT – Tanking operations coverage begins (NASA YouTube)
  • 12:50 p.m. EDT – Full launch coverage begins on NASA+

Viewers can also follow along via NASA’s official social media platforms for real-time updates.

🚀 Artemis II Mission Snapshot

  • Mission: Artemis II
  • Agency: NASA
  • Launch Vehicle: Space Launch System (SLS)
  • Spacecraft: Orion
  • Launch Site: Kennedy Space Center (LC-39B)
  • Mission Duration: ~10 days
  • Objective: Crewed lunar flyby (no landing)
  • Commander: Reid Wiseman
  • Pilot: Victor Glover
  • Mission Specialists: Christina Koch, Jeremy Hansen

🌕 A Mission Decades in the Making

Artemis II will send astronauts on a 10-day journey around the Moon and back, serving as a critical test flight for future lunar landings under NASA’s Artemis program.

The mission is designed to validate deep space navigation, life support systems, and spacecraft performance—laying the groundwork for Artemis III, which aims to return humans to the lunar surface.

As the countdown continues, all eyes are now on Florida’s Space Coast for what could become one of the most significant spaceflight milestones of the 21st century.

🧾 Sources & References

  • NASA – Artemis II Mission Updates and Press Materials
  • NASA Kennedy Space Center Launch Operations Briefings
  • NASA Artemis Program Overview
  • Official NASA Broadcast and Launch Coverage

For more details on NASA’s Artemis II mission and live launch coverage, explore the official resources below:


🔗 Related External Links & Sources

❓ Frequently Asked Questions

What is Artemis II?

Artemis II is NASA’s first crewed mission in its Artemis program, sending astronauts on a flight around the Moon to test systems for future lunar landings.

When is the Artemis II launch?

The mission is targeting a launch on April 1, 2026, from Kennedy Space Center in Florida.

Will Artemis II land on the Moon?

No, Artemis II is a lunar flyby mission designed to test spacecraft systems before a future landing mission.

Who is Victor Glover?

Victor Glover is a NASA astronaut and Artemis II pilot who will become the first Black astronaut to travel to lunar space.

Advertisement
Reveal Your Skin's Youthful Radiance with FOREO's LUNA 2 Facial Massager. Shop Now For $199

Stay with STM Daily News for continuing coverage of Artemis II and NASA’s return to the Moon.

Stay ahead of the curve with STM Daily News’ Tech section, featuring the latest on innovation, consumer technology, digital trends, startups, AI, and the stories shaping how we live and work.

View recent photos

Unlock fun facts & lost history—get The Knowledge in your inbox!

We don’t spam! Read our privacy policy for more info.

Continue Reading

Food

CropX Launches CropX Vision, an AI Tool for Vineyard Water Stress Monitoring

Published

on

CropX has launched CropX Vision, a new AI-powered vineyard monitoring tool that helps growers measure water stress from a single canopy photo.
CropX Vision enables vineyard growers to measure leaf water potential directly from canopy images, delivering scalable, AI-powered vine water stress insights from a single picture

CropX Technologies has launched CropX Vision, a new AI-powered vineyard monitoring solution designed to help growers measure vine water stress using a single canopy image.

The new tool uses computer vision and agronomic modeling to estimate leaf water potential from a smartphone photo, giving growers and agronomists a faster and more scalable way to assess plant stress across entire vineyard blocks. The company says the goal is to support better irrigation decisions throughout the growing season.

CropX Vision is available globally on both iOS and Android. The platform is also integrated into the broader CropX application, allowing users to combine canopy-based stress insights with other agronomic data in one place.

According to CropX, the technology offers an in-season alternative to traditional pressure chamber measurements, which can be more time-consuming and limited in sampling range. Instead of relying on specialized equipment, growers can capture a single image in the field and receive plant-level water stress insights.

The product builds on technology originally developed by Tule Technologies, a California-based precision irrigation company acquired by CropX in 2023. Tule’s canopy sensing technology has already been used in California vineyards, and CropX is now expanding that capability to growers worldwide.

CropX says the global release reflects its continued focus on data-driven tools that help growers improve productivity while managing water more efficiently.

CropX Vision is now available for download via the app stores:

For more information, visit CropX Vision.

Visit the Food and Drink section on STM Daily News for the latest food news, beverage trends, restaurant stories, seasonal recipes, culinary events, and community-driven lifestyle coverage.

Continue Reading

Consumer Corner

Unilever, Google Cloud Strike Five-Year AI Partnership to Reshape Consumer Goods Marketing

Published

on

Unilever and Google Cloud announced a five-year partnership to expand AI, cloud, and data capabilities, aiming to reshape brand discovery, marketing intelligence, and consumer goods commerce.

Unilever and Google Cloud have announced a five-year partnership aimed at accelerating the consumer goods giant’s digital transformation through AI, cloud infrastructure, and data modernization.

The deal centers on helping Unilever strengthen brand discovery, marketing measurement, and AI-driven customer engagement across its global portfolio, which includes Dove, Vaseline, and Hellmann’s. A major focus is the rise of conversational and agentic commerce, where consumers increasingly discover and shop for products through AI-powered interactions rather than traditional search and browsing.

As part of the agreement, Unilever will migrate its integrated data and cloud platform to Google Cloud, creating what the companies describe as an AI-first digital backbone. That system is intended to help Unilever move faster on demand generation, turn data into actionable insights, and respond more quickly to shifts in the market.

The partnership is built around three pillars:

  • Agentic commerce and marketing intelligence
  • An integrated data and cloud foundation
  • Advanced AI adoption across the business

Unilever leadership framed the move as part of a broader shift in which technology is becoming central to value creation in the fast-moving consumer goods sector. Google Cloud said the collaboration will use advanced AI models, including Gemini, to help modernize business processes and improve agility.

Unilever said it generated €50.5 billion in sales in 2025, operates in more than 190 countries, and reaches 3.7 billion people every day.

What to watch for

  • How “agentic commerce” changes consumer brand discovery
  • Whether other major CPG companies follow with similar AI-cloud partnerships
  • How AI-backed marketing measurement impacts ad efficiency and conversion

Source: PR Newswire / Google Cloud

Welcome to the Consumer Corner section of STM Daily News, your ultimate destination for savvy shopping and informed decision-making! Dive into a treasure trove of insights and reviews covering everything from the hottest toys that spark joy in your little ones to the latest electronic gadgets that simplify your life. Explore our comprehensive guides on stylish home furnishings, discover smart tips for buying a home or enhancing your living space with creative improvement ideas, and get the lowdown on the best cars through our detailed auto reviews. Whether you’re making a major purchase or simply seeking inspiration, the Consumer Corner is here to empower you every step of the way—unlock the keys to becoming a smarter consumer today!

https://stmdailynews.com/category/consumer-corner

Authors

Continue Reading

Trending