Tech
Why building big AIs costs billions – and how Chinese startup DeepSeek dramatically changed the calculus

Ambuj Tewari, University of Michigan
State-of-the-art artificial intelligence systems like OpenAI’s ChatGPT, Google’s Gemini and Anthropic’s Claude have captured the public imagination by producing fluent text in multiple languages in response to user prompts. Those companies have also captured headlines with the huge sums they’ve invested to build ever more powerful models.
An AI startup from China, DeepSeek, has upset expectations about how much money is needed to build the latest and greatest AIs. In the process, they’ve cast doubt on the billions of dollars of investment by the big AI players.
I study machine learning. DeepSeek’s disruptive debut comes down not to any stunning technological breakthrough but to a time-honored practice: finding efficiencies. In a field that consumes vast computing resources, that has proved to be significant.
Where the costs are
Developing such powerful AI systems begins with building a large language model. A large language model predicts the next word given previous words. For example, if the beginning of a sentence is “The theory of relativity was discovered by Albert,” a large language model might predict that the next word is “Einstein.” Large language models are trained to become good at such predictions in a process called pretraining.
Pretraining requires a lot of data and computing power. The companies collect data by crawling the web and scanning books. Computing is usually powered by graphics processing units, or GPUs. Why graphics? It turns out that both computer graphics and the artificial neural networks that underlie large language models rely on the same area of mathematics known as linear algebra. Large language models internally store hundreds of billions of numbers called parameters or weights. It is these weights that are modified during pretraining. https://www.youtube.com/embed/MJQIQJYxey4?wmode=transparent&start=0 Large language models consume huge amounts of computing resources, which in turn means lots of energy.
Pretraining is, however, not enough to yield a consumer product like ChatGPT. A pretrained large language model is usually not good at following human instructions. It might also not be aligned with human preferences. For example, it might output harmful or abusive language, both of which are present in text on the web.
The pretrained model therefore usually goes through additional stages of training. One such stage is instruction tuning where the model is shown examples of human instructions and expected responses. After instruction tuning comes a stage called reinforcement learning from human feedback. In this stage, human annotators are shown multiple large language model responses to the same prompt. The annotators are then asked to point out which response they prefer.
It is easy to see how costs add up when building an AI model: hiring top-quality AI talent, building a data center with thousands of GPUs, collecting data for pretraining, and running pretraining on GPUs. Additionally, there are costs involved in data collection and computation in the instruction tuning and reinforcement learning from human feedback stages.
All included, costs for building a cutting edge AI model can soar up to US$100 million. GPU training is a significant component of the total cost.
The expenditure does not stop when the model is ready. When the model is deployed and responds to user prompts, it uses more computation known as test time or inference time compute. Test time compute also needs GPUs. In December 2024, OpenAI announced a new phenomenon they saw with their latest model o1: as test time compute increased, the model got better at logical reasoning tasks such as math olympiad and competitive coding problems.
Slimming down resource consumption
Thus it seemed that the path to building the best AI models in the world was to invest in more computation during both training and inference. But then DeepSeek entered the fray and bucked this trend.
Their V-series models, culminating in the V3 model, used a series of optimizations to make training cutting edge AI models significantly more economical. Their technical report states that it took them less than $6 million dollars to train V3. They admit that this cost does not include costs of hiring the team, doing the research, trying out various ideas and data collection. But $6 million is still an impressively small figure for training a model that rivals leading AI models developed with much higher costs.
The reduction in costs was not due to a single magic bullet. It was a combination of many smart engineering choices including using fewer bits to represent model weights, innovation in the neural network architecture, and reducing communication overhead as data is passed around between GPUs.
It is interesting to note that due to U.S. export restrictions on China, the DeepSeek team did not have access to high performance GPUs like the Nvidia H100. Instead they used Nvidia H800 GPUs, which Nvidia designed to be lower performance so that they comply with U.S. export restrictions. Working with this limitation seems to have unleashed even more ingenuity from the DeepSeek team.
DeepSeek also innovated to make inference cheaper, reducing the cost of running the model. Moreover, they released a model called R1 that is comparable to OpenAI’s o1 model on reasoning tasks.
They released all the model weights for V3 and R1 publicly. Anyone can download and further improve or customize their models. Furthermore, DeepSeek released their models under the permissive MIT license, which allows others to use the models for personal, academic or commercial purposes with minimal restrictions.
Resetting expectations
DeepSeek has fundamentally altered the landscape of large AI models. An open weights model trained economically is now on par with more expensive and closed models that require paid subscription plans.
The research community and the stock market will need some time to adjust to this new reality.
Ambuj Tewari, Professor of Statistics, University of Michigan
This article is republished from The Conversation under a Creative Commons license. Read the original article.
STM Daily News is a vibrant news blog dedicated to sharing the brighter side of human experiences. Emphasizing positive, uplifting stories, the site focuses on delivering inspiring, informative, and well-researched content. With a commitment to accurate, fair, and responsible journalism, STM Daily News aims to foster a community of readers passionate about positive change and engaged in meaningful conversations. Join the movement and explore stories that celebrate the positive impacts shaping our world.
home improvement
Stay Protected from Cyberattacks: Simple Safeguards to Reduce Cyber Intrusions and Real-World Losses
Be aware of Cyberattacks: Connected homes are becoming the norm with millions of Americans relying on Wi-Fi networks, mobile apps and smart devices to manage everything from door locks to thermostats. As convenience increases, so does exposure, and basic cybersecurity practices can help reduce both digital and physical risks.
Last Updated on May 9, 2026 by Daily News Staff
Stay Protected from Cyberattacks: Simple Safeguards to Reduce Cyber Intrusions and Real-World Losses
(Feature Impact) Connected homes are becoming the norm with millions of Americans relying on Wi-Fi networks, mobile apps and smart devices to manage everything from door locks to thermostats.
As convenience increases, so does exposure, and the experts at multiple-line insurance carrier Mercury Insurance are reminding homeowners that basic cybersecurity practices can help reduce both their digital and physical risks.
“Smart-home technology is incredibly useful, but it also expands the number of entry points into your home – not just digitally, but physically,” said Dustin Howard, head of info security at Mercury Insurance. “The good news is that many of the most effective protections are simple, proactive steps that homeowners can take today.”
Smart-home adoption continues to accelerate with recent studies showing roughly 70% of U.S. households now use at least one connected device. From video doorbells to smart garage doors, these tools provide visibility and control, but if not properly secured, they can also create vulnerabilities that bad actors may exploit.
Consider these cybersecurity best practices for connected homes:
- Secure your Wi-Fi network: Use strong, unique passwords – at least 14-16 characters with a mixture of letters, numbers and symbols – and enable WPA3 encryption when available to prevent unauthorized access. Also turn on your router’s built-in firewall and disable Wi-Fi protected setup.
- Update devices regularly: Firmware and software updates often include critical security patches that close known vulnerabilities. Turn on automatic updates for operating systems, applications, browsers and smart home devices such as thermostats and cameras. If devices are no longer able to update, it may be time to replace them to avoid compromising security.
- Enable multi-factor authentication (MFA): Adding a second layer of verification significantly reduces the risk of unauthorized account access. Enable MFA for email accounts, banking and financial apps, cloud storage and social media accounts, and use an authenticator app for confirmation rather than receiving a code via text or email.
- Segment your network: Consider placing smart-home devices, including televisions, security cameras and speakers, on a separate network from personal devices like laptops and phones. Also create a guest network for visitors to use to help further protect your main network.
- Change default settings: Many devices come with default usernames and passwords that are widely known and easily exploited. Change the defaults on your router as well as login credentials for any new devices, making admin accounts more difficult to target.
- Monitor device activity: Regularly review connected devices and remove any that are unfamiliar or no longer in use. If your router supports it, enable notifications for new device connections for real-time visibility.
“As homes become more connected, cybersecurity becomes a core part of overall home protection,” Howard said. “It’s not just about protecting your data – it’s about protecting your property, your privacy and your peace of mind.”
With smart-home technology expected to continue expanding, homeowners should treat cybersecurity as a routine part of home maintenance – just like checking smoke detectors or locking doors – to stay ahead of evolving risks.
For more information about protecting your home from cyberattacks, visit mercuryinsurance.com/resources.
Photos courtesy of Shutterstock

SOURCE:
Our Lifestyle section on STM Daily News is a hub of inspiration and practical information, offering a range of articles that touch on various aspects of daily life. From tips on family finances to guides for maintaining health and wellness, we strive to empower our readers with knowledge and resources to enhance their lifestyles. Whether you’re seeking outdoor activity ideas, fashion trends, or travel recommendations, our lifestyle section has got you covered. Visit us today at https://stmdailynews.com/category/lifestyle/ and embark on a journey of discovery and self-improvement.
The Knowledge
Artemis II crew brought a human eye and storytelling vision to the photos they took on their mission
Artemis II crew: Artemis II’s astronaut photos show how human perspective and storytelling make space imagery feel authentic—especially in an era of AI-generated visuals.

Christye Sisson, Rochester Institute of Technology
In early April 2026, the Artemis II mission captivated me and millions of people watching from across the world. The crew’s courage, skill and infectious wonder served as tangible proof of human persistence and technological achievement, all against the mysterious backdrop of space.
People back on Earth got to witness the mission through remarkable photos of space captured by astronauts. Images created and shared by astronauts underscore how photography builds a powerful, authentic connection that goes beyond what technology alone can capture.
As a photographer and the director of the Rochester Institute of Technology’s School of Photographic Arts and Sciences, I am especially drawn to how these photographs have been at the center of the public’s collective experience of this mission.
In an era when image authenticity is often questioned and with the capabilities of autonomous, AI-driven imaging, NASA’s choice to train astronauts in photography has placed meaning over convenience and prioritized their human perspectives and creativity.
Capturing space from the crew’s perspective
Photography was not originally placed as a high priority in NASA’s Apollo era. The astronauts only took photographs if they had the chance and all their other tasks were complete.
Thanks largely in part to public response to those images from Apollo, including “Earthrise” and the “Blue Marble” being widely credited for helping catalyze the modern environmental movement, NASA shifted its approach to utilize photography to help capture the public’s imagination by training their astronauts in photographic practices.
The Artemis II mission’s photographs have helped cut through the increasing volume of artificially generated images circulating on social media. NASA’s social media releases of the crew’s photographs have garnered thousands of shares and comments.
This excitement could be explained by the novelty of photos from space, but these images also distinguish themselves as products of astronauts experiencing these sights and interpreting them through their photographs. These differences require an important distinction around where technology ends and humanity begins.
Human perspective versus AI tools
Photography has long integrated AI-powered software and data-driven tools in a variety of ways: to process raw images, fill in missing color information, drive precise focus and guide image editing, among others. These modern technological assists help human photographers realize their vision.
Artificial intelligence is also increasingly capable of operating machinery competently and autonomously, from cars to drones and cameras.
And AI can generate convincing, realistic images and videos from nothing more than a text prompt, using readily available tools.
Researchers train AI to mimic patterns informed by millions of sample images, and the algorithm can then either take or create a photograph based on what it predicts would be the most likely version of a successful, believable image.
Human-created photos are rooted in direct observation, intent and lived experience, while AI images – or choices made by AI-driven tools – are not. While both can produce compelling and believable visuals, the human photographs carry emotional power because the photographer is drawing from their experiences and perspective in that moment to tell an authentic story.
Artemis II photographs resonate, not only because they are historic, but because they reflect the deliberate choices and intent of a human being in that specific moment and context. The exposure, camera setting, lens choice and composition are all dictated by the astronaut’s vision, skill, perspective and experience. Each image is unique in comparison with the others. These choices give the images narrative power, anchoring them in human perspective.
Images to tell a story
Photographers choose what to include in the final version of their image to tell a story. In the Artemis II images, this human perspective comes out. In the “Earthset” photo, you see a striking juxtaposition of the Moon’s monochromatic, textured surface in the foreground against a slivered, bright Earth.
The choice to include both in the frame contrasts these objects literally and figuratively, inviting comparison. It creates a narrative where Earth is contrasted against the Moon – life is contrasted against the absence of it.
Another photo shows the nightside of the whole Earth, featuring the Sun’s halo, auroras and city lights. The choice to include the subtle framing of the window of the capsule in the lower left corner reminds the viewer where and how this image was captured: by a human, inside a capsule, hurtling through space. That detail grounds the photograph in the human perspective.
Both photos are reminiscent of Earthrise and the Blue Marble. These past images hold a place in the global collective consciousness, shaped by a shared historical moment.
The Artemis II photographs are anchored in this collective moment of lived human experience, yet also shaped by each astronaut’s viewpoint. The crew’s unique perspectives exemplify photography’s transformative power by inviting viewers to engage emotionally and intellectually with their journey. These photographs share the astronauts’ awe and wonder and affirm the value of human creativity and its ability to connect us in a captured moment.
Christye Sisson, Professor of Photographic Sciences, Rochester Institute of Technology
This article is republished from The Conversation under a Creative Commons license. Read the original article.
Dive into “The Knowledge,” where curiosity meets clarity. This playlist, in collaboration with STMDailyNews.com, is designed for viewers who value historical accuracy and insightful learning. Our short videos, ranging from 30 seconds to a minute and a half, make complex subjects easy to grasp in no time. Covering everything from historical events to contemporary processes and entertainment, “The Knowledge” bridges the past with the present. In a world where information is abundant yet often misused, our series aims to guide you through the noise, preserving vital knowledge and truths that shape our lives today. Perfect for curious minds eager to discover the ‘why’ and ‘how’ of everything around us. Subscribe and join in as we explore the facts that matter. https://stmdailynews.com/the-knowledge/
Food
CropX Launches CropX Vision, an AI Tool for Vineyard Water Stress Monitoring
Last Updated on April 30, 2026 by Daily News Staff
CropX Technologies has launched CropX Vision, a new AI-powered vineyard monitoring solution designed to help growers measure vine water stress using a single canopy image.
The new tool uses computer vision and agronomic modeling to estimate leaf water potential from a smartphone photo, giving growers and agronomists a faster and more scalable way to assess plant stress across entire vineyard blocks. The company says the goal is to support better irrigation decisions throughout the growing season.
CropX Vision is available globally on both iOS and Android. The platform is also integrated into the broader CropX application, allowing users to combine canopy-based stress insights with other agronomic data in one place.
According to CropX, the technology offers an in-season alternative to traditional pressure chamber measurements, which can be more time-consuming and limited in sampling range. Instead of relying on specialized equipment, growers can capture a single image in the field and receive plant-level water stress insights.
The product builds on technology originally developed by Tule Technologies, a California-based precision irrigation company acquired by CropX in 2023. Tule’s canopy sensing technology has already been used in California vineyards, and CropX is now expanding that capability to growers worldwide.
CropX says the global release reflects its continued focus on data-driven tools that help growers improve productivity while managing water more efficiently.
CropX Vision is now available for download via the app stores:
- iOS: https://apps.apple.com/nl/app/cropx-vision/id6756921607?l=en-GB
- Android: https://play.google.com/store/apps/details?id=com.cropx.cropx_vision&pcampaignid=web_share
For more information, visit CropX Vision.
Visit the Food and Drink section on STM Daily News for the latest food news, beverage trends, restaurant stories, seasonal recipes, culinary events, and community-driven lifestyle coverage.
