Connect with us

Tech

Startup Gimlet Labs is solving the AI inference bottleneck in a surprisingly elegant way

Stanford adjunct professor and successfully exited founder Zain Asgar just raised an $80 million Series A for a startup that solve the AI inference bottleneck problem in an astute way. The round was led by Menlo Ventures. 

The company, Gimlet Labs, has created what it claims is the first and only “multi-silicon inference cloud” which is software that allows an AI workload to be simultaneously run across diverse types of hardware. It can split an AI app’s work across both traditional CPUs and AI-tuned GPUs, as well as high-memory systems.  

“We basically run across whatever different hardware that’s available,” Asgar told TechCrunch. 

A single agent may chain together multiple steps, and each “requires different hardware: Inference is compute-bound; decode is memory-bound; and tool calls are network-bound,” writes lead investor, Menlo’s Tim Tully, in a blog post about the funding.  

No chip yet does it all, but as new hardware gets rolled out, and aging GPUs get redeployed, “the multi-silicon fleet is ready — it’s just missing the software layer to make it work.” That’s what Tully believes Gimlet Labs offers.

If the current deploy-more-compute trend continues, McKinsey estimates data center spending will tally nearly $7 trillion by 2030. Asgar says that apps are only using the existing hardware already deployed “somewhere between 15 to 30 percent” of the time.  

“Another way to think about this: you’re wasting hundreds of billions of dollars because you’re just leaving idle resources,” he said. “Our goal was basically to try to figure out how you can get AI workloads to be 10x more efficient than ever, today.” 

Techcrunch event

San Francisco, CA
|
October 13-15, 2026

So he and his cofounders, Michelle Nguyen, Omid Azizi, and Natalie Serrino, set about building orchestration software that slices up agentic workloads so that they can be simultaneous spread across all kinds of hardware. 

Gimlet Labs claims it reliably speeds AI inference up by 3x to 10x for the same cost and power. Gimlet says it can even slice the underlying model so that it runs across different architectures, using the best chip for each portion of the model. 

The company has already partnered with chip makers NVIDIA, AMD, Intel, ARM, Cerebras and d-Matrix.  

Gimlet’s product, delivered either as software or through an API to its own Gimlet Cloud, isn’t for the rank-and-file AI app developer. It’s for the largest AI model labs and data centers. 

The company publicly launched in October with, it said, eight-figure revenues out of the gate (so at least $10 million). Asgar said that his customer base has more than doubled in the last four months and now includes a major model maker and an extremely large cloud computing company, although he declined to name them.  

The cofounders had previously worked together at Pixie, a startup that created an open source observability tool for Kubernetes. Pixie was acquired by New Relic in 2020, just two months after it launched with a $9 million Series A led by Benchmark. (Pixie’s tech is now part of the open source org that oversees Kubernetes.)  

After Asgar randomly ran into Tully about a year ago and also received angel investments from Stanford professors, VCs started calling. After launch, a term sheet landed on Asgar’s desk. When VCs heard Asgar was looking at offers, “we got a pretty big swarm of funding,” and the round was quickly oversubscribed, he said. 

With the previous seed, the startup has now raised a total of $92 million, including from a slew of angels like Sequoia’s Bill Coughran, Stanford Professor Nick McKeown, former CEO of VMware Raghu Raghuram and Intel CEO Lip-Bu Tan. The company currently employs 30 people.

Other investors include Factory, who led the seed, Eclipse Ventures, Prosperity7 and Triatomic.

source

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Tech

Microsoft under fire for threatening security researcher with criminal investigation

After a security researcher published a series of unpatched bugs in Microsoft products, along with code to exploit them, the company is now threatening to take legal action and call the cops on them. Microsoft’s veiled threat reignites a long-running argument over what responsibility, if any, security researchers have to disclose vulnerabilities affecting large and wealthy tech giants.

On Wednesday, Microsoft published a blog post criticizing the researcher, who goes by the handle “Nightmare Eclipse,” for publicly disclosing a series of bugs, including BlueHammer, RedSun, UnDefend, and YellowKey. The flaws affected products such as the Windows built-in antivirus engine Defender and the disk-encryption tool BitLocker. 

The core of Microsoft’s complaints is that the researcher did not attempt to report the bugs so that the company could fix them. That would have been “responsible,” as Microsoft’s blog put it. The other side of the company’s argument is that by publishing the details of the bugs and how to exploit them before they were patched, Nightmare Eclipse may have aided malicious hackers. Some of the vulnerabilities Nightmare Eclipse disclosed have since been used by hackers in real-world attacks, according to Microsoft, as well as the U.S. cybersecurity agency CISA.

“Our Digital Crimes Unit will continue bringing cases against these actors and those that enable their criminal activity — coordinating as needed with law enforcement around the world,” Microsoft wrote. (Microsoft’s Digital Crimes Unit has the mission of protecting the company through different strategies, including “civil legal actions, technical countermeasures, criminal referrals, and public-private partnerships,” according to its website).

In a series of blogs published in the last couple of weeks — without providing many specific details — Nightmare Eclipse claimed to have been in contact with Microsoft, but the company allegedly mistreated them, including revoking access to their Microsoft Security Response Center account, the portal where researchers can report vulnerabilities to the tech giant. Nightmare Eclipse’s implication was that they had no choice but to release the vulnerabilities publicly, which essentially meant that at that point they were zero-days, a specific term for security flaws that are unknown to the software maker affected at the time they are disclosed or exploited.

The researchers published the bugs on open source repositories GitHub (owned by Microsoft) and GitLab. The researchers’ accounts on those platforms have been banned. 

Nightmare Eclipse and Microsoft did not respond to a request for comment. 

Cybersecurity veterans warn of chilling effect

This public spat brings back a long-running and still somewhat controversial debate: Do independent security researchers have a duty to make sure the vulnerabilities they find get fixed? And how far are they supposed to go to make sure the companies whose products are vulnerable actually fix them? 

One part of this debate, which has been fully settled and widely recognized, is that researchers deserve to get paid for their work. While it may sound obvious these days, it took years of struggle, captured in part during a campaign launched in 2009 called “No More Free Bugs.” Almost 20 years later, most companies small and large pay “bug bounty” financial rewards, which can today run as high as six figures or more to researchers who privately disclose bugs and coordinate publishing their details once the bugs are fixed.

In response to this latest controversy with Nightmare Eclipse, countless researchers have shared their bad experiences reporting bugs to Microsoft. It’s fair to say that much of the cybersecurity community is vocally unhappy about how Microsoft is handling this issue. This includes cybersecurity veterans, such as Luta Security founder Katie Moussouris, who while working at Microsoft in the mid- to late 2000s pioneered bug bounties and convinced the technology giant to move away from the concept of “responsible disclosure” by framing the process as “coordinated disclosure.”

“Invoking the term ‘responsible’ disclosure was the first strike in my book,” Moussouris told TechCrunch, referring to Microsoft’s blog post. “Adding a threat of prosecution by mentioning [Digital Crimes Unit] was over the top, and will only result in security researchers distrusting Microsoft.”

Moussouris warned that the consequences of security researchers losing trust with Microsoft could result in a chilling effect of fewer people coming forward to report bugs, “making it less safe for all of us.”

Security researcher and former Microsoft employee Kevin Beaumont also called out Microsoft in a blog post, describing the company’s position a “dumpster fire of its own making.” 

“Proof of concept exploit creation and distribution for zero days is ‘criminal activity’ now?” wrote Beaumont. “Responsible disclosure quite often is framed to protect the product owner, not the customer — using it to try to criminally prosecute people is a new low.”

When you purchase through links in our articles, we may earn a small commission. This doesn’t affect our editorial independence.

source

Continue Reading

Tech

After Nvidia’s $20B not-acqui-hire, AI chip startup Groq reportedly raising $650M

Groq is looking to raise $650 million in new funding from existing investors, sources tell Axios, as it leans into its inference neocloud business that relies on its homegrown AI chip and systems.

In December, Groq struck one of those not-an-acquisition agreements with Nvidia for a reported $20 billion, which involved the departure of some top-level senior Groq employees to the chip giant and the licensing of Groq’s hardware technology to Nvidia. That deal was good news for the startup’s investors, who got paid out in cash with what would have been Nvidia’s largest purchase, if the deal was a full-acquisition, Axios reports.

Now these investors have been asked to pony up and back the company’s plans to grow its inference cloud business, which lets developers and enterprises host their inference-hungry apps. Inference is the processing that happens after an AI prompt and is currently a much bigger need in the AI world than model training.

The new direction is led right now by Groq’s interim CEO and CFO, Adam Winter and Matt Eng, respectively. 

In some ways, the $650 million in funding is guaranteed. Axios reports that Groq’s backers Disruptive and Infinitium have agreed to fill the round should other existing investors not want their pro-rata shares.

source

Continue Reading

Tech

What happens when companies become too AI-pilled?

The people deciding that AI can replace your job are also the ones least likely to understand what your job truly involves, according to Box founder Aaron Levie, who pointed to this as an example of “AI psychosis.” Indeed, ClickUp recently cut 22% of its workforce for AI agents, tech layoffs in 2026 are already nearly matching all of 2025, and DuckDuckGo installs are climbing from users who want Google to stop forcing AI into search and just give them links. 

Watch as TechCrunch’s Equity podcast hosts Kirsten Korosec, Anthony Ha, and Sean O’Kane dig into what happens when the AI-pilled and the AI-skeptical are both right at the same time, plus three deals worth knowing about and Waymo’s new robotaxi hitting the road. 

Subscribe to Equity on YouTube, Apple Podcasts, Overcast, Spotify and all the casts. You also can follow Equity on X and Threads, at @EquityPod. 


source

Continue Reading