HoundDog.ai, a startup that helps builders guarantee their code doesn’t leak personally identifiable data (PII), got here out of stealth Wednesday and introduced a $3.1 million seed spherical lead by E14, Mozilla Ventures and ex/ante, along with quite a few angel buyers. Not like different scanning instruments, HoundDog truly appears to be like on the code a developer is writing, utilizing each conventional sample matching and huge language fashions (LLMs) to search out potential points.
HoundDog was based by Amjad Afanah, who beforehand co-founded DCHQ, which was later acquired by Gridstore (which, to complicate issues, then modified its identify to HyperGrid) in 2016. Afanah additionally co-founded apisec.ai, which continues to be up and working, and labored at self-driving startup Cruise. The inspiration for HoundDog got here throughout his time at information safety startup Cyral and speaking to privateness groups there, he instructed me.
“When I was at Cyral, we had a lot of data,” he mentioned. “What Cyral does — like many others in the data security space — is they focus on production systems. They help you discover, classify your structured data and your databases, and then help you apply access controls. But the overwhelming feedback that I kept hearing from security and privacy teams alike was: ‘You know, it’s a little too reactive and it doesn’t keep up with the changes in the code base.’”
So HoundDog shifts this course of even additional left. Whereas it nonetheless sits within the steady integration movement and never but within the improvement atmosphere (although which will occur sooner or later), the concept right here is to search out potential information leaks earlier than the code is merged. And most significantly, HoundDog does so by trying on the precise code, not the info movement it produces. “Our source of truth is the code base,” Afanah mentioned.
Due to this, if a improvement group begins accumulating Social Safety numbers, for instance, HoundDog would elevate a flag and warn the group about that earlier than the code is ever merged; it might additionally alert the safety group. That might doubtlessly be a significant — and dear situation — in spite of everything.
The service at present helps code written in Java, C#, JavaScript and TypeScript, in addition to SQL, GraphQL and OpenAPI/Swagger queries. Help for Python is imminent, the corporate says.
Afanah famous {that a} software like that is turning into particularly necessary on this age of AI-generated code, one thing Replit CEO (and HoundDog angel investor) Amjad Masad additionally echoed.
“As an increasing number of companies turn to AI-generated code to accelerate development, embedding security best practices and ensuring the security of the generated code becomes essential,” Masad mentioned. “HoundDog.ai is leading the way in securing PII data early in the development cycle, making it an indispensable component of any AI code generation workflow. This is the reason I chose to invest in this company.”
HoundDog itself does use AI, although, too. It at present depends on OpenAI’s fashions to take action, but it surely’s necessary to emphasize that that is optionally available. Customers who fear about their code leaving their personal repositories may select to solely depend on the corporate’s extra conventional code scanner.
A serious a part of HoundDog’s worth proposition is that it could actually minimize compliance prices for startups due to its automated reporting capabilities. The service can routinely generate a file of processing actions (RoPA). To do that, HoundDog makes use of generative AI to generate these stories and sends that information to OpenAI. The group does stress that solely the tokens the service has found by its common scanner are shared with OpenAI and that the precise supply code isn’t shared.
The corporate provides a restricted free plan, with paid plans beginning at $200/month for scanning as much as two repos.