Rendered at 13:01:50 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
simonw 21 hours ago [-]
"The values passed to _sort were concatenated directly into SQL ORDER BY clauses with no validation" - sounds to me like this project had some low-hanging fruit!
Looks like every single one of the 38 vulnerabilities were either SQL injection, XSS, path traversal or "Insecure Direct Object Reference" aka failing to check the caller was allowed to access the record.
This is actually a pretty good example of the value of AI security scanners - even really strong development teams still occasionally let bugs like this slip through, having an AI scanner that can spot them feels worthwhile to me.
tmoertel 20 hours ago [-]
> Looks like every single one of the 38 vulnerabilities were either SQL injection, XSS, path traversal or "Insecure Direct Object Reference" aka failing to check the caller was allowed to access the record.
Seems like code review against a checklist of the most common vulnerabilities would have prevented these problems. So I guess there are two takeaways here:
First, AI scanners are useful for catching security problems your team has overlooked.
Second, maintaining a checklist of the most-common vulnerabilities and using it during code review is likely to not only prevent most of the problems that AI is likely to catch, but also show your development team many of their security blind spots at review time and teach them how to light those areas. That is, the team learns how to avoid creating those security mistakes in the first place.
Falell 19 hours ago [-]
I think it shows exactly the opposite of the second. Even with the availability of checklists, and instructions to use them, people won't and don't actually use them consistently.
'With enough eyes, all bugs are shallow' and AI is an automatable eye that looks at things we can tell nobody has seriously looked at before. It's not a panacea, there will be lots of false positives, but there's value there that we clearly aren't getting by 'just telling humans to use the tools available'.
See also: modern practices and sanitizers and tools and test frameworks to avoid writing memory errors in C, and the reality that we keep writing memory errors in C.
Avamander 17 hours ago [-]
> See also: modern practices and sanitizers and tools and test frameworks to avoid writing memory errors in C, and the reality that we keep writing memory errors in C.
I think there's a difference in how trivial some of these things are to detect and how difficult others are. IDOR and SQLi aren't nearly as complex as C unsafety is.
capiki 20 hours ago [-]
What about having the checklist and having an AI tool use it to catch things at review time (or even development time)?
tmoertel 20 hours ago [-]
Having AI tools do the review against the checklist would probably prevent the problems. However, it would probably be substantially inferior as a teaching tool for your team. The exercise of having reviewers hunt the checklisted vulnerabilities for themselves is what develops the mental muscles needed to understand the vulnerabilities in depth and avoid them when designing and writing future code.
But, yes, I'd augment any manual review with a checklist and AI review as a final step. If the AI catches any problems then, your reviewers will be primed to think about why they overlooked them.
dylan604 20 hours ago [-]
> The exercise of having reviewers hunt the checklisted vulnerabilities for themselves is what develops the mental muscles needed to understand the vulnerabilities in depth and avoid them when designing and writing future code.
Could not agree any more strongly. These automagic tools are one thing in the hands of a dev that groks the basics like these examples. It would be one thing if new devs were actually reviewing the generated code to understand it, but so much is just vibe coded and deployed as soon as it "works". I get flack from not immediately deploying generated code because I want to take time to understand how it works. It's really grating and a lot of friction is coming from it.
capitalhilbilly 19 hours ago [-]
For vulnerabilities of this nature is there really a point in training if an AI will catch them from now on? Seems like a variant of the allowing calculators problem and maybe the problem codeless platforms would have had. If these style of bugs don't change design in any meaningful way then the user can just write pseudo variables and the AI can normalize to safe code and their ability to work without the AI and IDE is probably less relevant than freeing their cognitive load for more complex constraint problems.
ndriscoll 18 hours ago [-]
Suppose we still need humans to be writing code and caring about this stuff for the foreseeable future, so we need people to continue learning about the ways things can go wrong. For something like injection, you still ideally have a lint rule that says "don't concatenate things that look like SQL/HTML/etc. Use the correct macros for string interpolation". What does it actually teach for a reviewer to tell you that? You can ask the reviewer for more information, but you can ask your teammate anyway if you don't understand why the linter is mad. You can also ask the robot, who will patiently explain it to you even long after all of the knowledgeable humans have retired or died. The robot could even link to a prompt asking to explain it:
If people aren't learning more with AI, that's a meta skill they need to develop.
As for training the review muscles, why would you do that if you have a linter that rejects when you make the mistake? I don't expect reviewers to check whether you eschew nulls or uninitialized variables; I expect the compiler to do that, and I expect over time that more and more things will become tooling concerns (especially given that rigid tools with appropriate feedback are clearly a massive force multiplier for LLMs).
tmoertel 18 hours ago [-]
Two issues here. First, teams that decide to delegate security responsibilities to AI are more likely to do things fast and loose, in general, and thus be less likely to "ask the robot to patiently explain" problems until they understand the problems' root causes and update their mental models to prevent those problems.
Second, to use your example, the ChatGPT response you provided does a crappy job of explaining the root cause of problem: Namely, that every string is drawn from some underlying language that gives the string its meaning, and therefore when strings of different languages are combined, the result can cause a string drawn from one language to be interepreted as if it were drawn from another and, consequently, be given an unintended meaning.
So, if the idea is that smart teams can not only delegate the catching of problems but also the explanation of those problems to ChatGPT -- presumably because it is a better teacher than the senior engineers who actually understand the salient concepts -- I'd say AI ain't there yet.
ndriscoll 13 hours ago [-]
> teams that decide to delegate security responsibilities to AI are more likely to do things fast and loose
Is that true? Is that also true of e.g. teams using type checkers to avoid nulls or exceptions? Or teams that use memory safe languages to avoid memory corruption? Or using a library that has an `unsafeStringToSql` API surface, and a linter to flag its use (where you're expected to use safe macros instead)? My experience is that better tools (or languages and library designs) scanning for issues lead to fewer defects and less playing fast and loose since the entire point of the tools is to ban these mistakes.
On education, it literally tells you that the top concern is SQL injection made possible by concatenating strings, and gives an example of an auth bypass: `name = "foo' OR 1=1 --"`. It also notes that this is not just a minor nitpick, but that actually the solution is fundamentally doing something completely different (query objects with bound parameters). If you don't understand what it means you can just ask:
> Elaborate on 1
> Walk through examples of what goes wrong and why, and how the solution avoids it
etc. The knowledge is all there; you just need to ask for it. It's an infinitely patient teacher with infinite available attention to give to you. You can keep asking follow-ups, ask it to check your understanding, etc. Or there are tons of materials about it on the web or in textbooks, and if you still don't understand, you can still ask a more senior engineer to explain what's wrong.
tmoertel 11 hours ago [-]
> Is that true [that teams that decide to delegate security responsibilities to AI are more likely to do things fast and loose in general]?
Yes. See: vibe coding. See also: The shockingly widespread hype for and acceptance of vibe coding across industries that ought to know better.
Do you deny that there is a correlation between AI use and not knowing what you are doing? Isn’t one of the big selling points of AI is that it lets “regular people” create “real world” projects that they could only dream about previously?
I am not saying that serious engineers don’t use AI or that when they use it, they do so foolishly. I’m only pointing out that AI has let a lot of people who don’t know what they’re doing crank out code without understanding how it works (or doesn’t).
> Is that also true of e.g. teams using type checkers to avoid nulls or exceptions? Or teams that use memory safe languages to avoid memory corruption?
No, it is not true of those teams. When people choose to use languages with statically checked types or with memory safety or the other examples you offered, they are rarely doing it because they have no idea how to write sound code. But when people turn to AI to crank out code they couldn’t write themselves (see: vibe coding), that’s what they are doing.
> On education, [ChatGPT] literally tells you that the top concern is SQL injection from essentially concatenating strings, and gives an example of an auth bypass: `name = "foo' OR 1=1 --"`. If you don't understand what that means you can just ask...
Again, that’s a crappy explanation of the real problem. It promotes no understanding of the underlying issue—that strings are drawn from languages that give them their meanings. And, unless you understand that it’s a crappy explanation that ignores the underlying issue—which a person being gaslit by the crappy explanation would not—what stimulus is going to provoke you to ask for a better explanation? How are you going to know that the crappy explanation is crappy and tell ChatGPT to take another direction?
> The knowledge is all there; you just need to ask for it. It's an infinitely patient teacher with infinite available attention to give to you.
Yeah, and if it steers you down a crappy path, such as in your sql-injection session with ChatGPT, it will be infinitely happy to keep leading you down that crappy path. Unless you know that it’s leading you down a crappy path, you won’t be able to tell it to stop and take another path. But if you are relying on the AI to tell you what’s good and what’s crappy, you won’t be able to tell which is which. You’ll be stuck on whatever path the AI first presents to you.
> Or there are tons of materials about it on the web or in textbooks, and if you still don't understand, you can still ask a more senior engineer to explain what's wrong.
And that’s equivalent to “don’t ask the AI, use a traditional resource,” right?
ndriscoll 11 hours ago [-]
I'm not following the scenario here. The original discussion was around teams using these tools, not vibe coders chasing their dreams.
If you're a "regular person" vibe coder, you're not doing code reviews with a team anyway. You presumably had no teacher and no one to tell you your mistakes. So having a security bot is strictly an improvement.
If you're on a professional team, then you're presumably in the non-foolish group that already had standards, and is using it as a tool as with any of the other quality tools they use. And if they don't have standards and don't know this stuff already, well, the bot is again an improvement. It least it raises the issue for someone to ask what it means.
If you're a professional, I also assume you've heard of SQL injection (does it never come up in a CS degree?), so you don't really need more than a "this method is exposed to SQL injection" explanation. It's like saying "tail recursion is preferred because it compiles to a loop, so it's not prone to stack overflow". It assumes it doesn't need to elaborate further, but if you don't understand a term, you can just ask. Or look it up.
And yeah books or Wikipedia still exist even if you use an automated linter. You can go read about these things if you don't know them. I frequently tell my team members to go read about things. Actually I ended up in a digression about CSRF the other day (we work on low level networking, so generally not relevant), and I suggested the person I was talking to could go read about it if they're interested so as not to make them listen to me ramble.
Also I'm still unclear on why you think the explanation is crappy. It says the problem is you're making a query via simple string substitution, shows how you can abuse quotes if you do that (so concretely illustrates the problem), and says the reason the better solution is better is that it makes a structural object where you have a query with placeholders followed separately by parameters (so you can't misinterpret the query shape), which seems better than "strings are drawn from languages that give them their meanings"?
tmoertel 10 hours ago [-]
The root of this subthread was this claim that I made and you questioned:
> Teams that decide to delegate security responsibilities to AI are more likely to do things fast and loose in general.
Note the word delegate. I claimed that teams that delegate security responsibilities to AI are more likely to play fast and loose in general. That’s because delegating security to AI—not supplementing existing security practices with AI—is likely to be a good way to launch insecure garbage into the world. AI simply isn’t good enough to get security right on its own. Maybe someday it will be good enough, but like I wrote earlier, it ain’t there yet. And any team that plays fast and loose with security is likely to play fast and loose in general.
See any problems with that logic?
I only used vibe coding as an obvious example that shows there are lots of teams that delegate security responsibilities to AI. (Vibe coders are delegating almost everything to AI.)
> If you're a "regular person" vibe coder, you're not doing code reviews with a team anyway. You presumably had no teacher and no one to tell you your mistakes. So having a security bot is strictly an improvement.
How is it strictly an improvement? Before vibe coding, “regular people” couldn't launch insecure garbage upon an unsuspecting world—they couldn't launch anything. Do you believe that it’s "strictly better" that now everyone can launch insecure garbage courtesy of their AI minions? Do you think it’s “strictly better” that lots of users are having their data sucked into insecure apps and web sites that are destined to be hacked?
> Also I'm still unclear on why you think the explanation is crappy.
It’s crappy because it tells you how to use a tool (a custom SQL interpolator) without helping you understand the cause of the problem that the tool is trying to solve. You could read this ChatGPT explanation about avoiding SQL injection in Scala and not be any wiser about how to avoid that problem in other programming languages.
Worse, you would never learn from this explanation that the underlying cause of SQL injection is the same as for cross-site-scripting holes and a host of other logic and security problems in software. That’s because ChatGTP was trained on explanations of these problems scraped from the internet, and 99% of those explanations are superficial because the people who wrote them didn’t understand the underlying issues.
But if you deeply understand the following, you will never make this kind of mistake again in any programming language:
1. Every string is drawn from an underlying language and must conform to the syntax and semantics of that language.
2. To combine strings safely, you must ensure that they are all drawn from the same language and are combined according to that language’s syntax and semantics.
Therefore, as a programmer, you must (a) understand the language beneath each and every string, (b) combine strings only when you can prove that they have the same underlying language, and (c) combine strings only according to that underlying language’s syntax and semantics.
If you understand these things, you will know how to avoid all SQL injection and XSS holes and related problems in all programming languages. Things like escaping will make sense: it converts a string in one language into its equivalent string in another language. Further, you will know when you can safely delegate some of your responsibilities to tools such as parsers, type systems, custom SQL interpolators, and the like.
But you wouldn’t learn any of this from the ChatGPT explanation you received. Worse, you wouldn’t even think to ask for this deeper explanation because you would have no reason to suspect from ChatGPT’s explanation that this deeper explanation even existed.
In any case, I appreciate your willingness to continue this conversation. It’s been fun and educational and has forced me to articulate some of my ideas more clearly. Thanks!
ndriscoll 9 hours ago [-]
But I delegate checks to tools all the time. e.g. I could spend my time checking whether locks are all used correctly in our code, or I could use libraries designed to force correctness[0]. An LLM isn't an ideal solution to linting, but if you're stuck with a language with a weak type system maybe that's all you can reasonably do.
The actual problem is that you're using strings at all. The SQL solution (that the scala macros do) is to use prepared statements and bound parameters, not to escape the string substitution. Basically, work in the domain, not in the serialized representation (strings). Likewise with XSS, you put the stuff into a Text node or whatever and work directly with the DOM so the structural interpretation has all already happened before the user data is examined.
But "work in the domain as much as possible" is a good idea for a whole bunch of reasons (as chatgpt said).
It did also several times indicate there was more to the story. It didn't just say "because that way is safer"; it said it
> Builds a structured query object, not a raw string
> Instead of manipulating strings, you’re working with a query AST / fragment system
And concluded by saying there's absolutely more detail, and that it's important to understand:
> If you tell me which library you’re using (Doobie, Slick, Quill, etc.), I can show exactly what guarantees sql"..." gives in your stack—those details matter quite a bit.
On vibe coded "garbage", I expect there won't be much of a market for such things (why would there be when you can also just vibe it?), so it will more be a personal computing improvement, which already limits the blast radius (and maybe already improves the situation vs the precarious-by-default SaaS/cloud proliferation today even with poor security). I also think tooling and vibe security will be better than median professional level by the time it's actually as easy as people claim it is to vibe code an application anyway. i.e. security (which is an active area of improvement to sell to professionals) will probably be "solved" before ease-of-use. Partly exactly because issues like code injection are already "solved" in better programming languages (which are also more concise and have better tooling/libraries in general), so the bot just needs to default to those languages.
> But I delegate checks to tools all the time. e.g. I could spend my time checking whether locks are all used correctly in our code, or I could use libraries designed to force correctness[0].
Do you believe that because you can delegate some responsibilities without sacrificing important requirements that it follows that you can delegate all responsibilities without sacrificing important requirements? Do you not understand the difference between delegating to the computer proofs such as type checking that the computer can discharge faithfully without error and delegating something as wide and perilous as security to something as currently flawed as AI?
> An LLM isn't an ideal solution to linting, but if you're stuck with a language with a weak type system maybe that's all you can reasonably do.
No, in such a situation you can add LLM-based checks to your responsibility for security. But you can’t delegate away your responsibility to LLMs and say that you care about security. AI ain’t there yet.
> The actual problem is that you're using strings at all.
What percentage of the world’s existing code do you believe does not use strings at all? Tragically, that is the world we live in.
> Basically, work in the domain, not in the serialized representation (strings).
Sure, but you can’t do all your work in the domain. At some point you must take data from the outside world as input or emit data as output. And, even if you are lucky enough to work in a domain where someone has done the parsing and serialization and modeling work for you so that you have the luxury of a semantic model to work with instead of strings, who had to write that domain library? What rules did that person have to know to write that library without introducing security holes?
> [ChatGPT] did also several times indicate there was more to the story.
Great. Then show me how a person who didn’t know of the existence of the rules I shared with you in my previous post would naturally arrive at them by continuing your conversation with ChatGPT.
> security (which is an active area of improvement to sell to professionals) will probably be "solved" before ease-of-use.
I think that this is a naive hope. Security is different from virtually all other responsibilities in computing, such as ease of use, because getting it right 99.99% of the time isn’t good enough. In security, there is no “happy path”: it takes just one vulnerability to thoroughly sink a system. Security is also different because you must expect that adversaries exist who will search unceasingly for vulnerabilities, and they will use increasingly novel and clever methods. Users won’t probe your system looking for ease-of-use failures in the UI. So if you think that AIs are going to get security right before ease-of-use, I think you are likely to be mistaken.
ulimn 20 hours ago [-]
Checking for OWASP top 10 items during code review is usually a mid level dev interview question IME. It's nothing new. Teams don't have to come up with these. These things exist.
simonw 20 hours ago [-]
Yee, absolutely. A team with a strong code review culture that incorporates security review against common exploits ideally wouldn't end up with holes like this.
ihaveajob 19 hours ago [-]
I guess the value of the tool is that it gives you that same benefit for the cost of a few tokens.
tmoertel 19 hours ago [-]
> I guess the value of the tool is that it gives you that same benefit for the cost of a few tokens.
But it doesn't give you the same benefit. It gives you the partial benefit of catching these problems before they go to production, but it doesn't give you the remaining benefit of teaching your team about where their mental models are broken. A team that decides to delegate this responsibility entirely to AI is going to have a hard time learning about these serious defects in their mental models. Fixing those defects will pay dividends throughout the code base, not just in the places where AI would detect security failing.
dgb23 20 hours ago [-]
But by not having a checklist you avoid that your blind spots get exposed.
tmoertel 20 hours ago [-]
Why would you want to prevent your development team from learning about their blind spots?
dgb23 16 hours ago [-]
I tried to make a joke about the tensions of security and accountability.
dylan604 20 hours ago [-]
So you can move faster to the next features obviously. Refactoring for secure code is time consuming, and clearly wasted cycles as the code is working. /s
camdenreslink 20 hours ago [-]
I don't think strong development teams are still letting SQL injection vulnerabilities through by manually concatenating strings to build queries with user-provided data. Not in the year 2026.
voxic11 20 hours ago [-]
Keep in mind this project is a 25 year old PHP application.
zarzavat 20 hours ago [-]
That actually makes it more confusing since a 25 year old PHP application is exactly where you'd expect to find SQL injection vulnerabilities.
If I were in charge of a 25 year old PHP application, tracking down every SQL query and converting it to a safe form would high on my list of priorities. You don't need AI for that, just ripgrep and a basic amount of care for your users.
whythismatters 18 hours ago [-]
Most (proprietary) 25 year old PHP codebases I've seen are a huge mess riddled with issues, exuberant loc, mix of tabs and spaces and weird indentation, dry violations, slightly diverging code blocks copy-pasted all over the place, etc., etc. Resolving technical debt (let alone reviewing the "stuff that works" like SQL queries) is often low priority because it's tedious and does not create any "business value".
otabdeveloper4 19 hours ago [-]
Replacing/automating manual ripgrep is a top-1 use case for AI though.
pseudalopex 19 hours ago [-]
Their point was a competent team would have done this since 10 or 20 years I thought.
simonw 20 hours ago [-]
Good frameworks can protect against SQL injection and XSS (through default escaping of output variables) but protecting against insecure direct object access is a lot harder.
tdeck 7 hours ago [-]
Last time I had to build an ORDER BY clause in MySQL, it didn't support query parameters in prepared statements, which is probably how this happens. It's not an excuse at all but the standard path of "just throw a ? (or whatever) in there and use bound params" doesn't work for order by (or at least it didn't at some time in the recent past). You would end up having to concatenate strings somehow or other.
IshKebab 20 hours ago [-]
Yeah this is a huge red flag that would make me avoid this project for sure.
Unfortunately you have no easy way of checking if closed source projects are similarly amateur.
Taters91 20 hours ago [-]
These kind of checks were available without AI.
sheikhnbake 20 hours ago [-]
Math is doable without a calculator
happytoexplain 20 hours ago [-]
The headline is "AI uncovers...", implying that the standard static analyzers used by basically everybody didn't catch them.
serf 20 hours ago [-]
isn't this just sort of turning chicken-or-egg?
if an AI uses static analyzers to do work ,is it the tool or the ai ?
if AI is using grep to do the work, is it the AI or grep?
I mean essentially all agent work boils down to "cat or grep?"
RA_Fisher 20 hours ago [-]
AI gives us a means of leverage. We can do more with less. production = f(labor, capital, technology) + eps
krainboltgreene 19 hours ago [-]
This always comes up and the only thing I can think is: Doesn't Google make like 10B a quarter in profit from GCP alone? Did we really need a cheaper SQL injection checker?
positron26 20 hours ago [-]
Was the human labor?
20 hours ago [-]
gchamonlive 20 hours ago [-]
Isn't this something SonarQube catches?
webXL 20 hours ago [-]
Yes. Isn't this something code review catches? :)
happytoexplain 17 hours ago [-]
Sometimes, but not nearly as reliably as a static analyzer. But I'm assuming the unstated point you are sarcastically implying is "you don't need SonarQube" - maybe you're trying to say something else.
gchamonlive 14 hours ago [-]
WebXL makes no sense, because that's comparing oranges to apples.
20 hours ago [-]
gchamonlive 20 hours ago [-]
I'm really curious what's your line of thinking here. Could you elaborate?
positron26 20 hours ago [-]
Presuming there is an infinite pool of programmers who tirelessly work for a low price?
Groxx 17 hours ago [-]
>even really strong development teams still occasionally let bugs like this slip through
agreed, though I think you'd be hard-pressed to find anyone who uses healthcare-related software professionally who thinks any "really strong development team" was involved in its creation.
nudpiedo 20 hours ago [-]
There are Static code analyzers which already would have detected that.
And these were also automatic. Looks very likely that the team didn’t give a damn about top basic security and good practices.
Like a house made of paper wouldn’t be an example of the insecurity of the construction industry.
simonw 19 hours ago [-]
Which static code analyzers do you recommend?
happytoexplain 17 hours ago [-]
SonarQube is extremely common, but I'm sure there are many.
gowld 20 hours ago [-]
I think SQL Injection detectors were pretty mature even before the "AI" version?
hilariously 20 hours ago [-]
Honestly those all sound like common linters could find things like string concatenation.
EGreg 20 hours ago [-]
“even really strong development teams”
One would think a single really strong developer, let alone a team, would look for interpolation in strings fed to RDBMS?
srveale 20 hours ago [-]
And yet here we are
prerok 18 hours ago [-]
Everybody knew somebody should do it, but nobody did it.
Classic.
gostsamo 20 hours ago [-]
> This is actually a pretty good example of the value of AI security scanners
Are you fuckin' serious? This would be caught with any self-respecting scanner even 5 years ago and with most educated juniors even earlier.
I use AI every day, but I'm not deep enough in the dilulu to believe that everything above two brain cells should be a transformer.
simonw 19 hours ago [-]
Which scanners catch insecure direct object access?
demorro 20 hours ago [-]
Completely normal and expected.
People thinking that this isn't the case everywhere need a reality check. Most software is riddled with obvious security issues. If we can remediate them with AI, great, but don't be thinking that this is something that we could only have dealt with with AI. Enough attention and prioritization of these issues would also have sorted it.
Ask yourself if we weren't currently in an era of AI-focus and AI was just another boring tool, if we would be bothering to do this sort of thing. Loads of us still aren't bothering with basic static analysis.
unshavedyak 19 hours ago [-]
Heck, unless AI gets absurdly cheap - i feel like even this will be temporary. To your point, we don't do that now because it's not fun and no one broadly finances this sort of thing. However AI costs money, so why are we spending it now? I imagine it's just a temporary spend to explore the space, show what models are capable of, further embed usage of AI for future rugpulls, etcetc.
Point is unless it eventually becomes cheap enough that we all have this at home and can run SOTA analysis ourselves, this too will pass. I imagine it will get cheap enough fwiw, but.. yea.
demorro 16 hours ago [-]
Aye. The "use boring tech" advice isn't just about technical stability. You need to guard yourself against eventual boredom and ecosystem decay. Hype and enthusiasm can mask how likely these systems we put in place are to actually be maintained or used in the long term.
I'm sure that doesn't matter much to big tech folks seeking to fill that promotion packet though, or to executives seeking to demonstrate the overwhelming utility of this new income stream.
prerok 18 hours ago [-]
Presumably now we also have exploits written by AI, so I guess security has to be one-upped now?
Not that I expect companies to be more proactive now. I have been disillusioned of that long ago. With AI they could be at least a little bit more proactive, which I guess is a great selling point of AI to corporate.
dflock 20 hours ago [-]
No one knows how many vulnerabilities there are in closed source medical record software - because we can't check. There are _probably_ loads though, because that medical software is super terrible in every way that we _can_ check.
nradov 18 hours ago [-]
Well the closed-source EHR applications that use NoSQL databases such as MUMPS (InterSystems Caché) probably don't have many SQL injection vulnerabilities.
oatmeal1 20 hours ago [-]
Or voting machines.
mixedmath 17 hours ago [-]
I wasn't aware that there were any public, commonly-used voting machines that we could check.
1970-01-01 19 hours ago [-]
Isn't anything closed-source by definition this? Why speak of the subset of closed-source medical record software when it's just the entire class of software?
0xdeadbeefbabe 20 hours ago [-]
SQL injection and XSS come up in dynamic analysis too.
david_shaw 19 hours ago [-]
We'll see more of this, but this particular review is driven by marketing narrative. I'll explain what I mean:
Back in 2010, as a security engineer, I also looked at OpenEMR. It was an absolute disaster, and was (and is) somewhat well-known as such. I found and published vulnerabilities very similar to these sixteen years ago. This is not exactly the Fort Knox of software.
It makes sense for AISLE to demonstrate that they're able to find vulnerabilities here, but I'd love to see a side-by-side comparison of modern SAST and DAST reviews. I bet we'd find similar vulnerabilities.
>I was the main contributor and maintainer to OpenEMR about ~20 years ago and then decided it was irredeemable and started over with ClearHealth/HealthCloud. Shockingly some of my code code lives on (from PHP 3). I am reluctant to say don't use it but if you do please don't expose it to anything public, which sadly happens most of the time. There are some real problems that exist in that code base from a security and HIPAA perspective.
Finding SQL injections etc is definitely valuable, but at the same time they did not hack Epic; the "100000 medical providers" number links to https://www.hhs.gov/sites/default/files/open-emr-sector-aler... which links open-emr.org/blog/openemr-is-proud-to-announce-seamless-support-for-telehealth/ which...404s. Per archive.org the source is something the CEO of now defunct lifemesh.ai said.
"medical record software" makes it sound super serious, but again OpenEMR should not be taken as seriously as for instance Epic.
muglug 20 hours ago [-]
Most of these vulnerabilities could have been discovered much earlier had the same security researchers pointed a SAST tool at the codebase.
I wrote an OSS PHP SAST tool 6 years ago, but it's suffered from industry neglect — most people only care about security after an incident, and PHP has enough magical behaviour that any tool needs to be tuned to how specific repositories behave.
I agree there's a big opportunity for LLMs to take this work forward, filling in for a lack of human expertise.
unethical_ban 19 hours ago [-]
Where can I learn more about SAST, and do you have a link to your tool?
I stood up a Dokuwiki instance recently and had Qwen look through the codebase, and it didn't find anything critical. It identified "fragile patterns", though.
This is the new trend that keeps me awake at night. It's that adversaries now have access to off the book inference and that they will be able to scan pretty much any widely used open source project and discover and exploit zero days. I think making it closed source offers a bit more security but will only buy time as it is possible to reverse engineer them with current closed source models with extreme ease.
If you are sufficiently funded then you could benefit from the flip side of discovery but it looks bleak if you are a sole maintainer on a large project that is a dependency in many deployed instances without any revenue or donations, plus there is nobody digging deep enough to care or spend inference ( would your company spend the money on extra inference to is the question, more often than not) on both sides of the fence, we are going to see massive disruptions across the board.
Cybersecurity is becoming a proof-of-work of sorts and the race is on. There might be unknown number of zero days being silently discovered and deployed, likely have an impact on the economics too, thus making the access far more widespread.
I do wonder if this means our tech stacks will go back to being boring and simple as possible...you wouldn't hack a static html website being served on nginx would you?
eithed 19 hours ago [-]
It's nothing new - even without LLMs you have automated tools that will try stuff to see if your application is vulnerable. You can abuse misconfigured nginx server. To be fair, to your point, LLMs are amazing pattern recognizers = this pattern it has seen in this codebase applies to that codebase so vulnerability is most likely; I'm unsure if they can "innovate" (still, recognizing patterns is enough); this pattern it has seen causes a crash, but I don't know if we're at the point where it can connect two and two together and use a set of unrelated code issues to, for example, exfiltrate credentials
rustyhancock 20 hours ago [-]
I think we'll see a lot more of this (and it's a good thing).
Automation doesn't usually replace humans it just hikes up the floor.
I.e. nearly all of these (most in general?) bugs will be spotted quickly by a train eye. But it's hard to get trained eyes on code all the time. AI will catch all the low hanging fruit.
What's great about this it seems mostly low hanging I.e. even basic AI will help people patch holes.
giancarlostoro 20 hours ago [-]
I've said it a few times, and I will keep saying it. Especially for the anti-AI crowd. Sure, you don't want it to write your code, fine, not bothered at all, but review your code for serious security flaws, and enhancing security audits? You definitely want AI there. I foresee the next few years we will see all sorts of companies, sites, and critical infrastructure being hacked. Heck, we're already seeing more and more of this. It's not going to end very well. If your company is sleeping on its cyber security, tomorrow isn't when you want to deal with it, but get on it before you can.
I say this purely as a Software Engineer, not a security expert, but you have to consider hackers can, are, and will use AI against you.
The Mexican government was hacked by people using Claude[0] this was apparently many government systems and services, all that PII for everyone in the country in these systems. Even if Claude somehow "patches" this, there's so many open source models out there, and they get better every day. I've seen people fully reverse engineer programs from disassmebling their original code into compilable code in its original programmed language, Claude happily churning until it is fully translated, compiles and runs.
Whatever your thoughts on AI are, if you aren't at least considering it for security auditing (or to enhance security auditing) you are sleeping at the wheel just waiting to be hacked by some teenager skiddie with AI.
I've bounced back and forth on my feelings for AI and have landed in the realm of:
- there are certain things it is exceptional at that humans cannot replicate.
- there are certain things I do not want to use it for.
And review falls squarely in that first category. Similarly, it is exceptional at working through "low hanging fruit" type problems such as spotting inefficiencies, analyzing a profile to find flaws in software, etc.
caycep 18 hours ago [-]
how healthy is the open source community around openEMR? I feel like by nature, it is decidedly more unsexy and less attractive for volunteers to work on. I work in healthcare, and PTSD from various EMRS have run so deep that working on an actual EMR is the most unappealing thing I can think of to tinker around with code....
joshghent 19 hours ago [-]
Had exactly the same sort of experience using AI to audit a code base we inherited recently at $dayJob.
Spotted over 100 “security issue but after whittling them down via reproduction scripts and validating they were real CVE’s - that number was around 30.
Even so - it was a huge win and something we wouldn’t have spotted.
It’s something I’ve now codified into repowarden.dev
dgb23 20 hours ago [-]
It seems to me that this sort of work is a usecase that’s actually very fitting use case for LLM agents and the like. Because they can be trained and tuned to find commonly known vulnerability patterns.
Here, something that looks like the thing is a strong signal, as long as the probability is high enough to be useful.
Remember Netflix‘ chaos monkey?
KaiserPro 16 hours ago [-]
Ok this is cool and that, but show me the prompts.
Was this autonomous, as in "look at this repo and find me all the CVEs that could exist"?
Or was it much more guided?
bobkb 18 hours ago [-]
I had reviewed this application long ago to find tons of issues - not surprised many of it are now CVEs. I am also surprised that the product is still active.
mbesto 20 hours ago [-]
What's probably WAY worse than this is that most healthcare providers running OpenEMR are likely on older versions of OpenEMR where CVEs are already detected.
doctorpangloss 20 hours ago [-]
Nobody uses OpenEMR. No chance. They are lying about their numbers.
tecleandor 20 hours ago [-]
Well, it's not popular maybe on bigger hospitals, but back in the day I think it was relatively popular on smaller practices even on the US. I don't know if it has lost traction (or not) with the popularization of cloud services, I'm not super up to date...
whartung 20 hours ago [-]
I can't speak to OpenEMR, but OpenMRS is popular overseas, and has done a lot of work in Africa.
OpenEMR may be in similar spaces.
ranger_danger 20 hours ago [-]
> used by over 100,000 medical providers serving more than 200 million patients across 34 languages
Interesting... I have been working with many different EHR platforms across the country for the last 15 years and I have never heard of OpenEMR before, or any open-source platform for that matter.
jjwiseman 20 hours ago [-]
The Aisle "the moat is the system, not the model" blog post comparing Mythos' results to their system's was misleading, and seemed to be an attempt to ride the coattails of attention on Mythos. It was of low enough quality that I'd want to see more details of exactly how these vulnerabilities were found.
breezedream 16 hours ago [-]
It’s marketing content. This is a fundamental aspect of cybersecurity business. Fused marketing with research. Seems HN is susceptible to it.
motoxpro 20 hours ago [-]
A better headline would be "AI finds mistakes made by human" It's not that it's doing something novel, every single person in this thread has made mistakes, and big ones, not because we aren't trying, it just happens. AI helps find some mistakes, not all, not everytime, not without effort, not without slop/false positives, just some mistakes. Thats a very good thing.
hereme888 20 hours ago [-]
OpenEMR? Used by some missionary doctor in remote Afghanistan?
floatrock 19 hours ago [-]
Right. You're not a real medical group unless you go through an 18-month RFP procurement cycle including being wined and dined by the Epic rep who already knows they're gonna get your $50MM wallet because they're golf buddies with your CEO and already embedded with all your labs. God forbid anyone practicing Real Medicine tries to go the OSS route, medicine is too complicated for something like that.
jabl 18 hours ago [-]
$50M? Pfft. The regional health service provider over here has spent close to a billion € migrating to Epic over the past decade. The feedback has been so devastating they're apparently now considering starting over from scratch. Love seeing the consultants lighting my tax money on fire like that.
DANmode 17 hours ago [-]
Remote Afghanistan has 100,000 medical providers working in 34 languages?
hereme888 13 hours ago [-]
34 dialects? The numbers are here and there, you know.
0123456789ABCDE 20 hours ago [-]
something i am missing in this area is education and services.
if, during an automated code review, claude finds a vulnerability in a dependency, where should i direct it to share the findings?
who would be willing to take the slop-report, and validate it?
i've never done vulnerability disclosure, yet, with opus at max effort, i have found some security issues in popular frameworks/libraries i depend on.
a proper report can't be one pass, it has to validate it's a real problem, but ask opus to do that and you run the risk of the api refusing the request, endangering your account status. you ask to do it anyway, and write a report and now, you're burning tokens on a report that's likely to be ignore, because slop.
so i sit on this, and hope it doesn't hit me.
hedgehog 20 hours ago [-]
It often takes strong understanding of the upstream codebase and roadmap to write a good patch. It's easy enough to write a rough PoC and draft patch but getting all the way through the cycle takes up a bunch of time both from you and the maintainers (who are often already overloaded). My advice would be to draft a bunch privately, take one of the highest impact all the way through a deployed fix, and then plan based on what you learn. Some people's answer is to maintain private forks with automated fixes applied, with a periodic rebase on upstream.
0123456789ABCDE 20 hours ago [-]
i'm well aware that a pull-request with a fix is a lot of work. i don't pretend to have the capacity to do this, with all the rest i have to attend to.
it just doesn't sit well with me that, i am aware of something being broken, and not telling about it to someone who would otherwise want to know about it.
hedgehog 20 hours ago [-]
In my opinion maintainers can easily run a "hey robot, scan my code for risky patterns" to get a rough list, or they can solicit unreviewed contributions, but otherwise better not to add noise.
0123456789ABCDE 20 hours ago [-]
i'd be happy to use an official skill for vulnerability reporting
the skill would be manually triggered when vulnerabilities are found; do another pass for details; version, files, lines, then write a lightweight report and submit somewhere. anthropic could host this, or work with h1 to do that. when the models have extra capacity a process comes around and picks up these reports one by one, does another check, maybe with proof-of-concept, reports through proper channels.
esafak 19 hours ago [-]
Share it in the repo's issues, discussions, or chat?
0123456789ABCDE 19 hours ago [-]
that would be full disclosure, i don't particularly dislike the idea, but it's slop, the devs are already overwhelmed, i don't fully understand the legal implications i would be exposed to.
esafak 19 hours ago [-]
You can omit the details and share them on request.
danaw 19 hours ago [-]
now let's open source all healthcare systems so we can at least collectively improve these things rather than trusting companies like oracle to be good faith actors with equally acceptable security
asah 19 hours ago [-]
only 38 CVEs - that's pretty good!
...so far !
Exoristos 20 hours ago [-]
Now do Epic.
DANmode 17 hours ago [-]
Is there a process server at your place yet?
0xdeadbeefbabe 20 hours ago [-]
Also the attackers may become hypochondriacs after reading too much medical stuff.
MattCruikshank 19 hours ago [-]
EDIT: Looks like they did responsibly disclose - that's nice. I missed the single line at the bottom of the article. I'd prefer if an article like this opened with a paragraph about their conversation with the maintainers, and how all vulnerabilities have already been patched, etc. But I guess that's a personal preference.
===
Did they privately disclose these vulnerabilities to the developers and give them a reasonable amount of time to fix them, before they announced them to the world?
Because, and I'm going to highlight, if someone exploits a CVE in an EMR, they can wreck havoc on actual real patient data, and can endanger health and lives.
"Option 1 (preferred) : Report the vulnerability at this link. See Privately reporting a security vulnerability for instruction on doing this."
Did they do that?
Because if they didn't responsibly disclose, this sure seems like a hit job performed by someone who'd rather EMR software be closed source.
1970-01-01 19 hours ago [-]
RTFA, Matt. Your answer is at the end of it.
MattCruikshank 18 hours ago [-]
Have you heard of the term, "bury the lede"?
I'd love to see an opening paragraph like this one:
"All discovered vulnerabilities have already been patched. We waited to publish this article until they were. Release 8.0.3 addresses all of them, and we advise updating as soon as possible. We waited until 95% of installs had already updated to that version."
Looks like every single one of the 38 vulnerabilities were either SQL injection, XSS, path traversal or "Insecure Direct Object Reference" aka failing to check the caller was allowed to access the record.
This is actually a pretty good example of the value of AI security scanners - even really strong development teams still occasionally let bugs like this slip through, having an AI scanner that can spot them feels worthwhile to me.
Seems like code review against a checklist of the most common vulnerabilities would have prevented these problems. So I guess there are two takeaways here:
First, AI scanners are useful for catching security problems your team has overlooked.
Second, maintaining a checklist of the most-common vulnerabilities and using it during code review is likely to not only prevent most of the problems that AI is likely to catch, but also show your development team many of their security blind spots at review time and teach them how to light those areas. That is, the team learns how to avoid creating those security mistakes in the first place.
'With enough eyes, all bugs are shallow' and AI is an automatable eye that looks at things we can tell nobody has seriously looked at before. It's not a panacea, there will be lots of false positives, but there's value there that we clearly aren't getting by 'just telling humans to use the tools available'.
See also: modern practices and sanitizers and tools and test frameworks to avoid writing memory errors in C, and the reality that we keep writing memory errors in C.
I think there's a difference in how trivial some of these things are to detect and how difficult others are. IDOR and SQLi aren't nearly as complex as C unsafety is.
But, yes, I'd augment any manual review with a checklist and AI review as a final step. If the AI catches any problems then, your reviewers will be primed to think about why they overlooked them.
Could not agree any more strongly. These automagic tools are one thing in the hands of a dev that groks the basics like these examples. It would be one thing if new devs were actually reviewing the generated code to understand it, but so much is just vibe coded and deployed as soon as it "works". I get flack from not immediately deploying generated code because I want to take time to understand how it works. It's really grating and a lot of friction is coming from it.
https://chatgpt.com/share/69f10515-8808-83ea-abe3-a758d3144c...
If people aren't learning more with AI, that's a meta skill they need to develop.
As for training the review muscles, why would you do that if you have a linter that rejects when you make the mistake? I don't expect reviewers to check whether you eschew nulls or uninitialized variables; I expect the compiler to do that, and I expect over time that more and more things will become tooling concerns (especially given that rigid tools with appropriate feedback are clearly a massive force multiplier for LLMs).
Second, to use your example, the ChatGPT response you provided does a crappy job of explaining the root cause of problem: Namely, that every string is drawn from some underlying language that gives the string its meaning, and therefore when strings of different languages are combined, the result can cause a string drawn from one language to be interepreted as if it were drawn from another and, consequently, be given an unintended meaning.
So, if the idea is that smart teams can not only delegate the catching of problems but also the explanation of those problems to ChatGPT -- presumably because it is a better teacher than the senior engineers who actually understand the salient concepts -- I'd say AI ain't there yet.
Is that true? Is that also true of e.g. teams using type checkers to avoid nulls or exceptions? Or teams that use memory safe languages to avoid memory corruption? Or using a library that has an `unsafeStringToSql` API surface, and a linter to flag its use (where you're expected to use safe macros instead)? My experience is that better tools (or languages and library designs) scanning for issues lead to fewer defects and less playing fast and loose since the entire point of the tools is to ban these mistakes.
On education, it literally tells you that the top concern is SQL injection made possible by concatenating strings, and gives an example of an auth bypass: `name = "foo' OR 1=1 --"`. It also notes that this is not just a minor nitpick, but that actually the solution is fundamentally doing something completely different (query objects with bound parameters). If you don't understand what it means you can just ask:
> Elaborate on 1
> Walk through examples of what goes wrong and why, and how the solution avoids it
etc. The knowledge is all there; you just need to ask for it. It's an infinitely patient teacher with infinite available attention to give to you. You can keep asking follow-ups, ask it to check your understanding, etc. Or there are tons of materials about it on the web or in textbooks, and if you still don't understand, you can still ask a more senior engineer to explain what's wrong.
Yes. See: vibe coding. See also: The shockingly widespread hype for and acceptance of vibe coding across industries that ought to know better.
Do you deny that there is a correlation between AI use and not knowing what you are doing? Isn’t one of the big selling points of AI is that it lets “regular people” create “real world” projects that they could only dream about previously?
I am not saying that serious engineers don’t use AI or that when they use it, they do so foolishly. I’m only pointing out that AI has let a lot of people who don’t know what they’re doing crank out code without understanding how it works (or doesn’t).
> Is that also true of e.g. teams using type checkers to avoid nulls or exceptions? Or teams that use memory safe languages to avoid memory corruption?
No, it is not true of those teams. When people choose to use languages with statically checked types or with memory safety or the other examples you offered, they are rarely doing it because they have no idea how to write sound code. But when people turn to AI to crank out code they couldn’t write themselves (see: vibe coding), that’s what they are doing.
> On education, [ChatGPT] literally tells you that the top concern is SQL injection from essentially concatenating strings, and gives an example of an auth bypass: `name = "foo' OR 1=1 --"`. If you don't understand what that means you can just ask...
Again, that’s a crappy explanation of the real problem. It promotes no understanding of the underlying issue—that strings are drawn from languages that give them their meanings. And, unless you understand that it’s a crappy explanation that ignores the underlying issue—which a person being gaslit by the crappy explanation would not—what stimulus is going to provoke you to ask for a better explanation? How are you going to know that the crappy explanation is crappy and tell ChatGPT to take another direction?
> The knowledge is all there; you just need to ask for it. It's an infinitely patient teacher with infinite available attention to give to you.
Yeah, and if it steers you down a crappy path, such as in your sql-injection session with ChatGPT, it will be infinitely happy to keep leading you down that crappy path. Unless you know that it’s leading you down a crappy path, you won’t be able to tell it to stop and take another path. But if you are relying on the AI to tell you what’s good and what’s crappy, you won’t be able to tell which is which. You’ll be stuck on whatever path the AI first presents to you.
> Or there are tons of materials about it on the web or in textbooks, and if you still don't understand, you can still ask a more senior engineer to explain what's wrong.
And that’s equivalent to “don’t ask the AI, use a traditional resource,” right?
If you're a "regular person" vibe coder, you're not doing code reviews with a team anyway. You presumably had no teacher and no one to tell you your mistakes. So having a security bot is strictly an improvement.
If you're on a professional team, then you're presumably in the non-foolish group that already had standards, and is using it as a tool as with any of the other quality tools they use. And if they don't have standards and don't know this stuff already, well, the bot is again an improvement. It least it raises the issue for someone to ask what it means.
If you're a professional, I also assume you've heard of SQL injection (does it never come up in a CS degree?), so you don't really need more than a "this method is exposed to SQL injection" explanation. It's like saying "tail recursion is preferred because it compiles to a loop, so it's not prone to stack overflow". It assumes it doesn't need to elaborate further, but if you don't understand a term, you can just ask. Or look it up.
And yeah books or Wikipedia still exist even if you use an automated linter. You can go read about these things if you don't know them. I frequently tell my team members to go read about things. Actually I ended up in a digression about CSRF the other day (we work on low level networking, so generally not relevant), and I suggested the person I was talking to could go read about it if they're interested so as not to make them listen to me ramble.
Also I'm still unclear on why you think the explanation is crappy. It says the problem is you're making a query via simple string substitution, shows how you can abuse quotes if you do that (so concretely illustrates the problem), and says the reason the better solution is better is that it makes a structural object where you have a query with placeholders followed separately by parameters (so you can't misinterpret the query shape), which seems better than "strings are drawn from languages that give them their meanings"?
> Teams that decide to delegate security responsibilities to AI are more likely to do things fast and loose in general.
Note the word delegate. I claimed that teams that delegate security responsibilities to AI are more likely to play fast and loose in general. That’s because delegating security to AI—not supplementing existing security practices with AI—is likely to be a good way to launch insecure garbage into the world. AI simply isn’t good enough to get security right on its own. Maybe someday it will be good enough, but like I wrote earlier, it ain’t there yet. And any team that plays fast and loose with security is likely to play fast and loose in general.
See any problems with that logic?
I only used vibe coding as an obvious example that shows there are lots of teams that delegate security responsibilities to AI. (Vibe coders are delegating almost everything to AI.)
> If you're a "regular person" vibe coder, you're not doing code reviews with a team anyway. You presumably had no teacher and no one to tell you your mistakes. So having a security bot is strictly an improvement.
How is it strictly an improvement? Before vibe coding, “regular people” couldn't launch insecure garbage upon an unsuspecting world—they couldn't launch anything. Do you believe that it’s "strictly better" that now everyone can launch insecure garbage courtesy of their AI minions? Do you think it’s “strictly better” that lots of users are having their data sucked into insecure apps and web sites that are destined to be hacked?
> Also I'm still unclear on why you think the explanation is crappy.
It’s crappy because it tells you how to use a tool (a custom SQL interpolator) without helping you understand the cause of the problem that the tool is trying to solve. You could read this ChatGPT explanation about avoiding SQL injection in Scala and not be any wiser about how to avoid that problem in other programming languages.
Worse, you would never learn from this explanation that the underlying cause of SQL injection is the same as for cross-site-scripting holes and a host of other logic and security problems in software. That’s because ChatGTP was trained on explanations of these problems scraped from the internet, and 99% of those explanations are superficial because the people who wrote them didn’t understand the underlying issues.
But if you deeply understand the following, you will never make this kind of mistake again in any programming language:
1. Every string is drawn from an underlying language and must conform to the syntax and semantics of that language.
2. To combine strings safely, you must ensure that they are all drawn from the same language and are combined according to that language’s syntax and semantics.
Therefore, as a programmer, you must (a) understand the language beneath each and every string, (b) combine strings only when you can prove that they have the same underlying language, and (c) combine strings only according to that underlying language’s syntax and semantics.
If you understand these things, you will know how to avoid all SQL injection and XSS holes and related problems in all programming languages. Things like escaping will make sense: it converts a string in one language into its equivalent string in another language. Further, you will know when you can safely delegate some of your responsibilities to tools such as parsers, type systems, custom SQL interpolators, and the like.
But you wouldn’t learn any of this from the ChatGPT explanation you received. Worse, you wouldn’t even think to ask for this deeper explanation because you would have no reason to suspect from ChatGPT’s explanation that this deeper explanation even existed.
In any case, I appreciate your willingness to continue this conversation. It’s been fun and educational and has forced me to articulate some of my ideas more clearly. Thanks!
The actual problem is that you're using strings at all. The SQL solution (that the scala macros do) is to use prepared statements and bound parameters, not to escape the string substitution. Basically, work in the domain, not in the serialized representation (strings). Likewise with XSS, you put the stuff into a Text node or whatever and work directly with the DOM so the structural interpretation has all already happened before the user data is examined.
But "work in the domain as much as possible" is a good idea for a whole bunch of reasons (as chatgpt said).
It did also several times indicate there was more to the story. It didn't just say "because that way is safer"; it said it
> Builds a structured query object, not a raw string
> Parameterizes inputs safely (turns $id into ? + bound parameter)
> Often adds compile-time or runtime checks
> Instead of manipulating strings, you’re working with a query AST / fragment system
And concluded by saying there's absolutely more detail, and that it's important to understand:
> If you tell me which library you’re using (Doobie, Slick, Quill, etc.), I can show exactly what guarantees sql"..." gives in your stack—those details matter quite a bit.
On vibe coded "garbage", I expect there won't be much of a market for such things (why would there be when you can also just vibe it?), so it will more be a personal computing improvement, which already limits the blast radius (and maybe already improves the situation vs the precarious-by-default SaaS/cloud proliferation today even with poor security). I also think tooling and vibe security will be better than median professional level by the time it's actually as easy as people claim it is to vibe code an application anyway. i.e. security (which is an active area of improvement to sell to professionals) will probably be "solved" before ease-of-use. Partly exactly because issues like code injection are already "solved" in better programming languages (which are also more concise and have better tooling/libraries in general), so the bot just needs to default to those languages.
[0] https://news.ycombinator.com/item?id=47693559
Do you believe that because you can delegate some responsibilities without sacrificing important requirements that it follows that you can delegate all responsibilities without sacrificing important requirements? Do you not understand the difference between delegating to the computer proofs such as type checking that the computer can discharge faithfully without error and delegating something as wide and perilous as security to something as currently flawed as AI?
> An LLM isn't an ideal solution to linting, but if you're stuck with a language with a weak type system maybe that's all you can reasonably do.
No, in such a situation you can add LLM-based checks to your responsibility for security. But you can’t delegate away your responsibility to LLMs and say that you care about security. AI ain’t there yet.
> The actual problem is that you're using strings at all.
What percentage of the world’s existing code do you believe does not use strings at all? Tragically, that is the world we live in.
> Basically, work in the domain, not in the serialized representation (strings).
Sure, but you can’t do all your work in the domain. At some point you must take data from the outside world as input or emit data as output. And, even if you are lucky enough to work in a domain where someone has done the parsing and serialization and modeling work for you so that you have the luxury of a semantic model to work with instead of strings, who had to write that domain library? What rules did that person have to know to write that library without introducing security holes?
> [ChatGPT] did also several times indicate there was more to the story.
Great. Then show me how a person who didn’t know of the existence of the rules I shared with you in my previous post would naturally arrive at them by continuing your conversation with ChatGPT.
> security (which is an active area of improvement to sell to professionals) will probably be "solved" before ease-of-use.
I think that this is a naive hope. Security is different from virtually all other responsibilities in computing, such as ease of use, because getting it right 99.99% of the time isn’t good enough. In security, there is no “happy path”: it takes just one vulnerability to thoroughly sink a system. Security is also different because you must expect that adversaries exist who will search unceasingly for vulnerabilities, and they will use increasingly novel and clever methods. Users won’t probe your system looking for ease-of-use failures in the UI. So if you think that AIs are going to get security right before ease-of-use, I think you are likely to be mistaken.
But it doesn't give you the same benefit. It gives you the partial benefit of catching these problems before they go to production, but it doesn't give you the remaining benefit of teaching your team about where their mental models are broken. A team that decides to delegate this responsibility entirely to AI is going to have a hard time learning about these serious defects in their mental models. Fixing those defects will pay dividends throughout the code base, not just in the places where AI would detect security failing.
If I were in charge of a 25 year old PHP application, tracking down every SQL query and converting it to a safe form would high on my list of priorities. You don't need AI for that, just ripgrep and a basic amount of care for your users.
Unfortunately you have no easy way of checking if closed source projects are similarly amateur.
if an AI uses static analyzers to do work ,is it the tool or the ai ?
if AI is using grep to do the work, is it the AI or grep?
I mean essentially all agent work boils down to "cat or grep?"
agreed, though I think you'd be hard-pressed to find anyone who uses healthcare-related software professionally who thinks any "really strong development team" was involved in its creation.
And these were also automatic. Looks very likely that the team didn’t give a damn about top basic security and good practices.
Like a house made of paper wouldn’t be an example of the insecurity of the construction industry.
One would think a single really strong developer, let alone a team, would look for interpolation in strings fed to RDBMS?
Classic.
Are you fuckin' serious? This would be caught with any self-respecting scanner even 5 years ago and with most educated juniors even earlier.
I use AI every day, but I'm not deep enough in the dilulu to believe that everything above two brain cells should be a transformer.
People thinking that this isn't the case everywhere need a reality check. Most software is riddled with obvious security issues. If we can remediate them with AI, great, but don't be thinking that this is something that we could only have dealt with with AI. Enough attention and prioritization of these issues would also have sorted it.
Ask yourself if we weren't currently in an era of AI-focus and AI was just another boring tool, if we would be bothering to do this sort of thing. Loads of us still aren't bothering with basic static analysis.
Point is unless it eventually becomes cheap enough that we all have this at home and can run SOTA analysis ourselves, this too will pass. I imagine it will get cheap enough fwiw, but.. yea.
I'm sure that doesn't matter much to big tech folks seeking to fill that promotion packet though, or to executives seeking to demonstrate the overwhelming utility of this new income stream.
Not that I expect companies to be more proactive now. I have been disillusioned of that long ago. With AI they could be at least a little bit more proactive, which I guess is a great selling point of AI to corporate.
Back in 2010, as a security engineer, I also looked at OpenEMR. It was an absolute disaster, and was (and is) somewhat well-known as such. I found and published vulnerabilities very similar to these sixteen years ago. This is not exactly the Fort Knox of software.
It makes sense for AISLE to demonstrate that they're able to find vulnerabilities here, but I'd love to see a side-by-side comparison of modern SAST and DAST reviews. I bet we'd find similar vulnerabilities.
>I was the main contributor and maintainer to OpenEMR about ~20 years ago and then decided it was irredeemable and started over with ClearHealth/HealthCloud. Shockingly some of my code code lives on (from PHP 3). I am reluctant to say don't use it but if you do please don't expose it to anything public, which sadly happens most of the time. There are some real problems that exist in that code base from a security and HIPAA perspective.
Finding SQL injections etc is definitely valuable, but at the same time they did not hack Epic; the "100000 medical providers" number links to https://www.hhs.gov/sites/default/files/open-emr-sector-aler... which links open-emr.org/blog/openemr-is-proud-to-announce-seamless-support-for-telehealth/ which...404s. Per archive.org the source is something the CEO of now defunct lifemesh.ai said.
"medical record software" makes it sound super serious, but again OpenEMR should not be taken as seriously as for instance Epic.
I wrote an OSS PHP SAST tool 6 years ago, but it's suffered from industry neglect — most people only care about security after an incident, and PHP has enough magical behaviour that any tool needs to be tuned to how specific repositories behave.
I agree there's a big opportunity for LLMs to take this work forward, filling in for a lack of human expertise.
I stood up a Dokuwiki instance recently and had Qwen look through the codebase, and it didn't find anything critical. It identified "fragile patterns", though.
If you are sufficiently funded then you could benefit from the flip side of discovery but it looks bleak if you are a sole maintainer on a large project that is a dependency in many deployed instances without any revenue or donations, plus there is nobody digging deep enough to care or spend inference ( would your company spend the money on extra inference to is the question, more often than not) on both sides of the fence, we are going to see massive disruptions across the board.
Cybersecurity is becoming a proof-of-work of sorts and the race is on. There might be unknown number of zero days being silently discovered and deployed, likely have an impact on the economics too, thus making the access far more widespread.
I do wonder if this means our tech stacks will go back to being boring and simple as possible...you wouldn't hack a static html website being served on nginx would you?
Automation doesn't usually replace humans it just hikes up the floor.
I.e. nearly all of these (most in general?) bugs will be spotted quickly by a train eye. But it's hard to get trained eyes on code all the time. AI will catch all the low hanging fruit.
What's great about this it seems mostly low hanging I.e. even basic AI will help people patch holes.
I say this purely as a Software Engineer, not a security expert, but you have to consider hackers can, are, and will use AI against you.
The Mexican government was hacked by people using Claude[0] this was apparently many government systems and services, all that PII for everyone in the country in these systems. Even if Claude somehow "patches" this, there's so many open source models out there, and they get better every day. I've seen people fully reverse engineer programs from disassmebling their original code into compilable code in its original programmed language, Claude happily churning until it is fully translated, compiles and runs.
Whatever your thoughts on AI are, if you aren't at least considering it for security auditing (or to enhance security auditing) you are sleeping at the wheel just waiting to be hacked by some teenager skiddie with AI.
[0]: https://news.ycombinator.com/item?id=47280739
I've bounced back and forth on my feelings for AI and have landed in the realm of: - there are certain things it is exceptional at that humans cannot replicate. - there are certain things I do not want to use it for.
And review falls squarely in that first category. Similarly, it is exceptional at working through "low hanging fruit" type problems such as spotting inefficiencies, analyzing a profile to find flaws in software, etc.
Spotted over 100 “security issue but after whittling them down via reproduction scripts and validating they were real CVE’s - that number was around 30.
Even so - it was a huge win and something we wouldn’t have spotted.
It’s something I’ve now codified into repowarden.dev
Here, something that looks like the thing is a strong signal, as long as the probability is high enough to be useful.
Remember Netflix‘ chaos monkey?
Was this autonomous, as in "look at this repo and find me all the CVEs that could exist"?
Or was it much more guided?
OpenEMR may be in similar spaces.
Interesting... I have been working with many different EHR platforms across the country for the last 15 years and I have never heard of OpenEMR before, or any open-source platform for that matter.
if, during an automated code review, claude finds a vulnerability in a dependency, where should i direct it to share the findings?
who would be willing to take the slop-report, and validate it?
i've never done vulnerability disclosure, yet, with opus at max effort, i have found some security issues in popular frameworks/libraries i depend on.
a proper report can't be one pass, it has to validate it's a real problem, but ask opus to do that and you run the risk of the api refusing the request, endangering your account status. you ask to do it anyway, and write a report and now, you're burning tokens on a report that's likely to be ignore, because slop.
so i sit on this, and hope it doesn't hit me.
it just doesn't sit well with me that, i am aware of something being broken, and not telling about it to someone who would otherwise want to know about it.
the skill would be manually triggered when vulnerabilities are found; do another pass for details; version, files, lines, then write a lightweight report and submit somewhere. anthropic could host this, or work with h1 to do that. when the models have extra capacity a process comes around and picks up these reports one by one, does another check, maybe with proof-of-concept, reports through proper channels.
...so far !
===
Did they privately disclose these vulnerabilities to the developers and give them a reasonable amount of time to fix them, before they announced them to the world?
Because, and I'm going to highlight, if someone exploits a CVE in an EMR, they can wreck havoc on actual real patient data, and can endanger health and lives.
https://github.com/openemr/openemr/security
"Option 1 (preferred) : Report the vulnerability at this link. See Privately reporting a security vulnerability for instruction on doing this."
Did they do that?
Because if they didn't responsibly disclose, this sure seems like a hit job performed by someone who'd rather EMR software be closed source.
I'd love to see an opening paragraph like this one:
"All discovered vulnerabilities have already been patched. We waited to publish this article until they were. Release 8.0.3 addresses all of them, and we advise updating as soon as possible. We waited until 95% of installs had already updated to that version."