Ancestors

Toot

Written by unknowing8343@discuss.tchncs.de on 2024-08-31 at 16:50

Any AI tool to analyse a git repo for malicious code?

https://discuss.tchncs.de/post/21299499

=> More informations about this toot | More toots from unknowing8343@discuss.tchncs.de

Descendants

Written by Kalcifer on 2024-08-31 at 20:32

Huh. That’s actually kind’ve a clever use case. I hadn’t considered that. I presume the main obstacle would be the token limit of whatever LLM that one is using (presuming that it was an LLM that was used). Analyzing an entire codebase, ofc, depending on the project, would likely require an enormous amount of tokens that an LLM wouldn’t be able to handle, or it would just be prohibitively expensive. To be clear, that’s not to say that I know that such an LLM doesn’t exist — one very well could — but if one doesn’t, then that would be rationale that i would currently stand behind.

=> More informations about this toot | More toots from Kalcifer@sh.itjust.works

Written by unknowing8343@discuss.tchncs.de on 2024-08-31 at 22:53

I understand, but I wouldn’t be surprised to see some solution out there that could maybe feed the AI chunks of code without context… It may still be able to detect “hey you told me this software is supposed to do X and here it seems to be doing Y”.

I guess we’ll have to wait a couple of years for these tools to be accessible and affordable.

=> More informations about this toot | More toots from unknowing8343@discuss.tchncs.de

Written by Static_Rocket@lemmy.world on 2024-09-01 at 07:27

You would first need to define malicious code within the context of that repo. To some people, telemetry is malicious.

=> More informations about this toot | More toots from Static_Rocket@lemmy.world

Written by unknowing8343@discuss.tchncs.de on 2024-09-01 at 07:38

Yes, of course, the idea would be something like passing the AI a repo link and a prompt like “this repo is supposed to be used for X, tell me if you find anything weird that doesn’t fit that purpose”.

=> More informations about this toot | More toots from unknowing8343@discuss.tchncs.de

Written by remram@lemmy.ml on 2024-09-02 at 14:49

Probably not. Obfuscation works, and might even depend on remote code being downloaded at either build time or run time.

There are a lot of heuristics you can use (e.g. disallowing some functions/modules) to check a codebase, but those already exist no AI required. Unless you call static analysis “AI”, who knows.

=> More informations about this toot | More toots from remram@lemmy.ml

Written by unknowing8343@discuss.tchncs.de on 2024-09-02 at 15:08

But an AI can “realise” the code might be downloading something it doesn’t need to. That’s the point.

AI is “smart” and understands that you told it that the library was supposed to do something specific, and it can understand that and look for things that seem not correlated to the purpose of the repo.

=> More informations about this toot | More toots from unknowing8343@discuss.tchncs.de

Written by remram@lemmy.ml on 2024-09-02 at 15:20

If you’re one of those people that think every product is better if there’s “AI” on the box then sure. What you’re describing is static analysis though, it is not new.

=> More informations about this toot | More toots from remram@lemmy.ml

Written by unknowing8343@discuss.tchncs.de on 2024-09-02 at 20:39

Where’s that tool then?

=> More informations about this toot | More toots from unknowing8343@discuss.tchncs.de

Written by fruitycoder@sh.itjust.works on 2024-09-03 at 03:59

Gitlab has a SAST tool

=> More informations about this toot | More toots from fruitycoder@sh.itjust.works

Written by Sethayy@sh.itjust.works on 2024-09-02 at 17:26

Its got a dataset of billions for tokens, youre better off running the stock market as an antivirus.

Instead if you care use specifically curated programs for the task, like antivirus’

=> More informations about this toot | More toots from Sethayy@sh.itjust.works

Proxy Information
Original URL
gemini://mastogem.picasoft.net/thread/113057663415381049
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
281.296075 milliseconds
Gemini-to-HTML Time
2.333416 milliseconds

This content has been proxied by September (ba2dc).