• fruitcantfly@programming.dev
    link
    fedilink
    arrow-up
    2
    ·
    1 day ago

    It probably depends on what model you are using, but what you describe is more akin to the kind of advice I’ve gotten when I’ve asked for suggestions for optimizations, rather than asked the LLM to identify (not solve) problems in the code. When I’ve asked the LLM to identify problems, the overwhelming majority of issues raised where true positives, though most of them weren’t very serious either

    • starelfsc2@sh.itjust.works
      link
      fedilink
      arrow-up
      2
      ·
      21 hours ago

      Yeah I would say it’s pretty good at catching most mistakes the majority of the time, it’s just the few times when it does make a mistake that I’m not sure is a mistake or not, it’ll argue in a way that seems plausible and forces me to rethink and verify the code that was already correct. I probably still save time with it, but it can be incredibly annoying and time consuming when it fails.

      • fruitcantfly@programming.dev
        link
        fedilink
        arrow-up
        1
        ·
        17 hours ago

        True. Most of the false positives were easy to dismiss, but I did spent a significant amount of time of time on a couple of them, since it wasn’t immediately clear to me that the code was correct. In one case the agent had missed a precondition that was verified elsewhere, and in the other it had misrepresented what the C++ standard actually said