• starelfsc2@sh.itjust.works
    link
    fedilink
    arrow-up
    2
    ·
    1 day ago

    Yeah I would say it’s pretty good at catching most mistakes the majority of the time, it’s just the few times when it does make a mistake that I’m not sure is a mistake or not, it’ll argue in a way that seems plausible and forces me to rethink and verify the code that was already correct. I probably still save time with it, but it can be incredibly annoying and time consuming when it fails.

    • fruitcantfly@programming.dev
      link
      fedilink
      arrow-up
      1
      ·
      1 day ago

      True. Most of the false positives were easy to dismiss, but I did spent a significant amount of time of time on a couple of them, since it wasn’t immediately clear to me that the code was correct. In one case the agent had missed a precondition that was verified elsewhere, and in the other it had misrepresented what the C++ standard actually said