AI outcome is inconsistent

Hi! One thing I’ve noticed about TestDriver is that the results aren’t always consistent. Even when following the exact same steps, it sometimes passes and sometimes fails. The more complicated a command is, the more likely it is to fail.

I think this could be a deal-breaker for a lot of people. Does TestDriver plan to implement a system that can store outcomes for better consistency?

Thanks @jibril. Can you tell me more about what outcomes are inconsistent? Is it when using “explore” mode or “prompt” mode? Are you using assertions?

Can you share an example of a test file that is inconsisten?

Im not sure if it’s explore mode but i typed the instructions live on the terminal.

The instructions are exactly the same but when trying it out the second or third time, it would sometimes fail.

Sorry for the late response btw, just saw your message now

Ah, yes it is explore mode. What happens is explore mode generates a .yaml file that ensures the steps are repeated.

With explore mode, you may get something different every time. You need to use run to get consistency.