I test different AIs regularly, both the ones I’ve already tested before (to see if they are improving) or new ones I was not aware of. Usually I do that when I am developing something, not because I want AI to help me but to see how it performs in “real world” situations. In fact, I do that from time to time for about two years now. Call it curiosity.
I will not mention any particular AI here, because I don’t want to sound like I am attacking or promoting any of those. Also, obviously this is not a survey but rather part of my experience from testing AIs. I will restrict it in just two example cases.
A few days ago, I’ve spent some time finding something I needed about a rather popular C library. Then I asked AI as well, and it gave me a wrong answer - as it is usually the case before guiding it. I gave it a hint, but still got another wrong answer. After a few iterations, I was expecting it to fall into the endless loop, giving the exact same answer as before, even if I keep telling it “you are just repeating the same wrong answer”.
That particular AI though actually gave up, openly admitted it cannot find the correct answer, and asked me what that answer is. I could not resist the temptation: I replied with nonsense, just presented reasonably (a thing AI is notorious for). Call it revenge. 
Guess what, it happily accepted my nonsense and even used it to write an example program. The program was actually well written, but based on my nonsensical answer. It didn’t try to verify my nonsense. Then I told it I was joking, and gave it the real answer. As a small exercise, I also asked to use that information to solve a side problem concerning computer arithmetic in general. It was not hard to solve, but I can easily think of some students failing to solve it. That AI actually found the solution, and reasoned it well. I am guessing it used info gathered from StackOverflow or something, since this was a general question you can find info everywhere. But the fact is, AI actually solved that little problem.
In another case, just yesterday, I asked yet another new AI to write a C function that should use a rather niche library for timing. Its answer was incorrect because it assumed that library measures time in CPU cycles. I asked “are you sure it’s not milliseconds?”. It said “yes, I am certain”, and added a long text “reasoning” why. I said “you are wrong, look at the library implementation” - which I had to do just before I asked. It kept saying it was certain with yet another long text of reasoning. Then I pointed out where exactly it should look. This time it admitted time returned is indeed in milliseconds. It even pointed out some details I was planning to ask next. I checked that and it was correct. That actually saved me a few minutes.
In this case AI failed to find the correct answer on questions about the niche library, apparently because there is not much info about it on the web. It also failed to “think” where to look in library’s source files. But in the end, after guiding it, AI could find and explain the correct answer, and even saved me some boredom looking at source code.
My general impression is, AI is indeed improving a bit. But you should never forget what it is, and never ever trust it. Never assume its answers are correct, always verify them. A certain AI even openly says so.
In “real-world” situations, where you really look for an answer, you still need to guide it and, with proper guidance plus a pint of luck, it can even save you some time. But overall, I don’t think the time needed to guide AI and verify its answers is worth it. Most of the time, you will find out what you are looking for faster without it.