Proving Software Correctness

CajunCenturion · Feb 16, 2010

Can you prove that some piece of software has no bugs?
Edsger Dijkstra said no when he said

Edsger Dijkstra said:
Program testing can be used to show the presence of bugs, but never to show their absence!

I believe that the statement is true. You cannot never prove there are no [more] bugs in any given piece of software. You can only say there are no more known bugs.

On the other hand, thargtheslayer posted

thargtheslayer said:
Actually I differ with the illustrious Mr Dijkstra on that one. It demonstrably is possible to prove that software is working correctly, by the use of automated unit test tools and the like.

For example, I believe that the 4 colour map problem was solved several years ago by computer. The protagonists (I believe) simply set a computer to explore every possible combination of maps and awaited the outcome. Obviously this must have relied upon some form of topology, but I am not mathematically competent to pass judgement.

So, if a program has a domain space of say 10 billion outcomes, one can set a test tool running to explore its performance under each and every circumstance which may ever arise. By such exhaustive examination, it is therefore (in my opinion) possible to prove that a program has no bugs within a restricted solution domain.

I believe that Mr. Dijkstra was speaking in general, and his statement is therefore self-evidently true, but I assert that it may be possible to demonstrate to the contrary in restricted answer domain problems.

thargtheslayer has made some good point, but raises a number of questions.
[li]Does "prove that software is working correctly" mean the same as "prove the absence of bugs"?[/li]
[li]Can you prove that a program has a fixed domain space of outcomes when initial data condition are variable? User input is involved?[/li]

If you use an automated test tool, it would be necessary to prove that every possible path through the program is tested with every possible set of initial condition with every possible data input option along the way. How can you be sure that there is not another path with a different data set that leads to a failure?

With respect to the assertion that "it may be possible to demonstrate to the contrary in restricted answer domain problems", I believe the validity of the assertion is directly proportional to the amount of restriction is the answer domain. And that's if you can prove a restricted answer domain, and of course, whether that restriction is reasonable within the problem space.

For everyone, please share you thoughts.

--------------
Good Luck
To get the most from your Tek-Tips experience, please read
FAQ181-2886
As a circle of light increases so does the circumference of darkness around it. - Albert Einstein

audiopro · Feb 16, 2010

Very few apps cannot be broken by user intervention.
I wrote an app. many years ago, one of my first and was very proud of it.

That was, until a user wrote in the log book.......

"If you press F3, 11 times, it crashes."

Keith

http://www.studiosoft.co.uk

stackdump · Feb 16, 2010

I would say software can be proven to be bug free, but it depends on the whether the search space is confined sufficiently that testing can be exhaustive. (Let's ignore any faults that can be contributed from hardware).

The statement that software can *never* be proven to be bug free is not well put. I guess I agree with CC that unless software can be validated across its entire input range, in a realistic time, then validation must rely on sampling and assertions which inevitably incurs a probability of error. I have some experience of using code coverage tools (LINT) and assertion engines to validate code and have still found bugs that survived the sieve.

Slightly off track but an important facet we should not ignore is that a lot of commercial software is launched out before the programmers are ready to cut it loose. There is an old joke that most software doesn't get released, it escapes. Joking aside, the writers of buggy software will frequently confess that they were bullied into pushing versions out of the door ahead of time, or put another way, commercial pressure compromised quality. There are too many examples of disasters caused by this circumstance.

I recently used an online check-in at an airport. This was a new facility at that airport, new screens, new software, new everything. I had no bags to be loaded and on completion of printing the boarding card I got the instruction to "Now take your 0 bags to the luggage drop point". I find it hard to believe that the programmers of such a system could omit a conditional to say that if NumberOfBags>0 then print that message, or that someone did not notice that during testing. There were various other obvious bugs too, my colleagues boarding card came out with a different persons name on it! This check-in system was a product of one of the most famous hardware/software manufacturers. I suspect the poor quality was not caused by bad programmers or a result of bad testing, but was most likely caused by commercial pressure to launch it asap, regardless of whatever state it was deemed to be in (i.e. not finished and not tested).

kmcferrin · Feb 17, 2010

It is a basic scientific principle that it is impossible to prove that something does not exist. Therefore you cannot prove that bugs do not exist in your code.

While you may think that you are able to run some sort of automated test process to run through all of the possible permutations of using an application, you actually can only run through all of the permutations that you are aware of or can conceive of. Case in point, if you don't think that someone will press the F3 key eleven times then you won't test for it. Likewise, you cannot test every possible deployment scenario or interactions with other applications that may be installed. There are a plethora of situations that you may not be able to forsee or adequately test for.

More to the point, if that capability did exist to test every possible permutation it would not be feasible financially or from a time perspective. If you don't want your competition to rocket past you in the market you make sure that you release often and with as few known bugs as possible, and then patch afterwards.

Anyone who tells you that their software is bug free is either deluded or lying, possibly both.

________________________________________
CompTIA A+, Network+, Server+, Security+
MCTS:Windows 7
MCTS:Hyper-V
MCTS:System Center Virtual Machine Manager
MCTS:Windows Server 2008 R2, Server Virtualization
MCSE:Security 2003
MCITP:Enterprise Administrator

stackdump · Feb 17, 2010

Good points, but I wonder if that is really true in every scenario. If we switch to a philosophical train of thought and ignore practicality for a moment, consider the following...

If I take a 'raw' computer with no applications installed, no services running etc. If I write some code that adds two single digit integers, where the user has a multiple choice selection for number A (0-9 radio buttons) and a similar multiple choice for B together with an 'Add' button. I can evaluate every combination of A and B because the search space is very small. Since no data input is involved, no parsing etc. then its hard to see how such an app could not be completely validated under all conditions (and be declared free from bugs).

So could we state that an application *could* be 100% tested, provided that (a) all inputs can be exercised (b) the computer is dedicated to the function (there are no other apps/services present to interfere) (c) the hardware is working perfectly. This is of course unrealistic, but isn't there a scenario where the presence of bugs *could* be declared as zero?

As an example, I have a pocket calculator on my desk, it has never crashed or given me an incorrect result. That calculator is a dedicated computer, driven by software coded in hardware.

kmcferrin · Feb 17, 2010

OK, can you practically certify that any code is bug-free? No. Can you theoretically prove that a bit of code can be written that is bug-free? Yes.

10 Print "Hello world!"
20 Goto 10

That's completely bug-free Basic code right there. It serves no practical purpose, but it's bug-free.

________________________________________
CompTIA A+, Network+, Server+, Security+
MCTS:Windows 7
MCTS:Hyper-V
MCTS:System Center Virtual Machine Manager
MCTS:Windows Server 2008 R2, Server Virtualization
MCSE:Security 2003
MCITP:Enterprise Administrator

stackdump · Feb 17, 2010

ok. So if it's possible to produce code that is definitely bug free then Dijkstra is wrong. But its also possible to produce code for which it is impractical to determine whether it is bug free. So given that, there has to be a break point.

If a validation step could be completed in 1ns, a system of two inputs would have four combinations, so 4ns to validate. A system of 64 inputs has 1.844E19 combinations which would take 584 years. So in this case validation must either take place in parallel, or rely on sampling and assertions, which will inevitably incur a probability of error.

So my assertion would be that bug free code is possible but the break point is simply one of the time needed to completely validate it.

CajunCenturion · Feb 17, 2010

I'm not so sure that Dijkstra is wrong. We can look at the snippet of code that kmcferrin posted and intuitively know that the only bug it has is the infinite loop. That being said, intuitively knowing something and being able to prove it are two entirely different things.

And precisely with respect to that above snippet of code, what type of test would prove that it's an infinite loop?

--------------
Good Luck
To get the most from your Tek-Tips experience, please read
FAQ181-2886
As a circle of light increases so does the circumference of darkness around it. - Albert Einstein

stackdump · Feb 17, 2010

A code checker would easily spot that form of loop. Given that the snippet has no conditional branch of any kind within that loop, then the original intention could have been that it should run indefinitely, so we could argue that the snippet has no bugs, but the code checker should fail safe and highlight it as something to be validated by the author.

Knowing and proving are certainly different things. But we can infer that something is awry by testing syntax and searching for abnormal sequences. We know for example pythagoras theorem, but nobody has actually tested that it works for all possible dimensions of right-angled triangles. It is not always necessary to test all possible cases empirically in order to prove something to be true.

To prove that snippet would actually execute forever is impossible because we cannot wait to find out (could be a long test!). So in other words the proof comes back to being a question of time. But we know for sure that it will run indefinitely by inspection.

CajunCenturion · Feb 17, 2010

==> It is not always necessary to test all possible cases empirically in order to prove something to be true.
Of course not. The Pythagorean theorem has been mathematically proven. It is not necessary to test it against all possible right triangles because it does have a mathematical proof. However, if you don't have, or can't build a mathematical proof for your software, how can you prove it has no bugs?

--------------
Good Luck
To get the most from your Tek-Tips experience, please read
FAQ181-2886
As a circle of light increases so does the circumference of darkness around it. - Albert Einstein

stackdump · Feb 17, 2010

If you cannot make the equivalent of a mathematical proof then the only choices are;
(a) exhaustively check the entire search space, this is time intensive but has no risk of error.
or
(b) by structurally and also randomly sampling the search space, this is time conservative but incurs a risk of error.

The difference between (a) and (b) is only time, no other factors or influences come into it.

CajunCenturion · Feb 17, 2010

With respect to (a), before you can exhaustively check the entire search space, you must prove that there IS a finite search space. That in itself is problematic.

With respect to (b), the method itself, because it incurs a risk of error, fails short of the necessary level of proof.

Given that the snippet has no conditional branch of any kind within that loop, then the original intention could have been that it should run indefinitely, so we could argue that the snippet has no bugs, ...

That in and of itself highlights another problem with trying to prove the absence of bugs. What constitutes a bug? The point here is that you want the base a proof on a judgment of intent. How does the algorithm determine that it was the intent of the programmer to create an infinite loop or that the programmer left out a stop condition? The code is identical and the algorithm can only evaluates the code. In one case there is no bug and in the other case there is a bug. How can you base on proof on that?

Again, proving there are no bugs requires proving a negative, and as kmcferrin states, that may not be possible.

--------------
Good Luck
To get the most from your Tek-Tips experience, please read
FAQ181-2886
As a circle of light increases so does the circumference of darkness around it. - Albert Einstein

stackdump · Feb 18, 2010

Yes, (a) is the exhaustive case and there is nothing that cannot be tested this way. However the time and resources required to do this can be impractical. There are no exceptions, any type of software can be fully tested. We get 100% coverage but this is prohibitive in time.

Alternative (b) is the non-exhaustive flavor, its coverage is only as good as the number of test cases. So this is weaker coverage but is practical from a time perspective. The Dijkstra comment is valid here since this method is by definition subject to error.

Before software is written, a set of rules, results, desires that represent compliance should be generated. The test plan should validate that each compliance is satisfied. This is in most cases satisfied to a reasonable level of confidence using (b).

In terms of what constitutes a bug, we would have to restate that as what represents a failure to comply versus that which represents an unexpected result for which no compliance was specified. In one case it is a failure of the author to meet a specified compliance and a failure of the tester to check it (requires two failures). In the second case it is a failure of the architect and not in fact connected to the practice of writing software.

I have recently written a compliance specification for a hardware/software system that runs to over 300 pages. That system will be tested using formal verification and validation (verification and validation are not the same thing). This will involve simulation, structural analysis, constraint, inference and assertion based testing.

From this, inference testing is the most interesting because this ignores the compliances completely, this testing method works backwards and determines what the function is from the way that something behaves. What the function is concluded to be, is compared with the source and any differences highlighted. A little bit like working our the starting position of a chess game, by looking at the way pieces move and the final positions of many games.

mikrom · Feb 19, 2010

There is a mathematical theory veryfying programs.
I can remember, as I was student we had to learn something of Mathematical Theory of Program Verification.
We used for this topic the book Mathematical theory of computation by Zohar Manna.

But I don't know if this theory has a practical usage - I don't use it

CajunCenturion · Feb 19, 2010

==> Yes, (a) is the exhaustive case and there is nothing that cannot be tested this way.
You cannot test infinite-state systems this way.

--------------
Good Luck
To get the most from your Tek-Tips experience, please read
FAQ181-2886
As a circle of light increases so does the circumference of darkness around it. - Albert Einstein

stackdump · Feb 19, 2010

Well you can, but it would take an infinite amount of time, so that would drop into the category of being time prohibitive. So you would need to go to (b) and use something like UCLID. There's a lot of work going on this area, take a look at methods like Symbolic Forward Analysis.

Thargy · Feb 20, 2010

Also,

I believe that one or two things have crept in here that don't belong.

One poster mentioned hardware errors, well clearly that's not a bug in the software, but a hardware fault. Nobody expects software to run correctly on faulty hardware, so I reject that point.

kcmcferrin said:
It is a basic scientific principle that it is impossible to prove that something does not exist. Therefore you cannot prove that bugs do not exist in your code.

I cite the "green swan" argument here and cry "Rubbish" to Mr McFerrin. Just because nobody has ever seen a green swan, does not prove that none such exist - true. Proving that bugs do not exist in code is different because the answer space is finite.

If you can define a bug, then you must perforce define the answer space, otherwise how do you know it's a bug? So, if for a given set of inputs there exist a defined set of correct (ergo bug-free) outputs, and the program exhibits that behaviour, it is by definition bug free. The green swan argument is about a supposedly infinite answer space (i.e. all the swans in the world) but in reality, it is possible to scan the entire surface of the planet, detect no green swans, and the job is done.

Of course, the problem then becomes "demonstrate that there are no green swans on earth, the moon, mars and venus" but that's just daft - isn't it?

I am wrestling with problems that have infinite answer spaces but for which any answer is demonstrably either right or wrong. I think that's what Dijkstra was alluding to, but it's too late in the day to think clearly right now.

Regards

T

CajunCenturion · Feb 21, 2010

==> Well you can, but it would take an infinite amount of time, so that would drop into the category of being time prohibitive.
I disagree with that claim. I think there is very clear difference between the inability to prove a process because it's infinite and the inability to prove a process because it's not practical to do so. Whether or not it's practical to do so, either a process can be proven or it cannot. Finite state systems, such as the Towers of Hanoi, or other combinatorial explosive systems can be exhaustively proven even though it may take decades or centuries to do so. However, infinite stats systems cannot be exhaustively proven at all because there will always be another state that has not been tested.

Yes, there is a considerable amount of research being done in finite-state systems and a lot of it is quite interesting. One of the major hurdles are determine the stop conditions for the model. But again, these process are are models of systems and what you can with these tools is show that the model is consistent and stable. However, programs are implementations and while is one thing to heuristically prove the model, it's quite another to prove the implementation.

==> Proving that bugs do not exist in code is different because the answer space is finite.
Although the answer space may be finite, that doesn't mean there are a finite number of paths to that answer space. For example, if the only correct answer to my program is '2', then I clearly have a finite answer space. However, there are an infinite number of mathematical functions which yield an answer of two. To prove there are no bugs would require that you verify every possible mathematical function that could be evaluated by the program actually results in an answer of 2. If there are an infinite number of such expressions, then you cannot prove that every one results in a 2; therefore, you cannot prove there are no bugs in the program.

Further, even if you did show that every known function resulted in a 2, that doesn't mean that another unknown function would fail. That's the problem with trying to prove a negative. You cannot prove that something doesn't exits. You can show that there is no evidence for existence, but that's not the same a proving non-existence.

To paraphrase the green swan claim: "Just because nobody has ever found a bug in this program does not prove that none such exist."

--------------
Good Luck
To get the most from your Tek-Tips experience, please read
FAQ181-2886
As a circle of light increases so does the circumference of darkness around it. - Albert Einstein

Thargy · Feb 21, 2010

CC,

Right now, I am thinking of "proof by induction" from my A level maths days (back when I needed a calculator on a daily basis). It's modus operandi is to prove by mathematical analysis (induction) that if something is true for an integer n, then it is also true for n+1. This elegantly proves that it is true for all numbers, of which there is an infinite supply (answer domain).

So, by induction, I use an analytic method to prove one thing (and one thing only) but as a by-product, this one thing proves something about an infinite number of things.

I'm trying to apply that to infinite answer spaces in software - and failing at the moment.

Regards

T

stackdump · Feb 21, 2010

We're heading into deep philosophy here! Are there an infinite number of grains of sand on earth, well no, because if that were the case then the earth would be inifinitely large. So we by induction know that the number of grains of sand is finite, they just cannot be realistically counted. We don't need to count them on that basis, to disprove the conjecture.

From earlier discussion, given that I can write a program which is 100% testable and 100% guaranteed to be bug free, then there has to be a break point somewhere where testability becomes compromised by 'something'. I would say that the 'something' is a matter of the validation time becoming unrealistic.

In that case we have to look to the abstract and look for an analytical rather than an empirical proof. It could be argued that the Riemann Hypothesis was likely to be true, but not supported by absolute proof. Same for the final Fermat theorem. Given time and brain cells, both of these are now known to be correct. So I can well imagine that there are many currently unproven theories that will also become proven in time.

So if we were to give the world a chunk of software that is seemingly untestable, I can also envisage that the best brains would be able conclude its validity or otherwise. When that happens, perhaps the same method can be considered on a more universal basis to validate an entire raft of software. The good news is that people are working on this right at this moment. Based on our evolution to date, the impossible has had habit of becoming possible, so all we can say is that time will tell.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Proving Software Correctness

Programmer

Programmer

Technical User

MIS

Technical User

MIS

Technical User

Programmer

Technical User

Programmer

Technical User

Programmer

Technical User

Programmer

Programmer

Technical User

Technical User

Programmer

Technical User

Technical User

Similar threads

Log in

Part and Inventory Search

Sponsor