About "Metamorphic Testing" (property based tests)

vmagnin · March 3, 2022, 9:29am

I have stumbled on “metamorphic testing”. If you use tests in your programs, probably you have used some metamorphic tests, without knowing how it’s named. You may for example have used some kind of symmetry in your problem, like in this canonical case: you don’t need to know the exact value of \sin(1.2) to test that \sin(1.2) = \sin(\pi - 1.2).
I have read the survey you will find below and there are some papers that could be interesting for us, particularly about modelling, numerical programs and compilers. But the concept is so general that I am feeling quite fuzzy. What do you think about it?

awvwgk · March 3, 2022, 10:26am

Using this kind of testing a lot for my unit tests (now I also have a name for it). For example when dealing with analytical derivatives (compare against finite difference result) or when checking symmetry properties of integrals over products of spherical harmonics. From my experience, it makes it easy to parametrize a number of tests over a certain class of property, e.g. you only need one analytical vs finite differences tester to check different boundary conditions and interaction kernels for an expression. It helps if the project has a modular structure to support this kind of property checks.

I found it provides a good way to reduce the number of reference results you have to store when writing unit tests.

Arjen · March 3, 2022, 10:26am

Well, the term is completely new to me, but I will have a look at this. Any method to improve the testing process at least deserves attention

vmagnin · March 3, 2022, 10:36am

Yes, let me know what you think.
On my side, after reading that whole survey, my feeling is that I have learned a new term, but nothing really precise or new from a practical point of view. Probably I should read some references dealing with that concept in modelling and numerical applications to get a better view.

Arjen · March 3, 2022, 8:12pm

I have read the introduction and the overview (section 2) now and I find the method rather questionable. Suppose you are trying to solve this ODE:

dy/dx = - k y

just about the simplest you can think of without being trivial. Then you know that transforming the initial condition by a certain factor will change the outcome by the same factor. But that will also happen if you made a mistake and are actually solving:

dy/dx = - k x y

Even simpler, the identity:

sin(x)  = sin(pi - x)

would also work if your implementation of the sine function had an embarrassing flaw and was actually:

sin(x) = 0 - any value of x

Okay, stupid examples, but what about solving a non-linear model? What transformations on the input can you think of then? If you have a PDE or system of PDEs, one transformation could be reflecting the boundary conditions, but whatever errors you flush out with such a procedure, you are not testing the actual outcome much, leaving your system open to stupidities like the above or more intricate problems that require careful analysis of the actual outcome.

Short summary: the idea is sympathetic and attractive, but I fear that in many practical situations it is at most a complementary technique and not an easy-to-implement alternative.

certik · March 3, 2022, 10:22pm

I do not use this method for the same reason @Arjen wrote. Here is an example of a test for sin in LFortran: integration_tests/sin_04.f90 · c5903bf4ae8db84722935639f95843f21e486bce · lfortran / lfortran · GitLab, as you can see, it checks an actual value. A very common error is that the function will return 0, it has happened to me for several reasons:

The C interop under the hood is broken, and we get a random floating point number, which very often is zero
A mistake in the LLVM backend (any number of bugs in fact), which result in returning 0

If instead we rewrote the line:

if (abs(sin(1.5_dp) - 0.997494996_dp) > 1e-5_dp) error stop

to

if (abs(sin(1.5_dp) - sin(pi - 1.5_dp)) > 1e-5_dp) error stop

Then this test would succeed for the most common errors that LFortran had in the past while I was developing this feature.

So I always would recommend to check an actual, non zero, value.

yizhang · March 4, 2022, 4:20am

There seem confusions around this concept. The intention of the method is to reduce the number of reference tests (labeled tests) by reusing the reference tests in new ways. That is, if we have tested f(x)\equiv y, and we know the property of a map T, say, commutative with f, then we can test f(T(x))=T(f(x)) \equiv T(y) without explicitly computing f(T(x)). This often helps when f(\cdot) is expensive to evaluate or reference result y is not easy to find. So back to the \sin(x) example, metamorphic tests do not say we don’t need to test it with some reference results, but that we can add more tests.

This is quite useful in marching learning and inverse problems. In image labeling, we can add more tests using the labeled image (they are already tested) by apply the learning algorithm to the same image but rotated, reflected, shifted, etc without using additional training sets.

Arjen · March 4, 2022, 8:54am

Point taken :).

Such symmetries provide welcome additional test cases, indeed. But they can never replace specific tests.

Slightly OT:
Another type of tests that can be useful for a certain class of applications is that of “manufactured solutions” and that is where such a thing as automatic differentiation comes in quite handy.

vmagnin · March 4, 2022, 10:12am

Yes, the objective is to create more tests starting from successful basic tests, as explained by @yizhang .
People try to create a mathematical formalism for these tests. A kind of algebra?

I have read the following 2014 paper about those tests in multi-precision arithmetic computations. I am not yet convinced that the concept can bring more than what our intuition would bring. But the research field is recent and reading more recent papers could reveal if it has produced something fruitful or not.

https://www.researchgate.net/publication/289316333_Metamorphic_relations_to_improve_the_test_accuracy_of_Multi_Precision_Arithmetic_software_applications

everythingfunctional · March 4, 2022, 2:27pm

I’ve learned about such things under the name “Property Based Tests.” Perhaps this is the academic jargon equivalent. As others have pointed out, they are not always sufficient by themselves, but do provide a method of saying in code (and thus able to be verified by execution) what an expected behavior of your system is, which I think is quite valuable.

vmagnin · March 4, 2022, 2:32pm

Thanks, I like “Property Based Tests” which is a clear and simple expression. Metamorphic sounds rather bombastic or esoteric, you need to read paragraphs to understand what it means… and finally feel rather disappointed.
=> I have modified the title of my post!

Topic		Replies	Views
What kind of tests are sufficient: Some personal thoughts	29	1197	January 23, 2024
Computations with units (meters, seconds, ...) Poll	52	3061	October 31, 2021
Fazang: a reverse-mode automatic differentiation library Announcements	23	1557	February 24, 2023
Fortran Best Practice Minibook Announcements	84	3983	September 20, 2021
Looking at some old code	27	1042	October 24, 2022

About "Metamorphic Testing" (property based tests)

Related Topics