By Anne M Jensen, MSc, DC, DPhil (PhD)
Ever since word got around that I was studying the accuracy of muscle testing at Ox-ford University, I have received emails, phone calls and other messages from dis-gruntled muscle testers about our choice of using Muscle Response Testing (MRT) to detect truth/lies. This was not a decision that we took lightly — and in fact, it was debated for over a year within my department — amongst myself, my supervisors and other advisors — all renowned clinical researchers. There were very specific and important reasons why we chose this model. It is my hope that by sharing the background and the reasoning, this will clarify the question and quell any discontent within the muscle testing community — after all, we are playing for the same team! In this article, I will summarize the design process and explain salient points, and hopefully the reader will gain a better appreciation of the methods we used and why we used them. During the first year of my Oxford DPhil, I was charged with the task of figuring out what it is I planned to study, which involved talking with experts in various fields, and seeing what others were doing and have done. I did not intend to study muscle testing initially. I wished to study the effectiveness of an emotional healing modality (e.g. HeartSpeak) in a population of people with minor depression. However, since HeartSpeak uses muscle testing within its protocol, and because of its seemingly poor face validity, my supervisors insisted that I first show that MRT is valid tool. Needless to say, this endeavour took a life of its own, so I put the HeartSpeak study on the backburner, and began studying muscle testing in earnest. Following a thorough search of the muscle testing literature, I read dozens of studies, some showing promise, but most were discouraging.
This resulted in a sharp cognitive dissonance within me — for as a practitioner, I knew that muscle testing was one of the biggest strengths of my practice, and yet the existing research did not support its validity. While reviewing the literature with my supervisor, she suggested to start at the beginning, to start with something simple, concrete and straight-forward.
My supervisor also suggested I use the STARD guidelines when designing my studies. STARD stands for “Standards for Reporting of Diagnostic Accuracy Studies.”Just like Randomized Controlled Trials (RCTs) are the accepted standard when evaluating interventions, Diagnostic Test Accuracy Studies are the established method used to evaluate new tests. At first, I argued against MRT being a “diagnostic test” — because any kinesiologist knows that nothing is diagnosed (per se) with a muscle test. However, my supervisor insisted I look into it. What I discovered was this: A diagnostic test (1) detects the presence or absence of a target condition, and (2) is used to guide care. So, then yes, MRT did meet the criteria of being a diagnostic test, and therefore, I could use the STARD guidelines to design my studies — which advantageously made things quite straight forward. To assess a new diagnostic test, patients (who are enrolled in the study) are given two tests — the new test and the currently accepted standard — called the Reference Standard.
For example, if a lab develops a new blood test to detect prostate cancer, they must run studies to assess its accuracy, where all the participants would be tested using the new test and also with the PSA test (Prostate Specific Antigen) — the current Reference Standard for prostate cancer. The researchers would compare the results of these two tests, and then accuracy statistics would be calculated — one of which is “accuracy.” See Figure 1 for a graphical explanation of accuracy.
As Figure 1 demonstrates, to estimate accuracy, the results of a new test (called the Index Test) are compared to the results of the standard test in use at the time (called the Reference Standard). Since no test is 100% accurate, it is exceptionally important to choose a highly reputable Reference Standard. I realized early on that because of the skepticism already surrounding the validity of MRT, the choice of a solid Reference Standard was a key piece in designing these studies. After all, among the detractors, it is believed that MRT cannot possibly be used with any degree of accuracy or reliability. So, I needed to find an indisputable Reference Standard comparator test, one which leaves little room for uncertainty. With this in mind, we had to choose a target condition. Recall that a diagnostic test is used to detect the presence or absence of a target condition. For example, a sphygmomanometer can be used to detect hypertension (or high blood pressure), and likewise, the blood test CA-125 is used to detect ovarian cancer.
In these two examples, the target conditions are hypertension and ovarian cancer. However, currently, out in the field, MRT is used to detect innumerable conditions — including: stress, organ dysfunction, aberrant nerve function, meridian imbalance, the need for a specific nutritional supplement, personal beliefs, the presence of past trauma, and yes, truth / lies. Choosing which condition to specifically target was challenging — but exceptionally important.
I had to consider carefully what conditions MRT could likely detect with high accuracy. And then choose one condition that was also detectable by another, widely accepted test (the Reference Standard) — so their results could be compared. In the existing muscle testing literature, there are a number of studies in which MRT was used to detect an allergen or a toxin — all with discouraging results. I understood the difficulty in designing such studies, possibly due to the differing definitions of “allergen” or “toxin.” For instance, what one body considers to be “toxic,” the next body may not. I knew this was going to be a problem. As another example, some chiropractors use muscle testing to detect spinal subluxations, and from there they know where to make their adjustments. However, there is no one agreed upon way to detect a subluxation, therefore no Reference Standard exists — another problem. I needed to find a target condition that had a solid Reference Standard, and also in common use. In my practice as a mind/body specialist, I regularly use muscle testing to detect truth / lies in patients. That is, I ask patients to speak a statement to see if it is true for them at that moment in time. The paradigm that I use is that if the statement is true (for them at that moment), the muscle will remain strong; likewise, a lie will result in a weak muscle response. This paradigm is practiced routinely within many different muscle testing techniques. Therefore, I considered truth / lies a strong candidate for my target condition. Truth / Lies also had other advantages as a target condition. It was simple. And we could construct the statement such that either it was true or it was not true. With a concrete outcome like this, we had the ideal Reference Standard, even a Gold Standard. Also, we could randomize the presentation of truths and lies — an important piece of clinical research. All in all, truth / lies seemed the ideal choice of target condition to use for this initial series of studies. Now, one could argue, “What is truth?” — which could quite possibly be a philosophical discussion for another time. Yet, it was something that I had to consider very carefully during the design phase of this research.
I found myself pondering a number of important questions, such as: What is Truth? And what is a Lie? Are all lies conscious? Can one lie unconsciously? Is truth absolute or relative? Transient or stable? Universal or personal? Answering these questions fully is beyond the scope of this article; however, from my experiences, I will posit the following speculations, especially in the context of this research: Truth is personal. Truth is dynamic, relative and transient. Truth is conscious and truth is unconscious — and yes, these two truths may differ — which may make no sense at all! On the other hand, lying, paradoxically, requires the intent to deceive. Which may also be unconscious. All told, truth is complex. In the end, I used a colloquial definition of “truth”: that which is generally accepted as fact or reality.
(This is as opposed to abstract concepts of “truth,” such as “the Universal Truth” or “the Higher Truth”). In contrast, “lying” was defined as the opposite of “truth,” or more specifically: a false statement made with deliberate intent to deceive; an intentional untruth; a falsehood. So, keeping the definitions simple like this allowed us to know, indisputably, if a statement was true or not. This then was clearly an ideal Reference Standard, a true Gold Standard. Knowing this, we could calculate an accuracy statistic — and for these studies, we used the actual percent correct: Accuracy % Correct # Correct MRTs # Total Statements X 100
The percent correct was calculated for each participating pair, and then a mean (average) percent correct was calculated and reported as an estimation of the accuracy of MRT.
Ultimately, to complete the requirements for my DPhil (PhD) degree at Oxford University, I conducted a series of six Diagnostic Test Accuracy Studies on MRT for detecting lies. My full dissertation can now be downloaded from the Oxford website using this link. In addition, there are three papers from this series of studies that have now been published. They can be found here: PAPER1, PAPER2, PAPER3. More are currently under review and in press. If you would like to be kept up-to-date with these and future publications of mine, you can sign up HERE. This series of studies showed very favourable results in support of the validity of MRT to detect truth / lies. However, this is just a start — more robust research is keenly needed. For example, we need to determine if MRT can accurately detect other conditions, aside from truth / lies. To accomplish this, a future researcher may wish to choose another target condition with a solid and acceptable Reference Standard — such as attempting to use MRT to detect stress and comparing it with another realtime stress-detector such as a polygraph or heart-rate variability. In addition, more research is needed to assess what patient or practitioner characteristics produce better accuracies — which was not established in my series of studies. Finally, and perhaps most importantly, research is needed to assess the clinical utility of techniques that use MRT. For instance, one such research question might be: Are there better patient outcomes when a nutritionist uses MRT to prescribe supplements compared to when a nutritionist does not use MRT? There is much work yet to be done to establish the true usefulness of MRT, and everyone must do his or her own part. If you are interested in supporting future MRT research, please consider donating by clicking HERE — every little bit helps! Thank you for your support and your interest in this series of MRT research.
Anne is a researcher and the creator of the HeartSpeak program.