Who you gonna call when you’ve got an upset stomach? Well, for more than one-third of Americans, the answer is usually the internet.

We’re increasingly turning to the web to self-diagnose the shifting aches and pains that regularly trouble us, a phenomenon that’s been uncannily referred to as visiting "Dr. Google." But it isn’t just search engines that we rely on to determine whether we should call in sick, take an extra dose of Advil and rough it out, or visit the emergency room, we’re also utilizing any number of so-called symptom checkers — dedicated apps and websites that tabulate a diagnosis and dictate a course of action for us depending on what symptoms we type into them.

Most, such as WebMD’s version, are careful to note that they aren’t a "substitute for professional medical advice, diagnosis or treatment." And for very good reason, it turns out, according to a new study from the British Journal of Medicine.

Testing 23 different symptom checkers, it found that they were, on average, only able to guess the correct diagnosis one-third of the time, and when handing out the top twenty possible diagnoses, only listed the right answer among them about 60 percent of the time. Worse still, these checkers also often suggested receiving unneeded medical care when simple bed rest and observation would have been better. The only saving grace is that the checkers were generally decent at recommending immediate emergency care when encountered with urgent, potentially life-threatening symptoms, at about 80 percent.

The authors, using a tried-and-true method from medical school, tested out the symptom checkers, each different from the other in their computer coding and medical information available, by offering each one a list of symptoms derived from 45 clinical vignettes, which are case reports or scenarios intended to approximate a real-life person’s health concerns. Each third of the vignettes were meant to either represent someone in need of urgent medical attention, such as with a stroke; someone who should see a doctor but isn’t in immediate danger, such as with an ear infection; and someone who could provide their own self-care and be just fine, such as with a cold. Then it was a simple matter of grading the checkers on their accuracy.

"Across all symptom checkers the correct diagnosis was listed in the first three diagnoses in 51 percent of standardized patient evaluations and in the first 20 diagnoses in 58 percent of standardized patient evaluations," the authors concluded. "Diagnostic accuracy for listing the correct diagnosis in the top three and top 20 was higher for self care conditions than for emergent conditions and was also higher for common conditions than for uncommon conditions."

Out of the 23, 15 checkers evaluated and relayed information on whether the user should seek medical care. And while they were more easily able to identity a serious medical issue, they were less capable of correctly detecting a self-limiting condition, likely because the checkers were programmed to err on the side of caution.

"The risk averse nature of symptom checkers’ triage advice is a concern. In two thirds of standardized patient evaluations where medical attention was not necessary, we found symptom checkers encouraged care," they wrote. "Some patients researching health conditions online are motivated by fear, and the listing of concerning diagnoses by symptom checkers could contribute to hypochondriasis and 'cyberchondria,' which describes the escalated anxiety associated with self diagnosis on the internet."

Despite their floundering performance, the researchers do note that their sucess rates actually match up fairly well with telephone medical advice lines, which are sometimes, but not always, staffed by nurses. And they still proved to be better than "Dr. Google."

"If symptom checkers are seen as an alternative for simply entering symptoms into an online search engine such as Google, then symptom checkers are likely a superior alternative," they wrote. "A recent study found that when typing acute symptoms that would require urgent medical attention into search engines to identify symptom-related web sites, advice to seek emergent care was present only 64 percent of the time."

Overall, their findings point to a need for improvement among these apps, but an improvement that might not be too hard to accomplish with the available technology we have now.

"For instance, addition of real time information about the local incidence of illness in the community greatly improved the performance of a diagnostic tool for group A streptococcal pharyngitis," they suggested. "Diagnosis and triage rates could also be improved if symptom checkers incorporated individual clinical data from medical claims or the electronic medical record."

Source: Semigran H, Linder J, Gidengil C, et al. Evaluation of symptom checkers for self diagnosis and triage: audit study. British Journal of Medicine. 2015.