Measuring What We See

Measuring What We See: A Proposal for How Ayurveda Can Begin to Validate Its Own Clinical Language

Dr. Aakash Kembhavi, MD, PGDMLS, MS (Counseling & Psychotherapy)

[This article was developed with AI collaboration as an intellectual tool. All academic positions, arguments, and professional responsibility for this content rest solely with the author.]

Disclaimer: This article presents a proposal — a starting point for discussion, not a final methodology. The worked example of a pilot study on Shoola that follows is illustrative. It is intended to demonstrate that validated measurement of Ayurvedic Lakshanas is possible and to invite critical engagement from researchers, clinicians, and students. The author does not claim expertise in all dimensions of psychometric methodology or clinical research design, and does not present this as the definitive framework. It is a beginning. It should be questioned, improved, and corrected by those with the relevant expertise to do so.

Where We Have Arrived

Two previous articles on this blog have moved through two distinct stages of diagnosis.

The first — Why Ayurveda Needs Research and Why Many of Us Resist It — named the biases and cultural habits that prevent honest self-inquiry: confirmation bias, authority bias, the naturalistic fallacy, the Semmelweis Reflex. The problem was psychological and institutional.

The second — Ayurveda’s Missing Data Crisis — went deeper. It argued that even when Ayurvedic researchers attempt clinical research, they are working without the foundational baseline data that makes research interpretable. We do not have population-level Prakriti distribution data. We have no validated normative ranges for any component of Ashtavidha Pariksha. Every grading scale used in PG dissertations was invented by the student who used it, validated by no one, and used once. We are not conducting science — we are performing science, and the costume is starting to show.

Both stages were necessary. But diagnosis, however accurate, does not by itself constitute progress. At some point, we must ask: what does the first step actually look like?

This article is an attempt to answer that question — concretely, specifically, and in a way that any practitioner, PG scholar, or final-year BAMS student can understand, evaluate, and potentially replicate.

The answer proposed here begins with the TRISUTRA — the three-field framework that the Acharyas themselves identified as the structure of clinical knowledge. And it demonstrates, using Shoola as a worked example, how Ayurveda can begin to validate its own clinical language without abandoning its principles, without blindly copying biomedical methodology, and without pretending that the work is already done.

Part I: The TRISUTRA as Ayurveda’s Own Research Framework

Before discussing measurement methodology, it is important to establish something: the framework proposed here is not imported from modern medicine. It is already present in the Ayurvedic tradition, articulated with remarkable clarity in the very first chapter of the Charaka Samhita.

The TRISUTRA Ayurveda — the three-field structure of Ayurvedic knowledge — is:

Hetu — Causation. The study of Nidana: what produces disease, what disturbs the equilibrium of the living system, what etiological factors are operative in a given patient.

Linga — Clinical characterisation. The study of Lakshana: what signs and symptoms appear, how they are distributed, what patterns they form, how they vary across patients and across time.

Aushadha — Therapeutic response. The study of what interventions modify the disease state, in which direction, to what degree, and under what conditions.

This three-field structure is, in essence, a research framework. It describes, in the Acharyas’ own language, the three questions that clinical medicine must answer: What causes disease? How does it manifest? What relieves it?

Modern clinical research asks exactly these three questions — under the headings of epidemiology (Hetu), clinical characterisation and diagnosis (Linga), and therapeutic trials (Aushadha). The methodology is different. The questions are identical.

What this means is that when we propose systematic, rigorous investigation of Ayurvedic clinical phenomena, we are not importing a foreign framework. We are operationalising the framework the Acharyas themselves proposed — using the more powerful investigative tools now available to us. The TRISUTRA did not specify that Hetu must be investigated by clinical interview alone, or that Linga must be assessed without validated scales, or that Aushadha must be evaluated by individual clinical impression without comparison groups. These were the tools available. We have better tools. Using them is not betrayal. It is continuation.

Part II: The Measurement Problem — Stated Honestly

Before measuring anything, we must be clear about what we are measuring and what we are claiming about it.

A distinction that Ayurvedic researchers must make explicitly is the difference between two types of claims:

Claim Type 1: Phenomenological measurement. We are measuring the clinical phenomenon as it appears — its characteristics, intensity, distribution, pattern, aggravating and relieving factors, temporal behaviour — without making claims about the mechanism that produces it. This is what the patient experiences. This is what the clinician observes. This is real, reproducible, and measurable without resolving any ontological questions about Doshas or Srotas.

Claim Type 2: Mechanistic measurement. We are measuring something that we believe reflects an underlying Ayurvedic physiological mechanism — Vata aggravation, Sroto-avarodha, Ama involvement — and using the measurement as evidence for or against that mechanism. This is a much stronger claim. It requires not only reliable measurement of the phenomenon but also a validated relationship between the phenomenon and the proposed mechanism.

The critical error in most Ayurvedic clinical research is conflating these two claims — treating measurements of Claim Type 1 as evidence for Claim Type 2 without establishing the link between them.

The pilot study proposed in this article operates entirely at the level of Claim Type 1. The goal is to measure Shoola as a clinical phenomenon — its characteristics, its dimensions, its variability across patients. We are not, at this stage, claiming to measure Vata. We are not claiming that the measurement validates the Tridosha explanation of pain. We are doing something prior to and more fundamental than that: establishing that the clinical phenomenon can be reliably characterised and reproducibly measured.

That is the first step. It is not a small one.

Part III: Why Shoola?

Pain — Shoola in its broadest classical usage — is perhaps the single most clinically important Lakshana in Ayurvedic practice. It appears as a cardinal feature across virtually every major disease category. It is described with remarkable specificity in the Samhitas: Vata-type pain is characterised as moving, piercing, associated with crackling and popping, aggravated by cold and relieved by warmth, worse at night. Pitta-type pain is burning, penetrating, aggravated by heat. Kapha-type pain is dull, heavy, associated with stiffness, worse in the morning and after rest.

These descriptions are not vague. They are operationally rich — they specify quality, location behaviour, aggravating factors, relieving factors, temporal patterns, and associated sensations. This richness is precisely what makes Shoola a good candidate for a pilot validation study: the Samhitas have already done the descriptive work. What has not been done is converting those descriptions into a standardised, reliable measurement instrument.

Pain also has a significant methodological advantage as a starting point: modern pain science has already solved many of the problems that Ayurveda now faces. The McGill Pain Questionnaire demonstrated that even the most subjective dimensions of pain experience — its quality, its emotional character, its intensity — can be reliably measured using structured verbal descriptor scales. The methodological infrastructure for measuring subjective clinical experience exists. We do not need to invent it. We need to adapt it to Ayurvedic clinical language.

Furthermore, choosing Shoola allows the proposal to remain at the phenomenological level. We are not yet asking whether Vata causes pain. We are asking: can the characteristics of pain as described in Ayurvedic clinical language be reliably identified, graded, and documented across different examiners and different clinical settings? That is a prior and more fundamental question. And it is one we can answer.

Part IV: A Pilot Validation Study — The Worked Example

What follows is a proposed pilot study design. It is illustrative, not definitive. Its purpose is to demonstrate that this kind of work is possible and to provide a concrete template that researchers can evaluate, critique, and improve.

Study Title

Characterisation and Inter-Rater Reliability of Shoola Assessment Using a Structured Ayurvedic Pain Descriptor Scale: A Pilot Multicentric Study

Background and Rationale

Shoola is a cardinal Lakshana in Ayurvedic clinical practice, described across multiple dimensions in the classical texts including Charaka Samhita and Ashtanga Hridayam. Despite its clinical centrality, no validated, standardised instrument exists for the structured assessment of Shoola characteristics in contemporary Ayurvedic clinical practice. As a result, Shoola is typically assessed by individual clinical impression, making its documentation unreliable across examiners and its use as an outcome measure in clinical research methodologically unsound. This pilot study proposes to develop, pilot, and assess the inter-rater reliability of a structured Ayurvedic Shoola Descriptor Scale (ASDS).

Study Objectives

Primary objective: To assess the inter-rater reliability of the Ayurvedic Shoola Descriptor Scale (ASDS) among trained Ayurvedic clinicians evaluating the same patients.

Secondary objectives:

To characterise the distribution of Shoola types (Vata-predominant, Pitta-predominant, Kapha-predominant, mixed) in a sample of patients presenting with musculoskeletal pain to Ayurvedic OPDs.
To document the prevalence of each classical Shoola descriptor in the study sample.
To identify which descriptors show the highest and lowest inter-rater agreement, informing further instrument refinement.

The Instrument: Ayurvedic Shoola Descriptor Scale (ASDS)

The ASDS is built directly from classical Samhita descriptions of Shoola and its Dosha-associated variants. Each descriptor is operationally defined so that different examiners are assessing the same clinical feature, not their individual interpretation of it.

The scale has three components.

Component A: Pain Quality Descriptors (Patient-Reported)

This component is completed by the patient, guided by a trained research assistant. Each item uses a simple Yes/No response with a brief operational definition.

Item

Descriptor

Definition Provided to Patient

Chala (Moving)

Does your pain shift from one location to another, or move within the same area?

Toda (Piercing)

Does your pain feel like needles or pins being pressed into the affected area?

Sphutana (Bursting)

Does your pain feel like something is about to burst or split open from inside?

Daha (Burning)

Does your pain have a burning or hot quality?

Paka (Suppurating heat)

Does the painful area feel hot to your own touch?

Guru (Heavy)

Does the painful area feel heavy, as if something is pressing down on it?

Stambha (Stiff)

Is the painful area stiff, particularly in the morning or after rest?

Shita (Cold aggravation)

Does cold make the pain worse?

Ushna (Heat aggravation)

Does heat make the pain worse?

Nisha-vriddhi (Nocturnal aggravation)

Is your pain worse at night?

Component B: Pain Intensity and Impact (Patient-Reported)

A five-point ordinal scale for overall pain intensity, with clearly defined anchor descriptions:

1 — Minimal: Pain is present but does not interfere with any daily activity.
2 — Mild: Pain is noticeable and occasionally distracts from daily activity but does not prevent it.
3 — Moderate: Pain regularly interferes with daily activity and requires conscious effort to manage.
4 — Severe: Pain substantially prevents normal daily activity.
5 — Incapacitating: Pain prevents all normal activity and requires the patient to remain still or seek immediate relief.

Component C: Clinician Assessment (Examiner-Completed)

This component is completed independently by two trained Ayurvedic clinicians examining the same patient, without knowledge of each other’s assessments. It records:

Dominant Dosha involvement in the pain presentation (Vata / Pitta / Kapha / Mixed — with operational definitions for each based on the descriptor profile from Component A)
Presence of Sama features associated with the pain (Yes/No — based on defined criteria: tongue coating, heaviness, reduced Agni, malodour)
Srotas primarily involved (clinician judgment with documented reasoning)
Overall Shoola severity on the same five-point scale as Component B

Study Design

Design: Cross-sectional inter-rater reliability study with descriptive characterisation component.

Setting: Minimum three Ayurvedic teaching hospital OPDs across different geographical zones — one each from North, South, and either East or West India — to capture regional variation in presentation.

Sample: 150 patients presenting with musculoskeletal pain as the chief complaint. 50 patients per centre. This sample size is not based on a formal power calculation for a hypothesis test — because this is a descriptive and reliability study, not a hypothesis-testing trial. Survey and reliability studies do not require the complex statistical machinery of clinical trials. The sample of 150 is sufficient to calculate stable reliability coefficients and to characterise the distribution of descriptor patterns with reasonable precision.

Inclusion criteria: Adults aged 18–65 presenting with musculoskeletal pain of any cause, duration more than two weeks, able to communicate in the regional language, willing to participate.

Exclusion criteria: Acute trauma, post-surgical pain, pain requiring immediate medical intervention, cognitive impairment preventing reliable self-report.

Examiner training: All participating clinicians undergo a two-hour standardised training session covering the operational definitions of each ASDS item, the rating procedure, and the documentation format. Training is documented and examiner understanding is assessed before data collection begins.

Data Collection Procedure

Consenting patient is seen independently by two trained Ayurvedic clinicians within the same OPD visit, in sequence, with a 15-minute interval between assessments.
Clinician 1 completes Component C without knowledge of Clinician 2’s assessment, and vice versa.
Component A and B are completed by the patient (with research assistant guidance) once, between the two clinician assessments.
All data is recorded on a standardised paper form and entered into a central electronic database.

Statistical Analysis

The statistical analysis required for this study is deliberately simple — because the study question is simple, and complex statistics would not answer it better.

For inter-rater reliability:

Cohen’s Kappa coefficient for each item in Component C (categorical agreement between Clinician 1 and Clinician 2).
Intraclass Correlation Coefficient (ICC) for the overall severity rating.
Percentage agreement for each item as a plain-language companion to Kappa.
Interpretation standard: Kappa above 0.6 is considered acceptable reliability; above 0.8 is excellent.

For descriptive characterisation:

Frequency and percentage of each descriptor in the full sample and by centre.
Cross-tabulation of descriptor patterns against dominant Dosha classification.
Simple descriptive statistics (mean, median, range) for pain intensity scores.

No regression modelling. No multivariate analysis. No complex inferential statistics. Generating meaningful foundational data does not require methodological complexity. It requires methodological rigour in design and simplicity in analysis.

Expected Outputs

A reliability-tested Ayurvedic Shoola Descriptor Scale — the first validated instrument of its kind.
A characterisation of the distribution of Shoola types and intensities in contemporary Ayurvedic OPD patients across three regions.
Identification of which ASDS items require further refinement (those with Kappa below 0.6).
A documented research methodology that can be replicated, extended, and used as a template for similar Lakshana validation studies.
A publishable study that contributes to the evidence base at the most fundamental level — demonstrating that Ayurvedic clinical observation can be reliably standardised.

Part V: What a Proper Ayurvedic Research Proposal Looks Like

The pilot study above embeds — implicitly — the structural elements of a proper research proposal. Making them explicit is useful, because one of the consistent failures in Ayurvedic PG dissertations is the absence of clear proposal structure.

A research proposal grounded in TRISUTRA Ayurveda and meeting basic scientific standards should contain the following elements, in this sequence:

1. The Research Question — stated in one sentence, specifically enough that a reader unfamiliar with the work could understand exactly what is being investigated. “A study on Shoola” is not a research question. “What is the inter-rater reliability of a structured Ayurvedic Shoola Descriptor Scale among trained clinicians?” is a research question.

2. Background and Gap — what is already known, what is not known, and why the gap matters. This requires an honest literature review that acknowledges existing work rather than dismissing or ignoring it.

3. Objectives — primary and secondary, numbered, specific. The primary objective drives the sample size and the primary analysis. Secondary objectives are exploratory. A pilot study should not have more than three secondary objectives.

4. Study Design — the research architecture. State the design type explicitly and justify why it is appropriate for the research question.

5. Setting and Sample — where, who, how many, and why. The sample size must be justified — not with a power calculation if the study does not require one, but with a clear rationale for why the proposed number is sufficient for the study’s purpose.

6. Measurement — what instruments will be used, how they were developed, and what their reliability properties are. If no validated instrument exists — which is the current situation for most Ayurvedic Lakshanas — this section must acknowledge that, explain how the candidate instrument was developed, and make reliability assessment a study objective rather than an assumption.

7. Procedure — exactly what will happen, in sequence, from recruitment to data collection to data entry. Specific enough that another researcher could replicate it.

8. Analysis Plan — stated before data collection, not chosen after looking at the data. Name the statistical tests, explain why they are appropriate for the data type, and state what threshold of results would be considered meaningful.

9. Limitations — honest acknowledgment of what the study cannot conclude. A pilot study cannot establish efficacy. A reliability study cannot establish validity. These limitations define the study’s scope — they do not invalidate it.

10. Implications — what this study will contribute, and what research it makes possible next. All research is a step in a sequence. A pilot that establishes measurement reliability makes the next study — which uses that validated measure as an outcome — possible and interpretable.

Part VI: The Population-Level Question — What Remains to Be Done

This pilot study is one step. Establishing that a Lakshana can be reliably measured is the prerequisite for everything that follows. But it does not, by itself, address the population-level data crisis that the previous article described.

To develop population-level normative data for Shoola — to establish with confidence what the distribution of Shoola types looks like in the Indian population, how it varies by Prakriti, geography, age, sex, and season — the pilot study must be followed by a larger descriptive epidemiological study. That study would require a representative sample, careful sampling methodology, and the validated ASDS that the pilot produces.

The sequence is:

Pilot reliability study → Validated instrument → Large-scale descriptive study → Population norms → Properly powered clinical trials

Ayurveda is currently attempting the last step without having completed the first four. This is precisely the structural problem the Missing Data Crisis article described, and precisely why the results of current trials cannot be reliably interpreted.

The good news is that Steps 1 and 2 do not require large grants, sophisticated infrastructure, or complex statistical expertise. They require disciplined clinical documentation, a standardised form, two examiners, and basic statistical analysis. They require, in other words, exactly the resources that a motivated PG scholar, supervised by a methodologically literate guide, could bring to bear.

The barrier is not resources. It is the willingness to begin.

Closing Reflection: The Acharyas Would Recognise This as Their Own Work

The TRISUTRA — Hetu, Linga, Aushadha — was not a philosophical aspiration. It was a research programme. The Acharyas were proposing that clinical medicine must be built on systematic understanding of causation, careful characterisation of clinical features, and rigorous evaluation of therapeutic response. They did not have Kappa coefficients or intraclass correlation coefficients. They had careful observation, structured clinical interview, and the intellectual honesty to document what they found rather than what they expected.

We have their observations. We have their clinical descriptions — rich, detailed, and operationally specific enough to form the basis of a validation instrument right now, today, without waiting for institutional reform or government funding. And we have methodological tools they could not have imagined.

The pilot study proposed in this article is not a betrayal of that tradition. It is its continuation. It takes the Linga — the clinical characterisation — that the Acharyas built, and asks: can we establish that different clinicians, in different settings, looking at different patients, can reliably identify the same features?

If the answer is yes, we have taken the first step toward building the evidence base that Ayurveda has been promising for decades and has not yet delivered.

If the answer is no — if inter-rater reliability is poor — we have learned something equally important: that the classical descriptions, as currently taught and applied, do not produce consistent clinical identification of the same phenomena across different examiners. That is not a failure. That is a finding. And findings, whether positive or negative, are what science is made of.

The Acharyas observed the world as it was. We owe them the same discipline.

Dr Kembhavi’s Ayurveda Unfiltered Blog is a forum for critical engagement with Ayurvedic education, practice, and research. Articles are published on this blog every week.

Share your thoughts in the comments below.

Dr. Kembhavi's Ayurveda Unfiltered Blog

Measuring What We See

💬 Comments & Discussion