Home | Signal | Curator | Morey | SwarmCare | Hedera | Discord
Pharmaceutical AI Training Data -- Sealed & Verified

SwarmPharma

50,000 pharmacology-focused training pairs. 16 task types. 5-step trajectory methodology.
From drug interactions to pediatric dosing -- clinical-grade intelligence for pharmaceutical AI.

16 Task Types → SwarmPharma-35B →
~50,000 Pharma Pairs
16 Task Types
5-Step Trajectory
sealed SwarmPharma-35B

5-Step Trajectory in Every Output

Every trajectory-enhanced pair follows the same clinical reasoning chain. No shortcuts. No hallucinated conclusions. Each step is verified.

Step 1
IDENTIFY
Drug, patient, context parameters
Step 2
MECHANISM
Receptor, pathway, molecular target
Step 3
ASSESS
Risk, severity, clinical significance
Step 4
CALCULATE
Dosing, PK params, adjustments
Step 5
RECOMMEND
Action, monitoring, alternatives

The trajectory methodology ensures your model doesn't just produce answers -- it produces reasoning chains. Every output traces the clinical logic from drug identification through mechanism analysis to a concrete recommendation with monitoring parameters.

Where the Pairs Come From

Two verified sources. No synthetic-only generation. Textbook ground truth combined with trajectory-enhanced clinical pairs.

Katzung's Basic & Clinical Pharmacology
The gold standard pharmacology textbook. Core drug knowledge, mechanisms, therapeutic principles.
22,083 pairs
Trajectory-Enhanced Pairs (R2: sb-medical/trajectory/)
5-step methodology. 27 shards. 16 pharma task types. Labeled trajectory=true v1. Verified and sealed.
28,624 pairs
Total Pharmacology Pairs
~50,707 pairs

16 Pharmaceutical Task Types

Each task type teaches a distinct pharmacological capability. Every pair is trajectory-verified and quality-gated.

Core Pharmacology -- Drug Mechanisms & Interactions
Core

Drug Interaction Analysis

drug_interaction_analysis

Multi-drug interaction assessment, DDI severity grading, contraindication identification, and interaction cascade analysis across polypharmacy regimens.

Why it matters: Drug-drug interactions cause 125,000+ hospitalizations per year. Your model needs to catch what humans miss in complex medication lists.
Core

Mechanism of Action

mechanism_of_action

Receptor binding profiles, signal transduction pathways, molecular target identification, and downstream pharmacodynamic effects at the cellular level.

Why it matters: Understanding MOA is the foundation of all pharmacology. A model that can explain receptor-level mechanics can reason about novel drug combinations.
Core

Drug Metabolism

drug_metabolism

CYP450 enzyme interactions, phase I/II metabolic pathways, genetic polymorphism effects (CYP2D6, CYP2C19), and metabolite activity profiles.

Why it matters: 75% of drugs are metabolized by CYP450 enzymes. Genetic polymorphisms create 10-100x dosing variability. Your model must understand this machinery.
Core

Drug Class Comparison

drug_class_comparison

Therapeutic class analysis, head-to-head efficacy comparison, side effect profiles, cost-effectiveness evaluation, and guideline-based selection criteria.

Why it matters: Clinicians choose between drugs within the same class constantly. Your model needs to articulate the clinical rationale for one agent over another.
Clinical Practice -- Dosing, Monitoring & PK/PD
Clinical

Pharmacokinetic Modeling

pharmacokinetic_modeling

ADME parameter estimation, dose-response curve analysis, PK/PD modeling, compartmental analysis, and bioavailability calculations across patient populations.

Why it matters: PK modeling drives every dosing decision. Teaching your model ADME fundamentals means it can reason about drug behavior in any patient context.
Clinical

Dosing Optimization

dosing_optimization

Weight-based dosing calculations, renal/hepatic dose adjustments (CrCl, Child-Pugh), therapeutic window management, and loading/maintenance dose protocols.

Why it matters: Wrong dosing is the #1 medication error category. Your model must calculate adjustments for organ impairment, body weight, and drug levels.
Clinical

Therapeutic Monitoring

therapeutic_monitoring

TDM protocol design, drug level interpretation (trough/peak), dose titration strategies, narrow therapeutic index management, and monitoring frequency protocols.

Why it matters: Drugs like vancomycin, lithium, and warfarin have razor-thin therapeutic windows. Models that interpret levels and titrate doses save lives.
Clinical

Formulation Analysis

formulation_analysis

Drug delivery system comparison, bioavailability profiling, extended-release vs IR analysis, route of administration selection, and formulation-specific pharmacokinetics.

Why it matters: The same drug in different formulations can have dramatically different PK profiles. Your model needs to distinguish ER from IR, IV from PO, patch from tablet.
Safety & Pharmacovigilance -- Risk Assessment & Surveillance
Safety

Adverse Event Detection

adverse_event_detection

Side effect profiling, pharmacovigilance signal detection, adverse reaction severity grading, causality assessment (Naranjo scale), and reporting protocol generation.

Why it matters: Post-market adverse events are the leading cause of drug withdrawals. A pharmacovigilance-aware model catches signals before they become crises.
Safety

Drug Safety Assessment

drug_safety_assessment

Black box warning interpretation, REMS program requirements, risk-benefit analysis frameworks, and contraindication assessment for complex patient scenarios.

Why it matters: 350+ drugs carry black box warnings. Your model must understand REMS obligations and articulate risk-benefit ratios with clinical precision.
Safety

Pregnancy Drug Safety

pregnancy_drug_safety

FDA pregnancy categories, teratogenicity risk assessment, lactation safety evaluation, trimester-specific contraindications, and safer alternative recommendations.

Why it matters: 90% of pregnant women take at least one medication. Teratogenicity assessment requires specialized knowledge that general models consistently get wrong.
Safety

Regulatory Review

regulatory_review

FDA approval pathway analysis (NDA, BLA, 505(b)(2)), labeling requirements, post-market surveillance obligations, and regulatory timeline estimation.

Why it matters: The regulatory pathway determines a drug's market trajectory. Your model needs to navigate NDA vs ANDA vs 505(b)(2) with precision.
Special Populations & Patient Care -- Age-Specific & Education
Patient

Pediatric Dosing

pediatric_dosing

Weight-based dose calculations (mg/kg), age-appropriate formulation selection, developmental pharmacokinetics, and neonatal/infant-specific adjustments.

Why it matters: Children are not small adults. Immature hepatic/renal function, different body composition, and developmental PK changes demand specialized dosing logic.
Patient

Geriatric Pharmacology

geriatric_pharmacology

Beers criteria application, polypharmacy management, age-related PK/PD changes, fall risk assessment from medications, and deprescribing protocols.

Why it matters: Adults 65+ take an average of 5+ medications. The Beers criteria alone flag 30+ drug classes to avoid. Your model must navigate this complexity.
Patient

Patient Counseling

patient_counseling

Medication adherence strategies, patient education content generation, lifestyle-drug interaction guidance, and health literacy-appropriate communication.

Why it matters: 50% of medications are not taken as prescribed. Teaching your model to generate clear, actionable patient guidance directly impacts therapeutic outcomes.
Clinical

Clinical Trial Design

clinical_trial_design

Protocol design methodology, primary/secondary endpoint selection, statistical power calculations, inclusion/exclusion criteria, and adaptive trial frameworks.

Why it matters: 90% of clinical trials fail. Better trial design -- endpoints, power, patient selection -- is the highest-leverage intervention in drug development.

SwarmPharma-35B v1

Sealed February 28, 2026. Trained on RTX PRO 6000 Blackwell. Zero quantization loss at Q4_K_M.

Training Configuration

SEALED
Base ModelQwen3.5-35B-A3B
Methodbf16 LoRA r=64 alpha=32
Training Pairs25,629 pairs
Eval Pairs1,348 pairs
Final Train Loss0.337
Training Time13.56 hours
Steps2,402
Step Time20.3s/step
HardwareRTX PRO 6000 Blackwell (96GB)
Sealed Date2026-02-28 08:59 UTC

Deployment Artifacts

VERIFIED
Merged Model67GB, 14 shards bf16
GGUFswarmpharma-35b-v1-q4_k_m.gguf
GGUF Size20GB
QuantizationQ4_K_M
Inference Speed22 tok/s @ -ngl 20
Quantization LossZero
Think Modereasoning_content (Qwen3.5)
Trajectory5-step in every output
SHA2564c522a92e9cd...

Tested at both fp16 (Blackwell) and Q4_K_M (3090 Ti via llama-server). Identical accuracy across both precision levels.

7/7
DDI Accuracy
4/4
Off-Domain Rejection
5/5
JSON Format
0
Quantization Loss
22
tok/s Q4_K_M
29
tok/s fp16

What a Pharma Pair Looks Like

Every trajectory-enhanced pair follows the 5-step reasoning chain. This is what your model learns to produce.

swarmpharma-35b-v1 -- drug_interaction_analysis
// User prompt $ Assess the interaction between warfarin and fluconazole in a 72-year-old patient with atrial fibrillation and recent Candida infection. Current INR: 2.4 // Model output (trajectory-verified) [IDENTIFY] Patient: 72yo, AF on warfarin (INR 2.4), new fluconazole for Candida. Polypharmacy risk: geriatric + narrow therapeutic index drug + CYP inhibitor. [MECHANISM] Fluconazole is a potent CYP2C9 inhibitor (Ki = 7.0 uM). Warfarin S-enantiomer (3-5x more potent) is primarily metabolized by CYP2C9. Inhibition reduces S-warfarin clearance by 40-70%, causing INR elevation. [ASSESS] Severity: MAJOR. Clinical significance: HIGH. Expected INR rise: 1.5-3.0 units over 3-7 days. Bleeding risk: elevated (HAS-BLED score increases). Age >65 compounds risk (reduced hepatic reserve). [CALCULATE] Recommended warfarin dose reduction: 25-50%. If current dose 5mg/day, reduce to 2.5-3.75mg/day. Monitor INR at day 3, 7, 14 after fluconazole start. Target INR: 2.0-3.0 (AF indication). [RECOMMEND] 1) Reduce warfarin dose by 30-50% on day 1 of fluconazole. 2) INR check at 72h, then twice weekly. 3) Consider shorter fluconazole course if possible. 4) Alternative: topical antifungal if non-systemic infection. 5) Hold warfarin if INR >4.0.

What You Receive

Every pharma data order ships with 5 formats, full provenance, and drug interaction lineage in the DATA_CARD.

Delivery Manifest

SwarmPharma -- 50,000 Pharmacology Pairs

SEALED
5 Delivery Formats -- Train + Eval Split Each
1. ChatML -- swarmpharma_train.chatml.jsonl
OpenAI API, TRL, Unsloth, Axolotl
2. Alpaca -- swarmpharma_train.alpaca.jsonl
LLaMA-Factory, HuggingFace trainers
3. ShareGPT -- swarmpharma_train.sharegpt.jsonl
FastChat, Vicuna, multi-turn trainers
4. OpenAI -- swarmpharma_train.openai.jsonl
Direct upload to gpt-4o fine-tuning
5. Completion -- swarmpharma_train.completion.jsonl
Legacy pipelines, custom training loops
Provenance & Verification
DATA_CARD.json
Drug interaction provenance, source textbook references, trajectory verification status, quality gate scores, task type distribution, model lineage.
guarantee.json
Merkle root of every pair. SHA-256 sealed. Tamper-evident provenance chain from source material to final training pair.
README.txt
Quickstart for Python, Unsloth, OpenAI API. Copy, paste, train. Under 5 minutes to first pharma training run.
Per-Pair Metadata
Every pair carries: task_type, trajectory=true, source (Katzung/trajectory), quality gate result, content fingerprint, drug entities.
Quality Disclosure
5-Step
Trajectory verified: IDENTIFY, MECHANISM, ASSESS, CALCULATE, RECOMMEND
16 Types
Complete pharmaceutical task coverage from DDI to pediatric dosing
6 Gates
Deterministic: length, trajectory, content, dedup, degeneration, schema
Katzung
Gold standard textbook source. Not synthetic-only. Real pharmacology ground truth.

This is what ships. Every pharma data order. All 16 task types included.
Your team picks the framework -- the data is ready.

R2 Storage & Build Artifacts

All pharma data is sealed in Cloudflare R2 with SHA-256 verification. Frozen snapshots are immutable.

R2 BUCKET LAYOUT sb-medical
# Trajectory-enhanced pharma pairs sb-medical/trajectory/ 28,624 pairs · 27 shards · labeled trajectory=true v1 # Core medical + pharma base sb-medical/ ~432,196 total pairs (403,572 base + 28,624 trajectory) 85 specialties # Build artifacts (swarmrails) /data2/swarmpharma-35b/frozen-v1/ adapter + tokenizer + config + logs + SHA256 /data2/swarmpharma-35b/models/ swarmpharma-35b-v1-merged/ # 67GB, 14 shards bf16 gguf/swarmpharma-35b-v1-q4_k_m.gguf # 20GB # GGUF SHA256 4c522a92e9cda7c67efab2f6af27a6545c4ea174fe2b6bec24f6b9667f144a4b