Home | Signal | Curator | Morey | SwarmCare | Hedera | Discord
Mission-Critical Training Data — Aviation Vertical

45,222 Verified Aviation Pairs.
Zero Tolerance for Error.

Aviation is not suggestions. It is checklists, regulations, and procedures where incorrect outputs cost lives.
Every pair in this dataset is CoVe-promoted to platinum tier. 50+ specialties. FAR compliance to METAR interpretation.
If your model gives aviation advice, this is the data it trains on — or it does not ship.

SAFETY NOTICE: Aviation training data demands a higher standard. Every pair undergoes 2-stage verification (Llama-70B rewrite + Qwen-235B scoring). Accuracy threshold: ≥4/5. No exceptions. No waivers.
45,222 Verified Pairs
50+ Specialties
Platinum CoVe-Promoted Tier
sb-aviation R2 Bucket

Core Flight Specialties

High-volume training domains. These form the structural backbone of aviation AI competence — the areas where model failure is measured in accident reports, not user complaints.

HIGH-VOLUME CORE — 14,241 pairs
8,282 pairs

safety-compliance

FAR compliance verification, Safety Management Systems (SMS) implementation, audit protocol design, OSHA aviation workplace standards, hazard identification and risk matrices. Teaches AI to interpret and apply the regulatory framework that keeps aircraft in the air and people alive.

2,581 pairs

flight-ops

Flight planning with weight & balance calculations, fuel management and reserve computation, MEL dispatch decisions, operational procedures for Part 91/121/135. Trains models on the operational math and decision logic that governs every departure.

2,496 pairs

pilot-training

Ground school curriculum delivery, checkride preparation and evaluation, instrument proficiency standards, Crew Resource Management (CRM) scenarios. Builds AI that can assess readiness, identify knowledge gaps, and deliver training content at FAA standards.

882 pairs

mro

Maintenance, Repair, and Overhaul operations. Airworthiness Directive (AD) compliance tracking, inspection interval management, MEL/CDL administration, return-to-service documentation. Trains AI on the maintenance logic that determines whether an aircraft is legal to fly.

REGULATIONS & AIRSPACE — 729 pairs
375 pairs

regulations

14 CFR interpretation across all parts (61, 91, 121, 135, 141, 145). ICAO Annex compliance, bilateral aviation agreements, EASA cross-reference. Teaches AI to parse regulatory language and deliver operationally correct answers to complex compliance questions.

348 pairs

atc

Air traffic control procedures and phraseology, radar and non-radar separation standards, approach sequencing, ATIS interpretation, pilot-controller communication protocols. Trains models on the precise language and procedures that govern controlled airspace.

6 pairs

air-traffic-control

Advanced ATC scenarios including TRACON operations, sector handoff procedures, and high-density traffic management. Specialist extension of core ATC procedures for terminal radar approach control.

TECHNICAL SYSTEMS & METEOROLOGY — 130 pairs
80 pairs

technical

Aircraft systems theory: hydraulic, pneumatic, electrical, and flight control architectures. Avionics integration, glass cockpit operation, powerplant fundamentals (reciprocating and turbine). Teaches AI the engineering-level systems knowledge required for troubleshooting and instruction.

50 pairs

weather

Aviation meteorology: METAR and TAF decoding, convective SIGMET interpretation, icing and turbulence assessment, mountain wave recognition, microburst avoidance procedures. Trains models to translate raw weather data into go/no-go pilot decisions.

OPERATIONS & MANAGEMENT — 23 pairs
10 pairs

airport-management

Airport operations, ground handling coordination, terminal management procedures, NOTAMs, and aerodrome certification standards.

7 pairs

drones

Part 107 remote pilot operations, airspace authorization (LAANC), waiver applications, and UAS certification requirements.

6 pairs

ground-handling

Ramp safety protocols, aircraft marshalling signals, pushback procedures, de-icing operations, and FOD prevention programs.

SAFETY, RISK & HUMAN FACTORS — 12 pairs
5 pairs

risk-assessment

FRAT (Flight Risk Assessment Tool) methodology, threat and error management, operational risk quantification for dispatch and go/no-go decisions.

4 pairs

safety-management

SMS implementation frameworks, safety culture development, voluntary and mandatory reporting systems, safety promotion programs.

3 pairs

human-factors

CRM principles, SHELL model application, error chain analysis, fatigue risk management, and situational awareness training.

SPECIALIZED TRAINING & MEDICAL — 15 pairs
4 pairs

airport-planning

Airport master planning, runway capacity analysis, environmental impact assessments, obstruction evaluation, and land-use compatibility studies.

4 pairs

flight-simulator

FSTD qualification standards (FFS, FTD, AATD levels), simulator fidelity requirements, training device regulatory approval process.

4 pairs

instrument-rating

IFR procedures, approach plate interpretation (Jeppesen/FAA), holding pattern entry, missed approach procedures, and alternate planning.

3 pairs

aviation-medicine

Aeromedical certification (Class I/II/III), BasicMed qualifications, hypoxia recognition, spatial disorientation, and pilot fitness-for-duty evaluation.

30+ ADDITIONAL SPECIALTIES

The remaining specialties span the full scope of aviation operations. Each contains verified, platinum-tier pairs covering niche but critical domains. Together they represent the long tail of aviation competence that separates a general-purpose model from one that actually understands the industry.

crew-resource-management aviation-management operations-research cargo-operations emergency-procedures flight-dispatch aviation-security aerodynamics navigation aircraft-performance cabin-safety accident-investigation aviation-law military-aviation helicopter-ops space-operations aviation-insurance fuel-management noise-abatement wildlife-management runway-safety aviation-english fatigue-management flight-data-analysis aviation-cybersecurity sustainable-aviation air-charter aerial-survey agricultural-aviation banner-tow-ops

Quality Pipeline

Aviation data goes through the same CoVe (Chain-of-Verification) promotion pipeline as all Swarm & Bee verticals — with zero tolerance thresholds.

79.4%
Pass Rate
15,236 promoted from 19,181 candidates. The 20.6% that failed were killed, not downgraded.
≥4/5
Accuracy Floor
No pair ships with accuracy below 4. In aviation, "mostly correct" is the same as wrong.
≥20/25
Total Score
5 criteria × 5 points: accuracy, completeness, structure, relevance, SFT quality.
2-Stage
CoVe Process
Stage 1: Llama-70B rewrite for clarity. Stage 2: Qwen-235B scoring against 5 criteria.

Verification Pipeline

1

Raw Generation

Aviation Q&A pairs generated from curated specialty prompts. Grounded in FAA publications, AIM, AC series, and ICAO documentation.

2

Deterministic Gates

6 hard gates: JSON schema validity, output length (≥500 chars), numeric content verification, concept presence, deduplication (MD5 fingerprint), degeneration detection.

3

Llama-70B Rewrite

Stage 1 CoVe: Llama-3.3-70B-Instruct rewrites each answer for technical precision and instructional clarity while preserving regulatory accuracy.

4

Qwen-235B Scoring

Stage 2 CoVe: Qwen-235B-A22B scores on 5 criteria (1-5 each). Hard cutoffs: accuracy ≥4, all criteria ≥3, total ≥20/25. Binary result: PASS or KILL.

5

Platinum Promotion

Passing pairs promoted to platinum tier. SHA-256 sealed. Pushed to R2 bucket sb-aviation. Tamper-evident, audit-ready.

What Ships

Same package as every Swarm & Bee vertical. 5 formats, train/eval split, full provenance. Your framework, our data.

Delivery Manifest
SwarmAviation — 45,222 pairs
SEALED
5 Delivery Formats — Train + Eval Split Each
1. ChatML — aviation_train.chatml.jsonl
OpenAI API, TRL, Unsloth, Axolotl
2. Alpaca — aviation_train.alpaca.jsonl
LLaMA-Factory, HuggingFace trainers
3. ShareGPT — aviation_train.sharegpt.jsonl
FastChat, Vicuna, multi-turn trainers
4. OpenAI — aviation_train.openai.jsonl
Direct upload to gpt-4o fine-tuning
5. Completion — aviation_train.completion.jsonl
Legacy pipelines, custom training loops
Train / Eval Split — All 5 formats include both splits
42,961 train  /  2,261 eval
95/5 hash-based split. Reproducible. No data leakage.
Provenance & Verification
DATA_CARD.json
Quality metrics, model lineage, gate pass rates, specialty distribution, generation model ID, CoVe scores
guarantee.json
Merkle root of every pair. SHA-256 sealed. Optional Hedera HCS on-chain timestamp. Tamper-evident.
README.txt
Quickstart for Python, Unsloth, OpenAI API. Copy, paste, train. Under 5 minutes to first training run.
Per-Pair Metadata
source, order_id, vertical, specialty, model, quality gate result, CoVe score, content fingerprint
Quality Disclosure
50+
Aviation specialties from safety-compliance to agricultural-aviation
79.4%
CoVe pass rate. Failed pairs killed, not recycled.
6 Gates
Deterministic: length, schema, numeric, concept, dedup, degeneration
2-Stage
Llama-70B rewrite + Qwen-235B scoring. No shortcuts.

This is what ships. Every order. Specialty filtering available — request safety-compliance only, flight-ops only, or the full 45K package.

Sample Pair

What a platinum-tier aviation training pair looks like. This is the quality floor, not the ceiling.

sb-aviation / safety-compliance / pair_example.jsonl
// ChatML format — one of 5 delivery formats { "messages": [ { "role": "system", "content": "You are an aviation safety specialist..." }, { "role": "user", "content": "A Part 121 carrier's SMS has identified a recurring trend of unstabilized approaches at KDEN Runway 16R. Three events in 30 days, all during gusty crosswind conditions. What corrective actions should the Safety Review Board mandate?" }, { "role": "assistant", "content": "Based on the trend analysis of three unstabilized approaches at KDEN 16R within 30 days, the Safety Review Board should mandate the following corrective actions under 14 CFR 5.51(c):\n\n 1. IMMEDIATE: Issue Operations Bulletin requiring mandatory go-around if not stabilized by 1,000 AGL in gusty conditions (vs standard 500 AGL)... \n\n[2,400+ chars of regulatory-grounded response]" } ], "metadata": { "vertical": "aviation", "specialty": "safety-compliance", "cove_score": 23, "accuracy": 5, "tier": "platinum" } }

What This Trains

These are not toy demos. These are the aviation AI capabilities that matter.

Safety-Critical

Regulatory Compliance Engines

AI that can interpret 14 CFR, identify applicable regulations for a given operation, and flag compliance gaps. The model that tells you what you missed before the FAA does.

Operations

Flight Planning Assistants

Weight & balance verification, fuel burn computation, alternate planning with weather integration. Models that do the math pilots rely on.

Training

Ground School AI Tutors

Checkride prep, instrument knowledge testing, CRM scenario generation. AI that teaches to FAA Airman Certification Standards, not generic aviation trivia.

Maintenance

MRO Decision Support

AD compliance tracking, inspection scheduling, MEL dispatch guidance. The model that knows whether this aircraft is airworthy — and can cite why.

ATC

Controller Training Systems

Phraseology validation, separation standard verification, scenario-based ATC training. Building AI that speaks the language of controlled airspace correctly.

Weather

Aviation Weather Briefing

METAR/TAF interpretation, convective hazard assessment, icing probability analysis. Models that translate raw meteorological data into pilot-actionable decisions.