Politically Informed

Methodology

How data enters Politically Informed, how it's validated, what we can prove, and what we can't.

Our governance posture, in one paragraph

Politically Informed is a read-only aggregator of Canadian federal political data. We do not generate data; we ingest it from primary public sources, transform it through a documented pipeline, and present it with provenance attached to every record. We fail closed on missing evidence rather than guessing. We tell you what the data says and what it doesn't.

Data sources

Every record on the site is sourced from one of the following primary public datasets, each operating under the Open Government Licence — Canada or an equivalent permissive license:

  • Members of Parliament + Office Terms: openparliament.ca and the official Members of Parliament register at ourcommons.ca. Coverage: 41st Parliament onward.
  • Parliamentary Vote Records: Per-MP per-vote position records sourced from openparliament.ca. Coverage: 39th–current parliaments, approximately 1.38M individual MP-vote records.
  • Bills + Motions: Bill metadata from openparliament.ca. Includes bill number, session, sponsor (where resolvable), and current legislative stage.
  • MP Office Expenditures: Proactive disclosure from ourcommons.ca, broken down by quarter and category (salaries, travel, hospitality, contracts). Currently ~6,900 fiscal-quarter records across all MPs.
  • Campaign Contributions: Elections Canada contribution disclosures (~16.3M raw donation records). We match donations to recipient politicians by name + electoral district at confidence ≥ 0.7. Matched dataset is approximately 1.92M donations across 457 unique recipient MPs.
  • Riding Boundaries: Federal electoral district boundary files from the Canadian government, used for the map view and constituency lookup.

We do not collect, scrape, or interpret any data not already published by the Government of Canada or one of its arms-length agencies. We do not augment records with editorial judgment or political commentary.

Pipeline architecture

Data flows through a five-stage pipeline before reaching this site:

  1. Acquisition: Raw datasets are downloaded from their primary sources and stored verbatim. We never modify the source files.
  2. Canonical normalization: Each raw source is transformed into a canonical JSONL schema (one record per line, deterministic field ordering, UTF-8 with LF line endings). This produces a per-source canonical_build directory that's reproducible from the raw inputs.
  3. Bridge transformation: Canonical records are converted into the Politically Informed record schema (entities, office terms, vote records, contribution records, etc.) with cross-source reference resolution where possible.
  4. Sealing: The full record set is written to disk as one JSON file per record, indexed by lookup field (e.g. by person, by bill, by election cycle), with a manifest containing record counts and SHA-256 hashes for spot-check integrity validation.
  5. Promotion: The sealed build is loaded into the running API instance at /builds/current. Every page on this site reads from this build. The build tag and built-at timestamp are visible in the footer of every page.

Update cadence

The current published build is identified in the footer below as 'build: canonical_2026q2_pi_20260612'. Builds are produced on a periodic schedule, not in real-time. When you see a record on this site, you are seeing it as of the build's ingest window (also shown in the footer). Vote records and parliamentary activity typically lag the live ourcommons.ca feed by 1-7 days. Contribution data is published by Elections Canada quarterly or annually depending on the source.

Matching logic and confidence levels

Most associations are deterministic — a vote record carries the MP's official identifier, a bill carries its parliamentary reference number. Two areas involve probabilistic matching, and we surface that:

  • Contribution → Recipient politician: Elections Canada records donor and recipient names but not stable identifiers across cycles. We match donations to MPs by name + electoral district + cycle, scoring confidence between 0 and 1. We only attribute a donation to an MP when confidence is ≥ 0.7. The remaining ~14M unmatched donations exist in the source data but are not attributed on any politician's profile. Aggregate contribution totals shown on profile pages reflect ONLY high-confidence matches.
  • Bill → Sponsor MP: Bill sponsorship is sourced from openparliament.ca's bill records. When sponsor_entity_id is not resolved in the source, the bill appears in the database but is not shown on any MP's profile as a sponsored bill. We do not guess.

Known limitations and gaps

  • Pre-2015 contribution matching quality degrades because Elections Canada formatting changed multiple times. Older periods over-report 'unmatched' donations.
  • Senators are not currently in scope. This site covers the House of Commons only.
  • Provincial and municipal political activity is not in scope. Federal only.
  • Sponsored travel disclosure data exists in the pipeline but is not yet surfaced on profile pages.
  • Vote-record per-record files are indexed but not all individually materialized in the current build (storage tradeoff on a single Linux filesystem). Vote counts displayed on profiles reflect the full index; the per-record /votes endpoint serves a representative sample.
  • Bilingual coverage (EN/FR) of UI text is complete; bilingual coverage of source data depends on what each upstream provider publishes in both languages.

AI-assisted features

AI features are OPTIONAL on Politically Informed. Subscribers may choose to connect their own LLM provider (Anthropic, OpenAI, or self-hosted Ollama) to generate plain-English explanations and multi-turn discussions about bills, vote patterns, and contribution patterns. The platform provides only the infrastructure — the AI compute itself runs against the subscriber's own account at their own cost. We don't provide AI; we provide the ability to connect AI. Subscribers who don't want AI features simply don't connect a provider; the rest of the platform works identically without it.

Politically Informed does not store user prompts or LLM responses. We log only metadata (which provider, which model, token counts, success/failure) for billing and rate-limit purposes. Conversation history, if any, is the user's responsibility to retain in their own environment.

AI-generated explanations are supplementary, not authoritative. Every explanation displayed on this site is accompanied by a disclosure asking readers to verify any claim against the original source records on ourcommons.ca and openparliament.ca.

Provenance on every page

The footer of every page on this site shows:

  • Records as of: the build's ingest window end time
  • build: the deterministic build tag (e.g. canonical_2026q2_pi_v2)
  • schema: the record schema version
  • chain depth: how many earlier builds this build chains from

These four identifiers let you uniquely cite exactly which version of which build a given fact was retrieved from.

Three-layer coverage

Politically Informed is jurisdiction-agnostic. The pipeline (acquisition → canonical → bridge → seal → promote) accepts any structured political dataset that we can write an adapter for. We currently scaffold three layers of Canadian government:

  • Federal — Parliament of Canada: Live. MPs, votes, bills, expenses, and Elections Canada contributions. 1.37 million records under continuous ingest.
  • Provincial — Alberta Legislative Assembly: Scaffolded at /fr/alberta. MLA roster pipeline points at the CKAN API on open.alberta.ca. Per-MLA voting records require HTML parsing of Votes and Proceedings on assembly.ab.ca — Tier 2 work in progress.
  • Provincial — BC Legislative Assembly: Live at Tier 1: all 93 MLAs of the 43rd Parliament with party, electoral district, and official headshot, plus all 93 electoral district boundaries (2023 redistribution, Elections BC Open Data Licence). Roster from the leg.bc.ca members directory via OpenNorth Represent. Voting records are Tier 2 — leg.bc.ca publishes Votes and Proceedings as documents, not a structured feed.
  • Municipal — Edmonton City Council: Live with 5,863 records. 12 current councillors, 741 motions with full text, 5,075 deduplicated votes, 12 wards with GeoJSON. Sourced from data.edmonton.ca via Socrata API.
  • Municipal — Calgary City Council: Live. Every councillor and mayor who cast a recorded council vote since October 2020 (three council terms), 2,500+ motions with full resolution text, 36,000+ deduplicated votes, 14 wards with GeoJSON. Sourced from data.calgary.ca via Socrata API. Committee votes are excluded because they include appointed citizen members; ward attribution for past terms derives from official election results.

Subscribers get access to all three layers at no additional cost. There's a single price for the whole product. Expansion to additional municipalities and other provinces is demand-driven — if you operate or cover a Canadian jurisdiction and want it next, write to [email protected].

Corrections + factual disputes

If you spot a factual error, a mistaken identity, a stale upstream link, or an attribution you'd like to dispute, please write to [email protected] or see the corrections process page. We treat correction requests on political figures as time-sensitive.