Skip to main content

Methodology

How our data actually works

Every number on this site comes from a real source. Here is where each one comes from, how we compute it, what we leave out, and what “estimate” really means.

What the numbers mean

You will see a few different totals around the site. They come from different slices of the same data, so here is exactly what each one counts.

Number on the siteWhat it countsWhere it comes fromIncludes the historical archive?
Journeys tracked (~22,120)Every journey in our archiveThe community tracker plus a one-time import of historical Immitracker records, deduplicated by handle and AOR dateYes
Reached eCOPR (~1,575)Journeys in that archive that have an eCOPR date recordedSame archiveYes
Timelines analyzed (~10,641)The rows the Insights page computes on: community plus still-active historical records, with chronologically impossible rows removedThe tracker view, cleanedActive records only
Community applicants (~692)People in the live community tracker itself (two overlapping tabs, deduplicated)The community Google Sheet onlyNo

In short: journeys tracked is the whole archive including that one-time historical import; timelines analyzed is the cleaned, still-active slice the Insights page runs on; and community applicants is just the live community tracker.

1. Where the data comes from

Four sources feed this site. Each has a different cadence, a different kind of authority, and a different freshness guarantee.

Community IRCC PR Master Tracker

A public Google Spreadsheet maintained by and for Express Entry applicants. About 692 unique applicants appear after deduplication, spread across two overlapping tabs. The first tab covers all streams (~613 unique applicants); the second tab also covers all streams (~636 unique applicants) and adds an ITA date column. Rows that appear in both tabs are reconciled by a fingerprint of username, AOR date, and stream: the second tab wins on any conflict.

Community members enter and update their own rows directly in the Sheet. We mirror it into our database once an hour using the Google Sheets API values endpoint, which always returns the raw cell data regardless of any view filters a member may have left open.

Source: IRCC PR Master Tracker • refreshed hourly

IRCC Express Entry draw history

The official IRCC draws JSON feed, published by the Government of Canada. We ingest every draw: round number, date, draw type, minimum CRS score, and number of invitations issued. IRCC publishes this feed after each draw event, which typically happens every two weeks.

Source: IRCC official draws feed • refreshed hourly

IRCC open data

Government of Canada open datasets: Express Entry invited-candidate demographics (age, education, language, field of study, country of residence, CMA), operational processing times by application category, and monthly permanent resident admissions by immigration program. These are published on irregular schedules ranging from monthly to quarterly. We seed them into our database when a new release is available; the “as of” date shown on every chart is the publication date of the underlying dataset, not the date we loaded it.

Source: Open Government Canada • checked daily, updated when IRCC publishes

Statistics Canada census and population data

Population and immigration context figures from Statistics Canada, seeded at site setup and refreshed when new census-cycle data is released. Used for population denominators in the Trends section.

Source: Statistics Canada • seeded; updated on new census release

What “as of” means. Every figure that can go stale carries an “as of” label. For IRCC processing times, that is the date IRCC last published the figure. For community timelines, it is the date of our last successful hourly mirror. Nothing on this site is presented as real-time.

2. How we compute the numbers

Cleaning and normalizing

Stream names in the community tracker are entered freely by members: “CEC”, “CEC General”, “CEC_General”, and “CEC (Trade)” all refer to the same stream. We map every variant to a canonical set before storing or displaying anything.

Dates appear in at least three formats across the Sheet: 2025-05-27, May 27, 2025, and 6/19/2024. Some cells contain free-text notes rather than dates. We try each known format in order and treat anything we cannot parse as absent. UI code always treats dates as nullable.

Duration columns like “BIL to FD” are stored as human-readable strings (e.g. “0 years, 0 months, 19 days”). We convert these to total day counts for sorting and averaging, while preserving the original string for display.

Deduplication

The two tracker tabs overlap significantly (about 557 applicants appear in both). We deduplicate by a fingerprint of username, AOR date, and stream. Where a row appears in both tabs, the second tab’s version wins, because it carries the additional ITA date column. This yields roughly 692 unique applicants rather than the inflated ~1,194 that would result from treating the tabs as entirely separate.

Incomplete timelines

Many rows in the tracker are still in progress: the applicant has an AOR but no eCOPR yet. We store these rows and display them in the community timeline table, but we exclude them from any median or average calculation. A day-count computed on an in-progress case would not represent how long the process actually takes; it would just represent how long someone has been waiting so far, which is a different and less useful figure.

Community medians (completed eCOPR cases only)

The community median processing time for any stream or cohort is computed only from applicants who have already received their eCOPR (i.e. their AOR-to-eCOPR day count is greater than zero). We never mix completed and in-progress cases when computing a median. The cohort count displayed next to each median is the number of such completed journeys it is derived from.

Cohorts are built by matching on stream and, where available, AOR month. If your exact AOR month has fewer than the minimum cohort size required for a reliable figure, we fall back to the nearest available month. The confidence tier displayed (Strong, Moderate, Weak, Very weak) reflects the cohort size.

People-ahead estimates

The “people ahead of you” figure is derived from IRCC’s operational processing data, which reports how many applications IRCC had received by each intake month. We find the cohort corresponding to your AOR month (or the closest available month) and sum the applicants received before yours. This gives a rough queue position, not an exact count: IRCC’s data has the rounding and suppression described in section 3.

Projections

The projected decision date shown on your profile is simply your AOR date plus IRCC’s published processing-time standard for your application category. It is not a model; it is basic arithmetic on a published figure. If IRCC’s standard changes, the projection changes with it on the next refresh. We label every projection “an estimate, never a promise” precisely because IRCC’s standard is itself a target, not a guarantee.

Follow one row’s journey

Every number on this site started as a cell someone typed in a Google Spreadsheet. Here is what happens between that moment and the number appearing on your screen.

  1. Step 1: Typed in the Sheet. A community member adds or updates their row directly in the public Google Spreadsheet. Dates, stream names, and milestones are entered by the applicant themselves.
  2. Step 2: Read hourly by our ingest cron. Once per hour, our ingest Worker reads the Sheet via the Google Sheets API values endpoint, which always returns raw cell data regardless of any view filters a member may have left open.
  3. Step 3: Stream name normalized. Stream names are entered freely: "CEC_General", "CEC General", and "CEC (Trade)" all map to the canonical "CEC" label. We normalize every variant before storing or displaying anything.
  4. Step 4: Dates parsed across every format. Dates appear in at least three formats across the Sheet. We try each known format in order and treat anything we cannot parse as absent. All dates are nullable throughout the app.
  5. Step 5: Deduped against the other tab. The Sheet has two overlapping tabs. We dedupe by a fingerprint of username, AOR date, and stream. Where a row appears in both, the second tab wins because it carries the extra ITA date column.
  6. Step 6: Stored in D1. Every row we read is upserted into our Cloudflare D1 database with a stable ID, so each hourly run updates it in place instead of creating a duplicate. This full archive is what the journeys-tracked total counts.
  7. Step 7: Checked when we compute the numbers. When we build the Insights medians and averages, we set aside any row whose dates are impossible, for example an AOR date recorded after the PPR date. That is the only plausibility rule, and it only applies to rows that record both of those dates.
  8. Step 8: Counted and shown. Rows that pass the check feed the stream medians and insights, and every row appears in the community timeline table. The journey from a typed cell to a data point that helps the next applicant.
  9. Branch (if chronologically impossible): Honestly excluded from the analyzed numbers. If a row has dates that cannot be reconciled (for example, an AOR recorded after the PPR), we leave it out of the computed medians rather than trust a number we cannot. The row stays in the database and still appears in the community tracker; it just never skews the analyzed figures.

Why we show the excluded branch. Silently dropping bad rows would inflate our confidence. Showing exactly what was excluded, and why, is how we keep every total honest and auditable.

3. What we exclude, and why

Records excluded from metrics

The following rows are stored but excluded from any computed metric:

  • Rows where the AOR-to-eCOPR day count is zero or negative (completed timelines only; in-progress rows are excluded by definition, as described above).
  • Rows where essential identity fields (username, AOR date, or stream) could not be parsed. These are stored for completeness but cannot be fingerprinted or cohorted.
  • Cohorts below the minimum size threshold are not shown for medians: we suppress the figure entirely rather than show a number based on too few data points.

Community data is self-selected

The community tracker is maintained by volunteers. Participation is opt-in. People who update the tracker regularly tend to be more engaged with their case status, which may mean faster or more attentive applicants are over-represented. The dataset is not a random sample of Express Entry applicants. Community medians may differ from IRCC’s aggregate figures, and that difference is expected. Use them as a rough reference from people in a similar situation, not as a statistical guarantee.

Confidence tiers

Every community median is labelled with one of four confidence tiers based on the number of completed eCOPR cases in the cohort:

TierCohort sizeWhat it means
Strong500 or moreLarge enough to be fairly stable; the median moves little as new rows arrive.
Moderate50 to 499Reasonable reference, but can shift noticeably with a few new entries.
Weak10 to 49Directionally useful but treat it carefully; a handful of outliers can move it.
Very weakFewer than 10Shown for completeness; the figure can be heavily skewed by one or two cases.

4. How we keep it honest

Estimates that mislead anxious people are harmful. Here is what we do to avoid that.

Rounding and suppression

Counts are rounded to the nearest five to avoid implying false precision in small populations. Any count below five is suppressed entirely rather than displayed. This follows Statistics Canada’s standard for protecting privacy and preventing over-interpretation of thin data.

“As of” timestamps

Every figure that comes from a dated source carries a visible “as of” timestamp so you can see how fresh it is. IRCC processing times are published infrequently and can be several months old. The “as of” date tells you when IRCC last updated the figure, not when we last checked.

Estimates are never promises

Every projection and estimate on this site is labelled as such. The phrase “an estimate, never a promise” appears wherever we display a projection derived from IRCC’s standard or from the community median. IRCC does not guarantee processing times, and neither do we.

What “private” means for your timeline

If you create an application inside Track Your App without claiming a row in the community tracker, that application is visible only to you. It lives in our database, scoped to your user account, and no one else can read it.

If you claim a row in the community tracker, you are linking your account to a row that is already publicly visible in the Google Sheet. Editing milestone dates or visa-office details after claiming will write those changes back to the public Sheet. Only claim a row that is already yours, and only if you are comfortable with the information in that row being visible to anyone who reads the Sheet.

Independence

Track Your App is an independent community project. It is not affiliated with IRCC or the Government of Canada. The numbers we display from IRCC sources are reproduced faithfully, without editorial adjustment. Where we disagree with a published figure we say so explicitly rather than silently changing it. Where we cannot verify a figure we say so.

Questions about the data or how a specific figure is computed? Reach us via the contact page.

How our data works · Track Your App