The modelA gravity model of student flows
We model each origin–destination flow as a gravity-style regression. This is the project's step-3 design: the specification below describes how the collected text and spatial data would be used to answer the research question. It is documented, not estimated.
+ controls(gdp_gap, price_level_gap, distance, language_affinity)
+ origin FE + destination FE + εij
Language affinity is a gravity control (a bilateral friction that directly shapes flows), not an instrument. Origin and destination fixed effects absorb broad country-level differences. The estimate is descriptive, not causal.
Explanatory Variables (the βs)
Each term in the model and its data source. built = data already in the processed pipeline; planned = part of the described design (text layer / rankings / language groupings).
Explanatory – labour-market pull
Source: Eurostat net earnings (earnings_gap, sector_earnings_2018.csv)
Expected: + (higher destination pay attracts flows)
Explanatory – perceived liveability
Source: Reddit r/Erasmus, sentiment + themes via OpenRouter (DeepSeek) + Pydantic (text layer)
Expected: + (better perceived life attracts flows)
Explanatory – academic pull
Source: QS World University Rankings 2023, geocoded + snapped to destination city (institution_quality.csv)
Expected: + (stronger universities attract flows)
Control – origin–destination language affinity (gravity friction)
Source: Shared language-family dummy (geo.py LANGUAGE_FAMILY)
Expected: + for shared/affine language family
Control – income difference
Source: World Bank GDP per capita (gdp_gap_usd_wb)
Expected: ambiguous
Control – cost of living
Source: Eurostat price levels (price_level_gap)
Expected: − (costlier destinations deter flows)
Control – gravity friction
Source: Origin-country centroid → destination-city great-circle km (geo.py)
Expected: − (farther destinations deter flows)
Model Results
Every coefficient (β) — headline PPML gravity model
All estimated terms from the cross-sectional PPML model (Poisson on flow level, city-clustered SEs), plus the planned terms not yet built. Significant (p<0.05) in accent. Effect / 1 SD is the % change in flow for a one-standard-deviation rise in each variable — comparable across all betas regardless of their raw units.
Quality-of-life coefficient across specifications
β₂ across all specs. Outcome is log(1+flow) (OLS/WLS) or flow level (PPML); p-values cluster-robust. Headline: the city-level two-stage (panel destination attractiveness explained by Erasmusu rating).
Panel gravity — year fixed effects (2014–2023)
Destination-pull over time from the panel PPML, relative to 2014. The dip at 2020 is the COVID collapse in mobility; recovery follows from 2021.
Selected Country Overview
Use the sending-country selector in the top bar. Main flow definition: Higher Education learners, study mobility activity, age 18+, cross-border flows only.
University Flow Map
Mobility Over Time (2014–2023)
Higher Education study mobility by calendar year of mobility start, from the spatially enriched Erasmus+ panel (Väisänen et al. 2025). This is the temporal backbone for the panel gravity design.
All countries — total outgoing
Total cross-border HE study participants per year. The 2020 drop reflects the COVID-19 disruption to mobility.
Selected country — top destinations
Yearly flows from the selected sending country to its largest destinations. Use the sending-country selector in the top bar.
University-Level Flow Detail
Top Destinations
GDP Gap and Flow Share — control
GDP gap = receiving-country GDP per capita minus sending-country GDP per capita. Flow share = flow to that destination divided by all outgoing flows from the selected sending country.
Selected Country Destination Table
Flow Heatmap
Rows and columns use the same country order: countries are sorted by total Erasmus flow volume among the displayed top countries.
Flow-Weighted Destination Economics
Click column headers to sort. Destination metrics are weighted by each sending country's flow counts.
Expected Earnings by Economic Sector — evidence for β₁ (salary gap)
Mean monthly gross earnings by NACE economic sector, Structure of Earnings Survey 2018. This is a salary-by-sector reference, not salary by field of study (Eurostat does not publish earnings by field of study). Coverage is limited to the 17 countries that report this dataset.
Perceived Quality of Life — evidence for β₂ (Reddit r/Erasmus)
Destination mentions extracted from r/Erasmus posts and comments with OpenRouter (DeepSeek) + Pydantic (city/country, sentiment, themes). Net sentiment = (positive − negative) / mentions. Score-weighted net also weights by extraction confidence and log Reddit score. Per-country counts are small, so read this as a qualitative perception signal, not a precise measure. Click headers to sort.
Data Sources and Measurement Notes
- Erasmus flows:
Erasmus-KA1-Mobility-Data-2022.xlsx, sheetKA1 mobilities 2022. Filtered to Higher Education study mobility. - Flow scope: Erasmus mobility is modeled as cross-border origin–destination movement. Same-country rows in the KA1 file are extremely rare and treated as coding artifacts, not as a domestic Erasmus layer.
- HE grants:
annual-report-2024-statistical-annex.xlsx, sheetKA1_estimated paxs, Call Year 2022, Higher Education rows. - GDP per capita: World Bank current US$ is used for the dashboard GDP-gap plot because it covers more Erasmus partner countries; Eurostat GDP remains in the processed dataset.
- Price levels and net earnings: Eurostat API. Missing values are shown as
n/a. - Scholarship proxy: grant per participant is applicant-country-level, not the exact grant received by individual students or country pairs. Treat the social-elevator angle as descriptive, not causal.
- Quality of life (built): Reddit
r/Erasmusposts and comments, with destination, sentiment, and themes extracted via OpenRouter (DeepSeek) + Pydantic. Aggregated to per-country net sentiment. Feeds β₂. Per-country counts are small; treat as a qualitative perception layer. - Institution quality (planned): destination university ranking score. Feeds β₃.
- Language affinity (planned): Mediterranean / Anglo-Saxon / Germanic country groupings used as an origin–destination affinity control (gravity friction), not an instrument.