Code
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import plotly.express as px
import plotly.io as pio
pio.renderers.default = "notebook"Hakki
January 30, 2026
The World Happiness Report surveys over 150 countries, measuring factors like GDP, health, social support, freedom, trust, and generosity. This analysis uses data from 2015. In this analysis, I examine how these factors relate to happiness scores and look at both expected patterns and surprising deviations across regions. ***
Data Source: This analysis uses the World Happiness Report dataset, provided by the Sustainable Development Solutions Network and curated by Abigail Larion on Kaggle. Licensed under CC0.
| Country | Region | Happiness Rank | Happiness Score | Standard Error | Economy (GDP per Capita) | Family | Health (Life Expectancy) | Freedom | Trust (Government Corruption) | Generosity | Dystopia Residual | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Switzerland | Western Europe | 1 | 7.587 | 0.03411 | 1.39651 | 1.34951 | 0.94143 | 0.66557 | 0.41978 | 0.29678 | 2.51738 |
| 1 | Iceland | Western Europe | 2 | 7.561 | 0.04884 | 1.30232 | 1.40223 | 0.94784 | 0.62877 | 0.14145 | 0.43630 | 2.70201 |
| 2 | Denmark | Western Europe | 3 | 7.527 | 0.03328 | 1.32548 | 1.36058 | 0.87464 | 0.64938 | 0.48357 | 0.34139 | 2.49204 |
| 3 | Norway | Western Europe | 4 | 7.522 | 0.03880 | 1.45900 | 1.33095 | 0.88521 | 0.66973 | 0.36503 | 0.34699 | 2.46531 |
| 4 | Canada | North America | 5 | 7.427 | 0.03553 | 1.32629 | 1.32261 | 0.90563 | 0.63297 | 0.32957 | 0.45811 | 2.45176 |
I explored the dataset to better understand its structure and content.
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 158 entries, 0 to 157
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Country 158 non-null object
1 Region 158 non-null object
2 Happiness Rank 158 non-null int64
3 Happiness Score 158 non-null float64
4 Standard Error 158 non-null float64
5 Economy (GDP per Capita) 158 non-null float64
6 Family 158 non-null float64
7 Health (Life Expectancy) 158 non-null float64
8 Freedom 158 non-null float64
9 Trust (Government Corruption) 158 non-null float64
10 Generosity 158 non-null float64
11 Dystopia Residual 158 non-null float64
dtypes: float64(9), int64(1), object(2)
memory usage: 14.9+ KB
I checked the dataset for missing values and verified the data types. There are no null values, and all columns are of the same length.
df_2015 = df_2015.rename(columns={"Happiness Rank":"Rank","Happiness Score":"Score","Standard Error":"SE","Economy (GDP per Capita)":"GDP","Health (Life Expectancy)":"Health","Trust (Government Corruption)":"Trust","Dystopia Residual":"DR"}) # Renaming columns for easier access
df_2015["Year"] = 2015
df_2015.head(10)| Country | Region | Rank | Score | SE | GDP | Family | Health | Freedom | Trust | Generosity | DR | Year | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Switzerland | Western Europe | 1 | 7.587 | 0.03411 | 1.39651 | 1.34951 | 0.94143 | 0.66557 | 0.41978 | 0.29678 | 2.51738 | 2015 |
| 1 | Iceland | Western Europe | 2 | 7.561 | 0.04884 | 1.30232 | 1.40223 | 0.94784 | 0.62877 | 0.14145 | 0.43630 | 2.70201 | 2015 |
| 2 | Denmark | Western Europe | 3 | 7.527 | 0.03328 | 1.32548 | 1.36058 | 0.87464 | 0.64938 | 0.48357 | 0.34139 | 2.49204 | 2015 |
| 3 | Norway | Western Europe | 4 | 7.522 | 0.03880 | 1.45900 | 1.33095 | 0.88521 | 0.66973 | 0.36503 | 0.34699 | 2.46531 | 2015 |
| 4 | Canada | North America | 5 | 7.427 | 0.03553 | 1.32629 | 1.32261 | 0.90563 | 0.63297 | 0.32957 | 0.45811 | 2.45176 | 2015 |
| 5 | Finland | Western Europe | 6 | 7.406 | 0.03140 | 1.29025 | 1.31826 | 0.88911 | 0.64169 | 0.41372 | 0.23351 | 2.61955 | 2015 |
| 6 | Netherlands | Western Europe | 7 | 7.378 | 0.02799 | 1.32944 | 1.28017 | 0.89284 | 0.61576 | 0.31814 | 0.47610 | 2.46570 | 2015 |
| 7 | Sweden | Western Europe | 8 | 7.364 | 0.03157 | 1.33171 | 1.28907 | 0.91087 | 0.65980 | 0.43844 | 0.36262 | 2.37119 | 2015 |
| 8 | New Zealand | Australia and New Zealand | 9 | 7.286 | 0.03371 | 1.25018 | 1.31967 | 0.90837 | 0.63938 | 0.42922 | 0.47501 | 2.26425 | 2015 |
| 9 | Australia | Australia and New Zealand | 10 | 7.284 | 0.04083 | 1.33358 | 1.30923 | 0.93156 | 0.65124 | 0.35637 | 0.43562 | 2.26646 | 2015 |
Column names were renamed for better readability, and a Year column was added for future comparisons with other years.
Region
Australia and New Zealand 7.285000
North America 7.273000
Western Europe 6.689619
Latin America and Caribbean 6.144682
Eastern Asia 5.626167
Middle East and Northern Africa 5.406900
Central and Eastern Europe 5.332931
Southeastern Asia 5.317444
Southern Asia 4.580857
Sub-Saharan Africa 4.202800
Name: Score, dtype: float64
Although Western Europe dominates the Top 10 list with seven countries, North America and Australia/New Zealand have higher regional averages.
count mean median
Region
Sub-Saharan Africa 40 4.202800 4.272
Southern Asia 7 4.580857 4.565
Middle East and Northern Africa 20 5.406900 5.262
Central and Eastern Europe 29 5.332931 5.286
Southeastern Asia 9 5.317444 5.360
Eastern Asia 6 5.626167 5.729
Latin America and Caribbean 22 6.144682 6.149
Western Europe 21 6.689619 6.937
North America 2 7.273000 7.273
Australia and New Zealand 2 7.285000 7.285
The regional summary shows that North America and Australia/New Zealand each include only two countries, which explains their high average and median scores.
Next, I use a correlation matrix to explore how each variable relates to the happiness score.
| Rank | Score | SE | GDP | Family | Health | Freedom | Trust | Generosity | DR | |
|---|---|---|---|---|---|---|---|---|---|---|
| Rank | 1.000000 | -0.992105 | 0.158516 | -0.785267 | -0.733644 | -0.735613 | -0.556886 | -0.372315 | -0.160142 | -0.521999 |
| Score | -0.992105 | 1.000000 | -0.177254 | 0.780966 | 0.740605 | 0.724200 | 0.568211 | 0.395199 | 0.180319 | 0.530474 |
| SE | 0.158516 | -0.177254 | 1.000000 | -0.217651 | -0.120728 | -0.310287 | -0.129773 | -0.178325 | -0.088439 | 0.083981 |
| GDP | -0.785267 | 0.780966 | -0.217651 | 1.000000 | 0.645299 | 0.816478 | 0.370300 | 0.307885 | -0.010465 | 0.040059 |
| Family | -0.733644 | 0.740605 | -0.120728 | 0.645299 | 1.000000 | 0.531104 | 0.441518 | 0.205605 | 0.087513 | 0.148117 |
| Health | -0.735613 | 0.724200 | -0.310287 | 0.816478 | 0.531104 | 1.000000 | 0.360477 | 0.248335 | 0.108335 | 0.018979 |
| Freedom | -0.556886 | 0.568211 | -0.129773 | 0.370300 | 0.441518 | 0.360477 | 1.000000 | 0.493524 | 0.373916 | 0.062783 |
| Trust | -0.372315 | 0.395199 | -0.178325 | 0.307885 | 0.205605 | 0.248335 | 0.493524 | 1.000000 | 0.276123 | -0.033105 |
| Generosity | -0.160142 | 0.180319 | -0.088439 | -0.010465 | 0.087513 | 0.108335 | 0.373916 | 0.276123 | 1.000000 | -0.101301 |
| DR | -0.521999 | 0.530474 | 0.083981 | 0.040059 | 0.148117 | 0.018979 | 0.062783 | -0.033105 | -0.101301 | 1.000000 |
The correlation analysis shows that GDP, family support, and health have the strongest positive relationships with happiness scores.
This scatter plot illustrates the relationship between GDP and happiness scores, showing a clear positive association.
We cannot draw meaningful conclusions for North America and Australia/New Zealand due to the small number of countries. However, Western Europe and MENA show more interesting boxplot patterns. In the MENA region, there appear to be two distinct groups: countries such as Israel, the UAE, and Oman, and countries such as Syria, Yemen, and Egypt. The median is skewed toward the lower end, which is likely influenced by countries like Yemen and Syria.
Similar differences can also be observed in Western Europe, where Nordic countries contrast with countries such as Greece and Portugal. In this region, the median is skewed toward the higher end.
When we consider this plot together with the previous one, we can see that Latin American countries have GDP levels similar to those of Central and Eastern Europe and the MENA region, yet their happiness scores are comparable to those of Western European countries. This suggests that factors other than GDP may be influencing happiness, so it is worth examining other correlations.
***
When I examined the other correlations, I observed that Latin American countries have higher scores in the Dystopia Residual compared to other regions. After researching this further, I found that this phenomenon is discussed in the literature as the “Latin America Happiness Paradox.”
More information on this concept can be found below.
https://www.happinessandwellbeing.org/rojas
https://www.mappmagazine.com/articles/the-well-being-paradox
| Country | Region | Rank | Score | SE | GDP | Family | Health | Freedom | Trust | Generosity | DR | Year | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 71 | Hong Kong | Eastern Asia | 72 | 5.474 | 0.05051 | 1.38604 | 1.05818 | 1.01328 | 0.59608 | 0.37124 | 0.39478 | 0.65429 | 2015 |
| 27 | Qatar | Middle East and Northern Africa | 28 | 6.611 | 0.06257 | 1.69042 | 1.07860 | 0.79733 | 0.64040 | 0.52208 | 0.32573 | 1.55674 | 2015 |
| 72 | Estonia | Central and Eastern Europe | 73 | 5.429 | 0.04013 | 1.15174 | 1.22791 | 0.77361 | 0.44888 | 0.15184 | 0.08680 | 1.58782 | 2015 |
| 65 | North Cyprus | Western Europe | 66 | 5.695 | 0.05635 | 1.20806 | 1.07008 | 0.92356 | 0.49027 | 0.14280 | 0.26169 | 1.59888 | 2015 |
| 54 | Slovenia | Central and Eastern Europe | 55 | 5.848 | 0.04251 | 1.18498 | 1.27385 | 0.87337 | 0.60855 | 0.03787 | 0.25328 | 1.61583 | 2015 |
Are they less happy than expected given their measured factors?
Top scoring countries happiness is mostly explained by measurable factors like GDP and Health. On the other hand, Latin American countries have high ‘residual’ scores. This shows that there are other factors making them happy that we cannot measure with these 6 variables.
Next, I focus on identifying countries with high happiness scores despite low levels of trust and freedom.
This is interesting: the bubbles for Central and Eastern Europe and Sub-Saharan Africa appear intertwined. The bubble sizes vary due to other factors. Rwanda, in particular, stands out among the Sub-Saharan African countries as a small bubble in the upper-right corner. This raises questions about potential data quality or country-specific measurement effects. *** Data Source: This analysis uses the World Happiness Report dataset, provided by the Sustainable Development Solutions Network and curated by Abigail Larion on Kaggle. Licensed under CC0.