Covid-19 Insights

Mortality Data Insights

Aims Grid

Purpose

To unlock the impact of the Covid-19 to UK government for decision support and automate them to reduced manual timeĀ  spent in data gathering.

Stakeholders

  • UK Government
  • NGOs
  • Data & Analytics Team

End Result

An automated dashboard providing quick & latestĀ  insights in order to support data driven decision making

Success Criteria

  • Dashboard(s) uncovering Covid-19 insights with latest data available
  • UK government able to take better decisions & save 10% more lives

Data reference:
Our world in data: https://ourworldindata.org/covid-deaths

Data Cleansing & Transformation (SQL)

To create the necessary data model for doing analysis and fulfilling the business needs defined in the user stories the following tables were extracted using SQL.

The data source (covid) were provided in .csv format by the owid github repo, and were connected in the data model in a later step of the process.
Below are the SQL statements for cleansing and transforming necessary data.

sales:


Use Covid;

Select Location, date, total_cases, new_cases, total_deaths, population
From Covid..CovidDeaths
Where continent is not null 
order by 1,2

-- Total Cases vs Total Deaths
-- Shows mortality rate from Covid-19 in the United Kingdom

Select Location, date, total_cases,total_deaths, (total_deaths/total_cases)*100 as DeathPercentage
From Covid..CovidDeaths
Where location like '%kingdom%'
and continent is not null 
order by 1,2 DESC


-- Total Cases vs Population
-- Shows what percentage of population infected with Covid

Select Location, date, Population, total_cases,  (total_cases/population)*100 as PercentPopulationInfected
From Covid..CovidDeaths
Where location like '%kingdom%'
order by 1,2 DESC


-- Countries with Highest Infection Rate compared to Population
Select Location, Population, MAX(total_cases) as HighestInfectionCount,  Max((total_cases/population))*100 as PercentPopulationInfected
From Covid..CovidDeaths
Where location like '%kingdom%'
Group by Location, Population
order by PercentPopulationInfected desc


-- Countries with Highest Death Count per Population
Select Location, MAX(cast(Total_deaths as int)) as TotalDeathCount
From Covid..CovidDeaths
--Where location like '%kingdom%'
Where continent is not null 
Group by Location
order by TotalDeathCount desc


-- BREAKING THINGS DOWN BY CONTINENT

-- Showing contintents with the highest death count per population

Select continent, MAX(cast(Total_deaths as int)) as TotalDeathCount
From Covid..CovidDeaths
--Where location like '%kingdom%'
Where continent is not null 
Group by continent
order by TotalDeathCount desc


-- GLOBAL NUMBERS

Select SUM(new_cases) as total_cases, SUM(cast(new_deaths as int)) as total_deaths, SUM(cast(new_deaths as int))/SUM(New_Cases)*100 as DeathPercentage
From Covid..CovidDeaths
--Where location like '%states%'
where continent is not null 
--Group By date
order by 1,2



-- Total Population vs Vaccinations
-- Shows Percentage of Population that has recieved at least one Covid Vaccine

Select dea.continent, dea.location, dea.date, dea.population, vac.new_vaccinations
, SUM(CONVERT(int,vac.new_vaccinations)) OVER (Partition by dea.Location Order by dea.location, dea.Date) as RollingPeopleVaccinated
--, (RollingPeopleVaccinated/population)*100  Can't perform this yet 
From Covid..CovidDeaths dea
Join Covid..CovidVaccinations vac
	On dea.location = vac.location
	and dea.date = vac.date
where dea.continent is not null 
order by 2,3


-- Using CTE to perform Calculation on Partition By in previous query

With PopvsVac (Continent, Location, Date, Population, New_Vaccinations, RollingPeopleVaccinated)
as
(
Select dea.continent, dea.location, dea.date, dea.population, vac.new_vaccinations
, SUM(CONVERT(int,vac.new_vaccinations)) OVER (Partition by dea.Location Order by dea.location, dea.Date) as RollingPeopleVaccinated
--, (RollingPeopleVaccinated/population)*100
From Covid..CovidDeaths dea
Join Covid..CovidVaccinations vac
	On dea.location = vac.location
	and dea.date = vac.date
where dea.continent is not null 
--order by 2,3
)
Select *, (RollingPeopleVaccinated/Population)*100
From PopvsVac



-- Using Temp Table to perform Calculation on Partition By in previous query

DROP Table if exists #PercentPopulationVaccinated
Create Table #PercentPopulationVaccinated
(
Continent nvarchar(255),
Location nvarchar(255),
Date datetime,
Population numeric,
New_vaccinations numeric,
RollingPeopleVaccinated numeric
)

Insert into #PercentPopulationVaccinated
Select dea.continent, dea.location, dea.date, dea.population, vac.new_vaccinations
, SUM(CONVERT(int,vac.new_vaccinations)) OVER (Partition by dea.Location Order by dea.location, dea.Date) as RollingPeopleVaccinated
--, (RollingPeopleVaccinated/population)*100
From Covid..CovidDeaths dea
Join Covid..CovidVaccinations vac
	On dea.location = vac.location
	and dea.date = vac.date
--where dea.continent is not null 
--order by 2,3

Select *, (RollingPeopleVaccinated/Population)*100 as PercentPopulationVaccinated
From #PercentPopulationVaccinated




-- Creating View to store data for later visualizations

Create View PercentPopulationVaccinated as
Select dea.continent, dea.location, dea.date, dea.population, vac.new_vaccinations
, SUM(CONVERT(int,vac.new_vaccinations)) OVER (Partition by dea.Location Order by dea.location, dea.Date) as RollingPeopleVaccinated
--, (RollingPeopleVaccinated/population)*100
From Covid..CovidDeaths dea
Join Covid..CovidVaccinations vac
	On dea.location = vac.location
	and dea.date = vac.date
where dea.continent is not null 

Data Cleansing & Transformation (Power Query)

The data (covid) from the SQL server was exported to Power BI for further cleaning and transformation.
Below are the Power Query for cleansing and transforming necessary data.


Market Table:
-- Removing blank data rows in market zone column
= Table.SelectRows(sales_markets, each ([zone] <> ""))


Transactions Table:
-- Removing null data rows of -1 and 0 in transactions column
= Table.SelectRows(sales_transactions, each ([sales_amount] <> -1 and [sales_amount] <> 0))

-- Adding new column to convert sales amount in USD to INR
= Table.AddColumn(#"Filtered Rows", "n_sales_amount", each if [currency] = "USD" or [currency] = "USD#(cr)" then [sales_amount]*75 else [sales_amount])

Data Model

Below is a screenshot of the data model after cleansed and prepared tables were read into Power BI.

This data model also shows star schema of the dimension data from the customer, calendar, products and transactions connected to the facts table.

Star Schema

Covid Dashboard

The finished covid dashboard with one page works as a dashboard and overview.

Other Projects

Gadget Sales Analysis

Tech Stack:
+ Python
+ Pandas
+ Jupyter Notebook

Computer Hardware Manufacturer

Tech Stack:
+ Power BI
+ SQL
+ Power Query