Czech record-level mortality data by vaccination status (part 1) - sars2.net

Links to other parts: czech2.html, czech3.html, czech4.html.

Contents

Background

In March 2024 Stanislav Veselý received a FOI response of vaccination data from the Institute of Health Information and Statistics of the Czech Republic: https://www.skirsch.com/covid/CzechFOIA.pdf. The dataset includes rows for about 11 million people, who have columns for the year of birth, date of death, dates of vaccination for doses 1-7, and the type and batch number of each dose.

The data was already uploaded to GitHub in March 2024, but it was almost completely overlooked until Steve Kirsch published his analysis of the data in July 2024: https://kirschsubstack.com/p/breaking-record-level-data-from-czech, https://github.com/skirsch/Czech.

Files used in my analysis

Full record-level data:

Bucket files:

Population and deaths:

COVID data published by the Czech Ministry of Health:

Other:

Were Pfizer and Moderna vaccines allocated randomly?

In an earlier version of the GitHub repository about the Czech data, Kirsch wrote: "Vaccines were randomly distributed for those wishing to get vaxxed. [...] People were not allowed to select which vaccines they got. [...] The randomization of which vaccine someone got created a perfect real-world randomized clinical trial where we could compute the mortality rates for 1 year after Dose 2 for the two most popular vaccines." [https://github.com/skirsch/Czech/blob/5725ac1b64ede7124e00b72af68892f31736b349/README.md]

However I didn't find any source which said that the vaccines were actually allocated randomly in the Czech Republic.

In the FOIA record-level the average year of birth is about 1973 for people whose vaccine type for the second dose was "Comirnaty" (Pfizer) but about 1966 for "SPIKEVAX" (Moderna), which indicates that the vaccine types were not allocated randomly. And there's also about 6 million people whose second vaccine type was Comirnaty but only about 500,000 people whose second vaccine type was Spikevax:

> system("curl -Ls https://github.com/skirsch/Czech/raw/main/data/CR_records.csv.xz|xz -dc>CR_records.csv")
> rec=data.table::fread("CR_records.csv")
> rec[,.(year=mean(Rok_narozeni),.N),.(type=OckovaciLatka_2)][order(year)]|>print(r=F)
                                      type     year       N
                                 VAXZEVRIA 1952.506  439705 # AstraZeneca
                                   Valneva 1953.500       2
                Comirnaty Omicron XBB.1.5. 1960.761      67
                         Nuvaxovid XBB 1.5 1964.500       2
                                  SPIKEVAX 1965.842  517783 # Moderna
                  COVID-19 Vaccine Janssen 1969.026     271
                                   Covovax 1971.517      29
                                 Comirnaty 1972.682 5519975 # Pfizer
      Comirnaty Original/Omicron BA.4/BA.5 1973.527     300
                                   Sinovac 1974.657     178
           Comirnaty Original/Omicron BA.1 1976.048     230
                                 Sinopharm 1979.538      39
                                 Nuvaxovid 1980.135    5098
                                 Sputnik V 1981.300      10
   Spikevax bivalent Original/Omicron BA.1 1981.375      16
                                Covishield 1984.713      94
                                           1989.098 4544448 # people with no second dose listed
                                   COVAXIN 1990.500       4
 SPIKEVAX BIVALENT ORIGINAL/OMICRON BA.4-5 1992.000       2
           Comirnaty Omicron XBB.1.5. 5-11 2017.000       2
                            Comirnaty 6m-4 2019.902      92
            COMIRNATY OMICRON XBB.1.5 6m-4 2020.840      25
                                      type     year       N

Kirsch included these comments in the file Pfizer v. Moderna mortality by age.xlsx:

A Simpson's paradox example here that really surprised us

There is a really interesting simpson's paradox due to ratios. For each individual age, the MRR (mortality rate ratio) rarely go above 2.25. Yet when you divide the CMR for Moderna/Pfizer, you get 2.25.

You can see this in the calculation on the Combined but skip tab at the bottom.

But of course our result isn't a simpson's paradox because we computed the CMR ratios by single digit age groups and five year brackets and plotted it.

However people who received a Moderna vaccine were older on average than people who received a Pfizer vaccine, which explains why Kirsch's ratio is higher for all ages aggregated together than for individual age groups.

Added later: Kirsch later edited the README file at GitHub and he told me: "i've changed randomly to non-systematically in the github. sorry for the error."

Bucket analysis

The following code generates a file for deaths and person-days grouped by ongoing month, month of vaccination, weeks since vaccination, single year of age, dose number, and vaccine type.

The record-level data only has a year of birth for each person but not a date of birth, so here I generated a random date of birth for each person.

library(data.table)

# unique apply (faster for long vectors with many repeated values)
ua=\(x,y,...){u=unique(x);y(u,...)[match(x,u)]}

# fast way to get a floored number of years between dates after 1900 and before 2100
age=\(x,y){class(x)=class(y)=NULL;(y-x-(y-789)%/%1461+(x-789)%/%1461)%/%365}

mindate=as.Date("2020-1-1");maxdate=as.Date("2022-12-31")

rec=fread("Czech/data/CR_records.csv")
t=rec[,.(id=1:.N,death=DatumUmrti)]
set.seed(0);t$birth=ua(paste0(rec$Rok_narozeni,"-1-1"),as.Date)+sample(0:364,nrow(t),T)
t=t[rep(1:nrow(t),8),]
t$dose=rep(0:7,each=nrow(rec))
t$date=c(rep(mindate,nrow(rec)),rec[,`class<-`(unlist(.SD,,F),"Date"),.SDcols=paste0("Datum_",1:7)])
t$type=c(rep("",nrow(rec)),rec[,unlist(.SD,,F),.SDcols=paste0("OckovaciLatka_",1:7)])
t=t[!is.na(date)][date<=maxdate][order(-date)]
t$vaxmonth=ua(t$date,substr,1,7)

name1=unique(t$type);name2=rep("Other",length(name1))
name2[name1==""]=""
name2[grep("comirnaty",ignore.case=T,name1)]="Pfizer"
name2[grep("spikevax",ignore.case=T,name1)]="Moderna"
name2[grep("nuvaxovid",ignore.case=T,name1)]="Novavax"
name2[name1=="COVID-19 Vaccine Janssen"]="Janssen"
name2[name1=="VAXZEVRIA"]="AstraZeneca"
t$type=name2[match(t$type,name1)]

rm(rec) # free up memory so the script won't have to use swap

buck=data.table()
for(day in as.list(seq(mindate,maxdate,1))){
  cat(as.character(day),"\n")
  buck=rbind(buck,unique(t[date<=day&(is.na(death)|day<=death)&day>=birth],by="id")[,.( # remove under earlier doses
  # buck=rbind(buck,t[date<=day&(is.na(death)|day<=death)&day>=birth][!(dose==0&id%in%id[dose>0])][,.( # keep under earlier doses
    month=substr(day,1,7),vaxmonth,week=ifelse(type=="",0,as.numeric(day-date)%/%7),
    age=age(birth,day),dose,type,alive=1,dead=death==day)])[,.(
      alive=sum(alive),dead=sum(dead,na.rm=T)),by=.(month,vaxmonth,week,age,dose,type)]
}

setorder(buck,month,vaxmonth,week,age,dose,type)
fwrite(buck,"czbuckets.csv",quote=F)

Here's different versions of the output produced by the script:

My code is similar to the buckets.py script provided by Kirsch except my code accounts for the aging of people over time correctly. In buckets.py each person has a constant age that is either the age on the day of death for people who died or the age on the day when the script was ran for people who didn't die (and in both cases the age is calculated incorrectly as a floored division of the age in days by 365). [https://github.com/skirsch/Czech/blob/main/code/buckets.py] So if for example the buckets.py script is ran in July 2024, the age of someone who was born in January 1950 and who didn't die is treated as 74 even during 2020.

Deaths and population estimates by single year of age from Eurostat

Eurostat has yearly deaths and population estimates for Czech Republic by single year of age: https://ec.europa.eu/eurostat/data/database. The following code combines the two datasets into a more conveniently formatted single CSV file:

api="https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/data/"

pop=read.table(paste0(api,"demo_pjan?format=TSV"),sep="\t",header=T,check.names=F,na.strings=":")
pop=pop[grep("^A,NR,(?!(TOTAL|UNK)).*,T,CZ",pop[,1],perl=T),]
age=c(as.numeric(head(sub(".*,Y(.*?),.*","\\1",pop[,1]),-2)),0,100)
flat=as.numeric(sub(":? *\\w?$","",unlist(pop[,-1])))
d1=data.frame(year=rep(as.numeric(colnames(pop)[-1]),each=nrow(pop)),age,pop=flat)

dead=read.table(paste0(api,"demo_magec?format=TSV"),sep="\t",header=T,check.names=F,na.strings=":")
dead=dead[grep(",T,(?!(TOTAL|UNK)).*,CZ",dead[,1],perl=T),]
age=c(as.numeric(head(sub(".*,Y(.*?),.*","\\1",dead[,1]),-2)),0,100)
flat=as.numeric(unlist(dead[,-1]))
d2=data.frame(year=rep(as.numeric(colnames(dead)[-1]),each=nrow(dead)),age,dead=flat)

me=merge(d1,d2,all=T)
me=me[order(me$year,me$age),]
write.csv(me,"czpopdead.csv",quote=F,row.names=F,na="")

The output looks like this: [f/czpopdead.csv]

$ curl -Ls sars2.net/f/czpopdead.csv|sed 3q
year,age,pop,dead
1960,0,126977,2581
1960,1,139022,222

In my CSV file age 100 means ages 100 and above. The population estimates are for January 1st and not mid-year estimates.

This dataset has resident population estimates on December 31st 2022 by age group, region, and sex: https://data.gov.cz/datová-sada?iri=https%3A%2F%2Fdata.gov.cz%2Fzdroj%2Fdatové-sady%2F00025593%2Fa129a5408e8e5fd99497e9a22c39775e. The total population size in the dataset is identical to Eurostat's population estimate for January 1st 2023:

> system("wget csu.gov.cz/docs/107508/a53bbc83-5e04-5a74-36f9-549a090a806e/130142-24data051724.zip;unzip 130142-24data051724.zip")
> pop=fread("130181-23data2022.csv")
> pop[uzemi_typ=="stát"&is.na(vek_txt)&pohlavi_txt=="",hodnota] # region type is whole country, age is empty (all ages), and sex is empty (both males and females)
[1] 10827529
> fread("http://sars2.net/f/czpopdead.csv")[year==2023,sum(pop)]
[1] 10827529

The Czech Statistical Office has published Excel files which show the yearly number of deaths by ICD code, age group, and region: https://csu.gov.cz/produkty/zemreli-podle-seznamu-pricin-smrti-pohlavi-a-veku-v-cr-krajich-a-okresech-fgjmtyk2qr. In the years 2020 to 2022, the yearly number of deaths in the Excel files is identical to the data published by Eurostat, and both are otherwise identical to the record-level data except the record-level data is missing a single death in 2021:

> rec=fread("curl -Ls github.com/skirsch/Czech/raw/main/data/CR_records.csv.xz|xz -dc")
> reclev=rec[,.(reclev=.N),.(year=year(DatumUmrti))]
> eurostat=fread("http://sars2.net/f/czpopdead.csv")[,.(eurostat=sum(dead)),year]
> icd=fread("http://sars2.net/f/czicd.csv.gz")[,.(icd=sum(dead)),year]
> merge(icd,merge(reclev,eurostat))|>print(r=F)
year    icd reclev eurostat
2020 129289 129289   129289
2021 139891 139890   139891
2022 120219 120219   120219

Simple yearly ASMR calculation

When I used Eurostat's data to calculate the yearly ASMR among the total population of the Czech Republic, it was about 1343 for 2021 and 1143 for 2022. My standard population was Eurostat's Czech population estimates for January 1st 2022:

> pop=read.csv("http://sars2.net/f/czpopdead.csv")
> std=pop$pop[pop$year==2022]
> asmr=tapply(pop$dead/pop$pop*std[pop$age+1]/sum(std)*1e5,pop$year,sum)
> round(tail(asmr))
2017 2018 2019 2020 2021 2022
1143 1134 1104 1244 1343 1143

In the record-level data unvaccinated people had an ASMR of about 2064 in 2021 and 1865 in 2022, but it might partially be because of the healthy vaccinee effect because people with 1 or 2 doses also had high ASMR in 2022:

> buck=fread("http://sars2.net/f/czbuckets.csv.gz")
> a=aggregate(buck[,4:5],list(year=substr(buck$month,1,4),dose=buck$dose,age=buck$age),sum)
> m=tapply(a$dead/a$alive*std[pmin(a$age,100)+1]/sum(std)*365e5,a[,1:2],sum,na.rm=T)
> round(m)
      dose
year      0    1    2   3    4    5  6  7
  2020 1245    0    0  NA   NA   NA NA NA
  2021 2078 1068  885 610 4837    0  0 NA
  2022 1875 2099 1773 889  866 2206  0  0

This shows the excess ASMR as percentage of ASMR the same year among the total Czech population:

> round((m/c(tail(asmr,4))-1)*100)
      dose
year    0    1    2   3   4    5    6    7
  2020 13 -100 -100  NA  NA   NA   NA   NA
  2021 67   -3  -23 -55 289 -100 -100   NA
  2022 40   69   60 -22 -36   77 -100 -100

Simple ASMR by vaccine type during the first year after the first dose

In the code below I used a simplified but somewhat inaccurate method to calculate ASMR within the first 365 days from the first dose, so that I treated the age of each person as their year of birth subtracted from 2021, and I didn't remove people from the population size after they died. I only included people who got the first dose before 2022 so all people had at least 365 days of follow-up time to die after vaccination, since the record-level data only includes deaths up to the end of 2022.

I got about 32% higher ASMR for Moderna than Pfizer, but they were still both far below the ASMR of the total Czech population in 2021 and 2022:

> std=fread("http://sars2.net/f/czpopdead.csv")[year==2022,pop]
> system("curl -Ls https://github.com/skirsch/Czech/raw/main/data/CR_records.csv.xz|xz -dc>CR_records.csv")
> rec=fread("CR_records.csv")
> cmr=rec[Datum_1<as.Date("2022-1-1"),.(cmr=mean(!is.na(DatumUmrti)&DatumUmrti<Datum_1+365),.N),.(age=pmin(100,2021-Rok_narozeni),type=OckovaciLatka_1)]
> cmr[,.(asmr=sum(cmr*std[age+1]/sum(std)*1e5),N=sum(N)),type][order(-N)]
                        type      asmr       N
 1:                Comirnaty  787.3562 5461526 # Pfizer
 2:                 SPIKEVAX 1028.3506  516562 # Moderna
 3:                VAXZEVRIA  978.4670  447437 # AstraZeneca
 4: COVID-19 Vaccine Janssen 1325.4320  404824
 5:                  Sinovac    0.0000     178
 6:               Covishield    0.0000     115
 7:                Sinopharm    0.0000      36
 8:                  Covovax    0.0000      28
 9:                Sputnik V    0.0000       9
10:                  COVAXIN    0.0000       5

The output above shows that there's almost as many people whose first dose was Janssen as people whose first dose was Moderna, but I got a much higher ASMR for Janssen than Moderna.

Almost all people got the second vaccine from the same type as the first vaccine, except the schedule for the Janssen vaccine only included a single dose:

> rec[,.(type1=OckovaciLatka_1,type2=OckovaciLatka_2)][,.N,.(type1,type2)][order(-N)][1:16]
                                   type1     type2       N
 1:                            Comirnaty Comirnaty 5517414 # Pfizer
 2:                                                4046219 # unvaccinated
 3:                             SPIKEVAX  SPIKEVAX  517152 # Moderna
 4:                            VAXZEVRIA VAXZEVRIA  439653 # AstraZeneca
 5:             COVID-19 Vaccine Janssen            412314
 6:                            Comirnaty             68726
 7:                             SPIKEVAX              8467
 8:                            VAXZEVRIA              5524
 9:                            Nuvaxovid Nuvaxovid    5090
10:           Comirnaty Omicron XBB.1.5.              1924
11:                            VAXZEVRIA Comirnaty    1801
12:                             SPIKEVAX Comirnaty     736
13: Comirnaty Original/Omicron BA.4/BA.5               665
14:                            VAXZEVRIA  SPIKEVAX     329
15:                            Nuvaxovid               323
16:                            Comirnaty  SPIKEVAX     300

Ther's many people who got a Janssen vaccine for the first dose but who later got a booster from another vaccine type, but their field for the second dose is blank and the booster is listed as the third dose:

> rec[,.(type1=OckovaciLatka_1,type2=OckovaciLatka_2,type3=OckovaciLatka_3)][grep("Jans",type1),.N,.(type1,type2,type3)][order(-N)]
                       type1 type2                                     type3      N
 1: COVID-19 Vaccine Janssen                                                 243484
 2: COVID-19 Vaccine Janssen                                       Comirnaty 123603
 3: COVID-19 Vaccine Janssen                                        SPIKEVAX  40641
 4: COVID-19 Vaccine Janssen            Comirnaty Original/Omicron BA.4/BA.5   1877
 5: COVID-19 Vaccine Janssen                        COVID-19 Vaccine Janssen   1518
 6: COVID-19 Vaccine Janssen                 Comirnaty Original/Omicron BA.1    599
 7: COVID-19 Vaccine Janssen                      Comirnaty Omicron XBB.1.5.    436
 8: COVID-19 Vaccine Janssen         Spikevax bivalent Original/Omicron BA.1     70
 9: COVID-19 Vaccine Janssen                                       Nuvaxovid     66
10: COVID-19 Vaccine Janssen                                       VAXZEVRIA      9
11: COVID-19 Vaccine Janssen                 Comirnaty Omicron XBB.1.5. 5-11      6
12: COVID-19 Vaccine Janssen       SPIKEVAX BIVALENT ORIGINAL/OMICRON BA.4-5      4
13: COVID-19 Vaccine Janssen                               Nuvaxovid XBB 1.5      1

However the average date of first vaccination was much later for Janssen vaccines than for the other three most popular vaccine types:

> rec[,.(date=rec$Datum_1,type=rec$OckovaciLatka_1)][type!=""&type%in%type[rowid(type)==1e4]][,.(date=mean(date),.N),type][order(date)]
                       type       date       N
1:                VAXZEVRIA 2021-04-02  447450 # AstraZeneca
2:                 SPIKEVAX 2021-05-29  526453 # Moderna
3:                Comirnaty 2021-06-08 5586979 # Pfizer
4: COVID-19 Vaccine Janssen 2021-09-05  412314

And this shows that people who got a Pfizer or Moderna vaccine in late 2021 subsequently also had high ASMR over the next 365 days:

rec=fread("CR_records.csv")
std=fread("http://sars2.net/f/czpopdead.csv")[year==2022,pop]

rec2=rec[Datum_1<as.Date("2022-1-1")]
d=rec2[,.(dead=!is.na(DatumUmrti)&DatumUmrti<Datum_1+365,type=OckovaciLatka_1)]
d$age=pmin(100,2021-rec2$Rok_narozeni)

ua=\(x,fun,...){u=unique(x);fun(u,...)[match(x,u)]}
d$vaxmonth=ua(rec2$Datum_1,substr,1,7)

name1=c("Comirnaty","SPIKEVAX","VAXZEVRIA","COVID-19 Vaccine Janssen")
name2=c("Pfizer","Moderna","AstraZeneca","Janssen")
d$type=ifelse(d$type%in%name1,name2[match(d$type,name1)],"Other")
d$type=factor(d$type,names(sort(table(d$type),T)))

rate=d[,.(rate=mean(dead)),.(age=age,type=type,vaxmonth=vaxmonth)]
round(tapply(rate$rate*std[rate$age+1]/sum(std)*1e5,rate[,2:3],sum))
             vaxmonth
type          2020-12 2021-01 2021-02 2021-03 2021-04 2021-05 2021-06
  Pfizer         1461    1402    1369     740     868     823    1029
  Moderna          NA    3369    1952     975     927    1089    1231
  AstraZeneca      NA      NA    1628     904    1061    1184    1676
  Janssen          NA      NA      NA       0     918    1089    1402
  Other            NA       0       0       0       0       0       0
             vaxmonth
type          2021-07 2021-08 2021-09 2021-10 2021-11 2021-12
  Pfizer         1067    1241    1639    1757    1499    1782
  Moderna        1134    1204    1995    1647    1552    1882
  AstraZeneca    1992    1985     470      96       0       0
  Janssen        1340    1410    1569    1352    1422    1862
  Other             0       0       0       0       0       0

Plot for ASMR by dose and date

In the plot below the ASMR of people with one dose shoots up when the second dose is rolled out, and the ASMR of people with two doses shoots up when the third dose is rolled out. Martin Neil and Norman Fenton speculated that a similar phenomenon in the English ONS data was explained by the so-called cheap trick, where people were categorized under the previous vaccine dose for a certain number of weeks after a new vaccine dose, so that for example a death that occured soon after the second dose would've been classified under the first dose. Jeffrey Morris said that the phenomenon was explained by the healthy vaccinee effect instead, because the healthy vaccinees who move under the nth dose have low mortality, it means that the so-called unhealthy stragglers who remain under dose n-1 will have increased mortality. However the same kind of a phenomenon can also be seen in the Czech data even though it doesn't employ the cheap trick, which makes it seem more likely that Morris was right and Neil and Fenton were wrong. And the New Zealand data released by Barry Young doesn't employ the cheap trick either but the same phenomenon can also be seen in Barry's data.

From the plot below you can also see that during a COVID wave in December 2021, unvaccinated people have a big spike in mortality but there is essentially no spike in mortality in people with 3 doses, and there is only a small spike in the black line which shows the mortality among all vaccinated people. People with 2 doses do have a big increase in mortality in November to December 2021, but it's probably not only because of COVID but also because a lot of people got the third dose so the unhealthy stragglers were left under the second dose:

In the plot above the dark gray line shows the mortality rate among both vaccinated and unvaccinated people in the record-level data, and the light gray line shows the mortality rate based on the weekly number of deaths in 5-year age groups reported by the Czech Statistical Office combined with population estimates from Eurostat. [https://csu.gov.cz/produkty/number-of-deaths-weekly-and-monthly-time-series, https://ec.europa.eu/eurostat/data/database] The discrepancy between the lines might partially be because I used 5-year age groups up to 95+ for the dark gray line but 90+ for the light gray line.

library(data.table);library(ggplot2)

isoweek=\(year,week,weekday=1){d=as.Date(paste0(year,"-1-4"));d-(as.integer(format(d,"%w"))+6)%%7-1+7*(week-1)+weekday}
ma=\(x,b=1,f=b){x[]=rowMeans(embed(c(rep(NA,b),x,rep(NA,f)),f+b+1),na.rm=T);x}

xstart=as.Date("2020-1-1");xend=as.Date("2023-1-1");dates=seq(xstart,xend,1)

std=fread("http://sars2.net/f/czcensus2021pop.csv")[,.(pop=sum(pop)),.(age=pmin(age,95)%/%5*5)][,.(std=pop/sum(pop),age)]

b=fread("http://sars2.net/f/czbucketsdaily.csv.gz")
b=b[,.(alive=sum(alive),dead=sum(dead)),.(date,age=pmin(age,95)%/%5*5,dose=ifelse(dose==0,"Unvaccinated",paste0("Dose ",ifelse(dose>=4,"4+",dose))))]
b=unique(rbind(b,cbind(expand.grid(lapply(b[,1:3],unique)),alive=0,dead=0)),by=1:3)[order(date)]

b=rbind(b,b[,.(alive=sum(alive),dead=sum(dead),dose="All people (record-level data)"),.(age,date)])
b=rbind(b,b[grep("Dose",dose),.(alive=sum(alive),dead=sum(dead),dose="All vaccinated"),.(age,date)])

p=b[,.(date,alive=ma(alive,10),dead=ma(dead,10)),.(age,dose)]
p=merge(p,std)[,.(y=sum(dead/alive*std*365e5,na.rm=T),pop=sum(alive)),.(x=date,z=dose)]

dead=fread("http://sars2.net/f/czweeklydead.csv")
dead=dead[,approx(isoweek(year,week,4),dead/7,dates,rule=2),age][,.(x,dead=y,age)]
pop=fread("http://sars2.net/f/czpopdead.csv")[year>=2010]
pop=pop[,.(pop=sum(pop)),.(age=pmin(90,age%/%5*5),year)]
pop=pop[,spline(as.Date(paste0(year,"-1-1")),pop,xout=dates),age][,.(x,pop=y,age)]
std2=std[,.(std=sum(std)),.(age=pmin(age,90))]
p=rbind(p,merge(std2,merge(dead,pop))[,.(y=sum(dead/pop*std*365e5),pop=NA,z="All people (Czech Statistical Office)"),.(x=as.IDate(x))])

p[,z:=factor(z,unique(z))]
p[pop<2e3,y:=NA]

yend=4e3;ystep=1e3;yend2=100;ystep2=25;secmult=yend/yend2

color=c(hcl(c(210,120,60,0,300)+15,90,50),"black","gray45","gray70")
fill=c(hcl(c(210,120,60,0,300)+15,80,70),"black","gray45","gray70")

label=data.frame(x=xstart+.02*(xend-xstart),y=seq(yend,,-yend/15,nlevels(p$z))-yend/16,label=levels(p$z))

sub="The lines are ±10-day moving averages. People are removed under earlier doses after a new
dose. The standard population is the estimated resident population in the 2021 census by 5-year
age groups. ASMR values are not displayed on days with a population size below 2,000. Sources:
github.com/PalackyUniversity/uzis-data-analysis, scitani.gov.cz/age-structure, ec.europa.eu/
eurostat/data/database, csu.gov.cz/produkty/number-of-deaths-weekly-and-monthly-time-series."

p2=p[grep("Unvaccinated|Dose",z)]
p2=merge(p2,p2[,.(popsum=sum(pop)),x])

ggplot(p2,aes(x,y))+
geom_area(aes(color=z,fill=z,y=pop/popsum*99.999*secmult),linewidth=.1,alpha=.22)+
geom_line(aes(color=z),linewidth=.4)+
geom_line(data=p[grep("All vaccinated",z)],color=color[6],linewidth=.4)+
geom_line(data=p[grep("record-level",z)],color=color[7],linewidth=.4)+
geom_line(data=p[grep("Office",z)],color=color[8],linewidth=.4)+
geom_hline(yintercept=c(0,yend),linewidth=.3,lineend="square")+
geom_vline(xintercept=c(xstart,xend),linewidth=.3,lineend="square")+
geom_label(data=label,aes(x=x,y=y,label=label),fill=alpha("white",.7),label.r=unit(0,"pt"),label.padding=unit(1,"pt"),label.size=0,color=color[1:nrow(label)],size=2.7,hjust=0)+
labs(title="Age-standardized mortality rate in Czech record-level data",subtitle=sub,x=NULL,y="ASMR per 100,000 person-years")+
scale_x_date(limits=c(xstart,xend),breaks=seq(xstart,xend,"2 month"),date_labels="%b\n%y")+
scale_y_continuous(limits=c(0,yend),breaks=seq(0,yend,ystep),labels=\(x)ifelse(x==0,x,paste0(x/1e3,"k")),sec.axis=sec_axis(trans=~./secmult,breaks=seq(0,yend2,ystep2),name="Percentage of people with dose"))+
scale_color_manual(values=color)+
scale_fill_manual(values=fill)+
coord_cartesian(clip="off",expand=F)+
theme(axis.text=element_text(size=6.5,color="black"),
  axis.ticks=element_line(linewidth=.3),
  axis.title.y.left=element_text(margin=margin(0,4,0,0)),
  axis.title.y.right=element_text(margin=margin(0,-1,0,4)),
  axis.ticks.length=unit(3,"pt"),
  axis.title=element_text(size=7.5),
  legend.position="none",
  panel.background=element_rect(fill="white"),
  plot.margin=margin(4,4,4,4),
  plot.subtitle=element_text(size=7,margin=margin(0,0,4,0)),
  plot.title=element_text(size=7.5,face="bold",margin=margin(2,0,4,0)))
ggsave("0.png",width=5,height=3.3,dpi=400*4)
system("magick 0.png -resize 25% 1.png")

Why is there a peak in mortality rate about 30 to 35 weeks after the second dose?

Kirsch included this comment in the file CR time series analysis.xlsx: "MR peaking 35 weeks after the shots #2 for male and female is hard for them to explain. It can't be HVE because HVE doesn't 'peak'."

For example in this plot the mortality rate peaks on week 30 after the second dose:

However it's because the mortality rate of people who remained under the second dose shot up when the third dose was rolled out, and people got the third dose about 30 weeks after the second dose on average:

> rec=fread("CR_records.csv")
> mean(rec$Datum_3-rec$Datum_2,na.rm=T)/7
[1] 29.6133

The average date of third doses is only about 26 weeks later than the average date of second doses:

> as.Date(mean(rec$Datum_2,na.rm=T))
[1] "2021-07-09"
> as.Date(mean(rec$Datum_3,na.rm=T))
[1] "2022-01-04"
> (mean(rec$Datum_3,na.rm=T)-mean(rec$Datum_2,na.rm=T))/7
[1] 25.57143

However people who got both the second and third dose got the second dose much earlier on average than people who only got the second dose (because younger people got vaccinated later and younger people were less likely to get a booster):

> as.Date(mean(rec$Datum_2[!is.na(rec$Datum_3)],na.rm=T))
[1] "2021-06-10" # average date of second dose for people who also got a third dose
> as.Date(mean(rec$Datum_2[is.na(rec$Datum_3)],na.rm=T))
[1] "2021-08-31"" # average date of second dose for people who didn't get a third dose

Excess mortality relative to all people matched by age

Here for each combination of month and dose number, the expected number of deaths was derived by multiplying the number of person-days for each single year of age by the mortality rate that month of the same age among all people who are included in the dataset, and the results for each age were added together to get the baseline number of deaths among all ages.

For example in January 2021 the mortality rate of all people aged 70 in the dataset was about 9.734e-5 deaths per person-days, so the expected number of deaths was about 2.606 (from 9.734e-5*26871). So the excess mortality percent was about (2/2.606-1)*100 which is about -23%. When I repeated the same calculation for all ages, the expected number of deaths for all ages added together was about 523.6. But the actual number of deaths was 326, so the total excess mortality percent was about (326/523.6-1)*100 which is about -38%.

In the plot above the excess mortality of unvaccinated people peaks in December 2021 when there was a COVID wave. And conversely in December 2021 the excess mortality of the "All vaccinated" group is also lower than in the surrounding months.

In the plot above I used a different baseline for each month, so the excess mortality rates were adjusted for seasonal variation in mortality and the effect of COVID waves. But in the next plot I'm using the total mortality rate in 2021-2022 as the baseline throughout the plot, which increases the excess mortality percentages during the COVID wave in December 2021:

In the plot above unvaccinated people have 174% excess mortality in December 2021 but 43% in September 2021, so it's an approximately 1.92-fold increase in mortality (from (173.53+100)/(42.68+100)). But vaccinated people only have an approximately 1.54-fold increase in mortality when December 2021 is compared to September 2021 (from (5.46+100)/(-31.64+100)). In the US Medicare data that Kirsch published in 2023, the spikes in deaths during COVID waves were also bigger in unvaccinated people than vaccinated people.

library(data.table)

b=fread("http://sars2.net/f/czbuckets.csv.gz",showProgress=F)[month>="2020-12"]
b[,dose:=ifelse(dose==0,"Unvaccinated",paste0("Dose ",ifelse(dose>=4,"4+",dose)))]

me=merge(b,b[,.(base=sum(dead)/sum(alive)),.(age,month)]) # use different baseline for each month
# me=merge(b,b[,.(base=sum(dead)/sum(alive)),age]) # use 2021-2022 total as baseline for each month

a=me[,base:=base*alive][,.(alive=sum(alive),dead=sum(dead),base=sum(base)),.(month,dose)]
a=rbind(a,a[,.(dead=sum(dead),alive=sum(alive),base=sum(base),month="Total"),dose])
a=rbind(a,a[dose!="Unvaccinated",.(dead=sum(dead),alive=sum(alive),base=sum(base),dose="All vaccinated"),month])

a$dose=factor(a$dose,c("Unvaccinated",paste0("Dose ",1:3),"Dose 4+","All vaccinated"))

mpop=xtabs(alive~dose+month,a)/365
m=tapply(with(a,(dead-base)/ifelse(dead>base,base,dead)*100),a[,2:1],c)
disp=tapply(100*(a$dead/a$base-1),a[,2:1],round)
hide=mpop<10;m[hide]=disp[hide]=NA
exp=.8;m=abs(m)^exp*sign(m)
maxcolor=300^exp;m[is.infinite(m)]=-maxcolor

pheatmap::pheatmap(m,filename="i1.png",display_numbers=disp,
  cluster_rows=F,cluster_cols=F,legend=F,cellwidth=19,cellheight=19,fontsize=9,fontsize_number=8,
  border_color=NA,na_col="white",
  number_color=ifelse((abs(m)>.55*maxcolor)&!is.na(m),"white","black"),
  breaks=seq(-maxcolor,maxcolor,,256),
  colorRampPalette(hsv(rep(c(7/12,0),5:4),c(.9,.75,.6,.3,0,.3,.6,.75,.9),c(.4,.65,1,1,1,1,1,.65,.4)))(256))

exp2=.6;mpop=mpop^exp2;maxcolor2=max(mpop[-nrow(m),-ncol(m)])

kimi=\(x){e=floor(log10(ifelse(x==0,1,abs(x))));e2=pmax(e,0)%/%3+1;p=!is.na(x)
  x[p]=paste0(sprintf(paste0("%.",ifelse(e[p]%%3==0,1,0),"f"),x[p]/1e3^(e2[p]-1)),c("","k","M","B","T")[e2[p]]);x}

pheatmap::pheatmap(mpop,filename="i2.png",display_numbers=kimi(mpop),
  cluster_rows=F,cluster_cols=F,legend=F,cellwidth=19,cellheight=19,fontsize=9,fontsize_number=8,
  border_color=NA,na_col="white",
  number_color=ifelse(mpop>maxcolor2*.45,"white","black"),
  breaks=seq(0,maxcolor2,,256),
  sapply(seq(1,0,,256),\(i)rgb(i,i,i)))

system("w=`identify -format %w i1.png`;pad=38;convert -gravity northwest -pointsize 42 -font Arial \\( -splice x24 -size $[w-pad]x caption:'Czech record-level data: Excess mortality percent relative to all people matched by age and ongoing month. People are removed under earlier doses after a new dose.' -extent $[w-pad]x -gravity center \\) i1.png -gravity northwest \\( -size $[w-40]x caption:'Person-years by dose and month of death' -extent $[w-pad]x -gravity center \\) i2.png -append 1.png")

Triangle plot for excess mortality by month of vaccination and month of death

In the New Zealand data released by Barry Young, I noticed an effect where people who got vacinated during the early part of the main vaccine rollout subsequently had lower excess mortality than people who got vaccinated later on, which I called the "early vaccinee effect". I later also noticed a similar effect in a May 2024 FOI response to Clare Craig, which showed the number of deaths in England by week of vaccination, month of death, and age group: statistic.html#Clare_Craig_May_2024_UKHSA_FOI_response_for_deaths_in_vaccinated_people.

A similar early vaccinee effect is also visible in the Czech data. The monthly number of first doses given peaked in May 2021, so people who got the first vaccine dose during February to May 2021 subsequently had the lowest excess mortality, but people who got vaccinated in June 2021 already had close to 0% excess mortality, and the excess mortality gradually increased to over 100% in people who waited until 2022 to get the first dose:

library(data.table)

b=fread("http://sars2.net/f/czbucketskeep.csv.gz")[,age:=pmin(age,95)]
d=merge(b[dose==1],b[dose<=1,.(base=sum(dead)/sum(alive)),.(age,month)])[,base:=alive*base]
d=d[,.(dead=sum(dead),alive=sum(as.double(alive)),base=sum(base)),.(month,vaxmonth)]
d=rbind(d,d[,.(dead=sum(dead),alive=sum(alive),base=sum(base),month="Total"),vaxmonth])
d=rbind(d,d[,.(dead=sum(dead),alive=sum(alive),base=sum(base),vaxmonth="Total"),month])

mpop=xtabs(alive~vaxmonth+month,d)/365
disp=d[,tapply((dead/base-1)*100,list(vaxmonth,month),round)]
m=d[,tapply((dead-base)*100/ifelse(dead>base,base,dead),list(vaxmonth,month),c)]

hide=mpop<10;m[hide]=disp[hide]=NA
disp[lower.tri(disp)&row(disp)!=nrow(disp)]=""
exp=.8;m=abs(m)^exp*sign(m)
maxcolor=300^exp;m[is.infinite(m)]=-maxcolor

pheatmap::pheatmap(m,filename="mort.png",display_numbers=disp,
  gaps_col=ncol(m)-1,gaps_row=nrow(m)-1,
  cluster_rows=F,cluster_cols=F,legend=F,
  cellwidth=17,cellheight=17,fontsize=9,fontsize_number=8,border_color=NA,na_col="white",
  number_color=ifelse(!is.na(m)&abs(m)>.5*maxcolor,"white","black"),
  breaks=seq(-maxcolor,maxcolor,,256),
  colorRampPalette(hsv(rep(c(7/12,0),4:5),c(.9,.75,.6,.3,0,.3,.6,.75,.9),c(.4,.65,1,1,1,1,1,.65,.4)))(256))

ua=\(x,y,...){u=unique(x);y(u,...)[match(x,u)]}

kimi=\(x){e=floor(log10(ifelse(x==0,1,abs(x))));e2=pmax(e,0)%/%3+1
  x[]=ifelse(abs(x)<10,round(x),paste0(round(x/1e3^(e2-1),ifelse(e%%3==0,1,0)),c("","k","M","B","T")[e2]));x}

rec=fread("CR_records.csv")
vaxn=table(ua(rec$Datum_1,substr,1,7))
vaxn["Total"]=sum(vaxn)
pop=as.matrix(vaxn[rownames(m)]);colnames(pop)="New doses"
exp2=.6;maxcolor2=max(head(pop,-1))

pheatmap::pheatmap(pop^exp2,filename="pop.png",display_numbers=kimi(pop),
  gaps_row=nrow(m)-1,cluster_rows=F,cluster_cols=F,legend=F,
  cellwidth=21,cellheight=17,fontsize=9,fontsize_number=8,border_color=NA,na_col="white",
  number_color=ifelse(pop^exp2>maxcolor2^exp2*.4,"white","black"),
  breaks=seq(0,maxcolor2^exp2,,256),sapply(seq(1,0,,256),\(i)rgb(i,i,i)))

system("convert mort.png -crop -177 \\( -gravity northwest pop.png -shave 16x0 -splice 0x5 -gravity west \\) -gravity north +append 00.png;w=`identify -format %w 00.png`;pad=48;convert \\( -pointsize 46 -font Arial-Bold -size $[w-pad]x caption:'Czech record-level data, dose 1: Excess mortality percent up to end of 2022' -gravity north -splice x26 \\) 00.png -append 0.png")
system("convert 0.png -fill black -gravity southwest -size 900x700 -pointsize 40 -font Arial caption:'The x-axis shows the month of death and the y-axis shows the month of vaccination. People are kept included under the first dose even after they have gotten a second dose. The baseline number of deaths for each cell was calculated so that the number of person-days for each combination of age and month was multiplied by the mortality rate among all people included in the record-level data for the same combination of age and month, and the results were added together to get the total expected deaths for each cell. So the baseline is adjusted for seasonal variation in mortality.' -gravity southwest -geometry +60+330 -composite 1.png")

Excess mortality by type of first vaccine

The plot below shows excess mortality by the type of the first vaccine. People with two or more doses almost always got the second dose from the same vaccine type as the first dose, except Janssen vaccines only had a single dose so people who got a Janssen vaccine for the first dose often got a different type of vaccine for the booster.

The plot below shows that even in late 2022, people got a Moderna vaccine for the first dose had much higher excess mortality than people who got a Pfizer vaccine for the first dose. If the difference in mortality would be because of vaccine deaths like Kirsch says, you'd expect the difference to be greatest in the first few weeks or months after vaccination and get weaker over time. But because the difference remains in place more than a year after vaccination, it rather seems to be caused by some confounding factors which I didn't adjust for here:

In the plot above Janssen vaccines get by far the highest excess mortality percentage, but it's probably because the average date of first vaccination is much later for Janssen vaccines than for the other three vaccine types, so the "late vaccinees" who got vaccinated in the second half of 2021 are overrepresented among the people who got a Janssen vaccine.

In the next plot the x-axis shows the month of vaccination and not the month of death, and the number in each cell is the excess mortality up to the end of 2022 and not the excess mortality on the month that is shown on the x-axis. The plot shows that people who got a Pfizer vaccine in late 2021 subsequently also had much higher excess mortality than people who got a Pfizer vaccine in the first half of 2021:

library(data.table)

b=fread("http://sars2.net/f/czbucketskeep.csv.gz")[dose<=1][dose==0,type:="Unvaccinated"][,age:=pmin(age,95)]

# first plot
d=merge(b[month>="2020-12"&type!="Other"],b[,.(base=sum(dead)/sum(alive)),.(age,month)])[,base:=base*alive]
d=d[,.(dead=sum(dead),alive=sum(as.double(alive)),base=sum(base)),.(type,month)]
d=rbind(d,d[,.(dead=sum(dead),alive=sum(alive),base=sum(base),month="Total"),type])
d=rbind(d,d[type!="Unvaccinated",.(dead=sum(dead),alive=sum(alive),base=sum(base),type="All vaccinated"),month])
d[,type:=factor(type,unique(c("Unvaccinated","All vaccinated",d[,sum(alive),type][order(-V1)]$type)))]

# # second plot
# d=merge(b[type!="Other"&dose>0],b[,.(base=sum(dead)/sum(alive)),.(age,month)])[,base:=base*alive]
# d=d[,.(dead=sum(dead),alive=sum(as.double(alive)),base=sum(base)),.(type,vaxmonth)]
# d=rbind(d,d[,.(dead=sum(dead),alive=sum(alive),base=sum(base),vaxmonth="Total"),type])
# d=rbind(d,d[,.(dead=sum(dead),alive=sum(alive),base=sum(base),type="All vaccinated"),vaxmonth])
# d[,type:=factor(type,unique(d[,sum(alive),type][order(-V1)]$type))]

mpop=xtabs(d$alive~d[[1]]+d[[2]])/365
m=with(d,tapply((dead-base)/ifelse(dead>base,base,dead)*100,d[,1:2],c))
disp=tapply(100*(d$dead/d$base-1),d[,1:2],round)
m[mpop<10]=disp[mpop<10]=NA
exp=.8;m=abs(m)^exp*sign(m)
maxcolor=300^exp;m[is.infinite(m)]=-maxcolor

pheatmap::pheatmap(m,filename="i1.png",display_numbers=disp,
  gaps_col=ncol(m)-1,
  cluster_rows=F,cluster_cols=F,legend=F,cellwidth=19,cellheight=19,fontsize=9,fontsize_number=8,
  border_color=NA,na_col="white",
  number_color=ifelse((abs(m)>.55*maxcolor)&!is.na(m),"white","black"),
  breaks=seq(-maxcolor,maxcolor,,256),
  colorRampPalette(hsv(rep(c(7/12,0),5:4),c(1,.8,.6,.3,0,.3,.6,.8,1),c(.3,.65,rep(1,5),.65,.3)))(256))

kimi=\(x){e=floor(log10(ifelse(x==0,1,abs(x))));e2=pmax(e,0)%/%3+1;x[]=ifelse(abs(x)<1e3,round(x),
  paste0(sprintf(paste0("%.",ifelse(e%%3==0,1,0),"f"),x/1e3^(e2-1)),c("","k","M","B","T")[e2]));x}

exp2=.6;maxcolor2=max(mpop[-nrow(m),-ncol(m)])

pheatmap::pheatmap(mpop^exp2,filename="i2.png",display_numbers=kimi(mpop),
  gaps_col=ncol(m)-1,
  cluster_rows=F,cluster_cols=F,legend=F,cellwidth=19,cellheight=19,fontsize=9,fontsize_number=8,
  border_color=NA,na_col="white",
  number_color=ifelse(mpop^exp2>maxcolor2^exp2*.45,"white","black"),
  breaks=seq(0,maxcolor2^exp2,,256),
  sapply(seq(1,0,,256),\(i)rgb(i,i,i)))

The plot below is otherwise similar to the first plot except the baseline is the mortality among only unvaccinated people instead of both unvaccinated and vaccinated people. It shows that during the COVID wave in December 2021, people with a first dose from a Moderna vaccine had about 65% lower deaths than unvaccinated people matched by age (even though Kirsch was making it seem like people with a Moderna vaccine were dropping like flies):

Deaths by weeks after first dose

In 2023 Kirsch published a table of data from Medicare which showed the number of deaths by days since the first COVID vaccine given in 2021 that was listed in the database, which was generally the first dose of each person. The table seems to have a bump in deaths about 20-30 days after vaccination, which I speculated might have been because of people who died soon after the second dose, because the second dose was often given about 3-4 weeks after the first dose: [https://kirschsubstack.com/p/game-over-medicare-data-shows-the#%C2%A7the-medicare-data-that-i-received]

In the plot below the dark green points show the same data for deaths in 2021 as the screenshot above. The spline that I fitted to the dark green points seems to have a bump where it's temporarily elevated around day 25 from vaccination, but the red line for deaths in the year 2022 has a more smooth curve with no clear increase around day 25. I though it might have been because most of the doses given in 2022 were booster doses, and people didn't get a second booster dose right after the first booster, so in the scenario where the bump in the dark green line was caused by people dying from the second dose, the bump could've been missing from the red line because people didn't get the second booster soon after the first booster:

I noticed that the Czech data also seems to be a bump in deaths around weeks 2-4 after vaccination (where week 0 consists of the day of vaccination and the next 6 days):

However in the plot above my baseline was not adjusted for seasonal variation in mortality, because I calculated the baseline based on the total mortality rates for each age in 2021-2022. In the plot below the dark gray baseline was calculated the same way as in the plot above. But if you look at the light gray baseline which is adjusted for seasonal variation in mortality and for the impact of COVID waves, there's a big fall in the baseline during the first 10 weeks after vaccination, because many people got vaccinated during a period of time from March to May 2021 when COVID deaths fell from the highest point during the pandemic to near zero:

In the plot above if you compare the black line for deaths against the dark gray baseline that is not adjusted for seasonality, the period after vaccination that has clearly reduced mortality seems to last only about 2 weeks, which is surprisingly short compared to other datasets like Barry Young's data from New Zealand. But the period with reduced mortality is actually longer if you compare the black line against the light gray line instead, as you can see from this plot which shows the excess percentage of deaths relative to the baseline:

So I suspect the reason why the US Medicare data had a spike in deaths around days 20-30 might similarly be because many people received the first dose in January to February 2021 when there was a sharp spike in excess mortality in the United States, but by March 2021 the excess mortality had fallen down close to zero.

Here's code for making the two previous plots:

buck=fread("http://sars2.net/f/czbucketskeep.csv.gz")[dose<=1]
buck=buck[,.(dead=sum(dead),alive=sum(alive)),by=.(month,age,type,week)]

rate1=buck[,.(rate1=sum(dead)/sum(alive)),by=age]
rate2=buck[,.(rate2=sum(dead)/sum(alive)),by=.(month,age)]
me=merge(merge(buck,rate1),rate2,by=c("age","month"))[type!=""]
xy=me[,.(excess1=(sum(dead)/sum(rate1*alive)-1)*100,excess2=(sum(dead)/sum(rate2*alive)-1)*100),by=week]

xy=me[,.(dead=sum(dead),base1=sum(rate1*alive),base2=sum(rate2*alive)),by=week][order(week)]

png("1.png",1200,800,res=198)
par(mar=c(2.3,2.3,2.1,.8),mgp=c(0,.6,0))
tit="Czech record-level data: Deaths by weeks after first dose"
leg=c("Deaths","Same baseline for each month","Different baseline for each month")
col=c("black","gray50","gray80")
plot(xy$week,xy$base2,type="l",col=col[3],main=tit,xlab=NA,ylab=NA,lwd=1.5)
lines(xy$week,xy$base1,type="l",col=col[2],lwd=1.5)
lines(xy$week,xy$dead,type="l",col=col[1],lwd=1.5)
legend("bottom",legend=leg,col=col,lty=1,lwd=1.5)
dev.off()

And here's code for the plot before them:

library(ggplot2);library(data.table)

ua=\(x,y,...){u=unique(x);y(u,...)[match(x,u)]}

bin=7;maxdate=as.Date("2022-12-31")

rec=fread("CR_records.csv",showProgress=F)

t=rec[,.(id=1:.N,death=DatumUmrti,date=Datum_1,type=OckovaciLatka_1,birth=Rok_narozeni)][!is.na(date)]
set.seed(0);t$birth=ua(ua(t$birth,paste0,"-1-1"),as.Date)+sample(0:364,nrow(t),T)

name1=c("Comirnaty","SPIKEVAX","VAXZEVRIA","COVID-19 Vaccine Janssen")
name2=c("Pfizer","Moderna","AstraZeneca","Janssen")
t=t[!is.na(date)&type%in%name1][,type:=name2[match(type,name1)]]

age=as.numeric(t$date-t$birth)/365.25
dead=t[!is.na(t$death)]
deadbin=(as.numeric(dead$death)-as.numeric(dead$date))%/%bin
endbin=(as.numeric(maxdate)-as.numeric(t$date))%/%bin

cmr=fread("http://sars2.net/f/czpopdead.csv")[year%in%2021:2022,.(sum(dead)/sum(pop)*1e5),age][[2]]

bins=0:max(endbin)
pop=rev(cumsum(rev(table(factor(endbin,bins)))))*bin
baseline=sapply(bins,\(i)mean(cmr[pmin(100,floor(age[i<=endbin]+i*bin/365))+1]))

xy=data.frame(bin=bins,baseline,pop)
xy$dead=as.numeric(table(factor(deadbin,xy$bin)))
xy$cmr=xy$dead/xy$pop*1e5*365
xy$age=sapply(bins,\(i)mean(age[i<=endbin]))+xy$bin*bin/365
deadage=(as.numeric(dead$death)-as.numeric(dead$birth))/365.25
xy$deadage=tapply(deadage,factor(deadbin,xy$bin),mean)+xy$bin*bin/365
xy$deadbase=xy$baseline*xy$pop/1e5/365

xy$cmr[xy$pop<3e4]=NA

lab=read.csv(row.names=1,text="name,title
cmr,\"Mortality rate per 1,000 person-years\"
baseline,Baseline for mortality rate
dead,Deaths
deadbase,Baseline for deaths
age,Average age of population
deadage,Average age at death
pop,Population in 100k people")
lab$color=c(hcl(255,120,40),hcl(255,60,70),"black","gray50",hcl(60,90,60),hcl(60,110,40),hcl(135,80,50))

lab1=strsplit("cmr,baseline,age,deadage,pop",",")[[1]]
lab2=strsplit("dead,deadbase",",")[[1]]

lab$mult=1
lab["pop",]$mult=1/1e5/7
lab["baseline",]$mult=lab["cmr",]$mult=1/100

cand=c(sapply(c(1,2,5),\(x)x*10^c(-10:10)))
ymax=max(t(t(xy[,lab1])*lab[lab1,]$mult),na.rm=T)
ystep=cand[which.min(abs(cand-ymax/6))]
yend=ystep*ceiling(ymax/ystep)
xstep=cand[which.min(abs(cand-max(xy$bin)/9))]
xend=xstep*ceiling(max(xy$bin)/xstep)
xbreak=seq(0,xend,xstep);ybreak=seq(0,yend,ystep)
ymax2=max(t(t(xy[,lab2])*lab[lab2,]$mult),na.rm=T)
ystep2=cand[which.min(abs(cand-ymax2/6))]
yend2=ceiling(ymax2/ystep2)*ystep2
secmult=yend/yend2

legh=.38;legstep=15
legy1=rev(seq(0,,yend/legstep,length(lab1)))+(legh-length(lab1)/legstep/2)*yend
legy2=rev(seq(0,,yend/legstep,length(lab2)))+(legh-length(lab2)/legstep/2)*yend
leg1=data.frame(x=.02*xend,y=legy1,label=lab[lab1,]$title,color=lab[lab1,]$color)
leg2=data.frame(x=.98*xend,y=legy2,label=lab[lab2,]$title,color=lab[lab2,]$color)

lab$mult=lab$mult*ifelse(rownames(lab)%in%lab2,secmult,1)
xy2=as.data.frame(t(t(xy)*c(1,lab[names(xy)[-1],]$mult)))

kimi=\(x)ifelse(abs(x)>=1e6,paste0(x/1e6,"M"),ifelse(abs(x)>=1e3,paste0(x/1e3,"k"),x))

ggplot(xy2,aes(x=bin))+
geom_hline(yintercept=0,linewidth=.35,lineend="square")+
geom_vline(xintercept=c(0,xend),linewidth=.35,lineend="square")+
geom_line(aes(y=dead),linewidth=.4,color=lab["dead",]$color)+
geom_line(aes(y=deadbase),linewidth=.4,color=lab["deadbase",]$color)+
geom_line(aes(y=cmr),linewidth=.4,color=lab["cmr",]$color)+
geom_line(aes(y=baseline),linewidth=.4,color=lab["baseline",]$color)+
geom_line(aes(y=pop),linewidth=.4,color=lab["pop",]$color)+
geom_point(aes(y=age),size=.4,color=lab["age",]$color)+
geom_point(aes(y=deadage),size=.4,color=lab["deadage",]$color)+
geom_label(data=leg1,aes(x=x,y=y,label=label),fill=alpha("white",.8),label.r=unit(0,"lines"),label.padding=unit(.04,"lines"),label.size=0,size=3.2,hjust=0,vjust=.5,color=leg1$color)+
geom_label(data=leg2,aes(x=x,y=y,label=label),fill=alpha("white",.8),label.r=unit(0,"lines"),label.padding=unit(.04,"lines"),label.size=0,size=3.2,hjust=1,vjust=.5,color=leg2$color)+
labs(x=NULL,y="",title="Czech record-level data: Deaths by weeks after first vaccine dose",subtitle="People are kept under the first dose after they get a second dose. The baseline was calculated by multiplying a vector of Czech mortality rates for each age in 2021-2022 by a vector for the number of person-days for each age, so the baseline is not adjusted for seasonal variation in mortality."|>stringr::str_wrap(90))+
coord_cartesian(clip="off",expand=F)+
scale_x_continuous(limits=c(0,xend),breaks=xbreak)+
scale_y_continuous(limits=c(0,yend),breaks=ybreak,label=kimi,sec.axis=sec_axis(trans=~./secmult,breaks=seq(0,yend2,ystep2),label=kimi))+
theme(axis.text=element_text(size=8,color="black"),
  axis.ticks=element_line(linewidth=.35,color="black"),
  axis.ticks.length=unit(.2,"lines"),
  axis.title=element_blank(),
  axis.title.y.right=element_text(margin=margin(0,0,0,5)),
  legend.position="none",
  panel.background=element_rect(fill="white"),
  panel.grid=element_blank(),
  plot.margin=margin(.3,.3,.3,.3,"lines"),
  plot.subtitle=element_text(size=7.8,margin=margin(0,0,.6,0,"lines")),
  plot.title=element_text(size=9,margin=margin(.2,0,.4,0,"lines")))
ggsave("1.png",width=5,height=3.4,dpi=400)

Projected mortality rates by age group based on a 2010-2019 trend

The Czech Statistical Office has published weekly data for deaths by 5-year age groups: https://vdb.czso.cz/vdbvo2/faces/en/index.jsf?page=vystup-objekt-parametry&pvo=DEMD007KR-R2024&sp=A&skupId=3629&katalog=30845&z=T.

The following code calculates daily mortality rates in 2010-2019 by interpolating the weekly deaths to daily deaths, and it interpolates Eurostat's yearly population sizes to daily population sizes. And then it calculates a seasonality-adjusted linear trend of the mortality rate in each 5-year age group and projects the trend to 2020-2022, and then the expected number of deaths is derived by multiplying the population size by the projected mortality rate:

isoweek=\(year,week,weekday=1){d=as.Date(paste0(year,"-1-4"));d-(as.integer(format(d,"%w"))+6)%%7-1+7*(week-1)+weekday}

dates=seq(as.Date("2010-1-1"),as.Date("2022-12-31"),1)

t=read.csv("http://sars2.net/f/czweeklydead.csv")
dead=sapply(split(t,t$age),\(x)predict(smooth.spline(isoweek(x$year,x$week,4),x$dead/7),as.numeric(dates))$y)
pop=read.csv("http://sars2.net/f/czpopdead.csv")|>subset(year>=2009)
pop=sapply(split(pop,pop$age),\(x)predict(smooth.spline(an(as.Date(paste0(x$year,"-1-1"))),x$pop),as.numeric(dates))$y)
pop=t(rowsum(t(pop),pmin(0:100%/%5*5,90)))

seapre=\(x,y,x2=x,spar=.5){ # seasonal prediction
  x=as.Date(x,"1970-1-1");x2=as.Date(x2,"1970-1-1")
  d=data.frame(x,y);lm=lm(y~x,d)
  daily=tapply(y-predict(lm,list(x)),substr(x,6,10),mean)
  daily=daily[names(daily)!="02-29"]
  daily[]=smooth.spline(rep(daily,3),spar=spar)$y[366:730]
  daily["02-29"]=mean(daily[c("02-28","03-01")])
  data.frame(x=x2,y=predict(lm,list(x=x2))+daily[substr(x2,6,10)])
}

d=data.frame(date=dates,age=rep(seq(0,90,5),each=length(dates)),pop=c(pop),dead=c(dead))
d$projdead=c(apply(dead/pop,2,\(x)seapre(dates,x)$y))*d$pop

d$pop=sprintf("%.2f",d$pop)
d$dead=sprintf("%.5f",d$dead)
d$projdead=sprintf("%.5f",d$projdead)
write.csv(d,"czdeadproj.csv",quote=F,row.names=F)

The output looks like this: [f/czdeadproj.csv]

$ curl -Ls sars2.net/f/czdeadproj.csv|sed 3q
date,age,pop,dead,projdead
2010-01-01,0,570301.15,1.15035,1.04195
2010-01-02,0,570360.14,1.15010,1.04254

Excess mortality by weeks after vaccination and age group

Jeffrey Morris has coined the term "temporal healthy vaccinee effect" to describe the phenomenon where mortality rate is temporarily reduced for the first weeks or months after vaccination. The plot below shows that the temporal HVE seems to be the strongest around ages 50-79, but it's much weaker in ages 90+ and in younger age groups:

In the plot above my baseline was the total mortality rate among both vaccinated and unvaccinated people, which results in a bias where age groups with a higher percentage of vaccinated people tend to have excess mortality closer to zero. If for example there is one age group where there's 90% vaccinated people, and vaccinated people have a mortality rate of 123 and unvaccinated people have a mortality rate of 246, then it would result in only about -9% excess mortality for vaccinated people (from 123/(123*.90+246*.1)-1). But if 60% of people are vaccinated then vaccinated people would be about -29% excess mortality with the same mortality rates.of 123 and 246.

However the bias doesn't probably explain why ages 90+ don't have as low excess mortality during the first weeks of vaccination as younger age groups, because the Czech data actually has a lower percentage of vaccinated people in ages 90+ than in ages 50-79:

> rec=fread("CR_records.csv")
> rec[!(!is.na(DatumUmrti)&DatumUmrti<"2021-01-01"),.(vaxpct=mean(!is.na(Datum_1))*100),.(age=pmax(0,pmin(90,(2021-Rok_narozeni)%/%10*10)))][order(age)]|>round()|>print(r=F)
 age vaxpct
   0      3
  10     47
  20     68
  30     65
  40     73
  50     76
  60     82
  70     86
  80     84
  90     70

In order to avoid the bias, in the next plot I calculated the baseline, based on the trend in mortality rates within five-year age groups in 2010-2019. However now my excess mortality on weeks 0-4 was about -61% for ages 50-59 but only about -37% for ages 90+, even though later on the difference got smaller, which might indicate that the temporal healthy vaccinee effect might in fact be less strong in the oldest age groups:

library(data.table)

weekbin=5;agebin=10;maxage=90

t=fread("http://sars2.net/f/czbucketskeep.csv.gz")[dose<=1][,.(dead=sum(dead),alive=sum(alive)),.(month,age,type,week,dose)]

# first plot
t=merge(t[dose>0],t[,.(rate=sum(dead)/sum(alive)),by=.(month,age)])

# # second plot
# proj=fread("http://sars2.net/f/czdeadproj.csv")[,.(dead=sum(dead),pop=sum(pop)),.(month=substr(date,1,7),age)]
# t=merge(t[type!=""][,age:=pmin(90,age%/%5*5)],proj[,.(month,age,rate=dead/pop)])

t=t[,.(dead=sum(dead),alive=sum(alive),base=sum(rate*alive)),.(week%/%weekbin*weekbin,age=pmin(90,age%/%agebin*agebin))][order(week)]

t$age=`levels<-`(factor(t$age),paste0(seq(0,maxage,agebin),c(paste0("-",seq(agebin-1,maxage,agebin)),"+")))
t$week=`levels<-`(factor(t$week),sapply(sort(unique(t$week)),\(i)paste0(i,"-",i+weekbin-1)))
t=rbind(t,t[,.(dead=sum(dead),alive=sum(alive),base=sum(base),week="Total"),by=age])

m=tapply(with(t,(dead-base)/ifelse(dead>base,base,dead)*100),t[,2:1],c)
disp=round(tapply((t$dead/t$base-1)*100,t[,2:1],c))
disp[is.nan(disp)]=-100
mpop=xtabs(alive~age+week,t)

# m[mpop<100]=disp[mpop<100]=NA
exp=.8;m=abs(m)^exp*sign(m)
maxcolor=400^exp;m[is.infinite(m)]=-maxcolor;m[is.nan(m)]=-maxcolor

pal=hsv(rep(c(7/12,0),5:4),c(.9,.75,.6,.3,0,.3,.6,.75,.9),c(.4,.65,1,1,1,1,1,.65,.4))

pheatmap::pheatmap(m,filename="0.png",display_numbers=disp,
  gaps_col=ncol(m)-1,
  cluster_rows=F,cluster_cols=F,legend=F,
  cellwidth=18,cellheight=18,fontsize=9,fontsize_number=8,border_color=NA,na_col="white",
  number_color=ifelse(!is.na(m)&abs(m)>.5*maxcolor,"white","black"),
  breaks=seq(-maxcolor,maxcolor,,256),colorRampPalette(pal)(256))

People who got the second dose from a different vaccine type than the first dose

Kirsch included these comments in the file Full analysis - All vaccines.xlsx:

This is the pivot table analysis for Study #1 which enrolled any who got their first COVID shot anytime in 2021.

Note that if you didn't get a second shot, it's highly likely that it is because you died.

However the mortality rate ratio (MRR) which is MR(Moderna)/MR(Pfizer) is also high for those people when it should be 1 if both vaccines are safe.

See the Detail 1 tab which shows that the MRR ratio is >1 for each and every 5 year age group examined which eliminates the Simpson's paradox.

Additional note

Those who mixed vaccines between dose 1 vs. dose 2 had the highest increase in ACM.

The high MR line items for the second vaccine being None is likely because those people died before their second vaccine so this should not be misinterpreted as you must get two vaccines or you'll die.

Note that if you Moderna then Pfizer vs. Pfizer and then Moderna, the MRS were nearly identical around 2175) so the order of shots did not matter!!!

However, the MR of a mixed dose did NOT give you an MR between the two pure vaccines. It was higher than pure Pfizer or pure Moderna.

Note that MOST people did not mix brands between dose 1 and dose 2.

In the code below I calculated ASMR during the first year from the first dose using the same simple but inaccurate method as earlier, where I treated the age of each person as their year of birth subtracted from 2021. However I actually got very low ASMR for people who got Pfizer for the first dose and Moderna for the second dose (even though they consisted of only 279 people so the confidence interval is huge):

retype=\(x){name1=c("","Comirnaty","SPIKEVAX","VAXZEVRIA","COVID-19 Vaccine Janssen");name2=c("None","Pfizer","Moderna","AstraZeneca","Janssen");ifelse(x%in%name1,name2[match(x,name1)],"Other")}

std=fread("http://sars2.net/f/czpopdead.csv")[year==2022,pop]
rec=fread("CR_records.csv")

t=rec[Datum_1<as.Date("2022-1-1"),.(dead=sum(!is.na(DatumUmrti)&DatumUmrti<Datum_1+365),pop=.N),.(age=pmin(100,2021-Rok_narozeni),type1=retype(OckovaciLatka_1),type2=retype(OckovaciLatka_2))]
t[,.(asmr=round(sum(dead/pop*std[age+1]/sum(std)*1e5)),dead=sum(dead),pop=sum(pop),meanage=round(weighted.mean(age,pop))),.(type1,type2)][order(-pop)]|>print(r=F)
       type1       type2  asmr  dead     pop meanage
      Pfizer      Pfizer   705 45595 5411522      49
     Moderna     Moderna   918  8778  509704      55
 AstraZeneca AstraZeneca   781 10639  439648      68
     Janssen        None  1325  4418  404824      46
      Pfizer        None 11599  5914   49445      43
     Moderna        None 11274  1200    6224      52
 AstraZeneca        None 17457  3091    5522      75
 AstraZeneca      Pfizer   938    46    1795      61
     Moderna      Pfizer  1507    13     569      48
       Other       Other     0     0     348      43
 AstraZeneca     Moderna   973    10     329      62
      Pfizer     Moderna   421     6     279      50
      Pfizer       Other     0     0     130      49
      Pfizer     Janssen   253     1     126      43
 AstraZeneca     Janssen  1653     7     109      64
 AstraZeneca       Other     0     0      34      72
     Moderna     Janssen   989     2      33      47
     Moderna       Other     0     0      25      62
      Pfizer AstraZeneca     0     0      24      51
       Other AstraZeneca     0     0      21      48
     Moderna AstraZeneca   630     1       7      59
       Other        None     0     0       1      47
       Other      Pfizer     0     0       1      80
       type1       type2  asmr  dead     pop meanage

But actually ASMR is unreliable in cases like this where there's only a small number of people and the people can be unevenly distributed across age groups. In the following code I calculated the expected number of deaths by multiplying the Czech mortality rate for each age in 2021-2022 with the number of people for each age. I looked at all 6 pairs of crossovers between Pfizer, Moderna, and AstraZeneca vaccines, but the total excess mortality between them was only about -18%:

retype=\(x){name1=c("","Comirnaty","SPIKEVAX","VAXZEVRIA","COVID-19 Vaccine Janssen");name2=c("None","Pfizer","Moderna","AstraZeneca","Janssen");ifelse(x%in%name1,name2[match(x,name1)],"Other")}

base=fread("http://sars2.net/f/czpopdead.csv")[year%in%2021:2022,.(sum(dead)/sum(pop)),age][[2]]
rec=fread("CR_records.csv")

t=rec[Datum_1<as.Date("2022-1-1"),.(dead=sum(!is.na(DatumUmrti)&DatumUmrti<Datum_1+365),pop=.N),.(age=pmin(100,2021-Rok_narozeni),type1=retype(OckovaciLatka_1),type2=retype(OckovaciLatka_2))]
pick=c("Pfizer","Moderna","AstraZeneca")
t=t[type1%in%pick&type2%in%pick&type1!=type2]

t=rbind(t,t[,.(dead=sum(dead),pop=sum(pop),type1="Total"),.(age,type2)])
t=rbind(t,t[,.(dead=sum(dead),pop=sum(pop),type2="Total"),.(age,type1)])

t[,.(excesspct=round((sum(dead)/sum(pop*base[pmin(age,100)+1])-1)*100),dead=sum(dead),pop=sum(pop),meanage=round(weighted.mean(age,pop))),.(type1,type2)][order(type1,type2)]|>print(r=F)
       type1       type2 excesspct dead  pop meanage
 AstraZeneca     Moderna       -20   10  329      62
 AstraZeneca      Pfizer       -24   46 1795      61
 AstraZeneca       Total       -23   56 2124      62
     Moderna AstraZeneca       982    1    7      59
     Moderna      Pfizer         6   13  569      48
     Moderna       Total        13   14  576      48
      Pfizer AstraZeneca      -100    0   24      51
      Pfizer     Moderna       -11    6  279      50
      Pfizer       Total       -18    6  303      50
       Total AstraZeneca        51    1   31      53
       Total     Moderna       -17   16  608      56
       Total      Pfizer       -19   59 2364      58
       Total       Total       -18   76 3003      58

In the output above Pfizer-Moderna has a similar excess mortality percent as Moderna-Pfizer, even though in the block of output before it Moderna-Pfizer got over 3 times higher ASMR than Pfizer-Moderna. However that's because the Moderna-Pfizer group had one death in ages 10-19 and one death in ages 40-49, which together added over 500 units to the total ASMR value. With ASMR, the mortality rates of younger age groups that make up a large percent of the standard population are given more weight than the mortality rates of the oldest age groups that make up a small percent of the standard population:

std=fread("http://sars2.net/f/czpopdead.csv")[year==2022,pop]
t[type1=="Moderna"&type2=="Pfizer",.(asmr=sum(dead/pop*std[age+1]/sum(std)*1e5),dead=sum(dead),pop=sum(pop),stdpoppct=100*sum(std[age+1]/sum(std))),.(agegroup=age%/%10*10)][order(agegroup)]|>transform(asmrperdead=asmr/dead)|>mutate_if(is.double,round)|>print(r=F)
 agegroup asmr dead pop stdpoppct asmrperdead
       10  282    1  33         8         282
       20    0    0 107        10         NaN
       30    0    0 105        13         NaN
       40  258    1  73        16         258
       50    0    0  67        13         NaN
       60    0    0  64        12         NaN
       70  671    4  72        10         168
       80  179    3  33         3          60
       90  117    4  15         0          29 # for ages 90-99 the contribution to ASMR per death was about one tenth of the contribution in ages 10-19

Plot of multiple variables from OWID

In the Czech Republic spikes in excess mortality coincide with spikes in PCR positivity like in other countries, except that after Omicron there seems to be high PCR positivity coupled with low excess mortality, which is also similiar to many other countries. In winter 2020-2021 there's a weird three-hump pattern where there's three different spikes in excess deaths, but each of them also coincides with a spike in PCR positivity. And around mid-2021 when the number of new vaccine doses given peaked, there was low excess mortality.

At OWID weekly excess deaths peaked at about 104% in November 2020 and 58% in December 2021:

$ wget covid.ourworldindata.org/data/owid-covid-data.csv
$ sed '1n;/Czechia/!d' owid-covid-data.csv|csvtk cut -flocation,date,excess_mortality|awk -F, '$3>50'
location,date,excess_mortality
Czechia,2020-10-25,71.85
Czechia,2020-11-01,101.35
Czechia,2020-11-08,103.79
Czechia,2020-11-15,80.85
Czechia,2020-11-22,60.63
Czechia,2021-01-03,53.57
Czechia,2021-01-10,58.77
Czechia,2021-01-17,56.32
Czechia,2021-03-07,55.34
Czechia,2021-03-14,63.21
Czechia,2021-03-21,60.61
Czechia,2021-04-04,50.22
Czechia,2021-11-28,50.76
Czechia,2021-12-05,57.82
Czechia,2021-12-12,54.9

After mid-2021 when new vaccine doses peaked, there were only two relatively brief periods when the mortality was clearly elevated above the baseline. The first was during the COVID spike around December 2021. The second was around December 2022 when many European countries had a spike in all-cause deaths but not COVID deaths, which was probably caused by influenza A since it coincided with a peak in the number of positive tests for influenza A: [https://app.powerbi.com/view?r=eyJrIjoiYWU4YjUyN2YtMDBkOC00MGI1LTlhN2UtZGE5NThjY2E1ZThhIiwidCI6ImY2MTBjMGI3LWJkMjQtNGIzOS04MTBiLTNkYzI4MGFmYjU5MCIsImMiOjh9]

Germany also had a sharp spike in excess deaths in December 2022, which coincided with a high incidence of influenza-like illness and acute respiratory illness reported by the Robert Koch Institute's GrippeWeb service, but at the same time there was a low number of COVID deaths compared to the two previous winters: [https://github.com/robert-koch-institut/GrippeWeb_Daten_des_Wochenberichts]

Kirsch seems to suggest that a large part of the Czech excess mortality after vaccination can be attributed to the vaccines, and he also says that the vaccines don't necessarily kill people soon after vaccination but they result in a chronic increase in mortality, because his plots show that the number of deaths by weeks since vaccination goes up over time, and he has attributed the increase in his plots to deaths caused by vaccines. But then from mid-2021 onwards after most people had been vaccinated, why were the excess deaths concentrated into two relatively sharp spikes around December 2021 and December 2022? And if vaccines killed a particularly high number of people in December 2021, then why was the ratio of unvaccinated to vaccinated ASMR elevated in December 2021?

Representation of age groups compared to 2021 census and Eurostat

The resident population estimates at Eurostat are for January 1st of each year. In the record-level data you can get the age on each person on December 31st 2020 if you subtract the year of birth from 2020, so it's only one day off from the date of the population estimates at Eurostat.

The resident population estimates in the 2021 Czech census are for March 26th 2021: https://scitani.gov.cz/age-structure#skupina-282345.

In most 10-year age groups the record-level data contains about 98-104% of the people in the 2021 census, but in ages 90-99 it's about 107% and in ages 100+ it's about 209%:

rec=fread("CR_records.csv")
t=rec[Rok_narozeni<2021&!(!is.na(DatumUmrti)&DatumUmrti<"2021-01-01"),list(recordlevel=.N),.(age=pmin(2020-Rok_narozeni,100)%/%10*10)][order(age)]
t=merge(t,fread("http://sars2.net/f/czcensus2021pop.csv")[,.(census2021=sum(pop)),.(age=age%/%10*10)])
t=merge(t,fread("http://sars2.net/f/czpopdead.csv")[year==2021,.(eurostat=sum(pop)),.(age=age%/%10*10)])

t=rbind(t,t[,lapply(.SD,sum)][,age:="Total"])

t[,censuspct:=recordlevel/census2021*100]
t[,eurostatpct:=recordlevel/eurostat*100]
print(mutate_if(t,is.double,round,1),row.names=F)
   age recordlevel census2021 eurostat censuspct eurostatpct
     0     1122110    1110656  1106350     101.0       101.4
    10     1083970    1070940  1064611     101.2       101.8
    20     1091962    1078231  1074387     101.3       101.6
    30     1465653    1409650  1398642     104.0       104.8
    40     1773632    1735533  1731067     102.2       102.5
    50     1361789    1354501  1345943     100.5       101.2
    60     1298397    1284689  1293120     101.1       100.4
    70     1038356    1037997  1035782     100.0       100.2
    80      382614     378684   380997     101.0       100.4
    90       65762      62639    63318     105.0       103.9
   100         957        647      619     147.9       154.6
 Total    10685202   10524167 10494836     101.5       101.8

In the code above I excluded people who were born in 2021 or later or who had died before 2021. But the total population size in the record-level data was still about 1-2% higher than the resident population estimates in the 2021 census and Eurostat. But both of them only included the resident population, so I don't know if the record-level data also includes non-residents.

The Czech Statistical Office has published a dataset for resident population estimates on December 31st 2022 by age group, region, and sex: https://data.gov.cz/datová-sada?iri=https%3A%2F%2Fdata.gov.cz%2Fzdroj%2Fdatové-sady%2F00025593%2Fa129a5408e8e5fd99497e9a22c39775e. However the total population size in the file is identical to Eurostat's population estimate for January 1st 2023:

> fread("http://sars2.net/f/czpopdead.csv")[year==2023,sum(pop)]
[1] 10827529
> system("wget csu.gov.cz/docs/107508/a53bbc83-5e04-5a74-36f9-549a090a806e/130142-24data051724.zip ;unzip 130142-24data051724.zip")
> pop=fread("130181-23data2022.csv")
> pop[uzemi_typ=="stát"&is.na(vek_txt)&pohlavi_txt=="",hodnota]
[1] 10827529 # region type is whole country, age is blank (all ages), and sex is blank (both males and females)
> pop[uzemi_typ=="stát"&!is.na(vek_txt)&pohlavi_txt!="",sum(hodnota)]
[1] 10827529 # individual age groups add up to the total for all age groups, and males and females add up to the total for both males and females
> pop[uzemi_typ=="kraj"&is.na(vek_txt)&pohlavi_txt=="",sum(hodnota)]
[1] 10827529 # individual kraj-level regions add up the total for the whole nation

Monthly ratio of unvaccinated ASMR to vaccinated ASMR and comparison to Maldives data

In October 2021 unvaccinated people had about 2.1 times higher ASMR than vaccinated people, but over the next two months when there was a spike in excess deaths caused by COVID, the ratio increased first to about 2.8 in November and then about 3.1 in December. But by May 2022 when COVID deaths had fallen close to zero, the ratio had fallen back down to about 1.8:

> std=fread("http://sars2.net/f/czcensus2021pop.csv")[,sum(pop),pmin(age,95)][,V1/sum(V1)]
> b=fread("http://sars2.net/f/czbucketskeep.csv.gz")
> d=b[month>="2020-12"&dose<=1]
> d=d[,.(dead=sum(dead),alive=sum(alive)),.(month,vax=dose==1,age=pmin(age,95))]
> d=d[,.(asmr=sum(dead/alive*std[age+1])),.(month,vax)]
> d[,.(ratio=round(asmr[!vax]/asmr[vax],1)),month]|>print(r=F)
   month ratio
 2020-12   Inf
 2021-01   1.2
 2021-02   1.1
 2021-03   2.8
 2021-04   3.2
 2021-05   3.4
 2021-06   3.0
 2021-07   2.6
 2021-08   2.3
 2021-09   2.2
 2021-10   2.1
 2021-11   2.8
 2021-12   3.1
 2022-01   2.6
 2022-02   2.5
 2022-03   2.3
 2022-04   1.8
 2022-05   1.8
 2022-06   1.7
 2022-07   1.7
 2022-08   1.6
 2022-09   1.7
 2022-10   1.7
 2022-11   1.6
 2022-12   1.6
   month ratio

In December 2023 Kirsch published record-level data from the Maldives which included a complete or nearly complete table of people who died in the Maldives in 2021-2022 along with their dates of vaccination. [moar.html#Data_for_deaths_in_2020_2023_in_Maldives] I used the data to calculate a rough ratio of unvaccinated CMR to vaccinated CMR so that during for example in May 2021 when about 41.922% of people were unvaccinated and there were 122 deaths in unvaccinated people and 81 deaths in vaccinated people, I calculated the ratio as (122/41.922)/(81/(100-41.922)), which is about 2.1. In May and June 2021 when Maldives had the most COVID deaths, the ratio of unvaccinated to vaccinated CMR was about 2.1, but the ratio was much lower in the surrounding months, which seems to indicate that unvaccinated people were more likely to die of COVID:

The method I used to calculate the mortality ratios in the Maldives didn't even take into account that vaccinated people were older on average than unvaccinated people, so I would've probably gotten higher ratios if I would've been able to calculate mortality ratios normalized by age.

However now with the Czech data I was able to use a more sophisticated method to calculate a ratio between unvaccinated and vaccinated mortality so that it was adjusted for age, but I still got similar results where the ratio was temporarily elevated during a period of months that had a high number of COVID deaths.

Ratio between monthly ASMR for Moderna and Pfizer vaccines

The ratio column below shows the ASMR in people whose first dose was Moderna divided by the ASMR in people whose first dose was Pfizer. The ratio is actually lower than usual during the COVID wave in November to December 2021. It might possibly indicate that Moderna vaccines were more effective in preventing COVID deaths than Pfizer vaccines, even though it might also indicate that there's some confounding factors which caused people who got a Moderna vaccine to be less likely to die from COVID:

> b=fread("http://sars2.net/f/czbucketskeep.csv.gz")[month>="2020-12"&dose<=1][dose==0,type:="Unvaccinated"]
> std=fread("http://sars2.net/f/czcensus2021pop.csv")[,sum(pop),pmin(age,95)%/%5][,V1/sum(V1)]
> d=b[,.(dead=sum(dead),alive=sum(alive)),.(month,type,age=pmin(age,95)%/%5)]
> d=d[,.(asmr=sum(dead/alive*std[age/5+1]*365e5)),.(month,type)]
> o=dcast(d,month~type,value.var="asmr")[,ratio:=Moderna/Pfizer]
> o=o[,.(month,ratio=round(ratio,2),Pfizer,Moderna,Unvaccinated)]
> o[,3:5]=lapply(o[,3:5],round)
> print(o,r=F)
   month ratio Pfizer Moderna Unvaccinated
 2020-12    NA      0      NA         6041
 2021-01  1.29   4449    5740         5809
 2021-02  0.98   4423    4345         5695
 2021-03  1.61   3031    4873         7081
 2021-04  1.29   2671    3453         5888
 2021-05  1.04   2920    3051         5135
 2021-06  1.22   3107    3787         4941
 2021-07  1.25   3096    3865         4817
 2021-08  1.24   2960    3661         4536
 2021-09  1.28   3381    4336         4680
 2021-10  1.26   3720    4690         5106
 2021-11  0.94   4535    4242         8093
 2021-12  1.10   4516    4979         9136
 2022-01  1.22   3737    4575         5833
 2022-02  1.02   4388    4477         6698
 2022-03  1.29   4089    5265         6001
 2022-04  1.31   4216    5527         4848
 2022-05  1.40   3367    4726         4163
 2022-06  1.03   3508    3620         3998
 2022-07  1.24   3608    4483         4032
 2022-08  1.31   3652    4782         4158
 2022-09  1.12   3686    4138         4786
 2022-10  1.16   4155    4836         4999
 2022-11  1.12   4043    4518         4337
 2022-12  1.36   4844    6574         5710
   month ratio Pfizer Moderna Unvaccinated

However when I used a seasonality-adjusted 2010-2019 linear projection as the baseline, the oldest age groups had a relatively low percentage of excess deaths in November to December 2021 compared to ages 55-79:

> proj=fread("http://sars2.net/f/czdeadproj.csv")
> proj[grepl("2021-1[12]",date),.(excess=round(100*(sum(dead)/sum(projdead)-1))),age][,setNames(excess,age)]
  0   5  10  15  20  25  30  35  40  45  50  55  60  65  70  75  80  85  90
 -6   6   8   2  25 -10   8  10  19  11  22  23  27  27  24  25  20  15  16

And Moderna had a higher percentage of people in the oldest age groups than Pfizer:

> t=fread("http://sars2.net/f/czbucketskeep.csv.gz")[dose<=1]
> type=factor(t$type,names(sort(tapply(t$alive,t$type,sum),T)))
> age=pmin(t$age%/%10*10,90)
> m=xtabs(t$alive~age+type)
> round(t(m)/colSums(m)*100,1)
             age
type             0   10   20   30   40   50   60   70   80   90
              15.8 12.3 10.3 14.1 15.6 11.4 10.1  7.3  2.6  0.5
  Pfizer       0.3  7.2  9.9 13.5 19.5 16.4 15.4 12.1  4.9  0.8
  Moderna      0.0  2.1  8.4 11.5 16.4 14.9 16.0 21.8  7.6  1.4
  AstraZeneca  0.0  0.1  1.4  2.9  6.7  9.7 20.8 42.3 14.4  1.9
  Janssen      0.0  2.0 16.4 17.8 20.5 16.6 14.8  9.1  2.3  0.4
  Novavax      0.0  4.2 20.1 26.3 21.0 14.4  8.9  4.3  0.7  0.1
  Other        0.4  1.2 19.6 27.7 17.8  9.6 13.8  7.3  2.6  0.0

Monthly excess mortality in both vaccinated and unvaccinated people relative to a 2010-2019 trend

Here during the period from October 2020 to April 2021 which had high excess mortality because of COVID, the excess mortality seems to have fallen back down earlier in older age groups, so it first fell close to zero in January in ages 90+, in April in ages 80-89, in May in ages 70-79, and in June in ages 60-69. It might be because older age groups got vaccinated earlier than younger age groups:

This shows that the percentage of vaccinated person-days out of all person-days first reached above 50% in March in ages 90+ and 80-89, in April in ages 70-79, in May in ages 60-69, and in June in ages 50-59:

> b=fread("http://sars2.net/f/czbucketskeep.csv.gz")[dose<=1]
> b[,.(vaxpct=sum(alive[dose==1])/sum(alive)*100),.(month,age=pmin(age%/%10*10,90))][,round(xtabs(vaxpct~month+age))[12:24,]]
         age
month      0 10 20 30 40 50 60 70 80 90
  2020-12  0  0  0  0  0  0  0  0  0  0
  2021-01  0  0  1  1  1  2  1  1  6  8
  2021-02  0  0  2  2  3  3  2  2 27 32
  2021-03  0  0  3  4  6  7  6 22 58 51
  2021-04  0  0  5  6 10 13 18 56 71 60
  2021-05  0  1  8 11 24 41 59 75 77 65
  2021-06  0  4 21 33 52 61 72 80 81 68
  2021-07  0 15 40 45 59 65 75 82 82 69
  2021-08  0 27 49 50 63 68 78 84 83 70
  2021-09  0 31 52 52 64 70 79 85 84 71
  2021-10  0 33 54 54 65 71 79 85 84 71
  2021-11  0 37 60 58 69 74 81 86 85 72
  2021-12  0 41 65 62 71 76 82 88 86 74

In the plot above I thought that maybe the reason why the excess mortality fell close to 0% earlier in older age groups might have been some kind of an artifact of the method I used to calculate the baseline, because in the dataset for weekly deaths by age group I used to calculate the baseline, ages 90 and above were aggregated together and younger ages were split into 5-year age groups, so it might result in the excess mortality of the oldest age groups in being understated if the upper ends of the age groups were overrepresented in the record-level data relative to the lower ends. And I also projected the baseline for each age group into the future by doing a linear regression of the mortality rate in 2010-2019, which might not be accurate for all age groups.

So therefore I made the plot below where I used a different baseline, where I first calculated the total mortality rates in the record-level data for each single year of age in 2020-2022, except I aggregated together ages 100 and above. And then in order to derive the expected number of deaths for each cell of the heatmap, I multiplied a vector of person-days for each age in the cell by a vector of the total mortality rates for each age in 2020-2022. However it also gave me a similar result, where during the period with high excess deaths that lasted for late 2020 until early 2021, the excess deaths fell back down first in ages 90+, next in ages 80-89, and later in the younger age groups:

t=data.table::fread("http://sars2.net/f/czbucketskeep.csv.gz")[dose<=1]

# first plot
t=t[,.(dead=sum(dead),alive=sum(alive)),.(month,age=pmin(90,age%/%5*5))]
proj=fread("http://sars2.net/f/czdeadproj.csv")[,.(projdead=sum(projdead),pop=sum(pop)),.(month=substr(date,1,7),age)]
t=merge(t,proj[,.(month,age,rate=projdead/pop)])

# # second plot
# t=t[,.(dead=sum(dead),alive=sum(alive)),.(month,age=pmin(age,100))]
# t=merge(t,t[,.(rate=sum(dead)/sum(alive)),age])

t=t[,.(dead=sum(dead),alive=as.numeric(sum(alive)),base=sum(rate*alive)),.(month,age=pmin(90,age%/%10*10))]

t[,age:=`levels<-`(factor(age),paste0(0:9*10,c(paste0("-",1:9*10-1),"+")))]
t=rbind(t,t[,.(dead=sum(dead),alive=sum(alive),base=sum(base),month="Total"),age])
t=rbind(t,t[,.(dead=sum(dead),alive=sum(alive),base=sum(base),age="Total"),month])

m=tapply(with(t,(dead-base)/ifelse(dead>base,base,dead)*100),t[,2:1],c)
disp=tapply((t$dead/t$base-1)*100,t[,2:1],round)
disp[is.nan(disp)]=-100
mpop=xtabs(alive~age+month,t)
exp=.8;m=abs(m)^exp*sign(m)
maxcolor=300^exp;m[is.infinite(m)]=-maxcolor;m[is.nan(m)]=-maxcolor

pheatmap::pheatmap(m,filename="0.png",display_numbers=disp,legend=F,cluster_rows=F,cluster_cols=F,
  gaps_row=nrow(m)-1,gaps_col=ncol(m)-1,
  cellwidth=18,cellheight=18,fontsize=9,fontsize_number=8,border_color=NA,na_col="white",
  number_color=ifelse(!is.na(m)&abs(m)>.5*maxcolor,"white","black"),breaks=seq(-maxcolor,maxcolor,,256),
  colorRampPalette(hsv(rep(c(7/12,0),5:4),c(.9,.75,.6,.3,0,.3,.6,.75,.9),c(.4,.65,1,1,1,1,1,.65,.4)))(256))

Moderna-Pfizer ratio by age group and month of vaccination

The README file in Kirsch's GitHub repository includes this table:

I thought the difference might partially be if people who got vaccinated later had a higher Moderna-Pfizer ratio and younger people got vaccinated later than older people, so I made the heatmap below. I didn't calculate CMR for each age group like Kirsch, but I calculated the ASMR for each single year of age within an age group and then I added together the results to get the total ASMR for the ten-year age groups (where the ASMR for a single age is the CMR of the age multiplied by the fraction of the age in the standard population).

But anyway my heatmap below shows that the Moderna-Pfizer ratio was actually the highest in ages 90+, even though it was fairly low in ages 70-79 and 80-89.

For some reason people who got vaccinated in April 2021 subsequently had a low Moderna-Pfizer ratio of only about 1.1. But ages 70-79 received a large number of first doses in April 2021, which might partially explain their low ratio.

The Moderna-Pfizer ratio seems to have gotten lower starting from around October 2021. Among people who received the first vaccine dose in October 2021, November 2021, or February 2022, Pfizer had higher total ASMR than Moderna, so the Moderna-Pfizer ratio was below 1. So there seem to be some confounding factors which cause the ratio to shift dramatically over time:

library(data.table)

agecut=\(x,y)cut(x,c(y,Inf),paste0(y,c(paste0("-",y[-1]-1),"+")),T,F)

std=fread("http://sars2.net/f/czcensus2021pop.csv")[,sum(pop),pmin(age,95)%/%5][,V1/sum(V1)]

t=fread("http://sars2.net/f/czbucketskeep.csv.gz")[dose==1]
t=t[,.(dead=sum(dead),alive=sum(as.double(alive))),.(type,vaxmonth,age=pmin(age,95)%/%5*5)]
t=t[,.(dead=sum(dead),alive=sum(alive)),.(type,age,age2=agecut(age,0:9*10),vaxmonth)]
t=rbind(t,t[,.(dead=sum(dead),alive=sum(alive),vaxmonth="Total"),.(age,type,age2)])
t=rbind(t,t[,.(dead=sum(dead),alive=sum(alive),age2="Total"),.(age,vaxmonth,type)])
t=t[,.(alive=sum(alive),asmr=sum(dead/alive*std[age/5+1])*365e5),.(type,age2,vaxmonth)]

a=tapply(t$asmr,t[,1:3],sum,na.rm=T)
m1=a["Moderna",,];m2=a["Pfizer",,]
m=(m1-m2)/ifelse(m1>m2,m2,m1)

pop=t[type%in%c("Moderna","Pfizer"),xtabs(alive~age2+vaxmonth)]/365
disp=matrix(sprintf(ifelse(m1/m2>10,"%.1f","%.2f"),m1/m2),nrow(m))
hide=pmin(m1,m2)==0;m[hide]=NA;disp[is.na(m)]=""
exp=.7;m=abs(m)^exp*sign(m);maxcolor=10^exp

pheatmap::pheatmap(m,filename="i1.png",display_numbers=disp,
  gaps_row=nrow(m)-1,gaps_col=ncol(m)-1,
  cluster_rows=F,cluster_cols=F,legend=F,cellwidth=19,cellheight=19,fontsize=9,fontsize_number=8,
  border_color=NA,na_col="gray90",
  number_color=ifelse((abs(m)>.55*maxcolor)&!is.na(m),"white","black"),
  breaks=seq(-maxcolor,maxcolor,,256),
  colorRampPalette(hsv(rep(c(7/12,0),5:4),c(.9,.75,.6,.3,0,.3,.6,.75,.9),c(.4,.65,1,1,1,1,1,.65,.4)))(256))

kimi=\(x){e=floor(log10(ifelse(x==0,1,abs(x))));e2=pmax(e,0)%/%3+1;p=!is.na(x)
  x[p]=paste0(sprintf(paste0("%.",ifelse(e[p]%%3==0,1,0),"f"),x[p]/1e3^(e2[p]-1)),c("","k","M","B","T")[e2[p]]);x}

exp2=.6;maxcolor2=max(pop[-nrow(pop),-ncol(pop)])

pheatmap::pheatmap(pop^exp2,filename="i2.png",display_numbers=kimi(ifelse(pop==0,0,pop)),
  gaps_row=nrow(m)-1,gaps_col=ncol(m)-1,
  cluster_rows=F,cluster_cols=F,legend=F,cellwidth=19,cellheight=19,fontsize=9,fontsize_number=8,
  border_color=NA,na_col="white",
  number_color=ifelse(pop^exp2>maxcolor2^exp2*.45,"white","black"),
  breaks=seq(0,maxcolor2^exp2,,256),
  sapply(seq(1,0,,256),\(i)rgb(i,i,i)))

cap="Czech record-level data: Moderna ASMR divided by Pfizer ASMR, grouped by month of vaccination and age group. For example 1.5 means that people who received a Pfizer vaccine for their first dose had 1.5 times higher ASMR than people who received a Moderna vaccine for their first dose. The ASMR is calculated from the day of vaccination up to the end of 2022. The ratio is not displayed in cells where there are zero deaths for either Pfizer or Moderna vaccines. The standard population is the 2021 census population by single year of age."
system(paste0("w=`identify -format %w i1.png`;pad=44;convert -gravity northwest -pointsize 42 -font Arial \\( -splice x24 -size $[w-pad]x caption:'",cap,"' -extent $[w-pad]x -gravity center \\) i1.png -gravity northwest \\( -size $[w-40]x caption:'Person-years until end of 2022 for people who received a Pfizer or Moderna vaccine for their first dose.' -extent $[w-pad]x -gravity center \\) i2.png -append 1.png"))

The total Moderna-Pfizer ratio was about 1.00 for people who got the first dose in October 2021 or later:

> t=fread("http://sars2.net/f/czbucketskeep.csv.gz")[dose==1]
> t=t[vaxmonth>="2021-10",.(dead=sum(dead),alive=sum(as.double(alive))),.(type,age=pmin(age,95)%/%5*5)]
> t=t[,.(dead=sum(dead),alive=sum(alive)),.(type,age,age2=age%/%10*10)]
> t=rbind(t,t[,.(dead=sum(dead),alive=sum(alive),age2="Total"),.(type,age)])
> t=t[,.(alive=sum(alive),asmr=sum(dead/alive*std[age/5+1])*365e5),.(type,age2)]
> t[,.(ratio=asmr[type=="Moderna"]/asmr[type=="Pfizer"]),.(age=age2)][order(age)]|>print(r=F)
   age     ratio
     0 0.0000000
    10 2.4172339
    20 1.2744856
    30 1.0276323
    40 0.7548193
    50 0.8654081
    60 0.9301195
    70 1.0902050
    80 1.0001228
    90 0.9487740
 Total 1.0002985

I also tried calculating a weighted average of ASMR for each vaccine type by month of vaccination, where the weight was the number of vaccine doses of any type that were given that month, but Moderna still got a much higher ASMR than Pfizer:

> buck=fread("http://sars2.net/f/czbucketskeep.csv.gz")
> std=fread("http://sars2.net/f/czcensus2021pop.csv")[,sum(pop),pmin(age,95)%/%5][,V1/sum(V1)]
> t=buck[vaxmonth<="2022-08"&dose==1,.(dead=sum(dead),alive=sum(alive)),.(type,age=pmin(age,95)%/%5*5,vaxmonth)]
> t=t[,.(alive=as.double(sum(alive)),asmr=sum(dead/alive*std[age/5+1]*365e5)),.(type,vaxmonth)]
> t[,.(weightedasmr=weighted.mean(asmr,tapply(alive,vaxmonth,sum)[vaxmonth])),type][order(weightedasmr)]|>print(r=F)
        type weightedasmr
       Other        0.000
     Novavax     1011.801
      Pfizer     1054.668
 AstraZeneca     1117.712
     Moderna     1299.177
     Janssen     1407.315

(In the code above I only included vaccines given up to August 2022, because after that many people started getting Pfizer's Omicron vaccines which are listed under the "Other" type in my buckets file.)

Daily deaths and vaccine doses by age group

Czech Republic had a period of high excess deaths that lasted from roughly the fourth quarter of 2020 until the second quarter of 2021, which had an unusual pattern where there were three distinct humps in the deaths. In ages 40-59 and 60-79 the first hump was the lowest and the third hump was the highest. But in ages 80+ the first jump was the highest and the third hump was the lowest, which might be because ages 80+ got vaccinated the earliest so many people in ages 80+ had already been vaccinated by the time of the third hump, and some people had even been vaccinated by the time of the second hump:

The plot above shows that there was a spike in deaths around December 2021, which a lot of people will probably blame on deaths caused by the first booster which was rolled out around the same time. In the plot above the spike in December 2021 is difficult to see or nonexistent in ages 0-19 and 20-39. However in the case of ages 40-59, 60-79, and 80+, the peak in deaths occurs in December in each of the age groups, even though the peak in new vaccine doses occurs earlier in older age groups and later in younger age groups, so that vaccine doses peak in November in ages 80+, in December in ages 60-79, and in January in ages 40-59.

In Denis Rancourt's paper about southern-hemisphere and equatorial countries, he included the plots for Peru that are shown in the GIF file below, and he argued that the spike of deaths in early 2021 was caused by the vaccines because the spike roughly coincided with the rollout of the first two doses in the oldest age groups. [https://correlation-canada.org/covid-19-vaccine-associated-mortality-in-the-southern-hemisphere/] However a major weakness of his argument is that the spike in deaths also occurred around the same time in younger age groups even though younger age groups got vaccinated much later than older age groups:

From the next plot which shows ASMR instead of the raw number of deaths, you can see that the third hump is higher than the first hump in unvaccinated people in ages 80+. But in ages 80+ about half of people had already been vaccinated by the time of the third hump, so the total height of the third hump among both vaccinated and unvaccinated people is relatively low. But in ages 40-59 less than 10% of people had been vaccinated by the time of the third hump, so the height of the hump is similar among all people and unvaccinated people:

The Czech Ministry of Health has published CSV files for COVID deaths and hospitalizations by age: https://onemocneni-aktualne.mzcr.cz/api/v2/covid-19. Both COVID deaths and hospitalizations also have similar three-hump pattern as all-cause deaths, and in ages 80+ the first hump is the highest and the third hump is the lowest, but in ages 60-79 and 40-59 the third hump is higher than the first hump:

download.file("https://onemocneni-aktualne.mzcr.cz/api/v2/covid-19/nakazeni-hospitalizace-testy.csv","nakazeni-hospitalizace-testy.csv")
download.file("http://sars2.net/f/czbucketsdaily.csv.gz","czbucketsdaily.csv.gz")
download.file("https://onemocneni-aktualne.mzcr.cz/api/v2/covid-19/umrti.csv","umrti.csv")

library(data.table);library(ggplot2)

ma=\(x,b=1,f=b)setNames(rowMeans(embed(c(rep(NA,b),x,rep(NA,f)),f+b+1),na.rm=T),names(x))
ua=\(x,y,...){u=unique(x);y(u,...)[match(x,u)]}
age=\(x,y){x=as.numeric(x);y=as.numeric(y);(y-x-(y-789)%/%1461+(x-789)%/%1461)%/%365}

agecut=\(x,y)cut(pmax(x,0),c(y,Inf),paste0(y,c(paste0("-",y[-1]-1),"+")),T,F)
# ages=c(0,12,18,30,40,60,85)
ages=0:4*20
agelev=agecut(0:120,ages)

xstart=as.Date("2020-01-01");xend=as.Date("2023-1-1")

rec=fread("Czech/data/CR_records.csv",showProgress=F)
set.seed(0);rec$birth=ua(paste0(rec$Rok_narozeni,"-1-1"),as.Date)+sample(0:364,nrow(rec),T)
dead=rec[!is.na(DatumUmrti),.(dead=.N),.(date=DatumUmrti,age=agelev[pmax(0,age(birth,DatumUmrti))+1])]
k=grep("Datum_",names(rec));dt=data.table(date=as.IDate(unlist(rec[,..k],,F)),birth=rec$birth)
p=merge(dead,dt[!is.na(date)][,.(vax=.N),.(date,age=agelev[pmax(0,age(birth,date))+1])],all=T)

std=fread("http://sars2.net/f/czcensus2021pop.csv")$pop
buck=fread("czbucketsdaily.csv.gz")
buck=rbind(buck,cbind(expand.grid(date=as.IDate(xstart:xend),age=0:100,dose=unique(buck$dose)),alive=0,dead=0))|>unique(by=1:3)
unvax=buck[dose==0&alive>=1e3,.(dead=sum(dead),alive=sum(alive)),.(date,age2=age)]
vax=buck[dose>0&alive>=1e3,.(dead=sum(dead),alive=sum(alive)),.(date,age2=age)]
unvax=unvax[,.(asmr1=sum(dead/alive*std[pmin(age2,100)+1]/sum(std)*365e5)),.(age=agelev[age2+1],date)]
vax=vax[,.(asmr2=sum(dead/alive*std[pmin(age2,100)+1]/sum(std)*365e5)),.(age=agelev[age2+1],date)]
p=merge(p,merge(unvax,vax,all=T),all=T)

testy=fread("nakazeni-hospitalizace-testy.csv",na.strings="-")
testy=testy[,.(tests=sum(provedene_testy,na.rm=T),cases=sum(potvrzene_pripady,na.rm=T),hosp=sum(nove_hospitalizace,na.rm=T)),.(date=datum,age=agelev[as.numeric(sub("[-+].*","",vekova_kategorie))+1])][!is.na(age)]
p=merge(p,testy[tests>0,.(date,age,pcr=cases/tests*100,hosp)],all=T)

p=merge(p,fread("umrti.csv")[,.(coviddead=.N),.(date=datum,age=agelev[vek+1])],all=T)

p=p[date%in%xstart:xend]

p$dead[is.na(p$dead)]=0
p$coviddead[is.na(p$coviddead)&p$date>p$date[!is.na(p$coviddead)][1]]=0
p$vax[is.na(p$vax)&p$date>p$date[!is.na(p$vax)][1]]=0
p$hosp[is.na(p$hosp)&p$date>p$date[!is.na(p$hosp)][1]]=0

xbreak=seq(xstart,xend,"6 month");xlab=c(rbind("",2020:2022),"")

var=read.csv(row.names=1,text="name,group
dead,All-cause deaths,1
vax,Vaccines,2
asmr1,Unvaccinated ASMR,3
asmr2,Vaccinated ASMR,3
hosp,Hospitalizations,4
pcr,PCR positivity,5
coviddead,COVID deaths,6")
var$color=c("black","#cc00cc",hsv(7/12,1,.8),hsv(0,1,.8),hsv(0,.4,1),hsv(1/3,1,.7),"gray50")

# keep=c("dead","vax")
# keep=c("vax","asmr1","asmr2")
keep=c("dead","coviddead","vax","hosp")
long=cbind(p[,1:2],y=unlist(p[,..keep],,F),z=factor(rep(keep,each=nrow(p)),keep))
color=var[keep,]$color
long[,mav:=ma(y,10),.(age,z)]
long[,group:=var[levels(z)[z],]$group]
long=merge(long,long[,.(max=max(mav,na.rm=T)),.(group,age)])
levels(long$z)=var[levels(long$z),]$name

ggplot(long,aes(x=date))+
facet_wrap(~age,ncol=1,strip.position="top")+
geom_vline(xintercept=seq(xstart,xend,"month"),linewidth=.2,color="gray90")+
geom_vline(xintercept=seq(xstart,xend,"3 month"),linewidth=.25,color="gray70")+
geom_vline(xintercept=seq(xstart,xend,"year"),linewidth=.25,lineend="square")+
geom_hline(yintercept=0:1,linewidth=.25,lineend="square")+
geom_line(aes(y=mav/max,color=z),linewidth=.35)+
geom_label(data=data.frame(age=levels(p$age)),aes(label=paste0("\n   ",age,"   \n")),x=xstart,y=1,lineheight=.5,hjust=0,vjust=1,size=2.3,fill=alpha("white",1),label.r=unit(0,"lines"),label.padding=unit(0,"lines"),label.size=.25)+
labs(x=NULL,y=NULL,title="Czech record-level data: Daily deaths and vaccine doses",subtitle="Displayed as percentage of maximum value, ±10-day moving averages."|>stringr::str_wrap(70))+
scale_x_date(limits=c(xstart,xend),breaks=xbreak,labels=xlab)+
scale_y_continuous(limits=0:1)+
scale_color_manual(values=color)+
coord_cartesian(clip="off",expand=F)+
guides(color=guide_legend(nrow=1,byrow=F))+
theme(axis.text=element_text(size=7,color="black"),
  axis.text.y=element_blank(),
  axis.ticks=element_blank(),
  axis.ticks.x=element_line(color=alpha("black",c(1,0)),linewidth=.25),
  axis.ticks.length=unit(0,"lines"),
  axis.ticks.length.x=unit(.15,"lines"),
  legend.background=element_blank(),
  legend.box.spacing=unit(0,"pt"),
  legend.direction="horizontal",
  legend.justification=c(1,.5),
  legend.key.height=unit(.7,"lines"),
  legend.key.width=unit(1.1,"lines"),
  legend.key=element_rect(fill=alpha("white",0)),
  legend.margin=margin(.15,0,0,0,"lines"),
  legend.position="bottom",
  legend.spacing.x=unit(.1,"lines"),
  legend.text=element_text(size=7,vjust=.5),
  legend.title=element_blank(),
  panel.spacing=unit(0,"lines"),
  panel.background=element_rect(fill="white"),
  panel.grid=element_blank(),
  plot.margin=margin(.3,.3,.3,.3,"lines"),
  plot.subtitle=element_text(size=7,margin=margin(0,0,.6,0,"lines")),
  plot.title=element_text(size=7.5,face="bold",margin=margin(.1,0,.5,0,"lines")),
  strip.background=element_blank(),
  strip.text=element_blank())
ggsave("1.png",width=3.5,height=4,dpi=450)

Paper about two subsets of Czech record-level data from insurance companies

The CSV file for the record-level data was committed to GitHub by Tadeáš Fryčák, but he was also one of the authors of a paper published in May 2024 titled "Does the healthy vaccinee bias rule them all? Association of COVID-19 vaccination status and all-cause mortality from an analysis of data from 2.2 million individual health records": https://www.sciencedirect.com/science/article/pii/S1201971224000468. The authors of the paper described two sets of record-level data which only included about 2 million people in total, and not over 11 million people like the new dataset:

Using a Freedom of Information request, we obtained data from two health insurance houses (in the Czech Republic, health insurance is compulsory and multiple health insurance houses provide the service). Each line in the dataset corresponded to a unique individual and included their sex, age, dates and types of all COVID vaccines, and (if applicable), the date of death. When trying to find out whether there is a difference between the baseline frailty of the groups, ACM is the most telling parameter as it is not burdened with any possible misclassification on the cause of death. For this reason, only ACM will be considered in this study.

Each cooperating insurance company decided to blur the data slightly to exclude the possibility of identification of any individual. The first dataset provided by the Czech Business Insurance Company (CPZP) comprises 1,362,924 individuals, i.e., approx. 13% of the Czech population. It is by no means a fully representative sample of the Czech population because the clients of CPZP tend to be younger than average. However, the size of the cohort allows interesting analyses. The time data were blurred to week and year. The other dataset provided by the Professional Insurance Company (OZP) comprises 827,475 individuals, representing another 8% of the Czech population, the time data were blurred to month and year of events. The OZP dataset will serve for validation of the results obtained from the larger CPZP dataset. In all, we have complete data on mortality and vaccination of more than 2 million individuals over the entire period of 2021 and 2022. Both the datasets are available at https://github.com/PalackyUniversity/hve.

The authors observed the phenomenon where the mortality rate of people with 2 doses shot up when the third dose was rolled out, but they attributed it to the healthy vaccinee effect. Martin Neil and Norman Fenton have hypothesized that the same phenomenon might be explained by a classification delay which they call the "cheap trick", where deaths that occured within a certain number of weeks after the third dose would've been classified under the second dose. But because the authors of the Czech paper had access to the record-level data which showed the dates of vaccination and death of individual people, the authors could know for sure that the cheap trick was not being used in the data. They included the following comments about the plot shown below:

[...]

Now, let us focus on the red panels, covering the "high-COVID" period of October 2021-May 2022. In that period, the Czech Republic recorded almost 10,000 COVID-related deaths, which translates into an average of 40 deaths with/from COVID-19 per day. The vaccine effectiveness in preventing COVID-related deaths should lead to an increase in the ratio of unvaccinated:vaccinated ACM. However, the exact opposite happened. In all age groups, the ACM in the cohort with completed primary course (yellow bars) more than doubled compared to the low-COVID period, while the ACM of the unvaccinated cohort rose only by a third. A sanity check with the same analysis performed for the other dataset (OZP insurance company) yielded consistent results (see Supplement S6). This paradoxical observation can be caused by the fact that in the high-COVID period, vaccination by the third dose was underway, which, again, led to a subselection of the healthier group for booster dose vaccination, while infirm individuals concentrated in the group "primary course for more than 2 months."

Table for percentage of different vaccine types given each month

More than two thirds of all vaccines had the type "Comirnaty" (Pfizer) each month until September 2022, when many people started getting Pfizer's Omicron vaccines ("Comirnaty Original/Omicron BA.1" and "Comirnaty Original/Omicron BA.4/BA.5"). In the table below only vaccines with the type "Comirnaty" are included under Pfizer.

Most Janssen vaccines were given between August 2021 and November 2021. Most AstraZeneca vaccines were given between February 2021 and July 2021, and there were almost no AstraZeneca vaccines given after November 2021.

> ua=\(x,y,...){u=unique(x);y(u,...)[match(x,u)]}
> rec=fread("CR_records.csv")
> date=rec[,`class<-`(unlist(.SD,,F),"Date"),.SDcols=paste0("Datum_",1:7)]|>ua(substr,1,7)
> type=rec[,unlist(.SD,,F),.SDcols=paste0("OckovaciLatka_",1:7)]
> name1=c("Comirnaty","SPIKEVAX","VAXZEVRIA","COVID-19 Vaccine Janssen","")
> name2=c("Pfizer","Moderna","AstraZeneca","Janssen","")
> type=ifelse(type%in%name1,name2[match(type,name1)],"Other")
> m=data.table(date,type)[type!="",.N,.(date,type)][,xtabs(N~date+type)]
> round(m/rowSums(m)*100,1)
         type
date      AstraZeneca Janssen Moderna Other Pfizer
  2020-12         0.0     0.0     0.0   0.0  100.0
  2021-01         0.0     0.0     2.8   0.0   97.2
  2021-02         6.7     0.0     8.0   0.0   85.3
  2021-03        19.2     0.0    12.2   0.0   68.5
  2021-04        10.5     0.4    11.3   0.0   77.8
  2021-05         7.3     1.6     7.8   0.0   83.4
  2021-06         7.0     1.8     7.4   0.0   83.8
  2021-07         5.2     2.4     6.8   0.0   85.5
  2021-08         2.2     5.4     3.5   0.0   88.9
  2021-09         0.5    10.4     7.0   0.0   82.1
  2021-10         0.2    10.1     7.1   0.0   82.6
  2021-11         0.0     8.6    10.3   0.0   81.1
  2021-12         0.0     0.8    12.9   0.0   86.3
  2022-01         0.0     0.3    12.9   0.0   86.8
  2022-02         0.0     0.3    10.9   0.0   88.8
  2022-03         0.0     0.4    11.3   3.3   85.0
  2022-04         0.0     0.4     9.5   4.2   86.0
  2022-05         0.0     0.5     9.0   2.3   88.2
  2022-06         0.0     0.5     8.5   2.0   88.9
  2022-07         0.0     0.3     8.3   0.8   90.6
  2022-08         0.0     0.2     8.9   0.5   90.5
  2022-09         0.0     0.1     4.3  46.8   48.7
  2022-10         0.0     0.0     0.7  85.9   13.3
  2022-11         0.0     0.1     0.5  88.6   10.8
  2022-12         0.0     0.2     0.4  87.6   11.9
  2023-01         0.0     1.0     1.4  84.6   13.1
  2023-02         0.0     1.6     1.7  79.5   17.2
  2023-03         0.0     2.2     1.0  75.5   21.4
  2023-04         0.0     4.6     1.3  73.9   20.3
  2023-05         0.0     2.5     0.7  72.5   24.2
  2023-06         0.0     1.5     0.8  69.5   28.2
  2023-07         0.0     1.7     1.1  86.9   10.3
  2023-08         0.0     1.1     0.0  85.6   13.3
  2023-09         0.0     0.0     0.1  93.0    7.0
  2023-10         0.0     0.0     0.1  94.7    5.3
  2023-11         0.0     0.0     0.1  95.6    4.3
  2023-12         0.0     0.0     0.0  96.5    3.5
  2024-01         0.0     0.0     0.0  97.5    2.5
  2024-02         0.0     0.0     0.0  96.6    3.4
  2024-03         0.0     0.0     0.0  98.9    1.1

Kirsch's calculation for increase in all-cause mortality

Kirsch included this comment in the file Full analysis - All vaccines.xlsx:

>>> def dp(r, mrr):
...     return (r-1)/(r-mrr)
...
>>> dp(3, 1.5)
1.3333333333333333
>

I think that's probably the most conservative estimate: that Moderna has an excess mortality ratio (EMR, which is R in the function above) of 3, and an absolute MRR of 1.5 per the Czech data.

This would mean that Pfizer has an MR vs a saline shot of 1.33, or a 33% increase in all cause mortality.

So if MRP = 1.3, then MRm=1.3*1.5=1.95

And if half the public got Pfizer and half got Moderna, we have an excess ACM of 62%.

If Moderna is 1/4 of the doses given and has a 3X EMR, then Moderna death numbers = Pfizer death numbers.

But only 75% of people are vaccinated so that there would be an overall 46% increase in ACM in the group affected the most.

This corresponds of what the insurance data says.

In the code below I calculated a linear trend in mortality rate in 2013-2019 for each single year of age, I projected the trend to 2021-2022, and I multiplied the projected mortality rates by the population estimates for each age in 2021 and 2022 to get the expected number of deaths. But it gave me a total excess mortality of only about 17% in 2021-2022, which is less than half of Kirsch's 46% increase:

> t=fread("http://sars2.net/f/czpopdead.csv")
> t$base=t$pop*c(t(sapply(split(t,t$age),\(x)predict(lm(dead/pop~year,x[year%in%2013:2019]),x))))
> t=t[year%in%2021:2022,.(dead=sum(dead),base=sum(base)),.(year,age=pmin(80,age%/%20*20))]
> t=rbind(t,t[,.(dead=sum(dead),base=sum(base),age="Total"),year])
> t=rbind(t,t[,.(dead=sum(dead),base=sum(base),year="Total"),age])
> xtabs(round((dead/base-1)*100)~year+age,t)
       age
year      0  20  40  60  80 Total
  2021  -11  17  30  30  20    25
  2022   -9  13  10   6  11     8
  Total -10  15  20  18  16    17

Kirsch said that 75% of people were vaccinated, but I don't know if his figure is supposed to have included all ages or if it didn't include children in the ages that generally didn't get vaccinated. Because if you include all ages, then the total percentage of vaccinated people in the record-level data is only about 64-65% in 2022 (which is similar to OWID's figure of about 66% vaccinated people in 2022):

> t=fread("http://sars2.net/f/czbucketskeep.csv.gz")[dose<=1&month>="2020-12"]
> t=t[,.(alive=sum(as.double(alive)),dead=sum(dead)),.(dose,month,age=factor(pmin(100,age%/%10*10)))]
> t=rbind(t,t[,.(alive=sum(alive),dead=sum(dead),month="Total"),.(age,dose)])
> t=rbind(t,t[,.(alive=sum(alive),dead=sum(dead),age="Total"),.(month,dose)])
> t[,.(vaxpct=sum(alive[dose==1])/sum(alive)*100),.(month,age)][,round(xtabs(vaxpct~month+age))]
         age
month      0 10 20 30 40 50 60 70 80 90 100 Total
  2020-12  0  0  0  0  0  0  0  0  0  0   0     0
  2021-01  0  0  1  1  1  2  1  1  6  8   7     1
  2021-02  0  0  2  2  3  3  2  2 27 32  19     3
  2021-03  0  0  3  4  6  7  6 22 58 51  28     8
  2021-04  0  0  5  6 10 13 18 56 71 60  32    15
  2021-05  0  1  8 11 24 41 59 75 77 65  34    29
  2021-06  0  4 21 33 52 61 72 80 81 68  35    43
  2021-07  0 15 40 45 59 65 75 82 82 69  35    50
  2021-08  0 27 49 50 63 68 78 84 83 70  35    55
  2021-09  0 31 52 52 64 70 79 85 84 71  35    56
  2021-10  0 33 54 54 65 71 79 85 84 72  35    57
  2021-11  0 37 60 58 69 74 81 86 85 73  35    60
  2021-12  0 41 65 62 71 76 82 88 86 75  36    63
  2022-01  2 44 67 64 72 76 83 88 87 76  36    64
  2022-02  3 45 68 64 72 77 83 88 87 76  37    65
  2022-03  3 45 68 64 72 77 83 88 88 76  37    65
  2022-04  3 44 68 64 72 77 83 88 88 77  37    65
  2022-05  3 44 69 64 72 77 83 88 88 77  37    65
  2022-06  2 43 69 64 72 77 83 88 88 77  37    65
  2022-07  2 43 69 64 72 77 83 88 88 77  37    64
  2022-08  2 43 69 64 72 77 83 88 88 78  37    64
  2022-09  2 42 69 64 72 77 83 88 88 78  37    64
  2022-10  2 42 69 64 72 77 83 88 88 78  36    64
  2022-11  2 41 69 64 72 77 83 88 88 78  36    64
  2022-12  2 41 69 64 72 77 83 88 88 78  36    64
  Total    1 28 47 46 54 59 65 73 75 66  33    49