Other parts: czech2.
In March 2024 Stanislav Veselý received a FOI response of vaccination
data from the Institute of Health Information and Statistics of the
Czech Republic: https://
The dataset was already uploaded to GitHub in March 2024, but it was
almost completely overlooked until Steve Kirsch published his analysis
of the data in July 2024:
https://
Full record-level data:
Bucket files:
czbuckets.csv but
people are kept under earlier doses after a new dose.czbucketskeep.csv but with an additional column for batch
identifier.Population and deaths:
COVID data published by the Czech Ministry of Health:
Other:
In an earlier version of the GitHub repository about the Czech data,
Kirsch wrote: "Vaccines were randomly distributed
for those wishing to get vaxxed. [...] People
were not allowed to select which vaccines they got. [...] The randomization of which vaccine someone got
created a perfect real-world randomized clinical trial where we could
compute the mortality rates for 1 year after Dose 2 for the two most
popular vaccines." [https://
However I didn't find any source which said that the vaccines were actually allocated randomly in the Czech Republic.
In the FOIA record-level the average year of birth is about 1973 for people whose vaccine type for the second dose was "Comirnaty" (Pfizer) but about 1966 for "SPIKEVAX" (Moderna), which indicates that the vaccine types were not allocated randomly. And there's also about 6 million people whose second vaccine type was Comirnaty but only about 500,000 people whose second vaccine type was Spikevax:
>system( " curl -Ls https:// github. com/ skirsch/ Czech/ raw/ main/ data/ CR_ records. csv. xz| xz -dc> CR_ records. csv") > rec=data. table:: fread( " CR_ records. csv") > rec[,.( year=mean( Rok_ narozeni),. N),.( type=OckovaciLatka_ 2)][ order( year)]| > print( r=F) type year N VAXZEVRIA 1952. 506 439705 # AstraZeneca Valneva 1953.500 2 Comirnaty Omicron XBB. 1. 5. 1960. 761 67 Nuvaxovid XBB 1. 5 1964. 500 2 SPIKEVAX 1965. 842 517783 # Moderna COVID-19 Vaccine Janssen 1969. 026 271 Covovax 1971. 517 29 Comirnaty 1972. 682 5519975 # Pfizer Comirnaty Original/Omicron BA. 4/ BA. 5 1973. 527 300 Sinovac 1974. 657 178 Comirnaty Original/ Omicron BA. 1 1976. 048 230 Sinopharm 1979. 538 39 Nuvaxovid 1980. 135 5098 Sputnik V 1981. 300 10 Spikevax bivalent Original/ Omicron BA. 1 1981. 375 16 Covishield 1984. 713 94 1989. 098 4544448 # people with no second dose listed COVAXIN 1990.500 4 SPIKEVAX BIVALENT ORIGINAL/ OMICRON BA. 4- 5 1992. 000 2 Comirnaty Omicron XBB. 1. 5. 5- 11 2017. 000 2 Comirnaty 6m- 4 2019. 902 92 COMIRNATY OMICRON XBB. 1. 5 6m- 4 2020. 840 25 type year N
Kirsch included these comments in the file
Pfizer v.:
A Simpson's paradox example here that really surprised us
There is a really interesting simpson's paradox due to ratios. For each individual age, the MRR (mortality rate ratio) rarely go above 2.25. Yet when you divide the CMR for Moderna/Pfizer, you get 2.25.
You can see this in the calculation on the Combined but skip tab at the bottom.
But of course our result isn't a simpson's paradox because we computed the CMR ratios by single digit age groups and five year brackets and plotted it.
However people who received a Moderna vaccine were older on average than people who received a Pfizer vaccine, which explains why Kirsch's ratio is higher for all ages aggregated together than for individual age groups.
Added later: Kirsch later edited the README file at GitHub and he told me: "i've changed randomly to non-systematically in the github. sorry for the error."
The following code generates a file for deaths and person-days grouped by ongoing month, month of vaccination, weeks since vaccination, single year of age, dose number, and vaccine type.
The record-level data only has a year of birth for each person but not a date of birth, so here I generated a random date of birth for each person.
library(data. table) # unique apply (faster for long vectors with many repeated values) ua=\(x, y,...){ u=unique( x); y( u,...)[ match( x, u)]} # fast way to get a floored number of years between dates after 1900 and before 2100 age=\(x, y){ class( x) =class( y) =NULL;( y- x-( y- 789)%/% 1461+( x- 789)%/% 1461)%/% 365} mindate=as. Date( " 2020- 1- 1"); maxdate=as. Date( " 2022- 12- 31") rec=fread( " Czech/ data/ CR_ records. csv") t=rec[,.( id=1:. N, death=DatumUmrti)] set. seed( 0); t$ birth=ua( paste0( rec$ Rok_ narozeni, "- 1- 1"), as. Date)+ sample( 0: 364, nrow( t), T) t=t[ rep( 1: nrow( t), 8),] t$ dose=rep( 0: 7, each=nrow( rec)) t$ date=c( rep( mindate, nrow( rec)), rec[,` class<-`( unlist(. SD,, F), " Date"),. SDcols=paste0( " Datum_ ", 1: 7)]) t$ type=c( rep( " ", nrow( rec)), rec[, unlist(. SD,, F),. SDcols=paste0( " OckovaciLatka_ ", 1: 7)]) t=t[! is. na( date)][ date< =maxdate][ order(- date)] t$ vaxmonth=ua( t$ date, substr, 1, 7) name1=unique( t$ type); name2=rep( " Other", length( name1)) name2[ name1==" "] =" " name2[ grep( " comirnaty", ignore. case=T, name1)] =" Pfizer" name2[ grep( " spikevax", ignore. case=T, name1)] =" Moderna" name2[ grep( " nuvaxovid", ignore. case=T, name1)] =" Novavax" name2[ name1==" COVID- 19 Vaccine Janssen"] =" Janssen" name2[ name1==" VAXZEVRIA"] =" AstraZeneca" t$ type=name2[ match( t$ type, name1)] rm( rec) # free up memory so the script won' buck=data.t have to use swap table() for( day in as. list( seq( mindate, maxdate, 1))){ cat( as. character( day), "\ n") buck=rbind( buck, unique( t[ date< =day&( is. na( death)| day< =death) & day> =birth], by=" id")[,.( # remove under earlier doses #buck=rbind( month=substr(buck, t[ date< =day&( is. na( death)| day< =death) & day> =birth][!( dose==0& id% in% id[ dose> 0])][,.( # keep under earlier doses day, 1, 7), vaxmonth, week=ifelse( type==" ", 0, as. numeric( day- date)%/% 7), age=age( birth, day), dose, type, alive=1, dead=death==day)])[,.( alive=sum( alive), dead=sum( dead, na. rm=T)), by=.( month,