Comments to Ethical Skeptic - sars2.net

Contents

Plot for cardiac deaths in ages 0-54

Ethical Skeptic posted this tweet where he commented on a plot by Truth In Numbers: [https://x.com/EthicalSkeptic/status/1757181136306848244]

  1. He used the wrong ICD-codes: used 'heart disease in older persons' (many of whom died in 2020/21 from Covid), not 'sudden cardiac death in younger persons' (see panel 1 chart done correctly).

  2. Used an ICD R99/999 and Lag depleted 2023 data set (see data in panel 2). This is simply malicious.

  3. Diluted the group with population growth (this cohort shrank actually, but he used total population - cheating, idiocy or both).

  4. Did not use pull forward effect, as a professional does.

  5. Quashed the chart into a full range y-axis magnitude to make it appear flat.

  6. Put in a fictitious regression line, when the weekly line shows clearly an increase (save for his corrupted 2023 numbers)

He didn't specify how he calculated the baseline for excess deaths. But if he only fitted the baseline based on data from 2018 and 2019, then it might explain why the line for excess deaths appears to be so flat in 2018 and 2019, and the line would've probably looked less flat if he used a longer fitting period so it wouldn't have fitted as closely to the data in 2018-2019. His plot is also missing data before 2018 so it's not clear what the trend was before COVID.

I did a query at CDC WONDER where I used the same ICD codes as Ethical Skeptic but I did one query where multiple causes of death included one of the cardiac-related ICD codes he used in his plot, and I did another query for the two unknown causes of death he included in his plot (R96 and R99): https://wonder.cdc.gov/mcd.html. I excluded COVID deaths by doing a query where the multiple causes of death included both one of the cardiac ICD codes and a code for COVID, and then I subtracted the weekly COVID deaths from the cardiac total (but I didn't bother subtracting COVID deaths from the unknown causes of death).

I noticed that there was a spike in cardiac deaths in early 2018, but for some reason Ethical Skeptic omitted approximately the first two months of 2018 from his plot. The peak in deaths had a similar height as the peaks in the winters after COVID, so at first I thought he may have omited in early 2018 from his plot which made the spikes after COVID seem higher in comparison. But then I noticed that he also omitted approximately the first two months of 2018 from his other plots, so it might have something to do with his "lag adjustment" method.

Ethical Skeptic omitted the last 26 weeks of 2023 from his plot because there was a big increase in deaths where the cause had not yet been specified. But his plot had an increase in deaths in the first half of 2023, but it seems to be due to deaths where the cause has not yet been specified and not due to cardiac deaths:

library(tempdisagg);library(ggplot2);library(stringr)

wonder=\(x){t=readLines(x);t=paste(t[1:(grep("^\"(---|Total)",t)[1]-1)],collapse="\n");read.table(sep="\t",text=t,header=T,na.strings="Not Applicable")[-1]}

old=wonder("Multiple Cause of Death, 1999-2020.txt")
old=td(data.frame(as.Date(paste(old$Month.Code,1),"%Y/%m %d"),old$Deaths)~1,,"daily","fast")$values

new=wonder("Provisional Mortality Statistics, 2018 through Last Week.txt")
covid=wonder("wondercovid.txt")
mt=match(covid$MMWR.Week.Code,new$MMWR.Week.Code)
new$Deaths[mt]=new$Deaths[mt]-covid$Deaths
new=td(data.frame(as.Date(sub(".*ending ","",new$MMWR.Week),"%B %d, %Y")+3,new$Deaths)~1,,"daily","fast")$values

ill=wonder("ill.txt")
ill=td(data.frame(as.Date(sub(".*ending ","",ill$MMWR.Week),"%B %d, %Y")+3,ill$Deaths)~1,,"daily","fast")$values
illold=wonder("illold.txt")
illold=td(data.frame(as.Date(paste(illold$Month.Code,1),"%Y/%m %d"),illold$Deaths)~1,,"daily","fast")$values

xy=rbind(old[!old$time%in%new$time,],new)
xy$z="MCoD potentially involving sudden cardiac death, COVID excluded"

xy=rbind(xy,cbind(rbind(ill,illold[!illold$time%in%ill$time,]),z="Other sudden death with unknown cause, other ill-defined and unspecified causes of mortality"))

colnames(xy)=c("x","y","z")
xy$z=factor(xy$z,unique(xy$z))

xstart=as.Date("2011-1-1")
xend=as.Date("2024-1-1")
xbreak=seq(xstart,xend,"6 month")
xlab=c(rbind("",2011:2023),"")
cand=c(sapply(c(1,2,5),\(x)x*10^c(-10:10)))
ymax=max(xy$y,na.rm=T)
ystart=0
ystep=cand[which.min(abs(cand-ymax/5))]
yend=ystep*ceiling(ymax/ystep)
ybreak=seq(ystart,yend,ystep)

color=c("#aa0000","#888888")

sub="Data from wonder.cdc.gov/mcd.html. Weekly data from 2018-2023 and monthly data from 1999-2017 was interpolated to daily data. Multiple causes of death that potentially involve sudden cardiac death consist of I45 (Other conduction disorders), I46 (Cardiac arrest), I47 (Paroxysmal tachycardia), I48 (Atrial fibrillation and flutter), I49 (Other cardiac arrhythmias), I50 (Heart failure), I51 (Complications and ill-defined descriptions of heart disease), ROO (Abnormalities of heart beat), R01 (Cardiac murmurs and other cardiac sounds), R09.2 (Respiratory arrest), R09.8 (Other specified symptoms and signs involving the circulatory and respiratory systems). Ill-defined and unknown causes of death consist of multiple causes of death R96 (Other sudden death, cause unknown) and R99 (Other ill-defined and unspecified causes of mortality)."

labels=data.frame(x=xstart+.02*(xend-xstart),y=seq(yend*.93,,-yend/13,nlevels(xy$z)),label=levels(xy$z))

ggplot(xy,aes(x=x,y=y,color=z))+
geom_hline(yintercept=c(ystart,0,yend),color="gray65",linewidth=.25)+
geom_vline(xintercept=c(xstart,xend),color="gray65",linewidth=.25)+
geom_line(aes(color=z),linewidth=.3)+
geom_label(data=labels,aes(x=x,y=y,label=label),fill=alpha("white",.8),label.r=unit(0,"lines"),label.padding=unit(.04,"lines"),label.size=0,color=color[1:nrow(labels)],size=2.4,hjust=0)+
labs(title=str_wrap("CDC Wonder, ages 0-54: daily number of deaths with multiple cause of death potentially involving sudden cardiac death. MCoD U07.1 (COVID) and U09.9 (long COVID) excluded.",88),caption=str_wrap(sub,95),x=NULL,y=NULL)+
coord_cartesian(clip="off",expand=F)+
scale_x_date(limits=c(xstart,xend),breaks=xbreak,labels=xlab)+
scale_y_continuous(limits=c(ystart,yend),breaks=ybreak)+
scale_color_manual(values=color)+
theme(axis.text=element_text(size=6.5,color="black"),
  axis.ticks=element_line(linewidth=.25,color="gray65"),
  axis.ticks.x=element_line(color=alpha("gray65",c(1,0))),
  axis.ticks.length=unit(.15,"lines"),
  axis.title=element_text(size=8),
  legend.position="none",
  panel.background=element_rect(fill="white"),
  panel.grid=element_blank(),
  plot.background=element_rect(fill="white"),
  plot.margin=margin(.4,.65,.4,.4,"lines"),
  plot.caption=element_text(size=7,hjust=0,margin=margin(.6,0,0,0,"lines")),
  plot.title=element_text(size=7.5))
ggsave("1.png",width=4.6,height=3.3,dpi=450)

In Ethical Skeptic's plot the excess deaths looked flat in 2018-2019, but it might partially be because he fitted the trend against data from 2018-2019, because if you fit a seasonality-adjusted trend against only two years of data then it's easy to get the trend to match the data closely. In the plot below where I fitted the trend against data from 2021-2022, I also got my excess deaths to look mostly flat in 2021-2022:

library(tempdisagg);library(ggplot2);library(stringr)

wonder=\(x){t=readLines(x);t=paste(t[1:(grep("^\"(---|Total)",t)[1]-1)],collapse="\n");read.table(sep="\t",text=t,header=T,na.strings="Not Applicable")[-1]}

old=wonder("Multiple Cause of Death, 1999-2020.txt")
old=td(data.frame(as.Date(paste(old$Month.Code,1),"%Y/%m %d"),old$Deaths)~1,,"daily","fast")$values

new=wonder("Provisional Mortality Statistics, 2018 through Last Week.txt")
covid=wonder("wondercovid.txt")
mt=match(covid$MMWR.Week.Code,new$MMWR.Week.Code)
new$Deaths[mt]=new$Deaths[mt]-covid$Deaths
new=td(data.frame(as.Date(sub(".*ending ","",new$MMWR.Week),"%B %d, %Y")+3,new$Deaths)~1,,"daily","fast")$values

ill=wonder("ill.txt")
ill=td(data.frame(as.Date(sub(".*ending ","",ill$MMWR.Week),"%B %d, %Y")+3,ill$Deaths)~1,,"daily","fast")$values
illold=wonder("illold.txt")
illold=td(data.frame(as.Date(paste(illold$Month.Code,1),"%Y/%m %d"),illold$Deaths)~1,,"daily","fast")$values

d=rbind(old[!old$time%in%new$time,],new)|>"colnames<-"(c("x","y"))
xy=cbind(d,z="Actual deaths")

ma=\(x,b=1,f=b)rowMeans(embed(c(rep(NA,b),x,rep(NA,f)),f+b+1),na.rm=T)

prediction=d$x>="2018-1-1"&d$x<="2019-12-31"
linear=predict(lm(y~x,d[prediction,]),d)
days=substr(d$x,6,10)
daily=tapply(d$y[prediction]-linear[prediction],days[prediction],mean)
daily=ma(rep(daily,3),10)[(length(daily)+1):(2*length(daily))]|>setNames(names(daily))
excess=d$y-(linear+daily[days])
xy=rbind(xy,data.frame(x=d$x,y=excess,z="Excess deaths using seasonality-adjusted linear trend fitted against data from 2018-2019"))

prediction=d$x>="2021-01-01"&d$x<="2022-12-31"
linear=predict(lm(y~x,d[prediction,]),d)
days=substr(d$x,6,10)
daily=tapply(d$y[prediction]-linear[prediction],days[prediction],mean)
daily=ma(rep(daily,3),10)[(length(daily)+1):(2*length(daily))]|>setNames(names(daily))
excess=d$y-(linear+daily[days])
xy=rbind(xy,data.frame(x=d$x,y=excess,z="Excess deaths using seasonality-adjusted linear trend fitted against data from 2021-2022"))

xy$z=factor(xy$z,unique(xy$z))

xstart=as.Date("2011-1-1");xend=as.Date("2024-1-1")
xy=xy[xy$x>=xstart&xy$x<=xend,]

xbreak=seq(xstart,xend,"6 month")
xlab=c(rbind("",2011:2023),"")
cand=c(sapply(c(1,2,5),\(x)x*10^c(-10:10)))
ymax=max(xy$y,na.rm=T);ymin=min(xy$y,na.rm=T)
ystep=cand[which.min(abs(cand-(ymax-ymin)/5))]
ystart=ystep*floor(ymin/ystep)
yend=ystep*ceiling(ymax/ystep)
ystart=-120;yend=220
ybreak=seq(-100,yend,ystep)

color=c("#aa0000","#889944","#44aa44")

labels=data.frame(x=xstart+.02*(xend-xstart),y=seq(ystart+(yend-ystart)*.72,,-(yend-ystart)/15,nlevels(xy$z)),label=levels(xy$z))

sub=str_wrap("Data from wonder.cdc.gov/mcd.html. Weekly data from 2018-2023 and monthly data from 1999-2017 was interpolated to daily data. Multiple causes of death that potentially involve sudden cardiac death consist of I45 (Other conduction disorders), I46 (Cardiac arrest), I47 (Paroxysmal tachycardia), I48 (Atrial fibrillation and flutter), I49 (Other cardiac arrhythmias), I50 (Heart failure), I51 (Complications and ill-defined descriptions of heart disease), ROO (Abnormalities of heart beat), R01 (Cardiac murmurs and other cardiac sounds), R09.2 (Respiratory arrest), and R09.8 (Other specified symptoms and signs involving the circulatory and respiratory systems).",95)
sub=paste0(sub,"\n\n",str_wrap("The trend was calculated by first doing linear regression for daily data, then calculating the average difference to the trend for each 365 or 366 days of the year, then taking a 21-day centered moving average of the vector for daily differences, and then adding the difference to the trend on the corresponding day of each year.",95))

ggplot(xy,aes(x=x,y=y,color=z))+
geom_hline(yintercept=c(ystart,0,yend),color="gray65",linewidth=.25)+
geom_vline(xintercept=c(xstart,xend),color="gray65",linewidth=.25)+
geom_line(aes(color=z),linewidth=.3)+
geom_label(data=labels,aes(x=x,y=y,label=label),fill=alpha("white",.8),label.r=unit(0,"lines"),label.padding=unit(.04,"lines"),label.size=0,color=color[1:nrow(labels)],size=2.4,hjust=0)+
labs(title=str_wrap("CDC Wonder, ages 0-54: daily number of deaths with multiple cause of death potentially involving sudden cardiac death. MCoD U07.1 (COVID) and U09.9 (long COVID) excluded.",88),x=NULL,y=NULL,caption=sub)+
coord_cartesian(clip="off",expand=F)+
scale_x_date(limits=c(xstart,xend),breaks=xbreak,labels=xlab)+
scale_y_continuous(limits=c(ystart,yend),breaks=ybreak)+
scale_color_manual(values=color)+
theme(axis.text=element_text(size=6.5,color="black"),
  axis.ticks=element_line(linewidth=.25,color="gray65"),
  axis.ticks.x=element_line(color=alpha("gray65",c(1,0))),
  axis.ticks.length=unit(.15,"lines"),
  axis.title=element_text(size=8),
  legend.position="none",
  panel.background=element_rect(fill="white"),
  panel.grid=element_blank(),
  plot.background=element_rect(fill="white"),
  plot.margin=margin(.4,.65,.4,.4,"lines"),
  plot.caption=element_text(size=6.7,hjust=0,margin=margin(.6,0,0,0,"lines")),
  plot.title=element_text(size=7.5))
ggsave("1.png",width=4.6,height=4.5,dpi=450)

Updated plot for cardiac deaths in ages 0-54 with drug deaths included

After Ethical Skeptic had published the plot described in the previous section of this HTML file, he published an updated version of the plot where he included deaths due to accidental drug poisoning, and he included this note at the bottom of the plot: "X42-44 ((narcotics, heroin, fentanyl, meth, other drug) excess mortality surge removed (for 2020 only - lockdown impact is not trend data)". [https://theethicalskeptic.substack.com/p/the-state-of-things-pandemic-week-706] The plot now had a sudden increase in deaths between December 2020 and January 2021 which was missing from the earlier version, but it's because he included drug overdose deaths in 2021-2023 but not 2020:

The reason why the increase between 2020 and 2021 in the plot above is not instant but it's spread over several weeks is because ES told me that he used a 7-week moving average in the plot (which probably has a centered window which extends 3 weeks backwards and forwards, since in some of his earlier plots he has used a moving average with a window that extended 3 weeks backwards and 2 weeks forwards). [https://theethicalskeptic.substack.com/p/the-state-of-things-pandemic-week-706/comment/53237226, https://www.covid-datascience.com/post/evaluating-claims-of-excess-cancer-deaths-in-the-usa-during-the-pandemic]

Ethical Skeptic said that he omitted excess drug overdose deaths from 2020 but not 2021, but he didn't mention what baseline he used to calculate the excess drug deaths. However the plot below shows that if you simply use the 2018-2019 average as the baseline, then there's even more excess drug deaths in 2021 than in the post-lockdown period of 2020 Q2-Q4. (But an average baseline wouldn't make sense because there was already a sharply increasing trend in drug overdose deaths in the years before COVID. And there necessarily weren't even that many excess drug deaths caused by the lockdowns, and it wouldn't make sense to blame excess drug deaths in 2023 on the lockdown.)

In the plot above I didn't exclude deaths with MCoD COVID. In 2020-2023, only about 0.5% of deaths with MCoD X42-44 (accidental drug poisoning) also included MCoD U07.1 (COVID) (983 out of 184920). But about 8.5% of deaths with one of the cardiac ICD codes also included MCoD COVID (8677 out of 101714). The spikes in cardiac deaths coincide with spikes in COVID deaths, so I guess COVID actually contributed to some of the cardiac deaths.

My plot above also shows that in the third quarter of 2023 there is a decrease in drug overdose deaths, which is probably because of a registration delay, but it gets offset by an increase in R96 and R99 deaths where the cause of death has not yet been resolved. Ethical Skeptic omitted approximately the last 12 weeks of 2023 from his plot, but he didn't mention that there was already an increase in R96 and R99 deaths in the third quarter of 2023.

When someone asked ES why there was a huge increase in deaths in early 2021 even though only few people in ages 0-54 had been vaccinated, ES said it was the vaccines but he didn't even mention that he included drug deaths in 2021 but not 2020:

When someone asked him why he included drug deaths in his plot, he said that the drug deaths were caused by the vaccines:

In one tweet he further explained that the reason why he included a certain proportion of drug deaths is because if someone died from both a cardiac cause and a drug overdose, only the drug overdose would be listed as a multiple cause of death: "The reason we can and ethically should add the incremental vaccine-caused portion of these deaths (from Chart 4c) into this analysis, is because this is a multiple cause of death (MCoD) analysis, not an underlying cause of death (UCoD) analysis. MCoD analysis is used to detect prevalence and trend, when an overlap between causes is determined. However, when death is ascribed to a Non-Natural Cause, no MCoD overlap is afforded that pool of mortality, as only one cause of death is allowed - thereby depleting the sudden cardiac death trend numbers artificially. This has been abused (as one can ascertain from Chart 4c) - therefore, competent analysis on this must include the incremental portion of 'unknown drug overdose' which is clearly sensitive to the introduction of the vaccine." [https://x.com/EthicalSkeptic/status/1778166029601943622] However for example in 2021 in all ages, CDC WONDER returned 4702 deaths where the MCoD included X42-X44 (which the ICD codes for accidental drug overdoses that ES included in his plot) and where the MCoD also included I45-I51, R00, or R01 (which were the cardiac ICD codes). So I don't know if ES is right that only one cause of death is allowed for deaths from non-natural causes. And if a vaccinated person does cocaine and dies from a heart attack, then was the heart attack caused by the vaccine? (Maybe the vaccine could've been a contributing factor for some drug-induced heart attacks, but I don't think it would be enough to explain why the excess mortality would suddenly triple between December 2020 and January 2021.)

The GIF file below shows that when Ethical Skeptic posted the same plot on Twitter, it didn't include the line at the bottom which said: "X42-44 ((narcotics, heroin, fentanyl, meth, other drug) excess mortality surge removed (for 2020 only - lockdown impact is not trend data)". [https://x.com/EthicalSkeptic/status/1775711298061275434] X42 is "Accidental poisoning by and exposure to narcotics and psychodysleptics [hallucinogens], not elsewhere classified", X43 is "Accidental poisoning by and exposure to other drugs acting on the autonomic nervous system", and X44 is "X44 (Accidental poisoning by and exposure to other and unspecified drugs, medicaments and biological substances)". The tweet was dated April 4th 02:25 UTC, a Wordpress blog post which included the same version of the plot as the Substack post was dated April 4th 23:09 UTC, and the Substack post was dated April 4th 29:39 UTC. [https://theethicalskeptic.com/2024/04/04/the-state-of-things-pandemic-week-50-2023/] So if the version of the plot on Substack and Wordpress is newer than the version on Twitter, it might be that ES forgot to include the X42 and X43 ICD codes to the caption on Twitter even though his plot actually included deaths for X42 and X43. In the year 2021, CDC WONDER returned 45,053 deaths with MCoD X42 and 45,710 deaths with MCoD X44 but only 35 deaths with MCoD X43. But in any case, it was misleading that the plot he posted on Twitter was missing the text which said that he only excluded drug deaths in 2020 but not 2021-2023:

Someone on Twitter also asked: "Wonder what that graph looks like without that bottom code (and similar others)." [https://x.com/All_Im_sayn/status/1775728867543535904] (The bottom line meant the ICD code X44, because the X42 and X43 codes were not included in the version of the post that was posted on Twitter.) But ES replied: "It is 268 deaths per week out of the 929. Decrement that, and the injury-death tally is far below the actual vaccine impact - by conducting cherry picking to force a desired conclusion - and it does not even accomplish the task." 929 refers to the number of excess deaths on week 39 of 2023, so I don't know if the number of excess drug deaths would've been similar on other weeks. And I don't know if the figure of 268 only included drug deaths under X44 or also X42 and X43 (which would likely be around double of X44 alone).

One of the three "vaccine uptake periods" that are highlighted in Ethical Skeptic's plot is in mid-2022. However I don't know which vaccine dose or age group it refers to, because in ages 25-49 the percentage of people who had received a booster dose didn't increase that much after February 2022, and second boosters aren't even listed in CDC's vaccination dataset by age: [https://data.cdc.gov/Vaccinations/COVID-19-Vaccination-Age-and-Sex-Trends-in-the-Uni/5i5k-6cmh]

> t=data.table::fread("COVID-19_Vaccination_Age_and_Sex_Trends_in_the_United_States__National_and_Jurisdictional_20240131.csv")
> t=t[t$Demo=="Ages_25-49_yrs"&t$Location=="US",]
> a=aggregate(t[,9:12],list(sub("(..)/../..(..).*","\\2-\\1",t$Date)),mean,na.rm=T)
> colnames(a)=sub("_(Pop|Vax|pct).*","",colnames(a))
> round(t(data.frame(a,row.names=1)))
                   20-12 21-01 21-02 21-03 21-04 21-05 21-06 21-07 21-08 21-09 21-10 21-11 21-12
Administered_Dose1     1     6    10    20    39    51    56    60    64    69    72    74    77
Series_Complete        0     1     6    11    23    40    49    53    56    60    63    65    67
Booster_Doses        NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN     0     1     3    10    23
Second_Booster       NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN
                   22-01 22-02 22-03 22-04 22-05 22-06 22-07 22-08 22-09 22-10 22-11 22-12 23-01
Administered_Dose1    80    81    82    82    83    83    83    83    84    84    85    85    85
Series_Complete       68    69    70    70    70    70    71    71    71    71    72    72    72
Booster_Doses         33    38    39    40    40    41    41    41    42    42    43    43    43
Second_Booster       NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN
                   23-02 23-03 23-04 23-05
Administered_Dose1    85    85    85    86
Series_Complete       72    72    72    72
Booster_Doses         43    43    43    43
Second_Booster       NaN   NaN   NaN   NaN

Around the same time when Ethical Skeptic's plot had the vaccine uptake period in mid-2022, it also had a noticeable spike in excess deaths. But in my plot above there wasn't any cause of death which had a clear increase in deaths around the same time, apart from a small increase in COVID deaths (which weren't even included in Ethical Skeptic's plot). However his plot displayed excess deaths and not raw deaths like my plot, so the increase in mid-2022 might have been an artifact of the way he calculated his baseline.

In his blog Ethical Skeptic had the chutzpah to say that it would equate to obfuscation if he also added drug deaths to his plot in 2020: "It should be noted that an initial mid-2020 surge in excess deaths due to overdoses (X42-44) during the lockdown period has been removed from this data. This is a death-surge which is non-Natural/non-Vaccine in its basis and can be viewed as to its arrival shape and magnitude in Chart 4c above. This 2020 surge is not trend data, hence its removal in favor of the true arrival trend attributable to the mRNA vaccine. There exists a slight elevation in these deaths which remains on the chart, attributable to Covid-19, but the permanent surge in mortality arrived as a result of the Covid vaccine beginning in December (Week 52 - 14 days after the vaccine start) 2020. To conflate that pre-vaccine surge with the vaccine arrival data, would constitute obfuscation." [https://theethicalskeptic.com/2024/04/04/the-state-of-things-pandemic-week-50-2023/]

Plot for X44 drug deaths

Ethical Skeptic included this plot in one of his articles: [https://theethicalskeptic.com/2024/04/04/the-state-of-things-pandemic-week-50-2023/]

Chart 4c - Excess Mortality from Overdose from Unspecified Drug and Climate Change - Yes, the combination of lockdowns and Covid-19 resulted in a temporary increase in undetermined drug overdoses. However, this is not 'trend' data. The trend begins at the end of 2020 with the introduction of the vaccine (which peaks the very week of vaccine maximum uptake). Vaccine-induced myocarditis rendered this population vulnerable to sudden cardiac death - which was then falsely attributed to drug overdose. This excess attribution, as of Week 25 2023, is in the range of 270-300 deaths per week, for 'unspecified drug' and 'climate'. A query-determined non-overlapping portion of this tally is added back into the Sudden Cardiac Death mortality totals show in DFT Chart 10 below. It suggests an alarming increase in young person Sudden Cardiac Death.

In the year 2021, I got 556 deaths for MCoD X30 (Exposure to excessive natural heat (hyperthermia)) but 35,671 deaths for MCoD X44 (Accidental poisoning by and exposure to other and unspecified drugs, medicaments and biological substances). So in the plot above the deaths associated with hyperthermia have little impact on total mortality.

There's about as many deaths with MCoD X42 ("Accidental poisoning by and exposure to narcotics and psychodysleptics [hallucinogens], not elsewhere classified") as MCoD X44 ("Accidental poisoning by and exposure to other and unspecified drugs, medicaments and biological substances"). So I don't know why Ethical Skeptic only included X44 deaths in his plot. But it might be because X42 had a higher peak in 2020 than 2021 and the peak in 2020 cannot be attributed to the vaccines, but X44 had a higher peak in 2021 than 2020 so it's easier to blame the deaths on vaccines. Or it might also be because X44 had about 81% more deaths in 2021-2022 than 2018-2019 but X42 only had about 50% more deaths in 2021-2022 than 2018-2019. Or it might also be because X44 deaths have a more flat trend in 2018-2019 so the level of deaths in 2021-2023 looks like it's far above the trend from his plot, but X42 has a trend that is more sloped upwards in 2018-2019 so the level of deaths in 2022 doesn't seem that far off the trend even if you're not seeing past data before 2018:

The plot by ES started from 2018 and it only included X44 deaths, so it looked like there was an anomalous sustained increase in drug deaths in 2021-2023. But from my plot above where you can see a more long-term trend and where X44 deaths are also included, it looks like 2022 isn't that far off the long-term trend but rather the level of deaths in mid-2021 looks elevated, and there's still probably a lot of deaths missing in 2023 because drug-related deaths have a long registration delay.

Before COVID the long-term trend in drug deaths seemed to be curved upwards, so the 2015-2019 linear trend in my plot above might be too low, and in the plot below where I used a 2010-2019 polynomial trend instead, the X42 deaths were actually below the trend in 2022:

Effect of using a 2018-2019 baseline

When I asked Ethical Skeptic how he calculates his baseline in the plots which show weekly data from 2018 onwards, he posted this reply: [https://theethicalskeptic.substack.com/p/the-state-of-things-pandemic-week-706/comment/53292593]

I'm not analyzing generational or population-diluted trends (potentially misleading statistics), I am detecting deviation from trend dynamics (detecting system inflection).

I cannot use Wonder 2017 and earlier because the numbers are boosted by the small county single-record effect. This is why Wonder 2018 and beyond is a new database (shows as a variance in your chart which does not actually exist), under that HIPAA constraint. Nor can I use 2002 - 2017 data because this contain a dissipating generational effect of those Boomers who suffered cardiac arrest from specific past health impacts which are no longer salient to this age bracket. That trend line must be moved to the 65+ bracket now. This has caused a slight downtrend into 2019, which I do accommodate, yes.

I compared that baseline to a linear regression of the old growth rate (2014-2017) before hand to ensure it was 'in context'. However, I did not use a pure linear regression of 2018 and 19 for this reason. I fit the baseline to the variances linearly, but not by least-squares regression (how I avoided the early 2018 surge). There is of course not enough data to then use a non-linear function beyond that as well. That would be whipsawing fiction.

But much of this is moot, because none of these manipulations serve to make this dramatic an inflection disappear. These are trend manipulations. Four years from now, trend models will have to be adjusted again, but the inflection will still be there. The argument will be over just how much excess death there indeed is... but I anticipate that it will be moot. If you have a strong inflection like this and your trend analysis makes it disappear quickly, odds are the trend analysis is biased.

As a result, I don't have either a Jan 2018 high or 2019 lowering bias in my baseline.

I don't understand how his method of calculating a linear baseline is different from a regular linear regression or how he accounted for the low number of deaths in 2019, but he banned me before I was able to ask him to clarify it.

There was an exceptionally low number of deaths in 2019 and there was a high number of deaths in early 2018, so if you fit a linear baseline using only data from the years 2018 and 2019, the slope of the baseline will be too low (but I'm not sure to what extent Ethical Skeptic is able to adjust the slope of the baseline using his undocumented methodology):

Ethical Skeptic included the following point on his list of crimes committed by other analysts: "16. Add volatility of peak winter mortality periods into the trend regression, and not index off the summer base. That way a rough flu year will raise the baseline artificially to your advantage." [https://x.com/EthicalSkeptic/status/1773066903000396096] So he might omit weeks with high mortality in early 2018 when he calculates the baseline.

The plot below shows that if you fit a seasonality-adjusted baseline against only one year of weekly data so that the baseline is adjusted to match the actual deaths each week, there's 0% excess deaths during each week of the fitting period. Then the standard deviation is zero so you subsequently get infinite z-scores. And similarly if you fit a seasonality-adjusted baseline using only two years of data, you can subsequently get higher z-scores than with a 4-year baseline. The plot also demonstrates that shorter seasonality-adjusted baselines tend to produce a lower standard deviation for the residuals during the fitting period, and they also tend to produce higher z-scores after the fitting period:

library(ggplot2)

isoweek=\(year,week,weekday=1){d=as.Date(paste0(year,"-1-4"));d-(as.integer(format(d,"%w"))+6)%%7-1+7*(week-1)+weekday}
ma=\(x,b=1,f=b)rowMeans(embed(c(rep(NA,b),x,rep(NA,f)),f+b+1),na.rm=T)

t=read.csv("https://www.mortality.org/File/GetDocument/Public/STMF/Outputs/NZL_NPstmfout.csv")|>subset(Sex=="b")
t$date=isoweek(t$Year,t$Week,4)

xy=do.call(rbind,lapply(c(2011,2012,2014),\(i){
  pred=t$Year%in%2011:i
  trend=predict(lm(Total~date,t[pred,]),t)
  weekly=tapply(t$Total[pred]-trend[pred],t$Week[pred],mean)
  ma(rep(weekly,3),0)[53:154]
  o=data.frame(x=t$date,z=paste0("2011-",i),dead=t$Total)
  o$trend=trend+weekly[pmin(t$Week,52)]
  o$y=ma(o$dead-o$trend,1)
  o$sd=sd(o$dead[pred]-o$trend[pred])
  o}))

xstart=as.Date("2011-1-1");xend=as.Date("2016-1-1")
xy=xy[xy$x>=xstart&xy$x<=xend,]

xbreak=seq(xstart,xend,"6 month")
xlab=c(rbind("",2011:2015),"")
cand=c(sapply(c(1,2,5),\(x)x*10^c(-10:10)))
ymax=max(xy$y,na.rm=T);ymin=min(xy$y,na.rm=T)
ystep=cand[which.min(abs(cand-(ymax-ymin)/5))]
ystart=ystep*floor(ymin/ystep)
yend=ystep*ceiling(ymax/ystep)
ybreak=seq(ystart,yend,ystep)

color=c(hcl(c(210,260,310)+15,100,50))

sd=sprintf("%.1f",tapply(xy$sd,xy$z,unique))
sigma=c("Inf",sprintf("%.1f",tapply(abs(xy$y/xy$sd),xy$z,max))[-1])
lab=paste0(unique(xy$z)," (SD ",sd,", max abs sigma ",sigma,")")
xy$z=lab[factor(xy$z)]

tit="Weekly excess deaths in New Zealand. The baseline is a linear regression which was adjusted for seasonality by for example adding the average difference between the actual deaths and the baseline on each week 1 to the baseline for week 1, and similarly for other weeks."

ggplot(xy,aes(x=x,y=y,color=z))+
geom_hline(yintercept=c(ystart,0,yend),linewidth=.25)+
geom_vline(xintercept=c(xstart,xend),linewidth=.25)+
geom_line(aes(color=z),linewidth=.3)+
labs(title=stringr::str_wrap(tit,85),x=NULL,y=NULL)+
coord_cartesian(clip="off",expand=F)+
scale_x_date(limits=c(xstart,xend),breaks=xbreak,labels=xlab)+
scale_y_continuous(limits=c(ystart,yend),breaks=ybreak)+
scale_color_manual(values=color)+
guides(colour=guide_legend(title="Fitting period",override.aes=list(linewidth=.4)))+
theme(axis.text=element_text(size=6.5,color="black"),
  axis.ticks=element_line(linewidth=.25),
  axis.ticks.x=element_line(color=alpha("black",c(1,0))),
  axis.ticks.length=unit(.15,"lines"),
  axis.title=element_text(size=8),
  legend.background=element_blank(),
  legend.box.just="left",
  legend.key=element_rect(fill="white"),
  legend.spacing.x=unit(.15,"lines"),
  legend.key.size=unit(.8,"lines"),
  legend.position=c(0,0),
  legend.justification=c(0,0),
  legend.box.background=element_rect(fill=alpha("white",.85),color="black",linewidth=.25),
  legend.margin=margin(.3,.4,.3,.4,"lines"),
  legend.text=element_text(size=7,vjust=.5),
  legend.title=element_text(size=8),
  panel.background=element_rect(fill="white"),
  panel.grid=element_blank(),
  plot.background=element_rect(fill="white"),
  plot.margin=margin(.4,.65,.4,.4,"lines"),
  plot.caption=element_text(size=6.7,hjust=0,margin=margin(.6,0,0,0,"lines")),
  plot.title=element_text(size=7.8))
ggsave("1.png",width=4.6,height=3.4,dpi=450)

Ethical Skeptic has published three articles in a series called "Houston We Have a Problem" where he explains some of his methodology. [https://theethicalskeptic.com/2024/04/04/the-state-of-things-pandemic-week-50-2023/] In the third part he included this image which described how he calculates his baseline, but it sounds like he's just doing a regular linear regression and he's not applying any special method to account for the low number of deaths in 2019 or for winter peaks with an unusually high number of deaths. Even though here he seems to desrcibe the methodology he uses for his plots that start from 2014, and it could be that he uses different methodology in other types of plots where he calculates the baseline using data from 2018 and 2019 alone:

COVID deaths per capita compared to percentage of vaccinated population in US counties

Ethical Skeptic posted these tweets: [https://x.com/EthicalSkeptic/status/1531517512906592256]

Covid Deaths per 100 K population trend down as the percentage of the population vaccinated rises.

Vaccines working right?

No, this is the function which existed prior to the vaccines being rolled out (22 Jan 2020 - 4 Apr 2021).

Watch for charlatans exploiting this effect.

In actuality, when we compare both periods before & after the launch of the vax (MMWR Wk 14 2021) - we observe that there exists no difference between pre & post vax death rates by US county. It is EXACTLY the same function pre & post vaccine.

Any benefit MUST show here.

However for some reason he extended his pre-vaccination period up to April 4th 2021, even though about 30% of the US population was already vaccinated at that point (and the majority of people were vaccinated in the elderly age groups which account for most COVID deaths):

$ curl https://data.cdc.gov/api/views/rh2h-3yt2/rows.csv>statesvax.csv
$ csvtk filter2 -f '$Date=="04/03/2021"&&$date_type=="Report"&&$Location=="US"' statesvax.csv|csvtk cut -f Date,Location,Administered_Dose1_Pop_Pct
Date,Location,Administered_Dose1_Pop_Pct
04/03/2021,US,31.4

He also didn't explain how he calculated the trend line, and it looks too low near the end of the x-axis, and for some reason he only drew the trend up to the point where the value of the x-axis was about 85%.

I tried plotting the percentage of vaccinated population on March 1st 2022 on the x-axis like Ethical Skeptic, but I compared COVID deaths in 2020 against COVID deaths in 2021-2022 so that I didn't include the first three months of 2021 as part of the pre-vaccination period. My trend was a lot steeper in 2021-2022 than 2020. And even in 2020 the overall trend looks more flat in my plot, because I drew the trend up to the end of the x-axis and I didn't smooth out the trend as heavily as Ethical Skeptic (or maybe his trend was just a polynomial curve or something). When I weighted the trend by population size, it got even flatter because larger counties have a higher percentage of vaccinated people than smaller counties, and in 2020 there were many large northeastern counties that had a high number of COVID deaths per capita:

library(data.table);library(ggplot2)

# download.file("https://github.com/nytimes/covid-19-data/raw/master/us-counties-2020.csv","us-counties-2020.csv")
# download.file("https://github.com/nytimes/covid-19-data/raw/master/us-counties-2022.csv","us-counties-2022.csv")
# download.file("https://www2.census.gov/programs-surveys/popest/datasets/2020-2022/counties/totals/co-est2022-alldata.csv","uscountypop.csv")

# download manually:
# https://data.cdc.gov/Vaccinations/COVID-19-Vaccinations-in-the-United-States-County/8xkx-amqh/about_data

t=fread("COVID-19_Vaccinations_in_the_United_States_County_20240227.csv")
t=t[Date=="03/01/2022",c("FIPS","Administered_Dose1_Recip")]
colnames(t)=c("fips","vax")
t$fips=as.integer(t$fips)

pop=fread("uscountypop.csv")[,.(fips=as.integer(sprintf("%d%03d",STATE,COUNTY)),pop=POPESTIMATE2022)]

nyt=fread("us-counties-2022.csv")[date=="2022-12-31",.(fips,deaths)]
nyt2=fread("us-counties-2020.csv")[date=="2020-12-31",.(fips,deaths)]
nyt=merge(nyt,nyt2,by="fips")[,.(fips,deaths=(deaths.x-deaths.y)/2)]|>na.omit()

states=read.csv("https://github.com/cphalpert/census-regions/raw/master/us%20census%20bureau%20regions%20and%20divisions.csv")[,c(2,3)]
fips=read.csv("https://gist.githubusercontent.com/dantonnoriega/bf1acd2290e15b91e6710b6fd3be0a53/raw/11d15233327c8080c9646c7e1f23052659db251d/us-state-ansi-fips.csv",strip.white=T)[,c(3,2)]
states=merge(states,fips,by=1)
states=setNames(states[,2],states[,3])

me=merge(merge(t,nyt),pop)[vax>0]|>na.omit()
xy=data.frame(x=pmin(100,me$vax/me$pop*100),y=pmin(500,me$deaths/me$pop*1e5),pop=me$pop)
xy$size=pmax(.3,log(me$pop,20)-2.8)
xy$region=factor(states[sub("...$","",me$fips)],c("Northeast","Midwest","South","West"))

xstart=10;xend=100;xstep=10;ystart=0;yend=500;ystep=100
xbreak=seq(xstart,xend,xstep)
ybreak=seq(ystart,yend,ystep)

smooth1=as.data.frame(predict(smooth.spline(xy$x,xy$y,spar=.85,w=xy$pop/max(xy$pop)),seq(0,100,.1)))
smooth2=as.data.frame(predict(smooth.spline(xy,spar=.85),seq(0,100,.1)))

leg1=data.frame(x=11.4,y=seq(yend*.94,,-yend/15,2),label=c("Smoothed spline (weighted by population size, spar=.85)","Smoothed spline (not weighted, spar=.85)"))
leg2=data.frame(x=xend-1.4,y=seq(yend*.94,,-yend/15,4),label=levels(xy$region))

pal1=c("#ee66aa","#993377")
pal2=c(hcl(225,100,50),hcl(55,50,50),"black",hcl(135,100,50))

ggplot(xy,aes(x,y))+
geom_label(data=leg1,aes(x=x,y=y,label=label),fill=alpha("white",.85),label.r=unit(0,"lines"),label.padding=unit(.04,"lines"),label.size=0,size=2.3,hjust=0,color=pal1)+
geom_label(data=leg2,aes(x=x,y=y,label=label),fill=alpha("white",.85),label.r=unit(0,"lines"),label.padding=unit(.04,"lines"),label.size=0,size=2.3,hjust=1,color=pal2)+
geom_vline(xintercept=c(xstart,xend),linewidth=.25,lineend="square")+
geom_hline(yintercept=c(ystart,yend),linewidth=.25,lineend="square")+
geom_point(aes(color=region),stroke=0,size=xy$size,alpha=.3)+
geom_line(data=smooth1,color=pal1[1],linewidth=.35)+
geom_line(data=smooth2,color=pal1[2],linewidth=.35)+
labs(x="Percentage of vaccinated population on March 1st 2022",y="COVID deaths per 100k person-years in 2020",title="US counties: COVID deaths vs percentage of vaccinated population on March 1st 2022",caption="Point size indicates population size. Y-axis values are truncated to 500. Population estimates are for mid-2022. Sources:
data.cdc.gov/Vaccinations/COVID-19-Vaccinations-in-the-United-States-County/8xkx-amqh/about_data,
www2.census.gov/programs-surveys/popest/datasets/2020-2022/counties/totals/co-est2022-alldata.csv,
github.com/nytimes/covid-19-data.")+
scale_x_continuous(limits=c(xstart,xend),breaks=xbreak)+
scale_y_continuous(limits=c(ystart,yend),breaks=ybreak)+
scale_color_manual(values=pal2)+
coord_cartesian(clip="off",expand=F)+
theme(axis.text=element_text(size=5.5,color="black"),
  axis.ticks=element_line(linewidth=.25),
  axis.ticks.length=unit(.15,"lines"),
  axis.title=element_text(size=6.5),
  legend.position="none",
  panel.grid=element_blank(),
  panel.background=element_rect(fill="white"),
  plot.margin=margin(.4,.8,.4,.4,"lines"),
  plot.caption=element_text(size=6.3,hjust=0),
  plot.title=element_text(size=7.3))
ggsave("1.png",width=5.2,height=3.1,dpi=450)

Deaths from malignant neoplasms with excess MCoD normalization

Ethical Skeptic wrote: [https://theethicalskeptic.com/2024/04/04/the-state-of-things-pandemic-week-50-2023/]

The green-to-red line in Chart 5 below shows the CDC Wonder Excess Attributions of underlying cause of death Cancer to multiple cause of death Cancer, per week for the two years prior, as well as years-of and years-post the pandemic. The dotted baseline is normalized from the years 2018-2019, and indexed to the last 7 week average as compared to those same 7 weeks of 2018/19, so it is not a ‘regression’. Since this is a relative index, it should exhibit no trend or growth rate (outside the context of Covid mortality peak periods of course). The current excess of 321 deaths reflects cancer deaths which are concealed from the underlying cause of death ICD code, and must be added back into the mix in order to make Excess Cancer Mortality comparable to its past baseline. We reconcile this into Chart 6 below, applying it only after the pandemic period ended.


Chart 5 - Cancer Concealment by Excess MCOD Attribution - stands at 394 concealed cancer deaths per week. In this practice, cancers are moved from the underlying cause of death line in Part I of the death certificate, to one of the multiple cause of death lines above the UCoD line, in excess of past practice. Once Covid has abated, this balance should trend back to zero. It has not. Therefore, in order to capture the true cancer mortality increase, these cancer deaths falsely attributed to Covid and other underlying causes must be added back into the total in the Excess Cancer Mortality chart below. These missing deaths are phased back in from Week 10 of 2022 (point of large decline in Covid mortality) onward.

The plot above uses a 2018-2019 average baseline even though there is an increasing trend in cancer deaths per year, so the baseline is too low especially towards the right end of the x-axis. However the increasing trend is not visible from the plot above, because it only displays two years before COVID, and 2019 had a low number of deaths and early 2018 had a high number of deaths.

Next ES showed this plot where he incorporated his method of "excess MCoD normalization":


Chart 6 - Excess Cancer Mortality (USA) - stands at 5.9% or 747 persons per week over the last three weeks on average. More alarmingly however, is the novel compound annualized growth rate (CAGR) of 2.2%, a significant departure from the old growth rate of 0.23%, normalized from the years 2014-2019 (dark orange line). This chart is expertly adjusted for reporting lag (procedure here), Pull Forward Effect (345 week PFE method is depicted in Charts 3b and 6), and excess MCoD attributions (critical data mining method depicted in Chart 5 above). In other words it is showing the true cancer UCoD rate (shorted by pressure on physicians to not report cancers unless 'proven' by multiple tests). Despite all these shorts in the data collection, the raw (and misleading) UCoD data nonetheless shows a rise - which can be seen by clicking here.

I believe the plot above displays UCoD deaths until week 10 of 2022, starting from which it also incorporates a fraction of MCoD deaths, where the fraction increases until week 20 of 2023 after which it remains at the maximum level indefinitely. So is this method of "excess MCoD normalization" justified?

From my plot below you can see that the blue UCoD cancer deaths remained near the baseline throughout 2020-2023. The bright green MCoD cancer deaths remain above the baseline from 2020 onwards, but they dip below the baseline in mid-2023 (which is partially because it's summer and my baseline is not adjusted for seasonal variation in mortality, but it's also because in mid-2023 COVID deaths fell to the lowest point since 2020, as you can see from the red line at the bottom). However from the dark green line which shows deaths with MCoD cancer but not MCoD COVID, you can see that the number of deaths with both MCoD cancer and MCoD COVID is much higher than the total excess deaths with MCoD cancer alone. And even in 2023 there was a large number of deaths with both MCoD cancer and MCoD COVID:

And from the next plot which shows the COVID deaths on the same scale as the cancer deaths, you can see that the spikes in MCoD cancer deaths coincide with spikes in COVID deaths in January 2022 and in August to September 2021:

Does the provisional dataset have different suppression behavior than the final dataset?

Ethical Skeptic wrote: "I cannot use Wonder 2017 and earlier because the numbers are boosted by the small county single-record effect. This is why Wonder 2018 and beyond is a new database (shows as a variance in your chart which does not actually exist), under that HIPAA constraint." [https://theethicalskeptic.substack.com/p/the-state-of-things-pandemic-week-706/comment/53292593]

He also wrote: "Please note however, that the provisional mortality figures for 2018-2024 are lower than the figures prior to 2018 because the NVSS suppresses county level data with fewer than nine records. We do not adjust 2018 and later mortality figures upward for this. Therefore, these excess mortality projections are lower than the reality as a result. For this reason, all inflection and DFT charts begin with the 2018 suppressed data, in order to avoid false 'downtrends' or inflections in the data." [https://theethicalskeptic.com/2024/04/04/the-state-of-things-pandemic-week-50-2023/]

The link in the text above was a screenshot of this part of the methodology article about the provisional deaths dataset: [https://wonder.cdc.gov/wonder/help/mcd-provisional.html]

1. What are the Assurance of Confidentiality constraints for vital statistics?

All statistics representing one through nine (1-9) persons are suppressed, in the provisional mortality online database for years 2018 and later.

Data reports for years 1989 and later must meet the NCHS data use restrictions. Vital statistics data are suppressed due to confidentiality constraints, in order to protect personal privacy. The term "Suppressed" replaces sub-national death counts, births counts, death rates and associated confidence intervals and standard errors, as well as corresponding population figures, when the figure represents one to nine (1-9) persons.

ES may have been confused by the text which said that the deaths are suppressed "in the provisional mortality online database for years 2018 and later", as if there would have been a change in policy in the year 2018, even though I think it just means that the deaths are suppressed throughout the provisional mortality dataset which starts in 2018. Similar text is also included in the methodology article for the final mortality data in 1999-2020, which says that the suppression begins in the year 1999: "The term 'Suppressed' replaces death counts, births counts, death rates and associated confidence intervals and standard errors, as well as corresponding population figures, when the figure represents one to nine (1-9) persons for deaths in 1999 and later years." [https://wonder.cdc.gov/wonder/help/mcd.html]

CDC WONDER suppresses the number of deaths on rows with 1-9 deaths both in the datasets for provisional and final deaths, and I haven't noticed any differences in the suppression behavior between the datasets. For example I did a request for provisional mortality statistics here: https://wonder.cdc.gov/mcd-icd10-provisional.html. I set "group by" to residence county and month, I set the state to Rhode Island, I set the year to 2020, and I set underlying cause to U07.1 (COVID). Rhode Island has 5 counties, but only 22 out of 60 rows were shown because the number of deaths was suppressed or zero on other rows. The total number of deaths was 1511. However when I removed the option to group the results by county, now the total number of deaths increased to 1580, because now no deaths were suppressed apart from 5 deaths in March. And I got identical figures when I repeated the experiment using the 1999-2020 final dataset instead of the 2018-2024 provisional dataset. So the number of deaths in small counties is not suppressed in aggregate statistics.

If there was some major change to the data suppression behavior in 2018, then I think it would be highlighted more clearly in the documentation.

ES also wrote that the reason why "Wonder 2018 and beyond is a new database" was because of the change to the suppression behavior. But actually it might be because the data from 2018 onwards includes more race categories than the earlier data, and it includes weekly data in addition to monthly data. There's also a separate dataset for 1999-2004 which only includes 3 different race groups, where monthly deaths are missing, and where Hispanic origin is not indicated:

The natality data at CDC WONDER is also split off into different datasets for different ranges of years, and the natality data also has separate databases for provisional and final data: [https://wonder.cdc.gov/natality.html]

Deaths associated with pregnancy and childbirth

Ethical Skeptic posted this plot where he implied that a spike in pregnancy-associated deaths which coincided with the Delta spike were caused by vaccines, and that subsequently a reduced number of deaths could've been because the birth rate was reduced by vaccines: [https://x.com/EthicalSkeptic/status/1778268480619331786]

Not particularly good news here.

1. We don't use the 'placental disorder unspecified' like the UK.

2. Expanding query to all Maternal deaths O00-O99 for Pregnancy, Childbirth & Puerperium reveals

a. large vaccine impact
b. distressing news about implied childbirth rate

Ethical Skeptic told me that in these plots which display weekly deaths from 2018 onwards, he fits the baseline using only data from 2018 and 2019. He said that his baseline is linear but it's somehow different from a least squares linear regression, but I don't know how. [https://theethicalskeptic.substack.com/p/the-state-of-things-pandemic-week-706/comment/53237226]

The year 2019 had an exceptionally low number of deaths, so normally if you would fit a linear baseline using only data from 2018 and 2019, the slope of the baseline would be pointed too much downwards. However in the case of deaths with an underlying cause related to pregnancy and childbirth, 2019 actually had a higher number of deaths than 2018, so if you fit a linear baseline based on deaths from 2018 and 2019 only, it has a too high slope compared to the long-term trend. So it probably explains why the plot by ES had a steadily decreasing trend in excess deaths in 2022-2023, because the same can be seen in my plot below when the excess deaths were calculated relative to the 2018-2019 trend, even though my excess deaths were roughly flat during the same period when I used a 2013-2019 trend instead:

library(ggplot2)

t=read.csv("http://sars2.net/f/wonderpregnancymonthly.csv")

x=as.Date(paste0(t$year,"-",t$month,"-15"))
xy=data.frame(x,y=t$dead,z="Actual deaths")

trend1=predict(lm(y~x,xy[data.table::year(xy$x)%in%2018:2019,]),xy)
trend2=predict(lm(y~x,xy[data.table::year(xy$x)%in%2013:2019,]),xy)

rbd=\(x,...)rbind(x,setNames(data.frame(...),names(x)))
xy=rbd(xy,x,trend1,"2018-2019 linear regression")
xy=rbd(xy,x,trend2,"2013-2019 linear regression")
xy=rbd(xy,x,t$dead-trend1,"Excess deaths (2018-2019)")
xy=rbd(xy,x,t$dead-trend2,"Excess deaths (2013-2019)")

xy$z=factor(xy$z,unique(xy$z))

cand=c(sapply(c(1,2,5),\(x)x*10^c(-10:10)))
ymax=max(xy$y,na.rm=T);ymin=min(xy$y,na.rm=T)
ystep=cand[which.min(abs(cand-(ymax-ymin)/6))]
yend=ystep*ceiling(ymax/ystep)
ystart=ystep*floor(ymin/ystep)
ybreak=seq(ystart,yend,ystep)

xstart=as.Date("1999-1-1");xend=as.Date("2024-1-1")
xbreak=seq(xstart,xend,"6 month")
xlab=c(rbind("",seq(1999,2023)),"")

color=c("black",hcl(210,60,60),hcl(260,100,40))

kim=\(x)ifelse(x>=1e3,ifelse(x>=1e6,paste0(x/1e6,"M"),paste0(x/1e3,"k")),x)

ggplot(xy,aes(x,y))+
geom_vline(xintercept=seq(xstart,xend,"year"),color="gray93",linewidth=.25,lineend="square")+
geom_vline(xintercept=as.Date(paste0(c(2013,2018,2020),"-1-1")),color="gray50",linetype=2,linewidth=.25,lineend="square")+
geom_hline(yintercept=c(ystart,0,yend),color="black",linewidth=.25,lineend="square")+
geom_vline(xintercept=c(xstart,xend),color="black",linewidth=.25,lineend="square")+
geom_line(aes(y=y,linetype=z,color=z,alpha=z,linewidth=z))+
labs(title=stringr::str_wrap("CDC WONDER: Monthly deaths with underlying cause O00-O99 (Pregnancy, childbirth and the puerperium)",70),x=NULL,y=NULL)+
coord_cartesian(clip="off",expand=F)+
scale_x_continuous(limits=c(xstart,xend),breaks=xbreak,labels=xlab)+
scale_y_continuous(limits=c(ystart,yend),breaks=ybreak)+
scale_color_manual(values=color[c(1,2,3,2,3)])+
scale_linetype_manual(values=c(1,2,2,1,1))+
scale_linewidth_manual(values=c(.3,.3,.3,.23,.23))+
scale_alpha_manual(values=c(1,1,1,.6,.6))+
theme(axis.text=element_text(size=7,color="black"),
  axis.ticks.x=element_line(color=alpha("black",c(1,0))),
  axis.text.x=element_text(angle=90,vjust=.5,hjust=1),
  axis.ticks=element_line(linewidth=.25,color="black"),
  axis.ticks.length=unit(.2,"lines"),
  axis.title=element_text(size=8),
  legend.background=element_blank(),
  legend.box.just="left",
  legend.key=element_rect(fill="white"),
  legend.spacing.x=unit(.15,"lines"),
  legend.key.size=unit(.65,"lines"),
  legend.key.width=unit(.8,"lines"),
  legend.position=c(0,1),
  legend.justification=c(0,1),
  legend.box.background=element_rect(fill="white",color="black",linewidth=.3),
  legend.margin=margin(-.2,.3,.2,.3,"lines"),
  legend.text=element_text(size=7,vjust=.5),
  legend.title=element_blank(),
  panel.background=element_rect(fill="white"),
  plot.margin=margin(.4,.4,.4,.4,"lines"),
  plot.title=element_text(size=7.7,margin=margin(.1,0,.6,0,"lines")))
ggsave("1.png",width=4.2,height=2.8,dpi=450)

The biggest spike in deaths was in the third quarter of 2021, so it's probably associated with the Delta wave in August-September 2021 when there was the highest excess mortality in southeastern states. My plot below also shows that southern states had by far the highest excess mortality in the third quarter of 2021. But in northeastern states which had the highest percentage of vaccinated people, the excess mortality in 2021 was by far the lowest. Northeastern states also had 50% excess deaths in the second quarter of 2020, which might be due to COVID:

My plot above shows that there were negative excess MCoD deaths throughout 2020-2023 and even in the first quarter of 2020. I don't know why, but there might have been some change to coding practices. I don't think I made any error either, because I got the similar results when I looked at yearly provisional data in the whole US, where 2018 and 2019 had a much higher number of deaths than 2020:

Here's R code for generating the above heatmap:

library(colorspace)

t=read.csv("http://sars2.net/f/wonderpregnancyregionquarter.csv")
y=read.csv("http://sars2.net/f/wonderpregnancyregionyearly.csv")|>subset(year%in%2013:2019)

dim=data.frame(cause=y$cause,region=factor(y$region,unique(y$region)),date=y$year)
dim=rbind(dim,data.frame(cause=t$cause,region=t$region,date=paste0(t$year,"Q",t$quarter)))

a=tapply(c(y$dead,t$dead),dim,sum)

for(i in 1:2){
  m=a[i,,]
  m[,1:7]=m[,1:7]/4
  ave=rowMeans(m[,1:7])
  disp=round((m/ave-1)*100)
  m=(m-ave)/ifelse(m>ave,ave,m)

  maxcolor=2
  pal=hex(HSV(c(210,210,210,210,0,0,0,0,0),c(1,.8,.6,.3,0,.3,.6,.8,1),c(.3,.65,1,1,1,1,1,.65,.3)))

  pheatmap::pheatmap(m,filename=paste0("i",i,".png"),display_numbers=disp,
    cluster_rows=F,cluster_cols=F,legend=F,cellwidth=18,cellheight=18,fontsize=9,fontsize_number=8,
    border_color=NA,na_col="gray90",
    number_color=ifelse(abs(m)>maxcolor*.6,"white","black"),
    breaks=seq(-maxcolor,maxcolor,,256),
    colorRampPalette(pal)(256))
}

system("f=i1.png;w=`identify -format %w $f`;convert -interline-spacing -2 -gravity northwest -font Arial -pointsize 38 -size $[w-44]x \\( -splice 22x14 caption:'CDC WONDER: Excess percentage of deaths for cause O00-O99 (Pregnancy, childbirth and the puerperium). The baseline is the average number of deaths per quarter in 2013-2019, and all quarters have the same baseline which is not adjusted for seasonal variation in mortality.' \\) -pointsize 38 -font Arial-Bold \\( -splice 22x14 caption:'Underlying cause of death' \\) i1.png \\( -splice 22x14 caption:'Multiple cause of death' \\) i2.png -append 1.png")

The plot below shows that in the southern states which had a high number of pregnancy-associated deaths in the third quarter of 2021, deaths with UCoD COVID also accounted for a large percentage of all deaths in ages 15-44. I included both sexes in the plot, because females alone would've had some months with less than 10 deaths per census region so the number of deaths would've been suppressed:

The plot below also shows that the percentage of vaccinated people was the highest in the northeastern states which had the lowest percentage of excess deaths associated with pregnancy in 2021, but in the southern states which had a high percentage of excess deaths in the third quarter of 2021, the percentage of vaccinated people was much lower (R code). So if the deaths were caused by the vaccines, then why wasn't there a higher percentage of excess deaths in the states where more people were vaccinated?

Ethical Skeptic speculated that the reason why his plot had a decrease in excess deaths associated with childbirth in 2022 and 2023 could've been because there was a decrease in birth rate. However in the plot below the raw number of births was about 2.9% below the 2016-2019 linear trend in 2020 and 0.3% below the trend in 2021, but in 2022 and 2023 it was actually above the trend:

t=read.csv("http://sars2.net/f/wondernatality.csv")
t$date=as.Date(paste0(t$year,"-",t$month,"-1"))
t$trend=predict(lm(born~date,subset(t,date<as.Date("2020-1-1"))),t)

yearly=aggregate(t[,c("born","trend")],t[,1,drop=F],sum)

setNames(round((yearly$born/yearly$trend-1)*100,1),yearly$year)
# 2016 2017 2018 2019 2020 2021 2022 2023 2024
#  1.1 -0.1 -0.5 -0.5 -2.9 -0.3  1.0  0.3 -3.0

png("1.png",1800,1100,res=300)
par(mar=c(3.6,2.3,2.1,.8),mgp=c(0,.6,0))
tit="CDC WONDER: births per month"
sub="Source: wonder.cdc.gov/natality.html"
leg=c("Births per month","2016-2019 linear trend")
col=c("black","#dd7700")
plot(t$date,t$born,type="l",col=col[1],main=tit,xlab=NA,ylab=NA,lwd=1.5)
axis(1,at=seq(min(t$date),max(t$date),"year"),labels=2016:2024)
lines(t$date,t$trend,type="l",col=col[2],lwd=1.5)
legend("topright",legend=leg,col=col,lty=1,lwd=1.5)
mtext(text=sub,side=1,line=2)
dev.off()

Cancer ASMR compared to historical ASMR up to 1950

Ethical Skeptic posted this tweet: [https://x.com/EthicalSkeptic/status/1803267649385668791]

I've taken a look at the cancer long term Age Standardized Per Capita stats, aannd....

SV40 cancers only took 50 years to work their way out of the population. Just a blip on an epochal chart really.

Just a minor spike the last 3 years as well. Nothing to worry about.

Really. Honestly. For sure... 🤥

The plot above used a dataset for historical cancer ASMR that was only available for paid members of Statista.com, so I wasn't able to download it. The description of the dataset said: [https://www.statista.com/statistics/184566/deaths-by-cancer-in-the-us-since-1950/]

This statistic was assembled from several editions of "Health, United States".
* Malignant neoplasms.
Numbers are for malignant neoplasms. All rates are age-adjusted. 5 Age-adjusted rates are calculated using the year 2000 standard population. Prior to 2001, age-adjusted rates were calculated using standard million proportions based on rounded population numbers. Starting with 2001 data, unrounded population numbers are used to calculate age-adjusted rates.

But anyway, I don't understand how Ethical Skeptic got higher ASMR in 2023 than 2022, because I got lower ASMR in 2022 and 2023 than 2021. And I already got a big increase in 2020 if I looked at all neoplasms instead of only malignant neoplasms:

Ethical Skeptic may have applied some of his usual adjustments to his plot, like including part of deaths where the cause has not yet been resolved under cancer deaths. Or maybe he also added in deaths that he estimated to be missing because of a registration delay, because he got even higher ASMR in 2024 than 2023.

CDC WONDER doesn't return population estimates for ages 85 and above, so in my plot above I took mid-year population estimates by single year of age from these files: https://www2.census.gov/programs-surveys/popest/datasets/2010-2020/national/asrh/nc-est2020-agesex-res.csv, https://www2.census.gov/programs-surveys/popest/datasets/2020-2023/national/asrh/nc-est2023-agesex-res.csv (res stands for resident population).

In the plot below I calculated ASMR using ten-year age groups instead of single-year age groups, and I used population estimates from CDC WONDER which uses 2023 population estimates for 2022 which slightly elevates the ASMR in 2023. WONDER also uses vintage 2020 population estimates for 2020, vintage 2021 estimates for 2021, and so on, but my previous plot used the vintage 2023 estimates for 2020 and 2021 instead. However I'm still not getting any massive increase in ASMR after 2020:

Baseline that started from the 1970s for CMR in New Zealand

One of the terms invented by Ethical Skeptic is a "paltered" baseline, which means that 2020 or later years are included in the fitting period of a baseline that is used to calculate excess deaths during COVID.

OWID says that the fitting period of their baseline consists of the years 2015 to 2019: "We use an estimate produced by Ariel Karlinsky and Dmitry Kobak as part of their World Mortality Dataset (WMD). To produce this estimate, they first fit a regression model for each region using historical deaths data from 2015–2019. They then use the model to project the number of deaths we might normally have expected in 2020–2024. Their model can capture both seasonal variation and year-to-year trends in mortality." [https://ourworldindata.org/excess-mortality-covid#how-is-excess-mortality-measured]

However for some reason in this tweet Ethical Skeptic said that the baseline used by OWID was "paltered" even though it doesn't fit his usual definition of what paltered means: [https://twitter.com/EthicalSkeptic/status/1720267405883027598]

In the plot above the A baseline roughly looks like it's fitted against the last 5-year segment displayed in the plot (or it's hard to tell if the last 5-year segment is actually a 4-year segment like in my plot below). But OWID's baseline is not even calculated based on CMR but based on the raw number of deaths. His plot would've been more clear if it would've showed the CMR for each year and not for the 5-year blocks, and if he would've calculated the 2015-2019 linear trend in deaths based on the raw number of deaths and then converted it to CMR by dividing it by the population size, so it would've been closer to the baseline that is actually used by OWID.

I wasn't able to reproduce the plot above perfectly, but I think he used a method similar to this to calculate CMR for the 5-year blocks:

In Ethical Skeptic's plot the B baseline looks closer to my lighter blue dashed line which was fitted against the 5-year blocks for 1970-2023 than my darker blue dashed line which was fitted against the 5-year blocks for 1970-2019 (or maybe it's somewhere in between). I don't know if he just drew the B baseline by hand, but it doesn't look steep enough to be fitted against data from 1970-2019 but it looks more like it's fitted against data up to the present in his plot. (Which is not really an important point, but it would be funny if he was simeultaneously guilty of paltering his own baseline while he falsely accused OWID of paltering.)

For some reason the y-axis of Ethical Skeptic's plot says "deaths per 1000 people" but each point has over 6,000 deaths, so I don't know if he meant deaths per million people.

But in any case, the age structure of the New Zealand population has completely changed since the 70s, so it doesn't make any sense to just use a linear baseline for CMR that starts from the 70s. I only had population data that went back to 1992, but the change since the 70s would've been even more dramatic:

library(data.table)

agecut=\(x,y)cut(x,c(y,Inf),paste0(y,c(paste0("-",y[-1]-1),"+")),T,F)

t=fread("http://sars2.net/f/nzpopdead.csv")

m=t[,.(pop=sum(pop,na.rm=T)),.(age=agecut(age,0:9*10),year)][year>=1992,xtabs(pop~age+year)]
m=t(t(m)/colSums(m))*100

maxcolor=max(m)

pal=colorRampPalette(hsv(c(21,21,21,16,11,6,3,0,0,0)/36,c(0,.25,rep(.5,8)),c(rep(1,8),.5,0)))(256)

disp=ifelse(m<10,sprintf("%.1f",m),round(m))

pheatmap::pheatmap(m,filename="0.png",display_numbers=disp,
  cluster_rows=F,cluster_cols=F,legend=F,cellwidth=14,cellheight=14,fontsize=9,fontsize_number=8,
  border_color=NA,
  number_color=ifelse(m>maxcolor*.8,"white","black"),
  breaks=seq(0,maxcolor,,256),pal)

system("w=`identify -format %w 0.png`;pad=20;convert -gravity northwest -pointsize 42 -font Arial \\( -size $[w-pad] caption:'Percentage of each age group in New Zealand population' -splice 20x24 \\) 0.png -append 1.png")

In the plot below I think my green baseline is the most accurate but it's far above the actual CMR in 2020:

library(data.table);library(ggplot2);library(stringr)

t=fread("http://sars2.net/f/nzpopdead.csv")
a=t[,.(dead=sum(dead),pop=sum(pop,na.rm=T)),.(age=pmin(95,age%/%5*5),year)]

nzcmr=fread("http://sars2.net/f/nzcmr.csv")
p=nzcmr[year>=1970,.(x=year,y=cmr,z="Actual CMR")]

lm=a[year%in%2010:2019,.(year=2010:2023,lm=predict(lm(dead/pop~year),.(year=2010:2023))),age]
p=rbind(p,merge(a,lm)[,.(dead=sum(lm*pop),pop=sum(pop)),year][,.(y=dead/pop*1e5,z="2010-2019 trend in CMR for 5-year age groups multiplied by population size"),.(x=year)])

p=rbind(p,nzcmr[year%in%1970:2019,.(x=1970:2023,y=predict(lm(cmr~year),.(year=1970:2023)),z="1970-2019 linear trend in CMR (like Ethical Skeptic's B baseline)")])

owid=a[,.(dead=sum(dead)),year][year%in%2015:2019,.(year=2015:2023,owid=predict(lm(dead~year),.(year=2015:2023)))]
p=rbind(p,merge(a[,.(pop=sum(pop)),year],owid)[,.(x=year,y=owid/pop*1e5,z="2015-2019 linear trend in deaths (like OWID)")])

p$z=factor(p$z,unique(p$z))

xstart=1970;xend=2023
p=p[x>=xstart&x<=xend]

cand=c(sapply(c(1,2,5),\(x)x*10^c(-10:10)))
ymin=min(p$y,na.rm=T);ymax=max(p$y,na.rm=T)
ystep=cand[which.min(abs(cand-(ymax-ymin)/6))]
# yend=ystep*ceiling(ymax/ystep)
yend=ymax
ystart=ystep*floor(ymin/ystep)
ybreak=seq(ystart,yend,ystep)
ystart=ymin

color=c("black","#00bb00","#0055ff","#ee0000",hsv(30/36,1,.8))

cap="    Source: infoshare.stats.govt.nz/Default.aspx."
cap=paste0(cap,"\n    ","In the green baseline, for example the CMR in 2023 was calculated by first calculating a trend in CMR for each 5-year age group in 2010-2019, then the trend projected to 2023 was multiplied with the population size of each 5-year age group in 2023 to get the expected deaths for each age group, and the expected deaths of all age groups were added together and then divided by the population size in 2023."|>str_wrap(90))
cap=paste0(cap,"\n    ","The red line was calculated by doing a linear regression for the raw number of deaths and by then dividing the projected trend with the population size to convert it to CMR."|>str_wrap(90))

ggplot(p,aes(x,y))+
geom_vline(xintercept=c(xstart,xend),linewidth=.3,lineend="square")+
geom_hline(yintercept=c(ystart,yend),linewidth=.3,lineend="square")+
geom_line(aes(color=z),linewidth=.4)+
geom_point(data=p[grep("Actual",z)],aes(color=z),size=.5,show.legend=F)+
labs(title="Crude mortality rate in New Zealand (deaths per 100,000 people)",subtitle=cap,x=NULL,y=NULL)+
scale_x_continuous(limits=c(xstart,xend),breaks=seq(1900,2100,5))+
scale_y_continuous(limits=c(ystart,yend),breaks=ybreak)+
scale_color_manual(values=color)+
coord_cartesian(clip="off",expand=F)+
theme(axis.text=element_text(size=7,color="black"),
  axis.ticks=element_line(linewidth=.3),
  axis.ticks.length=unit(3,"pt"),
  legend.direction="vertical",
  legend.background=element_rect(fill=alpha("white",.8)),
  legend.key.height=unit(9,"pt"),
  legend.key.width=unit(15,"pt"),
  legend.key=element_rect(fill="white"),
  legend.box.spacing=unit(2,"pt"),
  legend.margin=margin(,,3),
  legend.justification=c(0,1),
  legend.position="bottom",
  legend.spacing.x=unit(1.5,"pt"),
  legend.text=element_text(size=7),
  legend.title=element_blank(),
  panel.grid.major=element_blank(),
  panel.background=element_rect(fill="white"),
  plot.margin=margin(3,5,4,4),
  plot.subtitle=element_text(size=6.7,margin=margin(0,0,4,0)),
  plot.title=element_text(size=7.5,face="bold",margin=margin(2,0,4,0)))
ggsave("0.png",width=4.2,height=3.7,dpi=400*4)
system("magick 0.png -resize 25% 1.png")

New Zealand clearly also had negative excess mortality in 2020 if you look at ASMR instead of CMR. Here I used the 2020 NZ population by single year of age as the standard population:

library(data.table)

t=fread("http://sars2.net/f/nzpopdead.csv")
a=t[,.(dead=sum(dead),pop=sum(pop,na.rm=T)),.(age=pmin(95,age),year)]
p=merge(a,a[year==2020,.(age,std=pop/sum(pop))])[year>=1992,.(asmr=sum(dead/pop*std*1e5)),year]
p$trend=p[year%in%2012:2019,predict(lm(asmr~year),.(year=p$year))]
p$poly=p[year%in%1992:2019,predict(lm(asmr~poly(year,2)),.(year=p$year))]

png("1.png",2000,1200,res=300)
par(mar=c(2.2,3,2.1,.8),mgp=c(0,.6,0),las=1)
tit="Age-standardized mortality rate in New Zealand"
leg=c("Actual ASMR","2012-2019 linear trend","1992-2019 quadaratic trend")
col=c("black","#0000ff","#00aa00")
plot(p$year,p$asmr,type="l",main=tit,xlab=NA,ylab=NA,ylim=range(p[,-1]),lwd=1.5,xaxs="i")
lines(p$year,p$trend,type="l",col=col[2],lwd=1.5)
lines(p$year,p$poly,type="l",col=col[3],lwd=1.5)
legend("topright",legend=leg,col=col,lty=1,lwd=1.5)