Logistic Regression Science R Questions

Description4/25/23, 12:08 PM
https://d2l.arizona.edu/content/enforced/1257946-548-2231-1ISTA116101101A-F201201A401401A/Project 4 DeLong 2018.html
Introduction
For this project you will read DeLong 2018, “A lowered probability of pregnancy in females in the USA aged 25–29
who received a human papillomavirus vaccine injection.” The general idea of this paper is that perhaps the HPV
vaccine (a vaccine given to lower the risk of genital warts and certain types of cervical cancer) could cause lowered
fertility. This paper was retracted for spurious correlation, meaning that the relationship the author found between
pregnancy and HPV vaccination might be related to other issues. I have included the letter to the editor that was
written in an effort to get the paper retracted (Shibata & Kataoka 2019) but there are some other problems that exist
in the paper as well.
DeLong 2018 downloaded data from the National Health and Nutrition Examination Survey (NHANES), a US survey
that is run every two years by the National Center for Health Statistic (NCHS) at the CDC. I have provided a .csv file
of this data in its raw format.
Your job for this project will be to:
1. Process, filter, and mutate the data in R as directed by Dr. DeLong and see if there were mistakes in how Dr.
DeLong cleaned and processed the data
2. Dr. DeLong did not analyze nor describe the accuracy of their models; you will be re-running them and doing
that
3. Look at some of the additional variables I’ve given you and consider how they might have impacted the model
interpretation
4. Consider the difference between causation, correlation, and spurious correlation and how they might be
measured by this study
Files for Downloading
Project 4 DeLong 2018 Comments.pdf
Project 4 DeLong 2018 Full Data.csv
Project 4 Letter to the Editor Re DeLong 2018.pdf (Not required, but helpful)
Trigger Warning
Be aware that the paper and the variables considered in the paper do broadly touch on fertility, pregnancies,
abortions, and miscarriages without considering whether the participants in the study wanted to get pregnant. The
paper also does treat gender as a binary, and seems to have a very traditional view of marriage and children as a goal
for all persons capable of bearing a child.
Remember that ultimately this paper was retracted (functionally, it was unpublished) as it is a pile of garbage.
A Note on Multivariate Logistic Regression
Look to Lab 11 for a refresher on multivariate regression – essentially, using more than one x variable at a time to
predict a y variable. You can also do this for other forms of regression, like logistic regression – it’s the same idea, but
with a binomial y variable. Running and interpreting logistic regression is covered in Lab 12. In R, the formula to
combine these into multivariate logistic regression is very simple:
glm(y ~ x1 + x2 + x3…, data = dataframe, family = binomial)
https://d2l.arizona.edu/content/enforced/1257946-548-2231-1ISTA116101101A-F201201A401401A/Project 4 DeLong 2018.html
1/7
4/25/23, 12:08 PM
https://d2l.arizona.edu/content/enforced/1257946-548-2231-1ISTA116101101A-F201201A401401A/Project 4 DeLong 2018.html
Just as in regular logistic regression, using the summary() function will tell you the slope estimate (positive or
negative) and the statistical significance of that variable.
A Note on Dummy Variables
Dummy variables are essentially variables that are forced into 0/1 status. This is often done to reduce the number of
categories, especially if those categories are nominal and not ordinal. This can be done well, but it often leads to
problems.
For a basic example, gender is often recorded as Male or Female. But Logistic Regression needs numbers, not words.
So you have to code them as 0/1 or 1/0, depending on which one you want to have as the 1.
Sometimes you need to force quite a lot of categories down into numbers. Let’s say for example that you are looking
at hair color. If you have recorded hair color as brown, black, red, blond, and gray, it’s going to be hard to categorize
them as an ordinal variable. You could turn them into 1, 2, 3, 4, 5, but what would be the order? Is red above or below
brown hair? Because of this problem, people often split these out from a single column into multiple columns. Like so:
Hair Color
Brown
Red
Brown
Black
Brown
1
0
1
0
Red
0
1
0
0
Black
0
0
0
1
Blond
0
0
0
0
Gray
0
0
0
0
This can lead to problems however, as it adds a lot of additional variables with very little information. So another
option is to take one of these multivariate categorical datapoints and turn it into a single yes/no style variable. Let’s
say you really think that having black hair versus other colors is the most important – you would take that categorical
data of hair color and just force it into a single yes/no column, like so:
Hair Color
Brown
Red
Brown
Black
Black Hair
0
0
0
1
You will be following DeLong’s coding decisions in part 3.
Instructions
While this isn’t a lab, you will be downloading, filtering, running some models and otherwise analyzing a dataset to
see if it matches up to what Dr. DeLong did. Go through the instructions below and write down your answers. When
you are ready, take the Project 4 quiz to enter your results. Not every question I ask you in the Instructions section
will be on the quiz, but they should guide how you read and analyze this paper.
The instructions do ask you to open up Rstudio and read in and analyze this data. You need to do that BEFORE you
open the quiz, as the quiz is timed and you won’t have time to do the analyses once you open the quiz.
Part 1. Read the Paper
https://d2l.arizona.edu/content/enforced/1257946-548-2231-1ISTA116101101A-F201201A401401A/Project 4 DeLong 2018.html
2/7
4/25/23, 12:08 PM
https://d2l.arizona.edu/content/enforced/1257946-548-2231-1ISTA116101101A-F201201A401401A/Project 4 DeLong 2018.html
Download the DeLong 2018 paper from D2L. If you like, you may also download the editorial. Read through the
sections that are still black (I grayed out unnecessary nonsense, and all of her terrible and not helpful tables). Here are
some helpful questions & comments to guide your reading (which are also listed in the PDF comments):
Dr. DeLong is a professor of economics – read the conflict of interest section to figure out why she is writing a
paper theorizing a medical connection between vaccines and health issues
The author suggests that if 100% of women in the study had been vaccinated, the number of women to
conceive would have fallen by 2 million. Is this conclusion assuming correlation or causation?
Geier and Geier are cited several times and thanked in the acknowledgements. Geier senior has had his medical
license revoked for giving a chemical castration drug to autistic children (some of whom were not actually
autistic). https://www.discovermagazine.com/the-sciences/antivaxxer-mark-geier-has-license-revoked-inmaryland
There is a misinterpretation of the VAERS system on page 3: VAERS does not mean that an incident was caused
by a vaccine, just that it happened after a vaccine and was reported to a doctor. So a patient could be vaccinated,
stub their toe, and report that as an effect of the vaccine.
For each variable listed, what type of data do you think it would be – nominal, ordinal, continuous, discrete?
On page 4, the options for RHQ131 are listed. What would you filter out in order to make sure you only had
data with answers?
In Table 2 there are a lot of t-tests and I believe Chi Square tests. Only 2 are significant out of 24. Can you use
binomial probability to figure out if getting 2 of 24 tests is statistically significant or could these potentially be
false positives?
Page 6 has an in-depth discussion of how the data was recoded, so pay close attention there.
On Page 6, pay attention to how the married/not married data was coded. People who marked “living with a
partner” or a 6 are not addressed clearly here.
While the csv file I will give you is the raw data with 945 participants, Dr. DeLong filtered it to be 700 women.
You will do the same, following the paper’s directions as closely as you can.
On page 10, the McInterney et al. 2017 study sounds like a much better study. What makes it so different from
this one?
Participants in the study data used in this paper did report if they were using birth control, but were not filtered
out of the dataset. How might that impact results?
The author claims: ” Data suggest that at least part of the reason for the recent decline in US birth rates amongst
females aged 25–29 may be associated with increasing injection of the HPV vaccine” meaning that this paper
suggests that. Is that true?
Part 2. Investigate Variables from Paper
Dr. DeLong uses a variety of variables from the NHANES survey, which I have edited and put into your datasheet.
Here is a table of their descriptions, as well as the URL to their description on the NHANES site. The last two
variables were not used in the first regression but were mentioned enough I put them into this table so you can see
them.
Read through the variable descriptions on the NHANES site. Dr. DeLong makes several claims about filtering that you
will shortly find out are incorrect.
Shorthand
Year
SEQN
RIAGENDR
RHQ131
Use
Question
URL
This is the year of the survey
The unique identifier for each participant (like a student ID number, essentially).
The gender in male or female (no other options were allowed). Filtered to just be
female.
In regression
Have you ever been https://wwwn.cdc.gov/Nchs/Nhanes/2007pregnant? Please 2008/RHQ_E.htm#RHQ131
include (current
pregnancy,) live
births, miscarriages,
stillbirths, tubal
https://d2l.arizona.edu/content/enforced/1257946-548-2231-1ISTA116101101A-F201201A401401A/Project 4 DeLong 2018.html
3/7
4/25/23, 12:08 PM
https://d2l.arizona.edu/content/enforced/1257946-548-2231-1ISTA116101101A-F201201A401401A/Project 4 DeLong 2018.html
pregnancies and
abortions
IMQ040
In regression
Have you ever
https://wwwn.cdc.gov/Nchs/Nhanes/2007received one or more 2008/IMQ_E.htm#IMQ040
doses of the HPV
vaccine? (The brand
name for the vaccine
is Gardasil.)
DMDEDUC2
In regression
What is the highest https://wwwn.cdc.gov/Nchs/Nhanes/2007grade or level of
2008/DEMO_E.htm#DMDEDUC2
school you
completed or the
highest degree you
received?
DMDMARTL
In regression What is your Marital https://wwwn.cdc.gov/Nchs/Nhanes/2007Status?
2008/DEMO_E.htm#DMDMARTL
RIDAGEYR
In regression
Age in years at
https://wwwn.cdc.gov/Nchs/Nhanes/2007screening
2008/DEMO_E.htm#RIDAGEYR
INDFMPIR
In regression
Ratio of family
https://wwwn.cdc.gov/Nchs/Nhanes/2007income to poverty
2008/DEMO_E.htm#INDFMPIR
threshold
RIDRETH1
In regression,
Reported race and https://wwwn.cdc.gov/Nchs/Nhanes/2007split into multiple ethnicity information
2008/DEMO_E.htm#RIDRETH1
variables
IMQ045
Not in regression, How many doses https://wwwn.cdc.gov/Nchs/Nhanes/2007mentioned later have you received?
2008/IMQ_E.htm#IMQ045
for second
regression
BMXHT
Not in regression,
Height in cm
https://wwwn.cdc.gov/Nchs/Nhanes/2007mentioned later
2008/BMX_E.htm#BMXHT
for second
regression

There are also two additional clustering variables used in the regression described on page 4 that are too complex for
a 100-level class, so be aware you won’t be able to replicate exact results, but should be able to approximate.
Part 3. Data Cleaning & Dummy Variables
Download the DeLong 2018.csv dataset. This is the raw data I processed for you from NEHNES. However, it has all
945 participants (identified by SEQN, or their identifying number) and still had the original data without the dummy
variables.
Dr. DeLong helpfully walks us through how the filtering and re-coding steps on page 6. They also discuss there how
they made what are known as dummy variables – essentially, these are variables that are forced into 0/1 status. This
is often done to reduce the number of categories, especially if those categories are nominal and not ordinal.
However, as is common on survey data many of the questions did not contain data for everyone either because they
did not answer, did not know, or refused to answer. I have already filtered gender, age, and the years of the survey for
you, which gets you to 945 women. You will apply further filters to get to 700. Because those changes are a bit
buried, I have highlighted them here in steps.
You can use the dplyr package in R, specifically the mutate and filter functions to process the data. You will also want
to look at the ifelse function to help you in mutate, and group_by() and tally() will help you to count observations.
https://d2l.arizona.edu/content/enforced/1257946-548-2231-1ISTA116101101A-F201201A401401A/Project 4 DeLong 2018.html
4/7
4/25/23, 12:08 PM
https://d2l.arizona.edu/content/enforced/1257946-548-2231-1ISTA116101101A-F201201A401401A/Project 4 DeLong 2018.html
filter() is covered in Lab 6, Lab 8, Lab 12
mutate() is covered in Lab 6, Lab 8, Lab 11, Lab 12
ifelse() is covered in Lab 11, Lab 12
group_by() and tally() are in Lab 6, Lab 8, Lab 12
Filters:
1. Must have an answer in RHQ131 of either yes or no
2. Must have an answer in IMQ040 of either yes or no
3. Cannot have an NA value in INDFMPIR
Dr. DeLong says that ‘Observations where the response was “refused,” “don’t know,” or “missing” were dropped from
the analysis (page 6 paragraph 1)’. However that is not universally true – if it were, there would actually be fewer than
700 participants.
Take a look at the NEHNES website and how these variables are coded – which variables have values that equate to
refused, don’t know, or missing yet were not filtered to get you to the 700 individuals?
Dummy Variables as stated by Dr. DeLong:
1. The educational variable, DMDEDUC2, was recoded into a dummy variable that was 1 if the woman was a
graduate of a 4-year college (DMDECUC2 = 5) and 0 if not (DMDECUC2 = 1, 2, 3, or 4).
2. If a female was married, widowed, divorced, or separated (responses 1, 2, 3, or 4), she was considered for this
analysis to be “married, currently or formerly.” Women who chose response 5 were considered to be “never
married.”
3. RIDRETH1 was recoded into four racial/ethnic groups—Hispanic, non-Hispanic black, non-Hispanic white, and
other.
Dummy Variables Translated:
1. DMDEDUC2 should be 0 (not a graduate) or 1 (a graduate). However, take a look at the options – are there
people that should have been filtered out for an NA here? How many?
2. DMDMARTL should be 0 (never married) or 1(married at some point). However, take a look at the options –
what happens to the “6 – living with partner” option with this dummy variable? (hint: look at counts in the table
to find out).
3. RIDRETH1 will be turned into four binomial variables:
a. Hispanic (1 if RIDRETH1 is 1 or 2, 0 if not)
b. Non-Hispanic White (1 if RIDRETH1 is 3, 0 if not)
c. Non-Hispanic Black (1 if RIDRETH1 is 4, 0 if not)
d. Other race (1 if RIDRETH1 is 5, 0 if not)
Part 4. Modeling Part 1
You should now have cleaned and filtered your data to have 700 participants, a single y variable, and 9 different x
variables: HPV status, Education, Marital, Hispanic, NonHispanic White, NonHispanic Black, Other Race, Age, and
Income to poverty ratio.
1. Build a model that use’s DeLong’s exact data. Use the summary() function to interpret the model. What variables
are statistically significant – specifically, is immunization statistically significant? What is the AIC?
2. Use the predict() and confusionMatrix() functions to determine if your model is particularly accurate. Does it
identify across both categories?
NOTE: you will probably get an error message for predict that says “In predict.lm(object, newdata, se.fit, scale = 1,
type = if (type == : prediction from a rank-deficient fit may be misleading”, and you will see that one of your variables
has NA under summary. That’s fine. There were not very many individuals in the “other” category for race, so it gets a
little wonky in R. It’ll still work.
https://d2l.arizona.edu/content/enforced/1257946-548-2231-1ISTA116101101A-F201201A401401A/Project 4 DeLong 2018.html
5/7
4/25/23, 12:08 PM
https://d2l.arizona.edu/content/enforced/1257946-548-2231-1ISTA116101101A-F201201A401401A/Project 4 DeLong 2018.html
On page 7 the author mentions that income and education have both been found in the past to be well-correlated
with pregnancies, and that this study also found that. When filtered correctly, the education variable has an R value
of .46 and the income variable has a correlation of .34. Marital status is also very correlated, with 0.31. Meanwhile,
immunization has a correlation of only 0.17.
3. Build a model with only education, income, and marital status as x variables.
4. Use the predict() and confusionMatrix() functions to determine if your model is particularly accurate. Does it
identify across both categories? Is it more or less accurate (or the same amount) as your previous model?
Part 5. Correlation, Causation, and Spurious Correlation
In the dataset are six additional variables that were available, but which were not included in the analysis that I have
added. These are sexual activity, sexual orientation, the use of birth control pills, whether a woman has had a
hysterectomy, and two variables that measure whether an individual is working full time or not.
Shorthand
SXD021
Question
Have you ever had vaginal, anal, or oral sex?
SXQ294
RHD442
Do you think of yourself as Heterosexual or
straight (that is, sexually attracted only to
men); homosexual or lesbian (that is, sexually
attracted only to women); bisexual (that is,
sexually attracted to men and women);
something else; or you’re not sure?
Are you taking birth control pills?
RHD280
Have you had a hysterectomy?
OCQ180
How many hours did you work last week at all
jobs or businesses?
Do you usually work 35 hours or more per
week in total at all jobs or businesses?
OCQ210
URL
https://wwwn.cdc.gov/Nchs/Nhanes/20072008/SXQ_E.htm#SXQ021
https://wwwn.cdc.gov/Nchs/Nhanes/20072008/SXQ_E.htm#SXQ294
https://wwwn.cdc.gov/Nchs/Nhanes/20072008/RHQ_E.htm#RHD442
https://wwwn.cdc.gov/Nchs/Nhanes/20072008/RHQ_E.htm#RHD280
https://wwwn.cdc.gov/Nchs/Nhanes/20072008/OCQ_E.htm#OCQ180
https://wwwn.cdc.gov/Nchs/Nhanes/20072008/OCQ_E.htm#OCQ210
Look at these variables and how they might have influenced the dataset and contemplate the following questions:
1. Do the sexual activity and sexual orientation variables have information that should have been used to
potentially filter out certain individuals from the study?
2. Does being on having birth control pills or having had a hysterectomy (removal of the uterus) potentially
influence a woman’s chance of getting pregnant? (note: if you think the answer is no, you need to google a lot of
things we won’t cover in this class…)
3. How might work hours influence someone’s likelihood of getting pregnant or not?
4. The date of pregnancy and the date of the HPV shot are not recorded here, but it is now recommended that all
individuals ages 9 – 45 get the HPV vaccine. How might that timing information be important for understanding
if this is correlation or causation?
5. Imagine the ideal scenario for figuring out if the HPV vaccine causes infertility. How would you design that
study? What is different between your potential study and the study conducted here?
Part 6. Modeling Part 2
Remove all the women who either had a hysterectomy or didn’t have data for it; either had birth control or didn’t
report it; or were virgins or wouldn’t answer the question. This takes you from 700 entries to 299. Use this new data
set in the same two models that you ran up above.
https://d2l.arizona.edu/content/enforced/1257946-548-2231-1ISTA116101101A-F201201A401401A/Project 4 DeLong 2018.html
6/7
4/25/23, 12:08 PM
https://d2l.arizona.edu/content/enforced/1257946-548-2231-1ISTA116101101A-F201201A401401A/Project 4 DeLong 2018.html
How have the significant variables changed?
How has the accuracy changed?
Take the Quiz!
https://d2l.arizona.edu/content/enforced/1257946-548-2231-1ISTA116101101A-F201201A401401A/Project 4 DeLong 2018.html
7/7
Journal of Toxicology and Environmental Health, Part A
Current Issues
ISSN: 1528-7394 (Print) 1087-2620 (Online) Journal homepage: https://www.tandfonline.com/loi/uteh20
RETRACTED ARTICLE: [A lowered probability of
pregnancy in females in the USA aged 25–29
who received a human papillomavirus vaccine
injection]
Gayle DeLong
To cite this article: Gayle DeLong (2018) RETRACTED ARTICLE: [A lowered probability of
pregnancy in females in the USA aged 25–29 who received a human papillomavirus vaccine
injection], Journal of Toxicology and Environmental Health, Part A, 81:14, 661-674, DOI:
10.1080/15287394.2018.1477640
To link to this article: https://doi.org/10.1080/15287394.2018.1477640
Published online: 11 Jun 2018.
Submit your article to this journal
Article views: 24422
View related articles
View Crossmark data
Citing articles: 7 View citing articles
Full Terms & Conditions of access and use can be found at
https://www.tandfonline.com/action/journalInformation?journalCode=uteh20
JOURNAL OF TOXICOLOGY AND ENVIRONMENTAL HEALTH, PART A
2018, VOL. 81, NO. 14, 661–674
https://doi.org/10.1080/15287394.2018.1477640
RETRACTED ARTICLE: [A lowered probability of pregnancy in females in the USA
aged 25–29 who received a human papillomavirus vaccine injection]
Gayle DeLong
Department of Economics and Finance, Baruch College/City University of New York, New York, NY, USA
ABSTRACT
ARTICLE HISTORY
Birth rates in the United States have recently fallen. Birth rates per 1000 females aged 25–29 fell
from 118 in 2007 to 105 in 2015. One factor may involve the vaccination against the human
papillomavirus (HPV). Shortly after the vaccine was licensed, several reports of recipients experiencing primary ovarian failure emerged. This study analyzed information gathered in National
Health and Nutrition Examination Survey, which represented 8 million 25-to-29-year-old women
residing in the United States between 2007 and 2014. Approximately 60% of women who did not
receive the HPV vaccine had been pregnant at least once, whereas only 35% of women who were
exposed to the vaccine had conceived. For married women, 75% who did not receive the shot
were found to conceive, while only 50% who received the vaccine had ever been pregnant. Using
logistic regression to analyze the data, the probability of having been pregnant was estimated for
females who received an HPV vaccine compared with females who did not receive the shot.
Results suggest that females who received the HPV shot were less likely to have ever been
pregnant than women in the same age group who did not receive the shot. If 100% of females
in this study had received the HPV vaccine, data suggest the number of women having ever
conceived would have fallen by 2 million. Further study into the influence of HPV vaccine on
fertility is thus warranted.
Received 5 August 2017
Revised 13 May 2018
Accepted 14 May 2018.
Introduction
The birth rates in the United States for women
under the age of 30 are at record lows (Martin,
Hamilton, and Osterman 2017). Birth rates per
1000 females aged 25–29 fell 11.5% from 118.1 in
2007 to 104.5 in 2015. The recent decline follows a
steady increase of 8.5% between 1995 and 2006
(from 108.8 to 118). The basis for the recent
decrease remains unknown. Factors contributing
to the reduction might be associated with more
effective and better use of contraceptives
(Sundaram et al. 2017) as well as the recession of
2008 (Schneider 2015).
Perhaps exposure to one or more environmental
toxins might be influencing the birth rates.
Domingo (1994) reported the adverse effects of
metals such as mercury and lead that are common
in the human environment as well as metals used
in pharmacological products such as aluminum
(Al) on fetal development and teratogenicity in
mammals. Bhatt (2000) surveyed the literature on
environmental endocrine disrupters such as dioxins and polychlorinated biphenyls and found these
chemicals were shown to be associated with infertility, menstrual irregularities, and spontaneous
abortions. Garry et al. (2002) reported an
increased frequency of miscarriages and spontaneous abortions in women exposed to pesticides.
Marwa et al. (2017) found that introducing Al to
ovarian cells of rats triggered intracellular damage
primarily by altering the cellular mitochondria. It
is of interest that Veras et al. (2010) demonstrated
that exposure to ambient air pollutants was associated with decreased female and male fertility.
In 2006, the U.S. Food and Drug
Administration (2006) licensed the first of two
vaccines to protect women against the human
papillomavirus (HPV). Both HPV vaccines
(Gardasil and Cevarix) address HPV 16 and 18,
two strains of HPV that produce approximately
70% of cervical cancer cases. Further, Gardasil
CONTACT Gayle DeLong
gayle.delong@baruch.cuny.edu
Department of Economics and Finance, Baruch College/City University of New York, One
Bernard Baruch Way, Box B10-225, New York, NY 10010, USA.
Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/uteh.
© 2018 Taylor & Francis
662
G. DELONG
et al. (2014) found an association between prenatal
exposure to Al and neonatal morbidity. Evidence
also suggests a link between Al exposure and POF
(Pellegrino et al. 2014).
Geier and Geier (2017) examined the Vaccine
Adverse Events Reporting System (VAERS) database to determine whether uptake of the HPV
vaccine affected the number of reports of autoimmune reactions. VAERS is a passive system where
vaccine administrators or recipients report adverse
effects after receiving a vaccine. Between 2006 and
2014, HPV vaccine recipients or their health care
providers noted 48 cases of ovarian damage associated with autoimmune reactions. In addition to
the Geier and Geier findings, the VAERS database
between 2006 and 2017 indicated other symptoms
that affect the ability to bear children: spontaneous
abortion (214 cases), amenorrhea (130 cases), and
irregular menstruation (123 cases).
protected against genital warts by interfering with
HPV 6 and 11 (Markowitz et al. 2014). The vaccine is recommended for females (and since 2011
for males) aged 11–26.
Reports of young women experiencing primary
or premature ovarian failure (POF) after receiving
the vaccine were noted (Colafrancesco et al. 2013;
Little and Ward 2012, 2014). POF—defined as the
onset of menopause before the age of 40—is sometimes referred to as premature ovarian insufficiency and thought to be extremely rare.
Symptoms include menstrual disturbances such
as primary or secondary amenorrhea as well as
hot flashes and mood swings. The estimated incidence for females under the age of 30 is 1 in 1000,
rising to 1 in 100 for females under the age of 40
(Rafique, Sterling, and Nelson 2012). However, the
use of the birth control pill might mask the existence of POF and thereby understate the incidence
of the disorder. Islam and Cartwright (2011) noted
that of the 4968 females in a UK birth cohort that
had been born in 1958, the number of women who
experienced POF was 370 (7.4%). Underlying conditions such as radiation and chemotherapy might
give rise to the malady, but 80–90% of POF cases
have no apparent cause. POF may be an autoimmune disorder and between 10% and 30% of
women with POF also have other autoimmune
disorders (Maclaran and Panay 2015).
Both licensed HPV vaccines contain aluminum
(Al), which has been associated with autoimmune
disorders (Colafrancesco et al. 2013). No apparent
epidemiological study on the influence of Al on
fertility exists (Krewski et al. 2007), but Karakis
Methods
This study examined the decline in birth rates
amongst women at the peak of their childbearing
years in the United States since 2007. Data on live
births per 1000 females aged 25–29 originated from
the Centers for Disease Control and Prevention
(CDC) WONDER database “Births” section:
https://wonder.cdc.gov/natality.html. The database
reports the numbers starting in 1995. The number
of births is divided by the number of females in the
age group using data from WONDER database
“Population” section: https://wonder.cdc.gov/
bridged-race-population.html. Figure 1 illustrates
Birth rates per 1,000 U.S. females
aged 25 to 29 years
120.0
Birth Rate
115.0
110.0
105.0
Figure 1. Birth rates per 1000 females in the United States aged 25–29 from 1995 to 2015.
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
95.0
1995
100.0
JOURNAL OF TOXICOLOGY AND ENVIRONMENTAL HEALTH, PART A
national numbers from 1995 to 2015. The chart
reveals the steady increase in birth rates through
the mid-2000s, followed by a recent sharp decline
that is the subject of this analysis.
To determine whether the change in birth rates
over time is statistically significant, regression analysis was performed. Birth rates per 1000 females in
the United States aged 25–29 over time (BR(t))
were regressed on a constant (C) that equals 1 as
well as two indicators of time: TREND indicates the
overall time trend and equals 1 if the observa-tion
occurred in the year 1995, 2 in the year 1996,. . ., 21
in the year 2015 and Post-2006_DUMMY is a
dummy variable that equals 1 if year of analysis is
2007–2015 and 0 if year of analysis is 1995–2006.
Table 1 shows the results of the regression:
BR(t) = C + C × Post 2006_DUMMY + TREND
+ TREND × Post 2006_DUMMY. The coefficient
on the TREND variable (1.0) is positive and
statistically signifi-cant, suggesting that over time,
the birth rate rose an average 1 per year. However, a
change seemed to occur beginning in 2007. The
coefficient on the TREND × Post 2006_DUMMY
variable (−2.6) is negative and statistically
significant, sug-gesting that birth rates fell an
average of 1.6 per year (= 1.0 − 2.6) between 2007
and 2015.
To analyze possible influences associated with
these changes in birth rates, this study examined
responses to the National Health and Nutrition
Examination Survey (NHANES). The survey collects data on health status of individuals in the
Table 1. Results from regressing birth rates per 1000 females in
the United States aged 25–29 between 1995 and 2015 on a
constant and time indicators: BR(t) = constant + constant × post-2006_DUMMY +TREND + TREND ×
post-2006_DUMMY, where constant = 1; post-2006_DUMMY =
1 if year of analysis was 2007–2015 and 0 if year of analysis
was 1995–2006; and TREND = 1 if the observation was in the
year 1995, 2 in the year 1996,…, 21 in the year 2015.
Variables
Constant
Constant × post-2006_DUMMY
TREND
TREND × post-2006_DUMMY N
Adj R2
106.91
(0.0000)
29.31
(0.0000)
0.95
(0.0000)
−2.55
(0.0000)
21
.8947
663
United States along with demographic and socioeconomic information. The National Center for
Health Statistic (NCHS) at the CDC administered
the survey and selected a representative sample of
the US population based upon complex sampling
procedure (for details, see https://www.cdc.gov/
nchs/nhanes/participant.htm). Data are provided
in 2-year cycles.
Starting in 1999, the NHANES asked females
aged 12 and up “RHQ131: Has the survey participant ever been pregnant? Please include (current
pregnancy,) live births, miscarriages, stillbirths,
tubal pregnancies and abortions.” Responses
could be (1) yes, (2) no, (7) refused, (9) don’t
know, or (.) missing. Starting in 2007, the
NHANES asked the question to females aged 9
and above, “IMQ040: Has the survey participant
ever received one or more doses of the HPV vaccine?” Response choices were the same as for the
pregnancy question. In 2015, the NCHS moved
these questions to the National Health Interview
Survey, an annual survey that is not directly compatible with NHANES. The years of study are
therefore 2007—when NHANES first asked about
HPV vaccine uptake—to 2014, the final year
NHANES included the questions concerning pregnancy and HPV shots.
To analyze the data, the SURVEY FREQ and
SURVEY LOGISTIC procedures from SAS
Version 9.4 were used. The SURVEY FREQ procedure provided analysis of the relationship
between exposure to the HPV vaccine and prevalence of having been pregnant. The SURVEY
LOGISTIC procedure performed a multiple logistic regression on the data and determined whether
the odds of having been pregnant (the response
variable) were influenced by explanatory variables
such as receiving the HPV shot. Following the
NHANES tutorial at https://www.cdc.gov/nchs/
tu
torials/NH
ANES/
N H A N E S A n a l y s e s / LogisticRegression/
Task2b_SAS92.htm, the vari-able SDMVSTRA
was included to control for stra-tification when
estimating the variance. To control for the
clustering effect of observations, the vari-able
SDMVPSU was used to identify the primary
sampling unit. Since the response to the questionnaire varied among different groups, NHANES
oversamples some groups of people. A weighting
variable was included in the analysis to seek to
664
G. DELONG
ensure that the sample reflects the US population.
Since this study examined 8 years of data, the
given weight in each 2-year database
(WTINT2YR) was divided by 4.
The study compared women who received the
HPV shot with those who did not. Matching the
average age of the women in the vaccinated group
with the average age of the women in the unvaccinated group is extremely important. Since the
vaccine is relatively new and uptake is steadily
increasing over time (Stokley et al. 2014), most of
the women who have had the shot are relatively
young. If the mean age at the time of the interview
is significantly higher for the group of women who
did not receive the HPV shot than the vaccinated
group, the analysis would be comparing older
women who were not exposed to the HPV vaccine
with younger, vaccinated women. The older
women would have a higher probability of being
pregnant, precisely because they were older. The
younger, vaccinated women would be less likely to
ever have been pregnant, perhaps because of the
vaccine, but also perhaps because of age.
Table 2 presents descriptive statistics of the data
in this analysis according to vaccine status. The
SURVEYMEANS procedure in SAS 9.4 was
employed to determine the statistics. The dataset
was restricted to observations that provided information for all response and explanatory variables.
That is, observations that did not provide information on all the variables were dropped from any of
the analysis. Table 2 reports that the mean age of
the group of vaccinated as well as the unvaccinated
women at the time of the interview was 27. The
result of a t-test suggests that the difference
between the two groups was statistically not
significant.
Besides receiving the HPV shot, the analysis
included other explanatory variables that might
affect whether a female had ever been pregnant.
Table 2. Demographic and socioeconomic characteristics, NHANES 2007–2014, females aged 25–29.
Total sample
Married, currently or formerly
Never married
HPV shot No HPV shot Difference HPV shot No HPV shot Difference HPV shot No HPV shot Difference
n = 118
Variable
Age at interview
Mean
27.006
se of mean
.123
Ratio of family income to poverty
Mean
3.173
se of mean
.202
College graduatea
Mean (%)
50.1%
se of mean
.062
Race/Ethnicity: NH Whitea
Mean (%)
62.5%
se of mean
.051
Race/Ethnicity: Hispanica
Mean (%)
13.4%
se of mean
.031
Race/Ethnicity: NH Blacka
Mean (%)
13.8%
se of mean
.033
Race/Ethnicity: Othera
Mean (%)
10.3%
se of mean
.025
Height
Mean
27.006
se of mean
.123
a
n = 582
(p Value)
n = 32
n = 272
(p Value)
n = 86
n = 310
(p Value)
26.996
.079
.010
(.947)
27.449
.268
27.195
.106
.253
(.386)
26.806
.138
26.781
.110
.025
(.886)
2.739
.093
.434
(.053)
3.343
.355
2.885
.119
.458
(.230)
3.096
.216
2.581
.136
.514
(.047)
34.3%
.032
15.8%
(.025)
51.2%
.101
32.3%
.034
18.9%
(.086)
49.6%
.071
36.5%
.042
13.1%
(.115)
64.0%
.030
−1.5%
(.795)
68.4%
.090
69.6%
.035
−1.2%
(.905)
59.8%
.061
58.1%
.033
1.8%
(.799)
17.2%
.018
−3.8%
(.297)
11.4%
.046
18.1%
.026
−6.7%
(.217)
14.3%
.040
16.2%
.016
−1.9%
(.659)
10.8%
.014
3.0%
(.400)
14.0%
.051
4.5%
.008
9.4%
(.078)
13.8%
.036
17.7%
.024
−3.9%
(.370)
7.9%
.012
2.4%
(.400)
6.2%
.046
7.8%
.017
−1.6%
(.749)
12.1%
.027
8.1%
.013
4.0%
(.184)
26.996
.079
.010
(.947)
27.449
.268
27.195
.106
.254
(.385)
26.806
.138
26.781
.110
.025
(.886)
Variable is a dummy variable, where 1 indicates the attribute and 0 otherwise. The mean of this variable is the percentage of the sample that
possesses the particular attribute.
Statistically significant results in bold.
se of mean: Standard error of mean.
1 mean2 Þ
.
Statistical significance determined by sqrt seðofmean
ð ð mean21 þse of mean22 ÞÞ
NH: Non-Hispanic
JOURNAL OF TOXICOLOGY AND ENVIRONMENTAL HEALTH, PART A
These variables were age at the time of the interview (RIDAGEYR), ratio of family income to poverty (INDFMPIR), and educational level
(DMDEDUC2). Race and ethnicity (RIDRETH1)
were added as random controls. Observations
where the response was “refused,” “don’t know,”
or “missing” were dropped from the analysis. The
raw values of RIDAGEYR and INDFMPIR were
used. The educational variable, DMDEDUC2, was
recoded into a dummy variable that was 1 if the
woman was a graduate of a 4-year college
(DMDECUC2
=
5)
and
0
if
not
(DMDECUC2 = 1, 2, 3, or 4). RIDRETH1 was
recoded into four racial/ethnic groups—Hispanic,
non-Hispanic black, non-Hispanic white, and
other. The study included only females
(RIAGENDR = 2), and subsets of the sample
were analyzed according to the marital status
(DMDMARTL) of the participant. If a female
was married, widowed, divorced, or separated
(responses 1, 2, 3, or 4), she was considered for
this analysis to be “married, currently or formerly.” Women who chose response 5 were considered to be “never married.” The dichotomy is
important, because most never-married women
are probably seeking to avoid pregnancy, while
married women probably want to have a child or
children at some point.
Table 2 demonstrates that the group of vaccinated women was similar to those females who did
not receive the HPV injection, with two exceptions. The differences in the means of most of
the explanatory variables were not statistically significant. For the entire sample, there was no
marked difference in age, ratio of family income
to poverty, or ethnic/racial composition between
the two groups (column 3). College attainment
was higher for women who received the shot, but
only for the entire sample. There was no marked
difference in college attainment for the subsets of
married women (column 6) and unmarried
women (column 9). The ratio of family income
to poverty was higher for never-married women
who received the shot (3.1) than did not receive
the shot (2.6). These differences are addressed in
robustness checks of the model.
The sample included 700 females aged 25–29
between the years 2007 and 2014. Recall that
NHANES selects survey participants strategically
665
such that the survey reflects the US population.
The sample of 700 females represented 7944,091
females. The subset of ever-married women
included 304 survey participants who represented
3842,661 women; the subset of 396 never-married
women represented 4091,429 women.
Results
Table 3 presents chi-square analysis and prevalence ratios of pregnancy of women who received
at least one HPV shot compared with women who
did not receive the HPV vaccine. Using the PROC
SURVEYFREQ program in SAS 9.4, data demonstrate chi-square statistics for the 2 × 2 tables that
report pregnancy prevalence according to vaccine
status. Results for the entire sample as well as the
subsets of ever-married women and never-married
women were significant suggesting that the prevalence of having been pregnant was not independent of exposure to the HPV vaccine.
Using formulas from Medicalbiostatistics.Com
(n.d.), calculations of prevalence ratios were made.
Table 3 reports that for the entire sample, the
difference in prevalence rate of having been
pregnant between women who received the HPV
shot (35.3%) and those who did not receive the
vaccine (61.1%) was −25.8%. At the level of vaccine uptake reflected in this sample (16.7%), the
lowered prevalence rate resulted in 341,654 (=
−0.258 × the weighted frequency of 1325,396
women who received the shot) fewer women having been pregnant. If 100% of the females in this
study had received the HPV shot, the number of
women who would ever have been pregnant might
have fallen by 2 million (= −0.258 × the weighted
frequency of all 7934,091 women in the study).
For married women, the difference in prevalence of pregnancy between exposure to the HPV
vaccine (50.1%) and unexposed group (76.9%) was
−26.2%. If all the married women had received the
HPV shot, the number of those women who had
ever been pregnant would have diminished by 1
million (= −0.262 × 3842,662).
To analyze the dataset further, logistic regressions were utilized to determine the influence of
the HPV vaccine on the probability of having
been pregnant. Results are presented in Table 4.
Results without covariates are shown in the first
666
G. DELONG
Table 3. Prevalence ratios of ever having been pregnant for women who received an HPV shot versus women who did not.
Total sample
Ever-married women
Ever pregnant
HPV shot exposure
Received HPV shot
Frequency Weighted
frequency
Percentage
Did not receive shot
Frequency Weighted
frequency
Percentage
Total
Frequency Weighted
frequency
Percentage
Rao-Scott Chi-square
p > Chi-square
Prevalence rate of ever being pregnant
Received HPV shot
Did not receive HPV shot
Relative prevalence rate of ever being
pregnant
Attributable prevalence rate of ever being
pregnant
Population attributable prevalence:
At levels of vaccine uptake in sample
If 25% of population vaccinated
If 50% of population vaccinated
If 100% of population vaccinated
Yes
52
467,579
5.9
No
Never-married women
Ever pregnant
Ever pregnant
Total
66
118
857,817 1325,396
10.8
16.7
Yes
No
Total
Yes
No
Total
21
11
208,754 203,111
5.4
5.3
32
411,865
10.7
31
258,826
6.3
55
654,706
16
86
913,532
22.3
387
195
582
4035,001 2573,694 6608,695
50.9
32.4
83.3
221
51
272
166
144
310
2637,592 793,205 3430,797 1397,409 1780,489 3177,898
68.6
20.6
89.3
34.2
43.5
77.7
439
261
700
4502,580 3431,511 7934,091
56.8
43.2
100.0
18.9012

Purchase answer to see full
attachment

We offer the bestcustom writing paper services. We have done this question before, we can also do it for you.

Why Choose Us

  • 100% non-plagiarized Papers
  • 24/7 /365 Service Available
  • Affordable Prices
  • Any Paper, Urgency, and Subject
  • Will complete your papers in 6 hours
  • On-time Delivery
  • Money-back and Privacy guarantees
  • Unlimited Amendments upon request
  • Satisfaction guarantee

How it Works

  • Click on the “Place Order” tab at the top menu or “Order Now” icon at the bottom and a new page will appear with an order form to be filled.
  • Fill in your paper’s requirements in the "PAPER DETAILS" section.
  • Fill in your paper’s academic level, deadline, and the required number of pages from the drop-down menus.
  • Click “CREATE ACCOUNT & SIGN IN” to enter your registration details and get an account with us for record-keeping and then, click on “PROCEED TO CHECKOUT” at the bottom of the page.
  • From there, the payment sections will show, follow the guided payment process and your order will be available for our writing team to work on it.