Using the Oregon Student Health Survey (SHS) Data Portal for Running Item Crosstabs
The Crosstabs function on the Oregon SHS Data Portal allows you to examine responses
to one question on the SHS relative to responses on a second item. For example,
you might be interested in knowing whether there are differences in 30
day alcohol use between youth who believe
their parents would say it is “Very wrong” for them to use alcohol vs. “A little
bit wrong.” Alternatively, you might be
interested in examining the relationship between feeling “sad or hopeless for
two weeks or more” in the past year and past 30 day marijuana use. Using the Crosstabs
function can provide
answers to these types of questions.
To Run a Crosstab Report:
Identify the two SHS items that you are interested in examining using the drop down
lists (Question 1 and Question 2). For the crosstab output table to provide the
information you desire, it is
important that the two questions be entered in the appropriate order (i.e., reversing
the order of the two
items will not provide the same response frequencies). The Question 1 field serves
as the primary variable
of interest, while the Question 2 field serves as the contextual (or predictor)
variable. For example, if
you are interested in examining 30-day alcohol use (primary variable), in relation
to parent’s attitudes
about their children using alcohol (contextual variable), you would choose the item,
“ALCDAY30X_DFC: Past 30 Day Use - Alcohol” for the Question 1 field, and the item
“PWRGDALC:
How wrong do your parents feel it would be for you to: Have one or two drinks of”
for the Question 2 field (see example below).
Interpreting the Output Table:
The crosstabs output table presents the percentage of participants who indicated
each response to the primary variable, based on their response to the contextual
variable. In the example above,
you can examine 30 day alcohol use responses based on perceived parental attitudes
about youth alcohol
use in the statewide 11th Grade SHS sample for 2022. For 11th graders who indicated
that their parents
would feel it was “Not Wrong at All” for them to use alcohol, 36.9% had reported
using alcohol in
the past 30 days prior to the survey (63.1% reported not using alcohol). By contrast,
for youth who indicated
that their parents would feel it was “Very wrong” for them to use alcohol, only
13.0% reported using
alcohol, while 87.0% reported no use. The use rates for 11th graders with parents
who would say it “Wrong”
and “A little bit wrong” were 23.8% and 35.7%, respectively.
The percentages provided within each column will always add up to 100% because the
percentage in each cell reflects only the total number of participants who indicated
the specified response
to the contextual variable. In contrast, the row totals present the overall percentage
of participants
who indicated each response to the primary variable (e.g., 83.0% of all 11th grade
participants
[regardless of response to perceived parental attitudes] indicated no alcohol use
in the past 30 days). This
highlights the importance of entering the survey items in the correct order with
Question 1 as the primary
variable of interest, and Question 2 as the contextual variable. (If you REVERSE
the order of the variables,
it provides a different perspective on the data. In this case, the table would allow
you to see
how participants responded to the parental disapproval item based on their alcohol
use responses.)
Helpful Tips for Running Crosstabs:
Use the SHS Question Reference for a list of items that are included in the drop
down box for Question 1 and Question 2. Because the items (or the order of the items)
can change each survey administration, there is a separate reference document for
each survey year. The SHS
Question Reference link (located within the gray Crosstabs entry field area) will
access the appropriate document based on the year that is selected in the Year field.
Use the Question Category dropdowns (links located above both the Question 1 and
Question 2 boxes) to filter the list of items in each of the Question fields to
a smaller and more
focused set of items based on topic. This will make it much easier to find the items
that you want to include in the crosstab analysis.
Use the largest relevant sample for your crosstab analyses as possible. The larger
the sample size for your analyses, the more reliable the observed relationships
are likely to be. For example, using the total state sample will provide the most
reliable results vs.
using smaller geographical areas. With that said, you may need to drill down to
a smaller
subset of the data
for the results of the analyses to be relevant to your work. If so, just be sure
to understand that your
confidence in the results should be tempered as you analyze the data at smaller
and smaller levels (as sample sizes are reduced).
Pay close attention to what the data table is telling you (based on which item is
specified as Question 1 and Question 2). Remember, that the order of the items DOES
matter regarding
interpretation. Fortunately, if you accidentally put the items in the wrong order,
you can simply click on the “Reverse Questions” button next to the Question 2 field.
If you are not sure
how to interpret the data table, please re-visit the section above in this document
or reach out
to the XXXX for technical assistance.
The Crosstabs tool is also excellent for examining the response breakout for any
individual survey item. For example, if you are interested in knowing the actual
frequency of 30 day
alcohol use responses rather than the simpler breakout of youth who “drank any alcohol”
vs. “those who did not drink,” you can run a crosstab with “ALCCAY30: During the
past 30 days,
on how many days did you have at least one drink of alcohol” as Question 1 and grade
(TRUGRADE: Grade) as Question 2. This will allow you to examine frequencies for
each response
category for the number of occasions that alcohol was used.
Important Cautions for Interpreting Crosstabs:
Most importantly, do not make assumptions about cause and effect. The crosstab tables
only allow you to examine the relationship between two specified variables. While
it may be
tempting to interpret the data as having a cause and effect relationship (e.g.,
Question 2 causes
Question 1), the crosstab data DO NOT imply causality. The crosstab tool provides
the
opportunity to examine relationships in the data, but it is important NOT to make
assertions
from the data that are not warranted. When you find relationships between two variables,
some possibilities for that relationship include:
- Question 2 causes Question 1 (there is a cause and effect relationship)
- Question 1 causes Question 2 (cause and effect is reversed)
- Some other variable causes both Question 1 and Question 2
- There is no true relationship between Question 1 and Question 2 (the relationship
is an artifact of the survey – see next bullet point)
Running crosstabs is a form of data “mining.” There is a large number of items on
each SHS Survey, and sometimes relationships between items can exist by chance (not
because of a true
relationship between the variables). It is always good to do some reality checks
when you find relationships in the data. For example:
- Does the relationship hold up over time (not just for one year of the survey)?
- Does the relationship seem to be systematic (e.g., is there a linear relationship
or is it sporadic)?
- Is there a theoretical foundation that plausibly connects the variables?
Form A only and Form B only items - The number of items collected through the SHS
has increased over the years to the point where two forms of the survey are now
administered each
year. Most items appear on both Form A and Form B, but some items appear only on
Form A, and other items appear only on Form B (please see the SHS Question Reference
document for
each survey year for specific items that are only on Form A or Form B). When running
crosstabs
where either Question 1 or Question 2 contains an item that is on Form A or Form
B only, please note that the sample size for the crosstab run is likely to be half
of the total SHS sample
for the specified geography. In situations where a Form A only item AND a Form B
only item are chosen
for a crosstab run, the table will display “n/a” in each cell because there are
no participants who would have completed both Form A and Form B only items due to
their mutual exclusivity.