A Mixed-Methods Evaluation to Measure 4-H STEM Program Quality

The 4-H Science: Building a 4-H Career Pathway Initiative was a 3-year collaboration among National 4-H Council, Lockheed Martin, and state 4-H grantees to help more than 50,000 youth in 13 states develop STEM and workforce skills for STEM professions. A mixed-methods design used observations and interviews to assess program quality. Researchers observed 4-H STEM programming and conducted individual and focus group interviews with youth, parents, community volunteers, corporate volunteers, and professionals. Observations were conducted using a validated observational tool, the Out-of-School Time (OST) Observation Instrument with STEM Plug-In. This instrument measured youth relationship building, youth participation, staff relationship building, staff instructional strategies, activity content and structure, and STEM instruction. Findings from the observations and interviews were combined to assess program quality. Sites scoring highest on the OST Observation Instrument reported using quality STEM curriculum, especially National 4-H Youth Science Day lessons. The 4-H STEM programs demonstrated highly evident and consistent youth relationship building (e.g., being friendly and collaborative) and youth participation (e.g.., contributing ideas and taking leadership). Yet, in many cases, STEM youth skill development (e.g., drawing connections to real-world concepts) and STEM staff instructional practices (e.g., discussing how youth could pursue STEM content through their education and/or career) were inconsistent and rarely evident. Recommendations include substantive professional and volunteer development for both STEM competencies and enhanced youth development. The OST Observation Instrument with STEM Plug-In provided a comprehensive tool to evaluate program quality, and it is recommended for use in evaluating other 4-H STEM programs.


Introduction
Out-of-school time (OST) STEM learning has the potential to address shortages in science fields by helping youth understand the connection between STEM activities and future careers. A comprehensive literature review of informal STEM programs and connections to career pathways identified six factors that contribute to a young person's STEM career pursuit:  career awareness and decision to pursue a STEM career;  academic preparation and achievement;  identification with STEM careers;  self-efficacy;  external environmental factors; and  interest, enjoyment, and motivation (Dorsen et al., 2006). OST STEM learning (National Informal STEM Education Network, 2015). A survey of more than 400 youth participating in 4-H STEM programs in eight states found that 4-H youth self-reported that their attitudes about future science-related careers were greater than the National Assessment of Educational Progress benchmarks (Mielke & Butler, 2013). Flores-Lagunes and Timko (2015) found a positive association between 4-H participation and youth knowledge in science and math including higher standardized test scores. In addition, participation in 4-H is related to taking more advanced science courses in school as well as reporting a more positive attitude about science overall (Heck et al., 2012;Lerner & Lerner, 2013;Rice et al., 2016).

4-H Science: Building a 4-H Career Pathway Initiative was a 3-year collaboration among
National 4-H Council, Lockheed Martin, and 13 state 4-H programs. The overall goal of the initiative was to strengthen the link between 4-H STEM activities to advanced education and careers in STEM fields. The program concept drew on several recommendations from Riley and Butler's (2012) national review of eight promising 4-H science programs, specifically: involving science experts to lead and advise local programming and developing program activities to expose youth to science careers. In this project, the science experts were corporate volunteers providing their time and expertise in local 4-H STEM programs.
 What are the program approaches of quality 4-H STEM programs?
 Can an exemplary 4-H STEM program be identified for potential replication and study?
 Does the 4-H STEM Career Pathway Programming Model contribute to program quality?

Methodology
This study was approved by the University of Tennessee, Knoxville Institutional Review Board (IRB Number UTK IRB-15-02714-XP). This was a convergent mixed-methods design whereby quantitative and qualitative data sets were obtained and combined. The data were combined to understand the project in summa, draw conclusions, and propose recommendations (Creswell, 2015;Creswell and Plano Clark, 2018).

Participants
Participants were 4-H youth and parents, Extension 4-H professionals, corporate volunteers, and community volunteers participating in the 4-H Science: Building a 4-H Career Pathway Initiative. The actual 4-H programs varied by location and included a robotics club conducted in a community setting, an after-school 4-H STEM program, and an in-school 4-H enrichment program focused on gardening. All participants signed consent forms for both the observations and focus groups; researchers obtained both parental consent and youth assent for youth participants. Our study involved 155 research participants in five of the thirteen states served in the program: 59 Extension 4-H professionals, 14 community 4-H volunteers, seven corporate volunteers, 49 youth, and 26 parents.

Procedures
Mixed-methods research has the potential to address multiple needs and issues related to assessing 4-H STEM programs. Mixed methods research is a social science research approach in which researchers combine quantitative and qualitative data and make interpretations and conclusions based on the combined robustness of the data (Creswell, 2015;Creswell & Plano Clark, 2018). As part of mixed-methods evaluation, site visits can be particularly useful (Patton, 2015). Site visits allow for direct observations of an ongoing program and may not be as disruptive to normal programming as tests, surveys, and interviews (Fu et al., 2019).
A major disadvantage for evaluative site visits has been impression management, or the program staff's tendency to show the evaluators only what they want them to see. Despite this, emerging research on the practice of evaluative site visits has the potential to produce more accurate, useful results. As outlined by Nelson (2017), strategies to reduce the influence of impression management included triangulating the data from multiple sources, conducting frequent and longer visits, and being focused on learning and improvement rather than judgement. These strategies were employed in this study as state grantees implementing all four phases were visited two times in 2 different years; visits were conducted across 2 to 5 days rather than just 1 day; data from both qualitative and quantitative strands were combined; and the observation tool was provided to all state grantees in advance of the site visits to emphasize that these visits were focused on finding best practices rather than strictly judgement. This research procedure reflected the four steps of an explanatory sequential mixed-methods design as described by Creswell and Plan Clark (2018): Step 1. Design and implement the quantitative strand.
Step 2. Use strategies to connect from the quantitative strand.
Step 3. Design and implement the qualitative strand.
Step 4. Interpret the connected results.

For
Step 1, the quantitative strand, we collected and aggregated monthly activity reports via curriculum used (approaches and innovations in reaching youth). The funding agency had set benchmarks for state grantees that included serving up to 60% girls and minorities. We used this quantitative data to select the states to visit. States were organized into two cohorts.
Cohort 1 were the three states implementing all four phases of the 4-H STEM Career Pathway, and Cohort 2 were those states only implementing the explore phase of the 4-H STEM Career Pathway. We selected all three of the Cohort 1 states because these states could provide the most depth and breadth regarding experience with the model. Of the 10 Cohort 2 states, we selected the four states that had the highest percentages of girls reached, had the highest percentages of minority youth reached, and/or had the most developed partnerships with corporate volunteers.

In
Step 3 of the explanatory sequential mixed methods design, we traveled to sites within each state selected by the state grantee and made observations of programs selected by 4-H professionals. To provide consistency and uniformity in observation, the OST Observation Instrument was used (Pechman et al., 2008). The OST uses five domains:  youth relationship building  youth participation  staff relationship building  staff instructional strategies  activity content and structure Each domain includes indicators and descriptions, and the total instrument has 28 indicators. A companion instrument for STEM programming, the STEM Plug-in includes 14 indicators. The indicators for the OST Observation and STEM Plug-in are scored on a 1-7 scale: 1 (the indicator is not evident), 3 (the indicator is rarely evident) and 7 (the indicator is highly evident and consistent). The OST Observation with STEM Plug-In refers to "staff" and in this study, we defined staff as being all inclusive of Extension 4-H professionals, para-professionals, and corporate and community volunteers.
The instrument has been validated in numerous studies with demonstrated inter-rater reliability and internal consistency. For a discussion of construct validity, internal consistency, concurrent validity, and validity of scale structure see Pechman, at al., 2008. In addition to the observational data, both individual and focus group interviews were conducted to provide information about how the program worked, to understand the depth and breadth of the program, and to understand lessons learned. Focus group interviews were conducted with 4-H youth and parents in the same group interviews, and all other participants were in groups for their distinct audience (community volunteers, corporate volunteers, and 4-H professionals were interviewed in distinct groups). We interviewed 29 participants in individual interviews and 126 participants in group interviews (Table 1).
The focus group and individual interviews were completely organic, that is, the format of the 4-H STEM educational program dictated how interviews were conducted. For example, at a community 4-H robotics club, parents were interviewed individually because they arrived at different times to pick-up their children after the event. The same questions were used for both individual and focus group interviews, and sample questions included: Note. The focus group interviews ranged in size from 2 to 26 participants with a mean of 7.41 participants. In Step 4 of the explanatory sequential mixed methods design, we analyzed data, combined the qualitative and quantitative strands, and interpreted the connected results. The focus groups were recorded using digital audio recorders and these files were transcribed. We coded the transcripts using an open-coding approach. The categories from each individual focus group were then aggregated across all focus groups. From the codes, we developed themes. Next, we examined the quantitative observational data as well as the qualitative observational data in the form of our field notes (containing information about the STEM activities, emerging themes, and impressions). Finally, we compared and contrasted the themes from the interviews in the context of the observational data and vice versa.

Results
We observed actual programs during site visits. The OST with STEM Plug-In instrument was used, and the indicators were scored on a scale of 1 (the indicator is not evident) to 7 (the indicator is highly evident and consistent). Table 2 shows the OST with STEM Plug-In scores for all 10 sites observed. We averaged the OST score for each indicator and calculated means for the domains on a site-by-site basis.

Table 2. Out-of-School-Time Observation Instrument With STEM Plug-In Scores
Note. The OST instrument with the STEM Plug-In was used, and the indicators were scored on a scale of 1 to 7 where: 1 (the indicator is not evident), 3 (the indicator is rarely evident) and 7 (the indicator is highly evident and consistent).
a Staff is all-inclusive of Extension 4-H professionals, para-professionals, corporate and community volunteers.
For the observations, mean scores ranged from 3.2 to 7.0 across the seven domains.
Observational scores reinforced that youth received a high-level of support and guidance during 4-H STEM programs. The instrument showed positive results, and the indicators that scored the highest were  Youth listen attentively to peers and staff (mean = 6.57).
 Staff used positive behavior management techniques (mean = 6.3).
 Activity content and structure requires analytical thinking (mean = 6.14).  connecting content to the real-world (mean = 3), and  discussing how youth could pursue STEM content through their education and/or in a career (mean = 2.83).
Focus group results indicated that the four sites with the highest OST scores (Sites 2, 3, 6, and 7 in Table 2) all had Extension 4-H professionals who consistently followed four practices: (a) they used established curriculum, (b) they recruited diverse science experts and role models for youth, (c) they recruited new community volunteers to serve as science experts, and (d) they provided one-on-one instruction for youth.

Established Curriculum
The Extension 4-H professionals at these sites all used previous National Youth Science Day curriculum in their current STEM programming. As a California 4-H professional noted, "The National Youth Science Day curriculum is really nice. It comes as a kit. It's got a facilitator manual. It's really hands-on." The highest-scoring sites also engaged youth in building and coding for Lego Mindstorms robots and Science Education and Resources for Informal Education Settings (SERIES). In Maryland, an integral program was the Adventures in Science, a program of Maryland 4-H that was previously identified as one of the most promising 4-H nonformal science programs in the nation (Riley & Butler, 2012). The local Extension 4-H professionals recruited new community volunteers from local colleges and universities. Both graduate students and faculty were recruited. In all cases except one, these volunteers had an extensive formal education background in STEM fields. These relationships were described as mutually beneficial for the volunteers and the youth such as in this comment from a 4-H professional: "As far as the college kids, it actually helps them to also expose the youth to what it's like to be on a college campus. It makes the youth excited about that because they get to work with somebody closer to their age."

One-on-One Instruction for Youth
The Extension 4-H professionals provided multiple forms of instruction, and one important commonality was that all provided one-on-one instruction for youth. Youth reported that this support and guidance was in contrast to school science classes where one-on-one instruction is limited. Typical comments included  "In school, since there's more students, you don't really get one-on-one help and then you don't really understand what you're doing. When you're here, since we have mentors, we get more help." (4-H youth)  "At school, we are beginning to do science, but they don't explain that much. But when I get here, they explain more." (4-H youth)  "One of the things that I've found with 4-H is that the camp, the management, and the Lockheed team, they're so accessible. They want to help you . . . I think that's phenomenal cause you don't get that in a lot of places-that patience and desire to teach and share their knowledge." (4-H parent) The site with the overall highest OST score was Site 3, the Paso Robles STEM program in California. Prior to the 3 rd year of the 4-H Science: Building a 4-H Career Pathway Initiative, a description of this site was shared with all state grantees for program improvement purposes.
This site provides a model for how 4-H STEM programs can work with community partners to bring STEM programs to minority and underserved youth. The description is shown in the Appendix.

Discussion, Implications and Conclusions
Youth concluded that studies of nonformal engineering programs were important for helping those designing the education to provide appropriate protocols and understand skills needed by adults for engaging youth. However, involvement of science experts alone does not guarantee a quality program. Areas identified for improvement were in STEM youth skill development and STEM instructional practices. Specifically, 4-H programs need greater emphasis on connecting 4-H STEM activities to real-world applications and to educational pathways and careers. These are areas of concern since the entire initiative was aimed at developing these skills.
Throughout the states, 4-H professionals identified two major challenges that could explain this shortfall. First, 4-H professionals discussed their lack of confidence and competence related to STEM programming and skills. They discussed their limited STEM educational background and how this restricted their ability to conduct advanced STEM programming that could link youth more directly to these careers. Second, 4-H professionals acknowledged the difficulty of identifying more advanced STEM experiences such as career shadowing and internship opportunities for students in middle and high schools.
Youth and parents discussed the importance of one-on-one instruction and mentoring. This finding underscores the important role of 4-H in STEM learning and achievement. Interestingly, this finding echoes Bloom's (1984) groundbreaking research which emphasized the role of individualized instruction, mastery learning, and the need to explore how group instruction could be as effective as individual instruction. Additional research should investigate effective 4-H one-on-one instructional settings and curricula. Furthermore, Extension 4-H professionals may be able to use the 4-H organization's proclivities for one-on-one instruction and mentoring for marketing 4-H to youth and parents.
As measured by program evaluation questionnaires, various 4-H science programs have been successful at teaching science concepts including robotics (Barker et al., 2008) biotechnology (Ripberger & Blalock, 2013), and aquaculture (Horton & House, 2015). Yet, Ripberger and Blalock (2015) suggested that youth may benefit from STEM programs in ways that cannot be easily captured in questionnaires. In the study reported here, evaluative site visits and the OST Observation Instrument with STEM Plug-In were used successfully to measure quality and these approaches are recommended for improving STEM programs throughout the national 4-H system. The OST Observation Instrument with STEM Plug-In can be used easily by supervisors and other professionals to assess programming quality and the numeric score provides the opportunity to show improvement if conducted periodically.
Youth development practitioners and researchers should look for ways to improve STEM youth skill development and STEM instructional practices. The researchers focused the site visits on the need to learn about the program and improve processes for future implementation. One area for future research is the use of program exemplars (such as the Paso Robles 4-H STEM Program) and how these may or may not influence program improvement among 4-H professionals and volunteers.
Since this initiative began, a new Dimensions of Success (DoS) tool has been validated for OST STEM learning (Shah et al., 2018). Research is needed to correlate program outcomes to an observation tool that could be used to improve practice for higher quality 4-H STEM programming. Likewise, the program quality measurement instruments themselves need to be updated and reconsidered over time. Program quality measures tend to reflect major themes from developmental theory, empirical research in human sciences programs, and youth program evaluations from previous years, all of which could change over time (Arnold & Cater, 2011 [Barrett et al., 2013]) were not widely adopted by Extension 4-H professionals in this study. This lack of adoption of current OST STEM curricula echoes the need for greater access to curricula among 4-H professionals and volunteers, perhaps through a national clearinghouse of "4-H Science-approved curricula" (Worker et al., 2017, Curricula section).