Motivations

2019

When I worked for the New York City Department of Education’s Office of School Support Services (NYC DOE OSSS), where my work was primarily focused upon school food research and program evaluation, we believed that students need to eat well to learn well. In this capacity, I was the program evaluator for the Garden To Café project (GTC).

This project was led by a chef1. GTC’s chefs obtain fresh, seasonal food (raw ingredients) from local farms and the schools’ own gardens, prepare the available ingredients into one or more dishes, and then have the students at participating schools try those dishes at special tasting events, often during lunch.

The chefs always had a pretty good sense of how students were responding to the dishes. The GTC coordinator also wanted formal feedback on the dishes. Over a period of four to five years, he and I worked together to develop on-site taste testing methodology that would efficiently deliver quality feedback to the program. We also wanted to make sure the schools benefited from the time they gave us for the taste testing.

This report delivers analysis of the last such taste test I conducted at NYC DOE schools.

I felt I had an obligation to my colleagues at the Garden To Café program, to the staff and students at the school at which the taste test was held, to the field of school food research and to the taxpayers of New York City to complete the work. The final project for Intro to R felt like an ideal opportunity to bring the work to completion, while simultaneously expanding my analysis skills.

1997

Another primary motivation for this report and analysis is rooted some two decades earlier.

I am enrolled in the Teachers College Learning Analytics Masters program, but this is not my first tour as a graduate student. I completed my first graduate program by earning a Ph.D. in Education at Cornell University, studying with Dr. Joseph Novak.

The Education program at Cornell was very good in many ways. One way it fell short of the mark, in retrospect, was insufficient emphasis on quantitative analysis. I don’t think people at the time anticipated how quantitative the field of educational research would become. The Cornell Education Department didn’t have any education statistics courses; Education graduate students had to go to Biology or Industrial and Labor Relations. No one pushed me to take statistics courses as a substantial portion of my studies. It is what it was.

I have acquired a lot of quantitative analysis skills in the field, which has taken me far, including becoming a disciple of effect size, but not far enough.

I have tried to supplement my skills on the fly for specialized needs, but nothing was working well. When the pandemic came and everything slowed down, I decided to try going back for a full degree program. I consider myself to be a very applied researcher. Learning Analytics seemed to be the most applied of Teachers College’s analysis-focused programs. Pages of code that runs without error later, so far so good.

Some specific motivating questions

  1. How do assessments of the components of taste relate to overall assessments of a dish?
  2. To what extent did Garden To Café achieve the program goal of increasing students’ willingness to try new foods?2
  3. How do student responses to the dish being taste tested vary by demographic characteristics (age and previous experience with taste tests)?

Introductions

Technical - Setting up the R session to run the R Markdown file

Each analysis and report in R requires various packages to run. These packages need to be installed and loaded before the analysis using those packages starts.

This next section presumes all of the needed packages are already installed, but provides the install.packages() code commented out so they are easily accessible if they are not already installed. Then the packages are loaded using the library() function.

# Install Packages
# DO NOT RUN this section, unless these packages have not been installed yet!!
# install.packages("readxl")
# install.packages('epiDisplay')


# install.packages("gmodels") # for CrossTable()
# install.packages("xtable")

# install.packages("ggthemes")

# Load Packages
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.5     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.4     ✓ stringr 1.4.0
## ✓ readr   2.0.2     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(readxl)
library(ggplot2)
library(epiDisplay)
## Loading required package: foreign
## Loading required package: survival
## Loading required package: MASS
## 
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
## 
##     select
## Loading required package: nnet
## 
## Attaching package: 'epiDisplay'
## The following object is masked from 'package:ggplot2':
## 
##     alpha
library(dplyr)
library(tidyr)
library(knitr)
library(gmodels) # for CrossTable()
## 
## Attaching package: 'gmodels'
## The following object is masked from 'package:epiDisplay':
## 
##     ci
library(xtable)

library(ggthemes) # for fancy ggplot plots
library(gridExtra) # to combine ggplots
## 
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
## 
##     combine

Content introduction

On May 14, 2019, I conducted a taste test at a public elementary school in NYC for the Garden To Café project (GTC). For various reasons, the data from this taste test has not been analysed until now.

This report of the May 14, 2019 Garden To Café taste test replicates, more or less, the report “Pilot test of a Garden To Café scannable taste test survey for snack fruit administered in classrooms at PSABX on 12/14/2017” dated 12/21/20173.

Additional analysis was done following the model of “Supplemental results from a Garden To Café scannable taste test survey for snack fruit administered in classrooms at PSABX on 12/14/2017” dated 2/14/20184. None of this analysis is an exact replication of these reports since the surveys used and dishes served were somewhat different.

Also note that this report was written in part to explore and demonstrate techniques in R, so it contains more plots than might otherwise have been included in a more typical report written for a client. Also since this report is written to demonstrate facility in R and R Markdown, it displays the code interspersed with the text. This makes the report less efficiently useful for other audiences, such as the client or executives. As such, one likely next step later on would be to produce a version of the report with less or no code displayed. To do this, I would likely experiment with putting echo = FALSE and warning = FALSE into the {r ...} of some or all code chunks in the R Markdown file5.

The dish that was taste tested

One of the Garden To Café chefs prepared a salad with arugula, spinach and sliced carrots. The salad was lightly dressed with an apple-based dressing. Three photos of the salad are shown below: with flash, without flash and as served in sample cups plus that day’s school food lunch (hamburger, fries, onion rings and a pear or apple). The last photo also shows the ingredient handouts that were available to the teacher and students who wished to take a set.

Photo of spinach, arugula and sliced carrot salad, with flash Photo of spinach, arugula and sliced carrot salad, without flash Photo of spinach, arugula and sliced carrot in tasting cups, plus that day’s school lunch of hamburger, apple, pear, onion rings and French fries, plus ingredient handouts

Some data wrangling

This report contains a lot of data wrangling.

Import the data from Excel into R

# Import the data from Excel
# Note: On a Mac, the software called XQuartz must be installed for this to work in R Markdown.
GTC_Tschool_P1to7_5_14_19_adj_10_24_21 <- read_excel("GTC_Tschool_P1to7_5-14-19_adj_10-24-21.xlsx")

Recode variables to the _Name form, and create factors

This step recodes variables into new variables, with spelled out names, that are lettered so that they appear in charts in the same order as on the survey.

# Recode Q1

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Name <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty,
       "Low" = "a_Low",
       "Med" = "b_Medium",
       "High" = "c_High",
       "IDK" = "d_I don't know",
       "LeftBlank" = "h_LeftBlank",
       .default = "i_Default")

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Name <- factor(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Name, 
       levels = c("a_Low", "b_Medium", "c_High", "d_I don't know", "h_LeftBlank", "i_Default"),
       labels = c("Low", "Medium", "High", "I don't know", "LeftBlank", "Default"))

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_Name <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet,
        "Low" = "a_Low",
        "Med" = "b_Medium",
        "High" = "c_High",
        "IDK" = "d_I don't know",
        "LeftBlank" = "h_LeftBlank",
        .default = "i_Default")

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_Name <- factor(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_Name, 
                 levels = c("a_Low", "b_Medium", "c_High", "d_I don't know", "h_LeftBlank", "i_Default"),
                 labels = c("Low", "Medium", "High", "I don't know", "LeftBlank", "Default"))

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_Name <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter,
                 "Low" = "a_Low",
                 "Med" = "b_Medium",
                 "High" = "c_High",
                 "IDK" = "d_I don't know",
                 "LeftBlank" = "h_LeftBlank",
                 .default = "i_Default")

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_Name <- factor(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_Name, 
                 levels = c("a_Low", "b_Medium", "c_High", "d_I don't know", "h_LeftBlank", "i_Default"),
                 labels = c("Low", "Medium", "High", "I don't know", "LeftBlank", "Default"))


GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_Name <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour,
                 "Low" = "a_Low",
                 "Med" = "b_Medium",
                 "High" = "c_High",
                 "IDK" = "d_I don't know",
                 "LeftBlank" = "h_LeftBlank",
                 .default = "i_Default")

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_Name <- factor(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_Name, 
                 levels = c("a_Low", "b_Medium", "c_High", "d_I don't know", "h_LeftBlank", "i_Default"),
                 labels = c("Low", "Medium", "High", "I don't know", "LeftBlank", "Default"))


GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_Name <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy,
                 "Low" = "a_Low",
                 "Med" = "b_Medium",
                 "High" = "c_High",
                 "IDK" = "d_I don't know",
                "LeftBlank" = "h_LeftBlank",
                .default = "i_Default")

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_Name <- factor(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_Name, 
                 levels = c("a_Low", "b_Medium", "c_High", "d_I don't know", "h_LeftBlank", "i_Default"),
                 labels = c("Low", "Medium", "High", "I don't know", "LeftBlank", "Default"))


GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_Name <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful,
                  "Low" = "a_Low",
                  "Med" = "b_Medium",
                  "High" = "c_High",
                  "IDK" = "d_I don't know",
                  "LeftBlank" = "h_LeftBlank",
                  .default = "i_Default")

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_Name <- factor(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_Name, 
            levels = c("a_Low", "b_Medium", "c_High", "d_I don't know", "h_LeftBlank", "i_Default"),
            labels = c("Low", "Medium", "High", "I don't know", "LeftBlank", "Default"))


# Recode Q2

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_Name <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish,
             "F" = "a_Frozen",
             "C" = "b_Cold",
             "W" = "c_Warm",
             "H" = "d_Hot",
             "VH" = "e_Very Hot",
             "IDK" = "f_I don't know",
             "LeftBlank" = "h_LeftBlank",
             .default = "i_Default")

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_Name <- factor(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_Name, 
             levels = c("a_Frozen", "b_Cold", "c_Warm", "d_Hot", "e_Very Hot", "f_I don't know", "h_LeftBlank", "i_Default"),
             labels = c("Frozen", "Cold", "Warm", "Hot", "Very Hot", "I don't know", "LeftBlank", "Default"))


# Recode Q3

# Note: the backslash character has to be escaped with another backslash character to avoid generating an error.
# Thus, to get "(\)", one has to write "(\\)".

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Name <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish,
             "D" = "a_Delicious (Smile Face)",
             "O" = "b_Okay (Flat Line Face)",
             "U" = "c_Unsatisfying (Frown Face)",
             "IDK" = "d_I don't know (\\)",
             "IDNT" = "e_I didn't try it (\\)",
             "LeftBlank" = "h_LeftBlank",
             .default = "i_Default")

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Name <- factor(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Name, 
             levels = c("a_Delicious (Smile Face)", "b_Okay (Flat Line Face)", "c_Unsatisfying (Frown Face)", "d_I don't know (\\)", "e_I didn't try it (\\)", "h_LeftBlank", "i_Default"),
             labels = c("Delicious (Smile Face)", "Okay (Flat Line Face)", "Unsatisfying (Frown Face)", "I don't know (\\)", "I didn't try it (\\)", "LeftBlank", "Default"))


# Note: Q4 does not need recoding because it is a string variable, OCR scanned character box by character box.
# It may need to be compared to the original paper surveys to check for OCR errors, but that can happen later, it isn't as important,
# and the responses are pretty much as expected even without checking for OCR errors: mostly repeats the multiple choice responses without adding detail.

# Recode Q5

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_Name <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish,
              "3+" = "a_3+ times",
              "2" = "b_2 times",
              "1" = "c_1 time",
              "N" = "d_Never",
              "IDK" = "e_I don't know",
              "LeftBlank" = "h_LeftBlank",
              .default = "i_Default")

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_Name <- factor(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_Name, 
              levels = c("a_3+ times", "b_2 times", "c_1 time", "d_Never", "e_I don't know", "h_LeftBlank", "i_Default"),
              labels = c("3+ times", "2 times", "1 time", "Never", "I don't know", "LeftBlank", "Default"))



GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Name <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore,
             "3+" = "a_3+ times",
             "2" = "b_2 times",
             "1" = "c_1 time",
             "N" = "d_Never",
             "IDK" = "e_I don't know",
             "LeftBlank" = "h_LeftBlank",
             .default = "i_Default")

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Name <- factor(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Name, 
             levels = c("a_3+ times", "b_2 times", "c_1 time", "d_Never", "e_I don't know", "h_LeftBlank", "i_Default"),
             labels = c("3+ times", "2 times", "1 time", "Never", "I don't know", "LeftBlank", "Default"))


GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_Name <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore,
             "3+" = "a_3+ times",
             "2" = "b_2 times",
             "1" = "c_1 time",
             "N" = "d_Never",
             "IDK" = "e_I don't know",
             "LeftBlank" = "h_LeftBlank",
             .default = "i_Default")

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_Name <- factor(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_Name, 
             levels = c("a_3+ times", "b_2 times", "c_1 time", "d_Never", "e_I don't know", "h_LeftBlank", "i_Default"),
             labels = c("3+ times", "2 times", "1 time", "Never", "I don't know", "LeftBlank", "Default"))


# Recode Q6
# Note: This question goes from K to 12 plus Adult, but in this dataset, the values are only K to 5.
# Also note that if this question did go beyond grade 7, the consistent LeftBlank = 8 and .default = 9 wouldn't work.
# This is mostly the same as the _Num version, but I have run this recode to be consistent.

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade_Name <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade,
        "K" = "0 (K)",
        "1" = "1",
        "2" = "2",
        "3" = "3",
        "4" = "4",
        "5" = "5",
        "LeftBlank" = "h_LeftBlank",
        .default = "i_Default")

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade_Name <- factor(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade_Name, 
        levels = c("0 (K)", "1", "2", "3", "4", "5", "h_LeftBlank", "i_Default"),
        labels = c("0 (K)", "1", "2", "3", "4", "5", "LeftBlank", "Default"))


# Recode Q7 and Q8

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q7_TakenPartIn_TasteTestBefore_Name <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q7_TakenPartIn_TasteTestBefore,
                "Yes" = "a_Yes",
                "Maybe" = "b_Maybe",
                "No" = "c_No",
                "LeftBlank" = "h_LeftBlank",
                .default = "i_Default")

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q7_TakenPartIn_TasteTestBefore_Name <- factor(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q7_TakenPartIn_TasteTestBefore_Name, 
                levels = c("a_Yes", "b_Maybe", "c_No", "h_LeftBlank", "i_Default"),
                labels = c("Yes", "Maybe", "No", "LeftBlank", "Default"))


GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q8_TakenPartIn_GTCEventBefore_Name <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q8_TakenPartIn_GTCEventBefore,
                "Yes" = "a_Yes",
                "Maybe" = "b_Maybe",
                "No" = "c_No",
                "LeftBlank" = "h_LeftBlank",
                .default = "i_Default")

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q8_TakenPartIn_GTCEventBefore_Name <- factor(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q8_TakenPartIn_GTCEventBefore_Name, 
                 levels = c("a_Yes", "b_Maybe", "c_No", "h_LeftBlank", "i_Default"),
                 labels = c("Yes", "Maybe", "No", "LeftBlank", "Default"))

# Recode ClassPeriod

# ClassPeriod shouldn't need to be recoded. See comment below in Recode to _Num section.

Recode variables to the _Num form

This section recodes the variables to numeric values, which are more useful for certain analyses.

## Recode variables from string variables into new numeric variables

# Recode Q1

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Num <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty,
             "Low" = 1,
             "Med" = 2,
             "High" = 3,
             "IDK" = 7,
             "LeftBlank" = 8,
             .default = 9)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_Num <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet,
             "Low" = 1,
             "Med" = 2,
             "High" = 3,
             "IDK" = 7,
             "LeftBlank" = 8,
             .default = 9)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_Num <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter,
              "Low" = 1,
              "Med" = 2,
              "High" = 3,
              "IDK" = 7,
              "LeftBlank" = 8,
              .default = 9)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_Num <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour,
              "Low" = 1,
              "Med" = 2,
              "High" = 3,
              "IDK" = 7,
              "LeftBlank" = 8,
              .default = 9)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_Num <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy,
              "Low" = 1,
              "Med" = 2,
              "High" = 3,
              "IDK" = 7,
              "LeftBlank" = 8,
              .default = 9)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_Num <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful,
               "Low" = 1,
               "Med" = 2,
               "High" = 3,
               "IDK" = 7,
               "LeftBlank" = 8,
               .default = 9)

# Recode Q2

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_Num <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish,
               "F" = 1,
               "C" = 2,
               "W" = 3,
               "H" = 4,
               "VH" = 5,
               "IDK" = 7,
               "LeftBlank" = 8,
               .default = 9)

# Recode Q3

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Num <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish,
               "D" = 1,
               "O" = 2,
               "U" = 3,
               "IDK" = 7,
               "IDNT" = 6,
               "LeftBlank" = 8,
               .default = 9)

# Note: Q4 does not need recoding because it is a string variable, OCR scanned character box by character box.
# It may need to be compared to the original paper surveys to check for OCR errors, but that can happen later, it isn't as important,
# and the responses are pretty much as expected even without checking for OCR errors: mostly repeats the multiple choice responses without adding detail.

# Recode Q5

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_Num <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish,
        "3+" = 3,
        "2" = 2,
        "1" = 1,
        "N" = 0,
        "IDK" = 7,
        "LeftBlank" = 8,
        .default = 9)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Num <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore,
        "3+" = 3,
        "2" = 2,
        "1" = 1,
        "N" = 0,
        "IDK" = 7,
        "LeftBlank" = 8,
        .default = 9)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_Num <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore,
        "3+" = 3,
        "2" = 2,
        "1" = 1,
        "N" = 0,
        "IDK" = 7,
        "LeftBlank" = 8,
        .default = 9)

# Recode Q6
# Note: This question goes from K to 12 plus Adult, but in this dataset, the values are only K to 5.
# Also note that if this question did go beyond grade 7, the consistent LeftBlank = 8 and .default = 9 wouldn't work.

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade_Num <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade,
         "K" = 0,
         "1" = 1,
         "2" = 2,
         "3" = 3,
         "4" = 4,
         "5" = 5,
         "LeftBlank" = 8,
         .default = 9)

# Recode Q7 and Q8

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q7_TakenPartIn_TasteTestBefore_Num <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q7_TakenPartIn_TasteTestBefore,
         "Yes" = 1,
         "Maybe" = 0.5,
         "No" = 0,
         "LeftBlank" = 8,
         .default = 9)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q8_TakenPartIn_GTCEventBefore_Num <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q8_TakenPartIn_GTCEventBefore,
         "Yes" = 1,
         "Maybe" = 0.5,
         "No" = 0,
         "LeftBlank" = 8,
         .default = 9)

# Recode ClassPeriod

# I don't believe that ClassPeriod needs to be recoded, as it has all numbers which are class periods,
# and there can, by definition, be no missing values or out of range values, unless I made a
# data entry error.

Visualization of the data

Regular boxplots and Tuftean boxplots

I created several plots using ggplot() for the POTD-5, Task 2 assignment. However, those bar charts have been superceded by the tab1() bar charts shown below, and the scatterplots created for this exercise were useful as a learning exercise, but are not worth reporting, so I have cut them from this report.

There are many kinds of plots that can be created using ggplot(), so I will continue exploring that later. One good reference is this web page: Top 50 ggplot Visualizations, downloaded 12/8/2021

To try to go the extra mile, and since I have already created two Tufte-inspired plots for Intro to R, I am going to attempt the Tuftean boxplot, side by side with the regular boxplot, using ggplot().

The boxplots below show the distribution of the Grade level variable. It appears that Tuftean boxplots can only be shown when oriented vertically, so I have oriented all of the boxplots that way. For reasons I can’t determine, the uni-variate boxplots show a scale along the X-axis, where there should be no scale. The dummy version uses a second variable where all values are “1” as one attempt to force a uni-variate boxplot. The boxplots seem to be happiest when a grouping variable is provided in aes(), so I tried one with Grade level grouped by “Taken part in a taste test before”, which turned out to be interesting because the distributions within groups are different from each other.

The final set of boxplots are the regular and Tuftean boxplots grouped side by side. (Note, though, that Tufte would probably be opposed to the grey grid taht is generated by default, on the grounds that the grid wastes too much non-data ink. I can experiment with plotting the same chart without the grids as a next step.)

# Boxplot and Tuftean boxplot

# Recode a variable that has a numeric range, and set all missing values to NA (missing, actually NA_real_ for a double vector missing)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade_Num_MissNA <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade,
         "0" = 0,
         "1" = 1,
         "2" = 2,
         "3" = 3,
         "4" = 4,
         "5" = 5,
         "LeftBlank" = NA_real_,
         .default = NA_real_)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$AllDummy_var <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade,
         "0" = 1,
         "1" = 1,
         "2" = 1,
         "3" = 1,
         "4" = 1,
         "5" = 1,
         "LeftBlank" = 1,
         .default = 1)

# Boxplots
Grade_boxplot <- ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(, Q6_Grade_Num_MissNA)) +
   geom_boxplot() +
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Regular Boxplot",
       subtitle="Grade level",
       caption="Coding ideas from r-statistics.co",
       x="This is not a scale.",
       y="Grade level")

Grade_boxplot_Dummy <- ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(AllDummy_var, Q6_Grade_Num_MissNA)) +
   geom_boxplot() +
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Regular Boxplot",
       subtitle="Grade level",
       caption="Coding ideas from r-statistics.co",
       x="This is a dummy scale where all values = 1",
       y="Grade level")

Grade_boxplot3_TasteTestBefore <- ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(Q7_TakenPartIn_TasteTestBefore, Q6_Grade_Num_MissNA)) +
   geom_boxplot() +
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Regular Boxplot with groups",
       subtitle="Grade level by Previous taste test participation",
       caption="Coding ideas from r-statistics.co",
       x="Have you taken part in a taste test before?",
       y="Grade level")

Grade_boxplot
## Warning: Removed 3 rows containing non-finite values (stat_boxplot).

Grade_boxplot_Dummy
## Warning: Removed 3 rows containing non-finite values (stat_boxplot).

Grade_boxplot3_TasteTestBefore
## Warning: Removed 3 rows containing non-finite values (stat_boxplot).

# Tuftean boxplots
Grade_Tufte_boxplot <- ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(, Q6_Grade_Num_MissNA)) +
   geom_tufteboxplot() +
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Tuftean Boxplot",
       subtitle="Grade level",
       caption="Coding ideas from r-statistics.co",
       x="This is not a scale.",
       y="Grade level")

Grade_Tufte_boxplot_Dummy <- ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(AllDummy_var, Q6_Grade_Num_MissNA)) +
   geom_tufteboxplot() +
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Tuftean Boxplot",
       subtitle="Grade level",
       caption="Coding ideas from r-statistics.co",
       x="This is a dummy scale where all values = 1",
       y="Grade level")

Grade_Tufte_boxplot_TasteTestBefore <- ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(Q7_TakenPartIn_TasteTestBefore, Q6_Grade_Num_MissNA)) +
   geom_tufteboxplot() +
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Tuftean Boxplot with groups",
       subtitle="Grade level by Previous taste test participation",
       caption="Coding ideas from r-statistics.co",
       x="Have you taken part in a taste test before?",
       y="Grade level")

Grade_Tufte_boxplot
## Warning: Removed 3 rows containing non-finite values (stat_fivenumber).

Grade_Tufte_boxplot_Dummy
## Warning: Removed 3 rows containing non-finite values (stat_fivenumber).

Grade_Tufte_boxplot_TasteTestBefore
## Warning: Removed 3 rows containing non-finite values (stat_fivenumber).

TwoBoxPlotsTogether <- grid.arrange(Grade_boxplot3_TasteTestBefore, Grade_Tufte_boxplot_TasteTestBefore, ncol=2)
## Warning: Removed 3 rows containing non-finite values (stat_boxplot).
## Warning: Removed 3 rows containing non-finite values (stat_fivenumber).

TwoBoxPlotsTogether
## TableGrob (1 x 2) "arrange": 2 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]

Exploratory analysis

I originally created summaries of variables for the Describe the Variables 10/26/2021 assignment using ggplot(). These were frequency bar charts. However, I subsequently created similar frequency bar charts using tab1() which look better, and are more Tuftean in terms of minimizing non-data ink and other Tuftean concerns. I could clean up the ggplot() bar graphs to explore ggplot() further, but I will leave that for another day.

Results

Introduction

Here I will replicate the types of analysis done in the report “Pilot test of a GTC scannable taste test survey for snack fruit administered in classrooms 12-15-17”. (It won’t be exactly the same because that 12-15-17 taste test included two dishes, whereas this taste test had one, and the survey used in the 12-15-17 taste test was slightly different.)

This isn’t a perfect replication of the standard frequency table in SPSS used in the previous report, but it is close enough perhaps, since it only omits the Valid Percent column. The text formatted tables are far from ideal, more Stata than SPSS, but since they are small in size they are okay.

I like the tab1() bar charts because they follow Tufte’s principles of data-ink minimization (such as the lack of a horizontal frame line and a vertical frame line that doesn’t go beyond what is needed), without going overboard in terms of minimalism.

I think these bar charts could be improved if there was an option to add Tufte’s idea of white lines as the horizontal grid lines.

Create chart titles

Here I create chart titles for use with each variable.

# Exploratory analysis 11-2-21
# Frequency tables

# There also ought to be a way to define a label for the question, so it can be displayed instead of or in addition to the variable name.

# I could assign these to a variable, but I am not right now.


## Create Chart Title Variables for each variable.
Q1_RateTasteComponentOfDish_Salty_VarTitle <- "Tell us how today's dish tastes: Salty?"
Q1_RateTasteComponentOfDish_Sweet_VarTitle <- "Tell us how today's dish tastes: Sweet?"
Q1_RateTasteComponentOfDish_Bitter_VarTitle <- "Tell us how today's dish tastes: Bitter?"
Q1_RateTasteComponentOfDish_Sour_VarTitle <- "Tell us how today's dish tastes: Sour?"
Q1_RateTasteComponentOfDish_Spicy_VarTitle <- "Tell us how today's dish tastes: Spicy?"
Q1_RateTasteComponentOfDish_Flavorful_VarTitle <- "Tell us how today's dish tastes: Flavorful?"
Q2_RateTemperatureOfDish_VarTitle <- "What is the temperature of today's dish?"
Q3_OverallTasteRatingOfDish_VarTitle <- "Overall, I think today's dish tastes..."
Q5_WillingnessToTry_DifferentKindsOfTodaysDish_VarTitle <- "I would like to try different kinds of *today's dish*..."
Q5_WillingnessToTry_FruitIHaventEatenBefore_VarTitle <- "I would like to try *fruit* that I haven't eaten before..."
Q5_WillingnessToTry_VegetablesIHaventEatenBefore_VarTitle <- "I would like to try *vegetables* that I haven't eaten before..."
Q6_Grade_VarTitle <- "What grade are you in? \n(If this is the Summer, what grade did you just complete?)"
Q7_TakenPartIn_TasteTestBefore_VarTitle <- "Have you taken part in a taste test before?"
Q8_TakenPartIn_GTCEventBefore_VarTitle <- "Have you taken part in a Garden To Café event before?"
ClassPeriod_VarTitle <- "Class Period"

Frequency analysis, with default sort order, with counts

(Note that the white bars should be labeled “LeftBlank”, and they were so labeled in R, but now are not in R Markdown. I will investigate this later. My best guess is that it is an HTML formatting issue. In technical terms, this requires a call to the Ack!() function, or to the Ecce!() function if using the Latin version of R. The easiest solution may be turning all the tab1() bar charts to horizontal, since the horizontal charts don’t have this missing label, even though the code syntax is the same. The “I don’t know” label may be too wide for the charts, pushing out the space for the next label.)

## Run the frequency analysis for real on all variables, with default sort order.

tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Name, sort.group = "", cum.percent = TRUE, main = Q1_RateTasteComponentOfDish_Salty_VarTitle, col=c("lightblue","blue","darkblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Name : 
##              Frequency Percent Cum. percent
## Low                 36    45.0         45.0
## Medium              12    15.0         60.0
## High                 8    10.0         70.0
## I don't know        21    26.2         96.2
## LeftBlank            3     3.8        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_Name, sort.group = "", cum.percent = TRUE, main = Q1_RateTasteComponentOfDish_Sweet_VarTitle, col=c("lightblue","blue","darkblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_Name : 
##              Frequency Percent Cum. percent
## Low                 28    35.0         35.0
## Medium              20    25.0         60.0
## High                13    16.2         76.2
## I don't know        13    16.2         92.5
## LeftBlank            6     7.5        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_Name, sort.group = "", cum.percent = TRUE, main = Q1_RateTasteComponentOfDish_Bitter_VarTitle, col=c("lightblue","blue","darkblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_Name : 
##              Frequency Percent Cum. percent
## Low                 16    20.0         20.0
## Medium              18    22.5         42.5
## High                20    25.0         67.5
## I don't know        20    25.0         92.5
## LeftBlank            6     7.5        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_Name, sort.group = "", cum.percent = TRUE, main = Q1_RateTasteComponentOfDish_Sour_VarTitle, col=c("lightblue","blue","darkblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_Name : 
##              Frequency Percent Cum. percent
## Low                 25    31.2         31.2
## Medium              12    15.0         46.2
## High                21    26.2         72.5
## I don't know        13    16.2         88.8
## LeftBlank            9    11.2        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_Name, sort.group = "", cum.percent = TRUE, main = Q1_RateTasteComponentOfDish_Spicy_VarTitle, col=c("lightblue","blue","darkblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_Name : 
##              Frequency Percent Cum. percent
## Low                 32    40.0         40.0
## Medium               6     7.5         47.5
## High                12    15.0         62.5
## I don't know        19    23.8         86.2
## LeftBlank           11    13.8        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_Name, sort.group = "", cum.percent = TRUE, main = Q1_RateTasteComponentOfDish_Flavorful_VarTitle, col=c("lightblue","blue","darkblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_Name : 
##              Frequency Percent Cum. percent
## Low                 18    22.5         22.5
## Medium              15    18.8         41.2
## High                14    17.5         58.8
## I don't know        23    28.7         87.5
## LeftBlank           10    12.5        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_Name, sort.group = "", cum.percent = TRUE, main = Q2_RateTemperatureOfDish_VarTitle, col=c("darkblue","lightblue", "orange", "red", "darkred", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_Name : 
##              Frequency Percent Cum. percent
## Frozen               6     7.5          7.5
## Cold                40    50.0         57.5
## Warm                17    21.2         78.8
## Hot                  2     2.5         81.2
## Very Hot             1     1.2         82.5
## I don't know         6     7.5         90.0
## LeftBlank            8    10.0        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Name, sort.group = "", cum.percent = TRUE, main = Q3_OverallTasteRatingOfDish_VarTitle, col=c("darkblue","blue","lightblue", "lightgreen", "grey", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Name : 
##                           Frequency Percent Cum. percent
## Delicious (Smile Face)           24    30.0         30.0
## Okay (Flat Line Face)            22    27.5         57.5
## Unsatisfying (Frown Face)        26    32.5         90.0
## I don't know (\\)                 1     1.2         91.2
## I didn't try it (\\)              3     3.8         95.0
## LeftBlank                         4     5.0        100.0
## Default                           0     0.0        100.0
##   Total                          80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_Name, sort.group = "", cum.percent = TRUE, main = Q5_WillingnessToTry_DifferentKindsOfTodaysDish_VarTitle, col=c("darkblue","blue3","blue1", "lightblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_Name : 
##              Frequency Percent Cum. percent
## 3+ times            29    36.2         36.2
## 2 times              4     5.0         41.2
## 1 time              18    22.5         63.8
## Never               16    20.0         83.8
## I don't know         6     7.5         91.2
## LeftBlank            7     8.8        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Name, sort.group = "", cum.percent = TRUE, main = Q5_WillingnessToTry_FruitIHaventEatenBefore_VarTitle, col=c("darkblue","blue3","blue1", "lightblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Name : 
##              Frequency Percent Cum. percent
## 3+ times            44    55.0         55.0
## 2 times             10    12.5         67.5
## 1 time               4     5.0         72.5
## Never                6     7.5         80.0
## I don't know         4     5.0         85.0
## LeftBlank           12    15.0        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_Name, sort.group = "", cum.percent = TRUE, main = Q5_WillingnessToTry_VegetablesIHaventEatenBefore_VarTitle, col=c("darkblue","blue3","blue1", "lightblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_Name : 
##              Frequency Percent Cum. percent
## 3+ times            21    26.2         26.2
## 2 times             13    16.2         42.5
## 1 time              16    20.0         62.5
## Never               12    15.0         77.5
## I don't know         5     6.2         83.8
## LeftBlank           13    16.2        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
# Note: The color scheme is fudged here because there were no Grade 4 students.
# Because these are categories, not a numeric sequence, the bar chart is not showing a 0 for Grade 4.
# It would be nice if there were a way to include the missing category in the table and chart.
# I don't know what it does if indicator variables are set to true: gen.ind.vars = FALSE
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade_Name, sort.group = "", cum.percent = TRUE, main = Q6_Grade_VarTitle, col=c("blue", "yellow", "blue", "yellow", "blue", "yellow", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade_Name : 
##           Frequency Percent Cum. percent
## 0 (K)             1     1.2          1.2
## 1                27    33.8         35.0
## 2                22    27.5         62.5
## 3                13    16.2         78.8
## 4                 0     0.0         78.8
## 5                15    18.8         97.5
## LeftBlank         2     2.5        100.0
## Default           0     0.0        100.0
##   Total          80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q7_TakenPartIn_TasteTestBefore_Name, sort.group = "", cum.percent = TRUE, main = Q7_TakenPartIn_TasteTestBefore_VarTitle, col=c("darkblue", "blue", "lightblue", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q7_TakenPartIn_TasteTestBefore_Name : 
##           Frequency Percent Cum. percent
## Yes              40    50.0         50.0
## Maybe             8    10.0         60.0
## No               30    37.5         97.5
## LeftBlank         2     2.5        100.0
## Default           0     0.0        100.0
##   Total          80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q8_TakenPartIn_GTCEventBefore_Name, sort.group = "", cum.percent = TRUE, main = Q8_TakenPartIn_GTCEventBefore_VarTitle, col=c("darkblue", "blue", "lightblue", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q8_TakenPartIn_GTCEventBefore_Name : 
##           Frequency Percent Cum. percent
## Yes              20    25.0         25.0
## Maybe            15    18.8         43.8
## No               40    50.0         93.8
## LeftBlank         5     6.2        100.0
## Default           0     0.0        100.0
##   Total          80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$ClassPeriod, sort.group = "", cum.percent = TRUE, main = ClassPeriod_VarTitle, col=c("blue", "yellow")) # Since this variable can't have missing values, a repeating color scheme should work.

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$ClassPeriod : 
##         Frequency Percent Cum. percent
## 1              13    16.2         16.2
## 2              21    26.2         42.5
## 3              17    21.2         63.8
## 6              15    18.8         82.5
## 7              14    17.5        100.0
##   Total        80   100.0        100.0

Frequency analysis, with default sort order, with percentages

## Run the frequency analysis for real on all variables, with default sort order,
# but this time use percentages.

tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Name, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste(Q1_RateTasteComponentOfDish_Salty_VarTitle, "\n(percentages)"), col=c("lightblue","blue","darkblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Name : 
##              Frequency Percent Cum. percent
## Low                 36    45.0         45.0
## Medium              12    15.0         60.0
## High                 8    10.0         70.0
## I don't know        21    26.2         96.2
## LeftBlank            3     3.8        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_Name, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste(Q1_RateTasteComponentOfDish_Sweet_VarTitle, "\n(percentages)"), col=c("lightblue","blue","darkblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_Name : 
##              Frequency Percent Cum. percent
## Low                 28    35.0         35.0
## Medium              20    25.0         60.0
## High                13    16.2         76.2
## I don't know        13    16.2         92.5
## LeftBlank            6     7.5        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_Name, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste(Q1_RateTasteComponentOfDish_Bitter_VarTitle, "\n(percentages)"), col=c("lightblue","blue","darkblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_Name : 
##              Frequency Percent Cum. percent
## Low                 16    20.0         20.0
## Medium              18    22.5         42.5
## High                20    25.0         67.5
## I don't know        20    25.0         92.5
## LeftBlank            6     7.5        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_Name, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste(Q1_RateTasteComponentOfDish_Sour_VarTitle, "\n(percentages)"), col=c("lightblue","blue","darkblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_Name : 
##              Frequency Percent Cum. percent
## Low                 25    31.2         31.2
## Medium              12    15.0         46.2
## High                21    26.2         72.5
## I don't know        13    16.2         88.8
## LeftBlank            9    11.2        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_Name, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste(Q1_RateTasteComponentOfDish_Spicy_VarTitle, "\n(percentages)"), col=c("lightblue","blue","darkblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_Name : 
##              Frequency Percent Cum. percent
## Low                 32    40.0         40.0
## Medium               6     7.5         47.5
## High                12    15.0         62.5
## I don't know        19    23.8         86.2
## LeftBlank           11    13.8        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_Name, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste(Q1_RateTasteComponentOfDish_Flavorful_VarTitle, "\n(percentages)"), col=c("lightblue","blue","darkblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_Name : 
##              Frequency Percent Cum. percent
## Low                 18    22.5         22.5
## Medium              15    18.8         41.2
## High                14    17.5         58.8
## I don't know        23    28.7         87.5
## LeftBlank           10    12.5        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_Name, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste(Q2_RateTemperatureOfDish_VarTitle, "\n(percentages)"), col=c("darkblue","lightblue", "orange", "red", "darkred", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_Name : 
##              Frequency Percent Cum. percent
## Frozen               6     7.5          7.5
## Cold                40    50.0         57.5
## Warm                17    21.2         78.8
## Hot                  2     2.5         81.2
## Very Hot             1     1.2         82.5
## I don't know         6     7.5         90.0
## LeftBlank            8    10.0        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Name, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste(Q3_OverallTasteRatingOfDish_VarTitle, "\n(percentages)"), col=c("darkblue","blue","lightblue", "lightgreen", "grey", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Name : 
##                           Frequency Percent Cum. percent
## Delicious (Smile Face)           24    30.0         30.0
## Okay (Flat Line Face)            22    27.5         57.5
## Unsatisfying (Frown Face)        26    32.5         90.0
## I don't know (\\)                 1     1.2         91.2
## I didn't try it (\\)              3     3.8         95.0
## LeftBlank                         4     5.0        100.0
## Default                           0     0.0        100.0
##   Total                          80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_Name, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste(Q5_WillingnessToTry_DifferentKindsOfTodaysDish_VarTitle, "\n(percentages)"), col=c("darkblue","blue3","blue1", "lightblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_Name : 
##              Frequency Percent Cum. percent
## 3+ times            29    36.2         36.2
## 2 times              4     5.0         41.2
## 1 time              18    22.5         63.8
## Never               16    20.0         83.8
## I don't know         6     7.5         91.2
## LeftBlank            7     8.8        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Name, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste(Q5_WillingnessToTry_FruitIHaventEatenBefore_VarTitle, "\n(percentages)"), col=c("darkblue","blue3","blue1", "lightblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Name : 
##              Frequency Percent Cum. percent
## 3+ times            44    55.0         55.0
## 2 times             10    12.5         67.5
## 1 time               4     5.0         72.5
## Never                6     7.5         80.0
## I don't know         4     5.0         85.0
## LeftBlank           12    15.0        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_Name, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste(Q5_WillingnessToTry_VegetablesIHaventEatenBefore_VarTitle, "\n(percentages)"), col=c("darkblue","blue3","blue1", "lightblue", "lightgreen", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_Name : 
##              Frequency Percent Cum. percent
## 3+ times            21    26.2         26.2
## 2 times             13    16.2         42.5
## 1 time              16    20.0         62.5
## Never               12    15.0         77.5
## I don't know         5     6.2         83.8
## LeftBlank           13    16.2        100.0
## Default              0     0.0        100.0
##   Total             80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade_Name, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste(Q6_Grade_VarTitle, "\n(percentages)"), col=c("blue", "yellow", "blue", "yellow", "blue", "yellow", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade_Name : 
##           Frequency Percent Cum. percent
## 0 (K)             1     1.2          1.2
## 1                27    33.8         35.0
## 2                22    27.5         62.5
## 3                13    16.2         78.8
## 4                 0     0.0         78.8
## 5                15    18.8         97.5
## LeftBlank         2     2.5        100.0
## Default           0     0.0        100.0
##   Total          80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q7_TakenPartIn_TasteTestBefore_Name, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste(Q7_TakenPartIn_TasteTestBefore_VarTitle, "\n(percentages)"), col=c("darkblue", "blue", "lightblue", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q7_TakenPartIn_TasteTestBefore_Name : 
##           Frequency Percent Cum. percent
## Yes              40    50.0         50.0
## Maybe             8    10.0         60.0
## No               30    37.5         97.5
## LeftBlank         2     2.5        100.0
## Default           0     0.0        100.0
##   Total          80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q8_TakenPartIn_GTCEventBefore_Name, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste(Q8_TakenPartIn_GTCEventBefore_VarTitle, "\n(percentages)"), col=c("darkblue", "blue", "lightblue", "white", "black"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q8_TakenPartIn_GTCEventBefore_Name : 
##           Frequency Percent Cum. percent
## Yes              20    25.0         25.0
## Maybe            15    18.8         43.8
## No               40    50.0         93.8
## LeftBlank         5     6.2        100.0
## Default           0     0.0        100.0
##   Total          80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$ClassPeriod, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste(ClassPeriod_VarTitle, "\n(percentages)"), col=c("blue", "yellow"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$ClassPeriod : 
##         Frequency Percent Cum. percent
## 1              13    16.2         16.2
## 2              21    26.2         42.5
## 3              17    21.2         63.8
## 6              15    18.8         82.5
## 7              14    17.5        100.0
##   Total        80   100.0        100.0

More recoding and tab1()’s

This sets up analysis of how many questions were left blank.

# Count _Num == 8 (LeftBlanks) across all 14 variables.
# Results match a similar Excel calculation.

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_IfLeftBlankYes1No0_Count <- 
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Num == 8, 1, 0) + 
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_Num == 8, 1, 0) + 
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_Num == 8, 1, 0) + 
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_Num == 8, 1, 0) + 
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_Num == 8, 1, 0) + 
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_Num == 8, 1, 0) +
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_Num == 8, 1, 0) +
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Num == 8, 1, 0) +
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_Num == 8, 1, 0) +
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Num == 8, 1, 0) +
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_Num == 8, 1, 0) +
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade_Num == 8, 1, 0) +
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q7_TakenPartIn_TasteTestBefore_Num == 8, 1, 0) +
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q8_TakenPartIn_GTCEventBefore_Num == 8, 1, 0)

tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_IfLeftBlankYes1No0_Count, sort.group = "", cum.percent = TRUE, bar.values = "frequency", main = paste("Count of LeftBlanks for each student"), col=c("blue", "yellow"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_IfLeftBlankYes1No0_Count : 
##         Frequency Percent Cum. percent
## 0              44    55.0         55.0
## 1              15    18.8         73.8
## 2               6     7.5         81.2
## 3               5     6.2         87.5
## 4               3     3.8         91.2
## 5               2     2.5         93.8
## 6               2     2.5         96.2
## 7               2     2.5         98.8
## 8               1     1.2        100.0
##   Total        80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_IfLeftBlankYes1No0_Count, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste("Count of LeftBlanks for each student\n(percentages)"), col=c("blue", "yellow"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_IfLeftBlankYes1No0_Count : 
##         Frequency Percent Cum. percent
## 0              44    55.0         55.0
## 1              15    18.8         73.8
## 2               6     7.5         81.2
## 3               5     6.2         87.5
## 4               3     3.8         91.2
## 5               2     2.5         93.8
## 6               2     2.5         96.2
## 7               2     2.5         98.8
## 8               1     1.2        100.0
##   Total        80   100.0        100.0
# Count _Num == 4 (I Don't Know - IDK) across all 11 variables where it was relevant.
# I updated the code so that IDK is always == 7
# Results match a similar Excel calculation.

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_IfIDKYes1No0_Count <- 
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Num == 7, 1, 0) + 
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_Num == 7, 1, 0) + 
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_Num == 7, 1, 0) + 
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_Num == 7, 1, 0) + 
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_Num == 7, 1, 0) + 
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_Num == 7, 1, 0) +
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_Num == 7, 1, 0) +
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Num == 7, 1, 0) +
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_Num == 7, 1, 0) +
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Num == 7, 1, 0) +
        ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_Num == 7, 1, 0)
        #  +
         # ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade_Num == 7, 1, 0) +
        # ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q7_TakenPartIn_TasteTestBefore_Num == 7, 1, 0) +
        # ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q8_TakenPartIn_GTCEventBefore_Num == 7, 1, 0)

tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_IfIDKYes1No0_Count, sort.group = "", cum.percent = TRUE, bar.values = "frequency", main = paste("Count of I Don't Know-s for each student"), col=c("blue", "yellow"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_IfIDKYes1No0_Count : 
##         Frequency Percent Cum. percent
## 0              24    30.0         30.0
## 1              24    30.0         60.0
## 2              14    17.5         77.5
## 3              11    13.8         91.2
## 4               3     3.8         95.0
## 6               1     1.2         96.2
## 7               1     1.2         97.5
## 10              1     1.2         98.8
## 11              1     1.2        100.0
##   Total        80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_IfIDKYes1No0_Count, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste("Count of I Don't Know-s for each student\n(percentages)"), col=c("blue", "yellow"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_IfIDKYes1No0_Count : 
##         Frequency Percent Cum. percent
## 0              24    30.0         30.0
## 1              24    30.0         60.0
## 2              14    17.5         77.5
## 3              11    13.8         91.2
## 4               3     3.8         95.0
## 6               1     1.2         96.2
## 7               1     1.2         97.5
## 10              1     1.2         98.8
## 11              1     1.2        100.0
##   Total        80   100.0        100.0
## Add the LeftBlank and IDK counts
# Results match the Excel method

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_LeftBlankCtPLUSIDKCt <- GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_IfLeftBlankYes1No0_Count + GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_IfIDKYes1No0_Count

tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_LeftBlankCtPLUSIDKCt, sort.group = "", cum.percent = TRUE, bar.values = "frequency", main = paste("Count of IDK-s and Blanks for each student"), col=c("blue", "yellow"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_LeftBlankCtPLUSIDKCt : 
##         Frequency Percent Cum. percent
## 0               9    11.2         11.2
## 1              19    23.8         35.0
## 2              15    18.8         53.8
## 3              13    16.2         70.0
## 4               9    11.2         81.2
## 5               3     3.8         85.0
## 6               4     5.0         90.0
## 7               4     5.0         95.0
## 8               1     1.2         96.2
## 9               1     1.2         97.5
## 10              1     1.2         98.8
## 11              1     1.2        100.0
##   Total        80   100.0        100.0
tab1(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_LeftBlankCtPLUSIDKCt, sort.group = "", cum.percent = TRUE, bar.values = "percent", main = paste("Count of IDK-s and Blanks for each student\n(percentages)"), col=c("blue", "yellow"))

## GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1to8_LeftBlankCtPLUSIDKCt : 
##         Frequency Percent Cum. percent
## 0               9    11.2         11.2
## 1              19    23.8         35.0
## 2              15    18.8         53.8
## 3              13    16.2         70.0
## 4               9    11.2         81.2
## 5               3     3.8         85.0
## 6               4     5.0         90.0
## 7               4     5.0         95.0
## 8               1     1.2         96.2
## 9               1     1.2         97.5
## 10              1     1.2         98.8
## 11              1     1.2        100.0
##   Total        80   100.0        100.0

Yet more recoding

This is a different way of accomplishing the recode task shown a few pages earlier. Either I use these variables later, or they are vestigial. As noted in the comment, they could be useful for later analysis, so I will leave them as is.

### Create IDK and LeftBlank variables for each variable, to compute IDK OR LeftBlank counts,
# and other options later.
# Not sure if these are strictly necessary, but since I have set them up, might as well run them.
# This would be useful if I wanted to find IDKs, LeftBlanks or combination for a subset of Qs.

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_IfLeftBlankYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Num == 8, 1, 0) 
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_IfLeftBlankYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_Num == 8, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_IfLeftBlankYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_Num == 8, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_IfLeftBlankYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_Num == 8, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_IfLeftBlankYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_Num == 8, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_IfLeftBlankYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_Num == 8, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_IfLeftBlankYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_Num == 8, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_IfLeftBlankYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Num == 8, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_IfLeftBlankYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_Num == 8, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_IfLeftBlankYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Num == 8, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_IfLeftBlankYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_Num == 8, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade_IfLeftBlankYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade_Num == 8, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q7_TakenPartIn_TasteTestBefore_IfLeftBlankYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q7_TakenPartIn_TasteTestBefore_Num == 8, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q8_TakenPartIn_GTCEventBefore_IfLeftBlankYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q8_TakenPartIn_GTCEventBefore_Num == 8, 1, 0)


GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_IfIDKYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Num == 7, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_IfIDKYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_Num == 7, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_IfIDKYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_Num == 7, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_IfIDKYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_Num == 7, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_IfIDKYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_Num == 7, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_IfIDKYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_Num == 7, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_IfIDKYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_Num == 7, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_IfIDKYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Num == 7, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_IfIDKYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_Num == 7, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_IfIDKYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Num == 7, 1, 0)
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_IfIDKYes1No0 <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_Num == 7, 1, 0)

Taste test faces scatterplot

This is an attempt to create a graphic similar to one of the data graphics Tufte reprinted, which used faces as points in a scatterplot.

First, I had to find face characters in a font that matched the icons used in the survey. The best match was in the font Wingdings. I also needed to find characters in the font that could plausibly represent “I don’t know”, “I didn’t try it” and “LeftBlank”.

In order to have the legend display correctly, I needed override.aes in the guide_legend() function.

To some extent, this faces scatterplot was a solution in search of a problem. Sometimes that is okay. I decided to explore if there was a relationship between two assessments of the components of taste and overall taste judgment response. I realized after initial plots that the scatterplot is likely to need jitter to be readable.

## Create a scatterplot with ggplot using smile faces as points with corrected legend.

## 1

# Recode from original variable.
# Note that I only recoded one _Faces2 variable. If I wanted to `color` the points by some
# other variable, I would have to recode that variable to the _Faces2 form.
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2 <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish, "D" = "J", "O" = "K", "U" = "L", "IDK" = "O", "IDNT" = "S", "LeftBlank" = "T") # removed {, 9 = "o"} - may not like it because no .defaults in data


# This starts the Faces plots I want.

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Salty_Name`, `Q1_RateTasteComponentOfDish_Sweet_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) +
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Faces scatterplot",
       subtitle="Inspired by Tufte",
       x="Rate Taste Component of Dish: Salty",
       y="Rate Taste Component of Dish: Sweet")

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Salty_Name`, `Q1_RateTasteComponentOfDish_Bitter_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) +
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Faces scatterplot",
       subtitle="Inspired by Tufte",
       x="Rate Taste Component of Dish: Salty",
       y="Rate Taste Component of Dish: Bitter")

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Salty_Name`, `Q1_RateTasteComponentOfDish_Sour_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) +
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Faces scatterplot",
       subtitle="Inspired by Tufte",
       x="Rate Taste Component of Dish: Salty",
       y="Rate Taste Component of Dish: Sour")

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Salty_Name`, `Q1_RateTasteComponentOfDish_Spicy_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) +
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Faces scatterplot",
       subtitle="Inspired by Tufte",
       x="Rate Taste Component of Dish: Salty",
       y="Rate Taste Component of Dish: Spicy")

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Salty_Name`, `Q1_RateTasteComponentOfDish_Flavorful_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) +
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Faces scatterplot",
       subtitle="Inspired by Tufte",
       x="Rate Taste Component of Dish: Salty",
       y="Rate Taste Component of Dish: Flavorful")

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Salty_Name`, `Q2_RateTemperatureOfDish_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) +
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Faces scatterplot",
       subtitle="Inspired by Tufte",
       x="Rate Taste Component of Dish: Salty",
       y="Rate Temperature of Dish")

The faces scatterplots below use variable names for the axis labels and legend title. This is one time where long variable names can be useful, because you can easily figure out what the variables represent, although the text labels do look better. This is also an example of what the NYC Civil Service exams call Inboxing: prioritizing what to do when there is too much work and too little time, and the appropriate sequences in which to do tasks. In any case, these faces scatterplots, while happiness-inducing to contemplate, are somewhat inconclusive as analysis, and thus if all of them aren’t perfectly formatted, it isn’t the end of the world. More importantly, speaking of the NYC Civil Service exams, if you are considering ever taking a job with a NYC government agency, you MUST take and pass one of the NYC Civil Service exams. A Ph.D. is not enough. Otherwise, someone might discriminate against you. I speak from experience. - I did add the text labels to the last two faces scatterplots, since those are the most important and interesting, since they have wider scales and are more suited to be scatterplots.

print("I tried to insert the above text chunk using the `print()` function, but it didn't work. Created new text and code chunks instead.")
## [1] "I tried to insert the above text chunk using the `print()` function, but it didn't work. Created new text and code chunks instead."
####
ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Sweet_Name`, `Q1_RateTasteComponentOfDish_Salty_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Sweet_Name`, `Q1_RateTasteComponentOfDish_Bitter_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Sweet_Name`, `Q1_RateTasteComponentOfDish_Sour_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Sweet_Name`, `Q1_RateTasteComponentOfDish_Spicy_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Sweet_Name`, `Q1_RateTasteComponentOfDish_Flavorful_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Sweet_Name`, `Q2_RateTemperatureOfDish_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

####
ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Bitter_Name`, `Q1_RateTasteComponentOfDish_Salty_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Bitter_Name`, `Q1_RateTasteComponentOfDish_Sweet_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Bitter_Name`, `Q1_RateTasteComponentOfDish_Sour_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Bitter_Name`, `Q1_RateTasteComponentOfDish_Spicy_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Bitter_Name`, `Q1_RateTasteComponentOfDish_Flavorful_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Bitter_Name`, `Q2_RateTemperatureOfDish_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

####
ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Sour_Name`, `Q1_RateTasteComponentOfDish_Salty_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Sour_Name`, `Q1_RateTasteComponentOfDish_Sweet_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Sour_Name`, `Q1_RateTasteComponentOfDish_Bitter_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Sour_Name`, `Q1_RateTasteComponentOfDish_Spicy_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Sour_Name`, `Q1_RateTasteComponentOfDish_Flavorful_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Sour_Name`, `Q2_RateTemperatureOfDish_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

####
ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Spicy_Name`, `Q1_RateTasteComponentOfDish_Salty_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Spicy_Name`, `Q1_RateTasteComponentOfDish_Sweet_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Spicy_Name`, `Q1_RateTasteComponentOfDish_Bitter_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Spicy_Name`, `Q1_RateTasteComponentOfDish_Sour_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Spicy_Name`, `Q1_RateTasteComponentOfDish_Flavorful_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Spicy_Name`, `Q2_RateTemperatureOfDish_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

####
ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Flavorful_Name`, `Q1_RateTasteComponentOfDish_Salty_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Flavorful_Name`, `Q1_RateTasteComponentOfDish_Sweet_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Flavorful_Name`, `Q1_RateTasteComponentOfDish_Bitter_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Flavorful_Name`, `Q1_RateTasteComponentOfDish_Sour_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Flavorful_Name`, `Q1_RateTasteComponentOfDish_Spicy_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_Flavorful_Name`, `Q2_RateTemperatureOfDish_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

####
ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q2_RateTemperatureOfDish_Name`, `Q1_RateTasteComponentOfDish_Salty_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q2_RateTemperatureOfDish_Name`, `Q1_RateTasteComponentOfDish_Sweet_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q2_RateTemperatureOfDish_Name`, `Q1_RateTasteComponentOfDish_Bitter_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q2_RateTemperatureOfDish_Name`, `Q1_RateTasteComponentOfDish_Sour_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q2_RateTemperatureOfDish_Name`, `Q1_RateTasteComponentOfDish_Spicy_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q2_RateTemperatureOfDish_Name`, `Q1_RateTasteComponentOfDish_Flavorful_Name`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.25, h = 0.25, seed = 123)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) # {, "o"}

Recode variables for a scatterplot that sums variables

This creates variables necessary to create a scatterplot with a greater range, given the available data. In this version, “I don’t knows”s, “I didn’t try it”s and missings are coded as 0s.

### Recode for faces scatterplot

# Recode Q1_Num to Q1_SumNum

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_SumNum <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Num,
         "1" = 1,
         "2" = 2,
         "3" = 3,
         "7" = 0,
         "8" = 0,
         "9" = 0)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_SumNum <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_Num,
                                                                                          "1" = 1,
                                                                                          "2" = 2,
                                                                                          "3" = 3,
                                                                                          "7" = 0,
                                                                                          "8" = 0,
                                                                                          "9" = 0)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_SumNum <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_Num,
                                                                                           "1" = 1,
                                                                                           "2" = 2,
                                                                                           "3" = 3,
                                                                                           "7" = 0,
                                                                                           "8" = 0,
                                                                                           "9" = 0)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_SumNum <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_Num,
                                                                                         "1" = 1,
                                                                                         "2" = 2,
                                                                                         "3" = 3,
                                                                                         "7" = 0,
                                                                                         "8" = 0,
                                                                                         "9" = 0)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_SumNum <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_Num,
                                                                                          "1" = 1,
                                                                                          "2" = 2,
                                                                                          "3" = 3,
                                                                                          "7" = 0,
                                                                                          "8" = 0,
                                                                                          "9" = 0)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_SumNum <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_Num,
                                                                                              "1" = 1,
                                                                                              "2" = 2,
                                                                                              "3" = 3,
                                                                                              "7" = 0,
                                                                                              "8" = 0,
                                                                                              "9" = 0)


# Recode Q5_Num to Q5_SumNum

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_SumNum <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_Num,
    "3" = 3,
    "2" = 2,
    "1" = 1,
    "0" = 0,
    "7" = 0,
    "8" = 0,
    "9" = 0)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_SumNum <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Num,
                                                                                                    "3" = 3,
                                                                                                    "2" = 2,
                                                                                                    "1" = 1,
                                                                                                    "0" = 0,
                                                                                                    "7" = 0,
                                                                                                    "8" = 0,
                                                                                                    "9" = 0)

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_SumNum <- recode(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_Num,
                                                                                                         "3" = 3,
                                                                                                         "2" = 2,
                                                                                                         "1" = 1,
                                                                                                         "0" = 0,
                                                                                                         "7" = 0,
                                                                                                         "8" = 0,
                                                                                                         "9" = 0)

Compute variables of the sum of taste components and willingness to try

This computes the variables used to show the sum of taste component responses on the one hand, and the sum of willingness to try variables on the other.

### Compute Q1 and Q5 _Sum variables

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_SumAll <-
  GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_SumNum +
  GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_SumNum +
  GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_SumNum +
  GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_SumNum +
  GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_SumNum +
  GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_SumNum

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_SumAll <-
  GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_DifferentKindsOfTodaysDish_SumNum +
  GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_SumNum +
  GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_SumNum

Scatterplots of the sum of taste components vs sum of willingness to try

Here I plot the sum of the taste component variables against the sum of the willingness to try new foods variables.

## Scatterplot of Q1_SumAll vs Q5_SumAll
# No jitter
ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_SumAll`, `Q5_WillingnessToTry_SumAll`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5) + # , position = position_jitter(w = 0.25, h = 0.25, seed = 123)
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) +
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Faces scatterplot, no jitter",
       subtitle="Inspired by Tufte",
       x="Rate Taste Component of Dish: Sum of all components",
       y="Willingness to try new foods: Sum of all willingness to try variables")

# With jitter
ggplot(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, aes(`Q1_RateTasteComponentOfDish_SumAll`, `Q5_WillingnessToTry_SumAll`, color = `Q3_OverallTasteRatingOfDish_Name`)) +
  geom_text(label=GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Faces2, family = "Wingdings", size = 5, position = position_jitter(w = 0.4, h = 0.4, seed = 42)) +
  guides(color = guide_legend(override.aes = list(label = c("J", "K", "L", "O", "S", "T")), title="Overall Taste Rating of Dish")) +
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) +
  labs(title="Faces scatterplot, with jitter",
       subtitle="Inspired by Tufte",
       x="Rate Taste Component of Dish: Sum of all components",
       y="Willingness to try new foods: Sum of all willingness to try variables")

The initial takeaway, after all of the coding, is that the overall taste response seems to have no obvious relationship to willingness to try new foods or taste component ratings. On the other hand, it is a small sample, and others should stare at the results for a while too.

Social Network Analysis

Generate matrix variables for Social Network Analysis

This section generates a set of matrix variables for Social Network Analysis (SNA) which can then be exported, trimmed in Excel and then imported into the Gephi software for SNA visualization. This visualization can also be accomplished using the iGraph R package, but I haven’t figured out how to use iGraph yet. This section was primarily for the Educational Data Mining course.

The attached/included code in R takes several variables from an 80 student dataset from a taste test conducted by Robert Abrams prior to the pandemic. This is an attempt to see if the 80 students can be grouped using social network analysis using their responses to the components of taste. It compares each pair of students on each of six variables. If the students have an exact match on a variable, a 1 is assigned. If they are off by one category, a 0.5 is assigned, and 0.25 if off by two categories. If one student has assessed the component of taste, but the other has said I Don’t Know or left it missing, a 0 is assigned. Then these six comparison values are summed, unless the two students are the same student, in which case the missing value “NA” is recorded. (The dish being taste tested was a spinach, arugula and sliced carrot salad with an apple-based dressing.)

As such, this represents a network of students who all attended the same school.

The R code transforms existing conventional variables into a matrix suitable for social network analysis. This matrix is stored in the dataset, which is then exported, cleaned up in Excel, and then imported into Gephi. The OpenOrd algorithm with Cutting set to 1.0 seemed to produce the most plausible result.

A visualization of the network produced by Gephi is attached/included.

There appear to be three rough clusters of students. If this analysis were done with identifiable students (these students are not identifiable), one possible use of the analysis would be to have the teacher group the students in their clusters to discuss why they responded to the components of taste the way they did. It is clear, even from this slightly difficult to read graphic, that the students in each cluster came from different periods (the number after the P) in which the taste test was administered, and thus also different classes and grades (K-5). Thus, this social network analysis would enable the school to implement supplemental instruction that reaches beyond the normal class and grade structure, potentially enhancing both direct and indirect instructional goals.

This represents an imperfect process and result, with an imperfect understanding of social network analysis, but it is learning progress.

The techniques shown here, particularly the regular variable to matrix variable transformation, could be applied to other social network analysis projects. And, because the R script generates a data matrix that could be used inside of or outside of R, it would also support teams who are using a combination of statistical programming languages, which is the case for our EDM class team. (There were several other similar regular to matrix data transformations tried before this one, but they didn’t produce useful results, so are omitted here.6

As I understand it, I need a set of vertices, and then define edges between them. The equivalent terminology in concept mapping is nodes (or concepts) and links, but since the igraph package uses vertices and edges, I will use that terminology to be consistent. (For concept mapping, I wrote a program called LifeMap. Or try CMapTools.)

My vertices are the 80 students. The data already has unique identifiers for each students. The CodeNum variable has values that look like this: P7514 The P means Period. The number after the P is the number of the period in the school’s schedule during which the student participated in the taste test. The next number is always a 5 in this dataset. To be honest, I can’t remember why it is there. The next two numbers are a number assigned to each survey: 01, 02, 03, and so on. The main purpose of the CodeNum variable in this case was to be able to error-check the data after scanning with the paper surveys. Otherwise, the number assigned to each survey is random in the sense that the surveys happened to be collected in an order that had no particular meaning, and then the ID numbers were written on the surveys after they were collected. (And some of the students wouldn’t have normally been in the science classroom in that period, but were sent by the principal or other school staff to participate in the taste test, so the period number doesn’t necessarily identify students by their regular science period either. Since I didn’t record who would have had science in that period and who would not have, the period number isn’t useful at all in terms of identifying any students, which in this case is good.)

My first coding challenge is to generate a VerticeList vs VerticeList dataset from my existing dataset.

### Attempt to implement Social Network Analysis (SNA) using the GTC taste test data
# This is building on the other course I am taking: Educational Data Mining
# It would use the igraph() package if the SNA visualization were done in R. In this case, I exported the data and did the visualization in the specialized SNA software Gephi.

# Set up first _V_Q3 variable

# This produces a list of modified CodeNum-s that will then be mirrored in a set of variable names.
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum_VerticeList_Q3 <- 
  paste(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum, "_V_Q3", sep="")

### Create the 80 SNA variables. The 9 value is a placeholder of the same length as the real values, since the real values, in this case, can only be 0 or 1.
# It might be possible to instead use something like vector() to make the cells empty.
# I could probably also use a variable to set the end of the while() loop, but I know how many data points I have, so I am not going to bother right now.
StudentCounterVar1 <- 1
while (StudentCounterVar1 < 81) {
  eval(parse(text = paste0("GTC_Tschool_P1to7_5_14_19_adj_10_24_21$",GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum_VerticeList_Q3[StudentCounterVar1]," <- ", 9)))
  StudentCounterVar1 = StudentCounterVar1+1
}

#### Assign student v student comparison values to the 80 variables.
StudentCounterVar1 <- 1
StudentCounterVar2 <- 1
while (StudentCounterVar1 < 81) {
  StudentCounterVar2 <- 1
  while (StudentCounterVar2 < 81) {
  ComparisonVar <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Num[StudentCounterVar1] == GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q3_OverallTasteRatingOfDish_Num[StudentCounterVar2], 1, 0)
  eval(parse(text = paste0("GTC_Tschool_P1to7_5_14_19_adj_10_24_21$",GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum_VerticeList_Q3[StudentCounterVar1],"[StudentCounterVar2]", "<-", "ComparisonVar")))
  StudentCounterVar2 = StudentCounterVar2+1
  }
  StudentCounterVar1 = StudentCounterVar1+1
}



# Q1 Salty
# This produces a list of modified CodeNum-s that will then be mirrored in a set of variable names.

GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum_VerticeList_Q1_Salty <- 
  paste(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum, "_V_Q1_Salty", sep="")

### Create the 80 SNA variables for _V_Q1_Salty
StudentCounterVar1 <- 1
while (StudentCounterVar1 < 81) {
  eval(parse(text = paste0("GTC_Tschool_P1to7_5_14_19_adj_10_24_21$",GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum_VerticeList_Q1_Salty[StudentCounterVar1]," <- ", 9)))
  StudentCounterVar1 = StudentCounterVar1+1
}

#### Assign student v student comparison values to the 80 variables for _V_Q1_Salty .
StudentCounterVar1 <- 1
StudentCounterVar2 <- 1
while (StudentCounterVar1 < 81) {
  StudentCounterVar2 <- 1
  while (StudentCounterVar2 < 81) {
    ComparisonVar <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Num[StudentCounterVar1] == GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Num[StudentCounterVar2], 1, 0)
    eval(parse(text = paste0("GTC_Tschool_P1to7_5_14_19_adj_10_24_21$",GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum_VerticeList_Q1_Salty[StudentCounterVar1],"[StudentCounterVar2]", "<-", "ComparisonVar")))
    StudentCounterVar2 = StudentCounterVar2+1
  }
  StudentCounterVar1 = StudentCounterVar1+1
}



# Q1 All plus Q2
# This produces a list of modified CodeNum-s that will then be mirrored in a set of variable names.
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum_VerticeList_Q1allPlusQ2 <- 
  paste(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum, "_V_Q1allPlusQ2", sep="")

### Create the 80 SNA variables for _V_Q1allPlusQ2
# In this case, I am setting the values of the new variables to NA (missing) rather than 9, because the data that follows might contain a 9, depending on how it is computed.
StudentCounterVar1 <- 1
while (StudentCounterVar1 < 81) {
  eval(parse(text = paste0("GTC_Tschool_P1to7_5_14_19_adj_10_24_21$",GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum_VerticeList_Q1allPlusQ2[StudentCounterVar1]," <- ", NA)))
  StudentCounterVar1 = StudentCounterVar1+1
}

#### Assign student v student comparison values to the 80 variables for _V_Q1allPlusQ2 .
# 11-19-2021
# This script block follows the realization from class last night that SNA can take values other than 0 and 1.
# Here I want to know on how many components of taste each student matches with each student.
# I have included Q2, temperature, partly because I need the number of components to be an odd number, for reasons I will explain later.
# I am going to hard-code the variables to be computed within the inner while() loop for now. I could make those work off of a variable later.
# One reason to hard-code them first is that then I can see what I am doing laid out. Then, I can compress the code and be more sure I am doing what I want correctly.
# That's the theory, anyway.
# I am also going to set the cells for students compared to themselves to NA (missing). Apparently this can be set either way, depending on the type of SNA.
# The inner while() loop will compute one SNA matrix. I should also be able to do this by computing a series of matrices and then adding the matrices together.

StudentCounterVar1 <- 1
StudentCounterVar2 <- 1
while (StudentCounterVar1 < 81) {
  StudentCounterVar2 <- 1
  while (StudentCounterVar2 < 81) {
    ComparisonVar_Salty <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Num[StudentCounterVar1] == GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_Num[StudentCounterVar2], 1, 0)
    ComparisonVar_Sweet <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_Num[StudentCounterVar1] == GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_Num[StudentCounterVar2], 1, 0)
    ComparisonVar_Bitter <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_Num[StudentCounterVar1] == GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_Num[StudentCounterVar2], 1, 0)
    ComparisonVar_Sour <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_Num[StudentCounterVar1] == GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_Num[StudentCounterVar2], 1, 0)
    ComparisonVar_Spicy <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_Num[StudentCounterVar1] == GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_Num[StudentCounterVar2], 1, 0)
    ComparisonVar_Flavorful <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_Num[StudentCounterVar1] == GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_Num[StudentCounterVar2], 1, 0)
    ComparisonVar_Temperature <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_Num[StudentCounterVar1] == GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q2_RateTemperatureOfDish_Num[StudentCounterVar2], 1, 0)

    # Sets matrix cell value to NA (missing) if student is compared to self, otherwise sets matrix cell to sum of seven matches on taste components.
    # Apparently there are situations where one wants to match students to selves, but I am not sure when or why.
    # Note: Gephi doesn't like NA nor "." as missing value markers. It wants empty cells. Might be able to accomplish the same thing by removing the NA here and unchecking the Self-Loops option on import into Gephi.
    ComparisonVar_AllSeven <- ifelse(StudentCounterVar1 == StudentCounterVar2, NA, ComparisonVar_Salty + ComparisonVar_Sweet + ComparisonVar_Bitter + ComparisonVar_Sour + ComparisonVar_Spicy + ComparisonVar_Flavorful + ComparisonVar_Temperature)
    
    eval(parse(text = paste0("GTC_Tschool_P1to7_5_14_19_adj_10_24_21$",GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum_VerticeList_Q1allPlusQ2[StudentCounterVar1],"[StudentCounterVar2]", "<-", "ComparisonVar_AllSeven")))
    StudentCounterVar2 = StudentCounterVar2+1
  }
  StudentCounterVar1 = StudentCounterVar1+1
}





#######


# Q1 All Gradient
# This produces a list of modified CodeNum-s that will then be mirrored in a set of variable names.
GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum_VerticeList_Q1allGradient <- 
  paste(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum, "_V_Q1allGradient", sep="")

### Create the 80 SNA variables for _V_Q1allGradient
# In this case, I am setting the values of the new variables to NA (missing) rather than 9, because the data that follows might contain a 9, depending on how it is computed.
StudentCounterVar1 <- 1
while (StudentCounterVar1 < 81) {
  eval(parse(text = paste0("GTC_Tschool_P1to7_5_14_19_adj_10_24_21$",GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum_VerticeList_Q1allGradient[StudentCounterVar1]," <- ", NA)))
  StudentCounterVar1 = StudentCounterVar1+1
}

#### Assign student v student comparison values to the 80 variables for _V_Q1allGradient .
# 11-22-2021
# This script block follows the realization from class last night that SNA can take values other than 0 and 1.
# Here I want to know on how many components of taste each student matches with each student,
# except I will also allow for gradient-partial matches.
# Instead of the using the _Num variables, I am using the _SumNum variables. These have been recoded so that I don't know and missing are 0.
# For Q1_SumNum, Low = 1, Medium = 2 and High = 3. IDK and Missing = 0, but these are not a sequence with the others in this case.
# Note to self: The Q5_SumNum variables won't work as is for this purpose, because Never/0 is treated the same as IDK/0 and Missing/0.
# Have to recode that one again if I use it for this purpose with SNA.

StudentCounterVar1 <- 1
StudentCounterVar2 <- 1
while (StudentCounterVar1 < 81) {
  StudentCounterVar2 <- 1
  while (StudentCounterVar2 < 81) {
    
    ComparisonVar_Salty <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_SumNum[StudentCounterVar1] == GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_SumNum[StudentCounterVar2], 1, 
                                  ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_SumNum[StudentCounterVar1] == 0, 0,
                                         ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_SumNum[StudentCounterVar2] == 0, 0,
                                                1/(abs(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_SumNum[StudentCounterVar1] - GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Salty_SumNum[StudentCounterVar2]) * 2)
                                         )
                                         )
                                         )

    ComparisonVar_Sweet <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_SumNum[StudentCounterVar1] == GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_SumNum[StudentCounterVar2], 1, 
                                  ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_SumNum[StudentCounterVar1] == 0, 0,
                                         ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_SumNum[StudentCounterVar2] == 0, 0,
                                                1/(abs(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_SumNum[StudentCounterVar1] - GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sweet_SumNum[StudentCounterVar2]) * 2)
                                         )
                                  )
    )
    
    ComparisonVar_Bitter <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_SumNum[StudentCounterVar1] == GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_SumNum[StudentCounterVar2], 1, 
                                  ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_SumNum[StudentCounterVar1] == 0, 0,
                                         ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_SumNum[StudentCounterVar2] == 0, 0,
                                                1/(abs(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_SumNum[StudentCounterVar1] - GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Bitter_SumNum[StudentCounterVar2]) * 2)
                                         )
                                  )
    )
    
    ComparisonVar_Sour <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_SumNum[StudentCounterVar1] == GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_SumNum[StudentCounterVar2], 1, 
                                  ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_SumNum[StudentCounterVar1] == 0, 0,
                                         ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_SumNum[StudentCounterVar2] == 0, 0,
                                                1/(abs(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_SumNum[StudentCounterVar1] - GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Sour_SumNum[StudentCounterVar2]) * 2)
                                         )
                                  )
    )
    
    ComparisonVar_Spicy <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_SumNum[StudentCounterVar1] == GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_SumNum[StudentCounterVar2], 1, 
                                  ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_SumNum[StudentCounterVar1] == 0, 0,
                                         ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_SumNum[StudentCounterVar2] == 0, 0,
                                                1/(abs(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_SumNum[StudentCounterVar1] - GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Spicy_SumNum[StudentCounterVar2]) * 2)
                                         )
                                  )
    )
    
    ComparisonVar_Flavorful <- ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_SumNum[StudentCounterVar1] == GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_SumNum[StudentCounterVar2], 1, 
                                  ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_SumNum[StudentCounterVar1] == 0, 0,
                                         ifelse(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_SumNum[StudentCounterVar2] == 0, 0,
                                                1/(abs(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_SumNum[StudentCounterVar1] - GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q1_RateTasteComponentOfDish_Flavorful_SumNum[StudentCounterVar2]) * 2)
                                         )
                                  )
    )
    

    # Sets matrix cell value to NA (missing) if student is compared to self, otherwise sets matrix cell to sum of seven matches on taste components.

    ComparisonVar_AllSix <- ifelse(StudentCounterVar1 == StudentCounterVar2, NA, ComparisonVar_Salty + ComparisonVar_Sweet + ComparisonVar_Bitter + ComparisonVar_Sour + ComparisonVar_Spicy + ComparisonVar_Flavorful)
    
    eval(parse(text = paste0("GTC_Tschool_P1to7_5_14_19_adj_10_24_21$",GTC_Tschool_P1to7_5_14_19_adj_10_24_21$CodeNum_VerticeList_Q1allGradient[StudentCounterVar1],"[StudentCounterVar2]", "<-", "ComparisonVar_AllSix")))
    StudentCounterVar2 = StudentCounterVar2+1
  }
  StudentCounterVar1 = StudentCounterVar1+1
}


### Re-export dataset after transformations
# https://caltechlibrary.github.io/data-carpentry-R-ecology-lesson/03-dplyr.html
# REMEMBER to change the date in the file name to create a new file.
## Uncomment this when needing to export the file, and adjust the path as needed.
# Maybe it is just me, but I think it would be nice if R could pop up a dialogue box for saving the file. If I didn't want to interrupt a longer code-flow, I could trigger the dialogue box at the beginning of the run, save the path and file name to a variable, and then use that variable here later.
# write.csv(GTC_Tschool_P1to7_5_14_19_adj_10_24_21, file = "Intro to R class/Final Project/Robert Abrams Final R Project /RobertAbramsTasteTestR/GTC_Ts_P1t7_051419_adj102421_Ot112221.csv")

Social Network Analysis visualization

This image was produced by the Gephi software. The main point here for the moment is that the matrix variables data transformation did yield a visualization of interest.

Taste Test Social Network Analysis visualization produced by the Gephi software

Replicate supplemental taste test results report

I will try as much as possible for this coding run to stick to tidyverse solutions.

The 2/14/2018 supplemental results report was primarily about cross-tabs, so this analysis will also focus on cross-tabs, especially since I haven’t done any for this analysis so far. However, the snack fruit taste test included two fruits, so it was possible to cross-tab responses to those, such as “What percentage of students found at least one fruit they thought tasted delicious?” and “What percentage of students reported they thought two fruits tasted delicious?” Since the taste test being analysed here only tested one dish, such cross-tabs are not possible. Instead, I will find other questions where similar cross-tabs might be reasonable. (Also note that the CrossTable() text-tables display okay on the screen, with the shortened substitution for the variable name, but not so well when printed in portrait orientation. It could be worth trying to print into a PDF in landscape orientation for the CrossTable() tables section, if printing is needed.)

## 11-29-2021
## Replicate Supplemental results from a Garden To Café scannable taste test survey for snack fruit administered in classrooms at PSABX on 12/14/2017 / by Robert Abrams, Ph.D., published 2/14/2018

# I tried piping variables to summarize(), but it didn't produce the results I wanted. The formatting of the tables looked more okay than just text-based tables, though.

# I am going to stay within
# http://analyticswithr.com/contingencytables.html
# but jump to CrossTable()
# It looks like it produces a better labeled table, if still Stata-ish.
# Run the CrossTable() command, with your two variables as inputs.
# CrossTable(mpg$class, mpg$cyl)

# Make shorter variables so the cross-tab displays better
Willingness_To_Try_Fruit <- GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Num
# For some CrossTable() cross-tabs, an even shorter variable name is needed.
Will_Try_Fruit <- GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Num

Willingness_To_Try_Vegetables <- GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_Num

CrossTable(Will_Try_Fruit, Willingness_To_Try_Vegetables)
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  80 
## 
##  
##                | Willingness_To_Try_Vegetables 
## Will_Try_Fruit |         0 |         1 |         2 |         3 |         7 |         8 | Row Total | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
##              0 |         2 |         2 |         2 |         0 |         0 |         0 |         6 | 
##                |     1.344 |     0.533 |     1.078 |     1.575 |     0.375 |     0.975 |           | 
##                |     0.333 |     0.333 |     0.333 |     0.000 |     0.000 |     0.000 |     0.075 | 
##                |     0.167 |     0.125 |     0.154 |     0.000 |     0.000 |     0.000 |           | 
##                |     0.025 |     0.025 |     0.025 |     0.000 |     0.000 |     0.000 |           | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
##              1 |         1 |         1 |         1 |         1 |         0 |         0 |         4 | 
##                |     0.267 |     0.050 |     0.188 |     0.002 |     0.250 |     0.650 |           | 
##                |     0.250 |     0.250 |     0.250 |     0.250 |     0.000 |     0.000 |     0.050 | 
##                |     0.083 |     0.062 |     0.077 |     0.048 |     0.000 |     0.000 |           | 
##                |     0.013 |     0.013 |     0.013 |     0.013 |     0.000 |     0.000 |           | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
##              2 |         0 |         6 |         1 |         3 |         0 |         0 |        10 | 
##                |     1.500 |     8.000 |     0.240 |     0.054 |     0.625 |     1.625 |           | 
##                |     0.000 |     0.600 |     0.100 |     0.300 |     0.000 |     0.000 |     0.125 | 
##                |     0.000 |     0.375 |     0.077 |     0.143 |     0.000 |     0.000 |           | 
##                |     0.000 |     0.075 |     0.013 |     0.037 |     0.000 |     0.000 |           | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
##              3 |         8 |         7 |         9 |        16 |         2 |         2 |        44 | 
##                |     0.297 |     0.368 |     0.479 |     1.715 |     0.205 |     3.709 |           | 
##                |     0.182 |     0.159 |     0.205 |     0.364 |     0.045 |     0.045 |     0.550 | 
##                |     0.667 |     0.438 |     0.692 |     0.762 |     0.400 |     0.154 |           | 
##                |     0.100 |     0.087 |     0.113 |     0.200 |     0.025 |     0.025 |           | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
##              7 |         0 |         0 |         0 |         1 |         3 |         0 |         4 | 
##                |     0.600 |     0.800 |     0.650 |     0.002 |    30.250 |     0.650 |           | 
##                |     0.000 |     0.000 |     0.000 |     0.250 |     0.750 |     0.000 |     0.050 | 
##                |     0.000 |     0.000 |     0.000 |     0.048 |     0.600 |     0.000 |           | 
##                |     0.000 |     0.000 |     0.000 |     0.013 |     0.037 |     0.000 |           | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
##              8 |         1 |         0 |         0 |         0 |         0 |        11 |        12 | 
##                |     0.356 |     2.400 |     1.950 |     3.150 |     0.750 |    42.001 |           | 
##                |     0.083 |     0.000 |     0.000 |     0.000 |     0.000 |     0.917 |     0.150 | 
##                |     0.083 |     0.000 |     0.000 |     0.000 |     0.000 |     0.846 |           | 
##                |     0.013 |     0.000 |     0.000 |     0.000 |     0.000 |     0.138 |           | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
##   Column Total |        12 |        16 |        13 |        21 |         5 |        13 |        80 | 
##                |     0.150 |     0.200 |     0.163 |     0.263 |     0.062 |     0.163 |           | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
## 
## 
# CrossTable(GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_FruitIHaventEatenBefore_Num, GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q5_WillingnessToTry_VegetablesIHaventEatenBefore_Num)

# This CrossTable() produces a better initial result, although an ability to control the output better would be nice.

We learn from the cross-tab, using the Row and Column Totals, that 55.0% of the 80 students would like to try fruit they haven’t eaten before 3 or more times, and 26.3% of the 80 students would like to try vegetables that they haven’t eaten before 3 or more times. 20.0% of the 80 students would like to try both fruit and vegetables they haven’t eaten before 3 or more times. Subtracting the 20.0% from the fruit and vegetable percentages individually, 61.3% of the 80 students would like to try either fruits or vegetables they haven’t eaten before 3 or more times.

While the surveys for the older taste test and the current one were different, some questions overlap. Of the 43 students in the older taste test, 69.8% reported wanting to try other kinds of fruit they haven’t eaten before. This was a higher percentage than the 55.0% in the current taste test. Similarly, 30.2% of the 43 students in the older taste test reported wanting to try vegetables they hadn’t eaten before 3 or more times. This was also higher than the 26.3% found in the current taste test. The results are not directly comparable, because the taste tests were at different schools, at different times, during taste tests of different dishes. One might hypothesize that students willingness to try fruit might be higher after a taste test of fruit than after a taste test of a vegetable dish like a salad. It is, though, a start to viewing patterns of taste responses across an entire school district.

Overtime, as the taste tests accumulated, this would give the client data to know how well the program was achieving its goal of encouraging students to be willing to try new foods, and perhaps provide insight into how to adjust to better reach that goal. The reports for the older taste test did not report a cross-tab for willingness to try fruit vs vegetables. One could go back to the older data and run such a cross-tab, and then compare those numbers to the same for the current taste test.

As more reports of the same kind of data and analysis are needed, the need for smoother and more efficient report generation increases. That said, the analyst shouldn’t only rely on past work to narrow in on only what was reported before. Some staring at the data and the output can be useful to find patterns that are new or which were missed before.

Next I will run cross-tabs for willingness to try new fruit against whether students had taken part in taste tests before, GTC events before, grade and class period.

Taste_Test_Before <- GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q7_TakenPartIn_TasteTestBefore_Num
GTC_Event_Before <- GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q8_TakenPartIn_GTCEventBefore_Num
Grade_Number <- GTC_Tschool_P1to7_5_14_19_adj_10_24_21$Q6_Grade_Num
Class_Period <- GTC_Tschool_P1to7_5_14_19_adj_10_24_21$ClassPeriod

CrossTable(Willingness_To_Try_Fruit, Taste_Test_Before)
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  80 
## 
##  
##                          | Taste_Test_Before 
## Willingness_To_Try_Fruit |         0 |       0.5 |         1 |         8 | Row Total | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
##                        0 |         3 |         0 |         3 |         0 |         6 | 
##                          |     0.250 |     0.600 |     0.000 |     0.150 |           | 
##                          |     0.500 |     0.000 |     0.500 |     0.000 |     0.075 | 
##                          |     0.100 |     0.000 |     0.075 |     0.000 |           | 
##                          |     0.037 |     0.000 |     0.037 |     0.000 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
##                        1 |         3 |         0 |         1 |         0 |         4 | 
##                          |     1.500 |     0.400 |     0.500 |     0.100 |           | 
##                          |     0.750 |     0.000 |     0.250 |     0.000 |     0.050 | 
##                          |     0.100 |     0.000 |     0.025 |     0.000 |           | 
##                          |     0.037 |     0.000 |     0.013 |     0.000 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
##                        2 |         3 |         1 |         5 |         1 |        10 | 
##                          |     0.150 |     0.000 |     0.000 |     2.250 |           | 
##                          |     0.300 |     0.100 |     0.500 |     0.100 |     0.125 | 
##                          |     0.100 |     0.125 |     0.125 |     0.500 |           | 
##                          |     0.037 |     0.013 |     0.062 |     0.013 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
##                        3 |        12 |         5 |        26 |         1 |        44 | 
##                          |     1.227 |     0.082 |     0.727 |     0.009 |           | 
##                          |     0.273 |     0.114 |     0.591 |     0.023 |     0.550 | 
##                          |     0.400 |     0.625 |     0.650 |     0.500 |           | 
##                          |     0.150 |     0.062 |     0.325 |     0.013 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
##                        7 |         4 |         0 |         0 |         0 |         4 | 
##                          |     4.167 |     0.400 |     2.000 |     0.100 |           | 
##                          |     1.000 |     0.000 |     0.000 |     0.000 |     0.050 | 
##                          |     0.133 |     0.000 |     0.000 |     0.000 |           | 
##                          |     0.050 |     0.000 |     0.000 |     0.000 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
##                        8 |         5 |         2 |         5 |         0 |        12 | 
##                          |     0.056 |     0.533 |     0.167 |     0.300 |           | 
##                          |     0.417 |     0.167 |     0.417 |     0.000 |     0.150 | 
##                          |     0.167 |     0.250 |     0.125 |     0.000 |           | 
##                          |     0.062 |     0.025 |     0.062 |     0.000 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
##             Column Total |        30 |         8 |        40 |         2 |        80 | 
##                          |     0.375 |     0.100 |     0.500 |     0.025 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
## 
## 
CrossTable(Willingness_To_Try_Fruit, GTC_Event_Before)
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  80 
## 
##  
##                          | GTC_Event_Before 
## Willingness_To_Try_Fruit |         0 |       0.5 |         1 |         8 | Row Total | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
##                        0 |         4 |         1 |         1 |         0 |         6 | 
##                          |     0.333 |     0.014 |     0.167 |     0.375 |           | 
##                          |     0.667 |     0.167 |     0.167 |     0.000 |     0.075 | 
##                          |     0.100 |     0.067 |     0.050 |     0.000 |           | 
##                          |     0.050 |     0.013 |     0.013 |     0.000 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
##                        1 |         3 |         1 |         0 |         0 |         4 | 
##                          |     0.500 |     0.083 |     1.000 |     0.250 |           | 
##                          |     0.750 |     0.250 |     0.000 |     0.000 |     0.050 | 
##                          |     0.075 |     0.067 |     0.000 |     0.000 |           | 
##                          |     0.037 |     0.013 |     0.000 |     0.000 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
##                        2 |         6 |         2 |         1 |         1 |        10 | 
##                          |     0.200 |     0.008 |     0.900 |     0.225 |           | 
##                          |     0.600 |     0.200 |     0.100 |     0.100 |     0.125 | 
##                          |     0.150 |     0.133 |     0.050 |     0.200 |           | 
##                          |     0.075 |     0.025 |     0.013 |     0.013 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
##                        3 |        17 |        10 |        16 |         1 |        44 | 
##                          |     1.136 |     0.371 |     2.273 |     1.114 |           | 
##                          |     0.386 |     0.227 |     0.364 |     0.023 |     0.550 | 
##                          |     0.425 |     0.667 |     0.800 |     0.200 |           | 
##                          |     0.212 |     0.125 |     0.200 |     0.013 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
##                        7 |         3 |         0 |         1 |         0 |         4 | 
##                          |     0.500 |     0.750 |     0.000 |     0.250 |           | 
##                          |     0.750 |     0.000 |     0.250 |     0.000 |     0.050 | 
##                          |     0.075 |     0.000 |     0.050 |     0.000 |           | 
##                          |     0.037 |     0.000 |     0.013 |     0.000 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
##                        8 |         7 |         1 |         1 |         3 |        12 | 
##                          |     0.167 |     0.694 |     1.333 |     6.750 |           | 
##                          |     0.583 |     0.083 |     0.083 |     0.250 |     0.150 | 
##                          |     0.175 |     0.067 |     0.050 |     0.600 |           | 
##                          |     0.087 |     0.013 |     0.013 |     0.037 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
##             Column Total |        40 |        15 |        20 |         5 |        80 | 
##                          |     0.500 |     0.188 |     0.250 |     0.062 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|
## 
## 
CrossTable(Will_Try_Fruit, Grade_Number)
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  80 
## 
##  
##                | Grade_Number 
## Will_Try_Fruit |         0 |         1 |         2 |         3 |         5 |         8 | Row Total | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
##              0 |         0 |         3 |         2 |         1 |         0 |         0 |         6 | 
##                |     0.075 |     0.469 |     0.074 |     0.001 |     1.125 |     0.150 |           | 
##                |     0.000 |     0.500 |     0.333 |     0.167 |     0.000 |     0.000 |     0.075 | 
##                |     0.000 |     0.111 |     0.091 |     0.077 |     0.000 |     0.000 |           | 
##                |     0.000 |     0.037 |     0.025 |     0.013 |     0.000 |     0.000 |           | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
##              1 |         0 |         2 |         2 |         0 |         0 |         0 |         4 | 
##                |     0.050 |     0.313 |     0.736 |     0.650 |     0.750 |     0.100 |           | 
##                |     0.000 |     0.500 |     0.500 |     0.000 |     0.000 |     0.000 |     0.050 | 
##                |     0.000 |     0.074 |     0.091 |     0.000 |     0.000 |     0.000 |           | 
##                |     0.000 |     0.025 |     0.025 |     0.000 |     0.000 |     0.000 |           | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
##              2 |         0 |         2 |         3 |         1 |         3 |         1 |        10 | 
##                |     0.125 |     0.560 |     0.023 |     0.240 |     0.675 |     2.250 |           | 
##                |     0.000 |     0.200 |     0.300 |     0.100 |     0.300 |     0.100 |     0.125 | 
##                |     0.000 |     0.074 |     0.136 |     0.077 |     0.200 |     0.500 |           | 
##                |     0.000 |     0.025 |     0.037 |     0.013 |     0.037 |     0.013 |           | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
##              3 |         1 |        13 |        12 |         7 |        10 |         1 |        44 | 
##                |     0.368 |     0.230 |     0.001 |     0.003 |     0.371 |     0.009 |           | 
##                |     0.023 |     0.295 |     0.273 |     0.159 |     0.227 |     0.023 |     0.550 | 
##                |     1.000 |     0.481 |     0.545 |     0.538 |     0.667 |     0.500 |           | 
##                |     0.013 |     0.163 |     0.150 |     0.087 |     0.125 |     0.013 |           | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
##              7 |         0 |         3 |         0 |         1 |         0 |         0 |         4 | 
##                |     0.050 |     2.017 |     1.100 |     0.188 |     0.750 |     0.100 |           | 
##                |     0.000 |     0.750 |     0.000 |     0.250 |     0.000 |     0.000 |     0.050 | 
##                |     0.000 |     0.111 |     0.000 |     0.077 |     0.000 |     0.000 |           | 
##                |     0.000 |     0.037 |     0.000 |     0.013 |     0.000 |     0.000 |           | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
##              8 |         0 |         4 |         3 |         3 |         2 |         0 |        12 | 
##                |     0.150 |     0.001 |     0.027 |     0.565 |     0.028 |     0.300 |           | 
##                |     0.000 |     0.333 |     0.250 |     0.250 |     0.167 |     0.000 |     0.150 | 
##                |     0.000 |     0.148 |     0.136 |     0.231 |     0.133 |     0.000 |           | 
##                |     0.000 |     0.050 |     0.037 |     0.037 |     0.025 |     0.000 |           | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
##   Column Total |         1 |        27 |        22 |        13 |        15 |         2 |        80 | 
##                |     0.013 |     0.338 |     0.275 |     0.163 |     0.188 |     0.025 |           | 
## ---------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
## 
## 
CrossTable(Willingness_To_Try_Fruit, Class_Period)
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  80 
## 
##  
##                          | Class_Period 
## Willingness_To_Try_Fruit |         1 |         2 |         3 |         6 |         7 | Row Total | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
##                        0 |         2 |         2 |         0 |         1 |         1 |         6 | 
##                          |     1.078 |     0.115 |     1.275 |     0.014 |     0.002 |           | 
##                          |     0.333 |     0.333 |     0.000 |     0.167 |     0.167 |     0.075 | 
##                          |     0.154 |     0.095 |     0.000 |     0.067 |     0.071 |           | 
##                          |     0.025 |     0.025 |     0.000 |     0.013 |     0.013 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
##                        1 |         1 |         2 |         0 |         1 |         0 |         4 | 
##                          |     0.188 |     0.860 |     0.850 |     0.083 |     0.700 |           | 
##                          |     0.250 |     0.500 |     0.000 |     0.250 |     0.000 |     0.050 | 
##                          |     0.077 |     0.095 |     0.000 |     0.067 |     0.000 |           | 
##                          |     0.013 |     0.025 |     0.000 |     0.013 |     0.000 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
##                        2 |         2 |         3 |         4 |         0 |         1 |        10 | 
##                          |     0.087 |     0.054 |     1.654 |     1.875 |     0.321 |           | 
##                          |     0.200 |     0.300 |     0.400 |     0.000 |     0.100 |     0.125 | 
##                          |     0.154 |     0.143 |     0.235 |     0.000 |     0.071 |           | 
##                          |     0.025 |     0.037 |     0.050 |     0.000 |     0.013 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
##                        3 |         6 |        11 |        11 |         8 |         8 |        44 | 
##                          |     0.185 |     0.026 |     0.291 |     0.008 |     0.012 |           | 
##                          |     0.136 |     0.250 |     0.250 |     0.182 |     0.182 |     0.550 | 
##                          |     0.462 |     0.524 |     0.647 |     0.533 |     0.571 |           | 
##                          |     0.075 |     0.138 |     0.138 |     0.100 |     0.100 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
##                        7 |         1 |         0 |         0 |         2 |         1 |         4 | 
##                          |     0.188 |     1.050 |     0.850 |     2.083 |     0.129 |           | 
##                          |     0.250 |     0.000 |     0.000 |     0.500 |     0.250 |     0.050 | 
##                          |     0.077 |     0.000 |     0.000 |     0.133 |     0.071 |           | 
##                          |     0.013 |     0.000 |     0.000 |     0.025 |     0.013 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
##                        8 |         1 |         3 |         2 |         3 |         3 |        12 | 
##                          |     0.463 |     0.007 |     0.119 |     0.250 |     0.386 |           | 
##                          |     0.083 |     0.250 |     0.167 |     0.250 |     0.250 |     0.150 | 
##                          |     0.077 |     0.143 |     0.118 |     0.200 |     0.214 |           | 
##                          |     0.013 |     0.037 |     0.025 |     0.037 |     0.037 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
##             Column Total |        13 |        21 |        17 |        15 |        14 |        80 | 
##                          |     0.163 |     0.263 |     0.212 |     0.188 |     0.175 |           | 
## -------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
## 
## 

Cross-tabs for Willingness to try new Fruits vs Demographics

Fruits vs Prior taste test & GTC event participation

18 students wanted to try new fruits 1 to 3+ times, out of 30 students who had never taken part in a taste test before, or 60.0%. 6 students wanted to try new fruits 1 to 3+ times, out of 8 students who had maybe taken part in a taste test before, or 75.0%. (The sub-group N here is too low to say much.) 32 students wanted to try new fruits 1 to 3+ times, out of 40 students who had taken part in a taste test before, or 80.0%. Thus, for these students, having taken part in a taste test before was associated with a higher likelihood of wanting to try new fruits.

26 students wanted to try new fruits 1 to 3+ times, out of 40 students who had never taken part in a GTC event before, or 65.0%. 13 students wanted to try new fruits 1 to 3+ times, out of 15 students who had maybe taken part in a taste test before, or 86.7%. (The sub-group N here is a little too low to say much.) 17 students wanted to try new fruits 1 to 3+ times, out of 20 students who had taken part in a taste test before, or 85.0%. Thus, for these students, having taken part in a Garden To Café event before was associated with a higher likelihood of wanting to try new fruits.

(One next step would be to check the differences when “taken part in a taste test before” and “taken part in a GTC event before” are combined into one variable. Additional analysis can also be run for today’s dish and vegetables.)

Below is a summary of these results, using the xtable() function. (I had to create an as.table() table first, and then feed that into the xtable() function, and it needed two code chunks because the as.table() method gets tripped up by , results = 'asis' while xtable() needs it. But it worked! Cue the Teachers College Marching Band!)

# for xtable in r curly brackets above ", results = 'asis'" # but interferes with the "tab" method.

# tab <- matrix(c(7, 5, 14, 19, 3, 2, 17, 6, 12), ncol=3, byrow=TRUE)
# colnames(tab) <- c('colName1','colName2','colName3')
# rownames(tab) <- c('rowName1','rowName2','rowName3')
# tab <- as.table(tab)
# print(tab)
# https://www.statology.org/create-table-in-r/

tab <- matrix(c('60%', '75%', '80%', '65%', '87%', '85%'), ncol=3, nrow=2, byrow=TRUE)
colnames(tab) <- c('No','Maybe', 'Yes')
rownames(tab) <- c('Taken part in a taste test before?','Taken part in a Garden To Café event before?')
tab <- as.table(tab)
print("Table showing percent of students in each prior participation response category who were willing to try new fruits they hadn't eaten before.")
## [1] "Table showing percent of students in each prior participation response category who were willing to try new fruits they hadn't eaten before."
print(tab)
##                                              No  Maybe Yes
## Taken part in a taste test before?           60% 75%   80%
## Taken part in a Garden To Café event before? 65% 87%   85%

Summary of crosstab showing Willingness to try new fruits was higher with Prior taste test and GTC event participation

print(xtable(tab, caption = "Summary of Crosstab of Willingness to try new fruits versus Prior participation in taste testing and GTC events. Percentages show the percentage of students who were willing to try new fruits they hadn't eaten before, within Prior participation response categories. The table shows that students who reported prior participation in events similar to today's taste test were more likely to report willingness to try new fruits. Thus, taste tests and GTC events appear to contribute to increased willingness to try new fruits, although larger samples would be needed to confirm this result."), type = "html")
Summary of Crosstab of Willingness to try new fruits versus Prior participation in taste testing and GTC events. Percentages show the percentage of students who were willing to try new fruits they hadn’t eaten before, within Prior participation response categories. The table shows that students who reported prior participation in events similar to today’s taste test were more likely to report willingness to try new fruits. Thus, taste tests and GTC events appear to contribute to increased willingness to try new fruits, although larger samples would be needed to confirm this result.
No Maybe Yes
Taken part in a taste test before? 60% 75% 80%
Taken part in a Garden To Café event before? 65% 87% 85%

(Note that neither table is worded in a way that makes it immediately clear that the percentages are Willingness to try new fruits, but I can’t think of how to fix it right now. The reader has to reference the caption, and then it makes sense, which is okay, but not ideal. I will have to have a colleague look at it some other time.)

Fruits vs Grade level

(Note these analyses contrast willingness to try against unwillingness grouped with IDK and Missing.)


Gr K = 1/1 = 100% (Sub-group N too low.)

Gr 1 = 17/27 = 63.0%

Gr 2 = 17/22 = 77.3%

Gr 3 = 8/13 = 61.5% (Sub-group N a little too low.)

Gr 4 = NA (No students were in Grade 4.)

Gr 5 = 13/15 = 86.7% (Sub-group N a little too low.)

(2 students were willing to try new fruits, but left the grade number question blank.)


There appears to be a pattern among these students where as students get older, they are more likely to be willing to try new fruits, except for Grade 3.

This pattern is very tentative since it is not completely consistent and the sub-group Ns are low.

It could be used for hypothesis generation in a larger study.

Or, one could argue that because of Grade 3, there is no pattern by grade level at all.

As with every sub-group analysis in this 80 student dataset, caution is advised when the sub-group N-s are below 20. If the results of sub-group analysis look promising, go back to the field and collect more data.

Fruits vs Class period


Period 1 = 9/13 = 69.2% (Sub-group N a little too low.)

Period 2 = 16/21 = 76.2%

Period 3 = 15/17 = 88.2% (Sub-group N a little too low.)

Period 6 = 9/15 = 60.0% (Sub-group N a little too low.)

Period 7 = 9/14 = 64.3% (Sub-group N a little too low.)


Given the similarity of the results comparing periods at the start of the day to the end of the day, there seems to be no discernible difference by time of day. There might be some differences in some class periods compared to some other class periods, but the sub-group Ns are too small, and not enough is known about the classes who happened to be asked to complete the survey in each period, to say whether class period had any influence on or association with the results. The working hypothesis is that there was no association between willingness to try new fruits and class period.

Discussion

The theory behind the Mid-Reflective Taste Test Survey is that students would provide more accurate overall taste-response assessments of a dish if they had time to reflect on that dish for multiple taste points, compared to the single taste point of many taste tests (and certainly of previous taste test surveys that led to this survey). Another purpose of this taste test design is to provide the client with a more detailed understanding of how students perceived the dish. An extension of this purpose is the theory that students often have an underdeveloped vocabulary of taste.

Where students in a class either have a high number of “I don’t know” or blank responses to a taste component question, or have large variability in their perception of a taste component of the same dish, this may indicate an opportunity for the client and/or the school to invest in additional taste education on that taste component.

Assessments of taste components

Salty

In the case of Salty, there was a substantial consensus that the salad was low for Salty (45%). However, 30% of students didn’t know how salty the salad was or left it blank. This suggests an opportunity for salt education. Given that it is known that perception of salt is relative and can shift if salt intake is adjusted slowly, the variability makes some sense.

I am experimenting with calculations to represent the variability in taste component responses.

(Note that “O2” is “Other 2 levels”. There are three possible response options, so the analysis identifies the mode (M), and then calculates the percentage for the other two levels combined.)


Mode = 45.0%

Other 2 levels = 25.0%

Mode/O2 = 1.80

M – O2 = +20.0%


Sweet

For Sweet, the mode response was Low, which is consistent with the dish. However, using Mode/O2 as a measure of variability (high #s = high consensus, low variability and low #s = low consensus, high variability), there was quite a lot of variability in how students rated the sweetness of the salad. 23.7% of students didn’t know how sweet the salad was (includes blanks).


Mode = 35%

O2 = 41.2%

M/O2 = 0.85

M – O2 = -6.2%


Bitter

For Bitter, the mode of High makes sense given the kind of salad. That said, the consensus was low, with a M/O2 of 0.59. 32.5% of students couldn’t rate the bitterness of the salad (includes blanks).


Mode = 25.0%

O2 = 42.5%

M/O2 = 0.59

M – O2 = -17.5%


Sour

Unlike the previous taste component, which were unimodal, Sour was bimodal. M/O2 = 0.76. Don’t know (plus blanks) was 27.4%.


Mode = 31.2%

O2 = 41.2%

M/O2 = 0.75

M – O2 = -10.0%


Spicy

Spicy had high consensus. Low spicy (40% of students) makes sense for this salad. Still, 15% of students though it was High spicy. 37.6% of students didn’t know the spiciness of the dish (includes blanks).


Mode = 40%

O2 = 22.5%

M/O2 = 1.78

M – O2 = +17.5%


Flavorful

The mode for Flavorful was Low (22.5%). M/O2 = 0.62. Don’t know (including blanks) = 41.2%.


Mode = 22.5%

O2 = 36.3%

M/O2 = 0.62

M – O2 = -13.8%


Temperature

The mode for Temperature was as expected: Cold (50%), although Warm could also be a fair description for room temperature, which is generally, but not always, the expectation for a green salad. Only 17.5% of students didn’t know the temperature of the dish.

Overall assessment

The overall assessment of the salad was fairly evenly distributed among Delicious (30%), Okay (27.5%) and Unsatisfying (32.5%). Don’t know plus Blanks was 6.2%. Didn’t try it was 3.8%.

I need to go back and check how students who answered Didn’t try it to this question answered the taste component and other questions. There were only three such students, so it won’t impact the results much either way.

Willingness to try new foods

In terms of willingness to try new foods, in this case different kinds of today’s dish, 36.2% of students were strongly willing to try (3+ times), 27.5% (1-2 times) were willing to try, and 20% were not willing to try. 16.3% didn’t know (includes blanks).

Willingness to try new fruits (that is, fruits the student hadn’t eaten before), which is expected to be high, or at least higher than for vegetables, was 55% for 3+ times, 17.5% for 1-2 times, and 7.5% for never. 20% didn’t know (including blanks).

Willingness to try new vegetables (that the student hadn’t eaten before), which is expected to be low, or lower than fruits, was 26.2% for 3+ times, 36.2% for 1-2 times, and 15% for never. I don’t know (including blanks) was 22.4%.

A concept behind these willingness to try questions was to measure the Garden To Café’s effectiveness: if willingness to try new foods increased over time, this would be one indicator of GTC’s meeting its goals, keeping in mind that most schools participating in GTC have only a small number of GTC events, so expectations for change should be set appropriately – unless the school staff reinforced GTC messages such as “You don’t have to like the new food, you just have to try it” in between GTC events.

A series of pre-, immediate-post- and later-post- willingness to try assessments could be warranted, if feasible within budget constraints.

Grade levels

Students were largely in grades 1 and 2, with smaller, but still large numbers in grades 3 and 5. These were all self-reports on the survey. One student self-reported being in Kindergarten. Only 2.5% left the question blank. The participating students were classes selected by the school staff based on availability and scheduling. Thus, while the selection of classes (groups of students) was not entirely random on the part of the school staff, I as the GTC program evaluator had no hand in the selection. Since entire classes were selected, within classes we had the whole population of students attending school in each such class on that day.

Prior taste test and Garden To Café event participation

Half of students (50%) reported participating in a taste test before this event. The other half said No or were uncertain (Maybe or blanks).

25% of students reported having participated in a GTC event before. The other 75% either said No (50%) or were uncertain (Maybe or blank).

Class period

Responses by class period ranged from a low of 13 students in period 1 to a high of 21 students in period 2.

Response rate

Eighty out of 82 students present completed a survey, for an overall response rate of 97.6%.

I don’t know and blank responses

Number of “I don’t know” or Blank responses per student: The distribution of I don’t knows and blanks for each student, across all questions (14 sub-questions total), can be found in the bar chart titled "Count of IDK-s and Blanks for each student. The maximum number of IDKs and blanks for any one student was 11. Eight out of 80 students (10%) left half or more of the questions blank or answered I don’t know. While questions left blank are missing data, they also potentially provide the program with an indication of where taste education might be targeted.

(Note: ideally, some of the numbers in this section would be dynamically generated using in-line r code, but since I have already written this section, I will leave that enhancement for another day.)

Next steps

For analysis and reporting of this taste test data

One next step would be to conduct further exploratory analysis on the relationships between taste components and overall taste assessment.

A key next step would be to obtain input from the client on reporting needs, especially given programming changes due to the pandemic. This would include discussing the current results with the client.

A third next step would be to share the results of the taste test data analysis with the school at which the data was collected, in a much shorter form that could also function as a math or science lesson for a teacher to try with students. I might also print up small “reward tokens” or stickers to give to the students who participated in the taste test.

For continued learning

One necessary precondition for learning is to be honest with oneself, so here are some items on my continuing education to-do list.

  1. Check if CrossTable() has any other parameters that can be set.
  2. Do a refresher of Chi-squared cross-tab statistics.
  3. Explore pivot_wider() and pivot_longer() in the tidyverse in more detail.
  4. Investigate the other links to see if there are other approaches to cross-tabs in R. More options for this can be found in the full R script version of this document by searching for “4)”.
  5. See if there is a way to display # and ## R Markdown header levels so that they look more distinct from each other. Right now, to my eye, they look so similar as to blur together.
  6. Find someone who knows how to install Linux into a Samsung Android tablet, so that both OSes can run simultaneously, so that I can run R Studio on my tablet. Either that or convince the R Studio folks to develop an Android-native version of R Studio. Reading works better on a tablet than on a laptop. My tablet is lighter than my laptop. I need to minimize the extent to which I carry around my laptop which, while relatively light, is still heavy enough to injure my back if I carry it around too often.
  7. For whatever reason, I didn’t have a need to put text into italic or bold in this report so far, but since I do know how to do that, I have done so here. (However, how do I make text italics and bold at the same time? I have tried nesting the formatting codes, with contrasting underscores and asterisks. We shall see if it works. Yes, it worked! Cue the fireworks!) And, I just stumbled across how to put a footnote into an R Markdown document, even though I wasn’t looking for it, so here is a footnote7. So, another next step for continued learning is learning more R Markdown syntax.8 9
  8. Create a mock-up of what an improved Figure 27.2 in WG would look like.
  9. R Markdown files are not currently knitting to PDFs because “font family ‘Wingdings’ not found in PostScript font database” so I need to look into this and fix it.
  10. I have leads in WG 29.6 for building interactive dashboards, which is potentially very useful because, in my experience, many clients are more interested in easy access to simple metrics, than in detailed, fancy analysis. R is starting to look like one way to build such an interactive dashboard at a reasonable cost, even when the underlying data is messy and siloed in multiple data systems.

Conclusions

2019

In the cross-tabs for fruit versus demographics, we saw that for the 80 students who took part in this taste test, students who had taken part in a taste test before were more likely to want to try new fruits they hadn’t eaten before. Students who had taken part in a Garden To Café event before were also more likely to want to try new fruits they hadn’t eaten before. This provides evidence that Garden To Café has been meeting its overarching program goal of increasing students’ willingness to try new foods. Larger samples sizes at multiple schools would be needed to confirm the result. A similar analysis for “today’s dish” and “vegetables” should be run as a next step, particularly since willingness to try new vegetables (and its corrolary, vegetable neophobia) is known to be more difficult to achieve than willingness to try new fruits.

With this analysis, we begin to see the potential for comparisons of taste responses across many, and perhaps all, New York City public schools.

This report gives the Garden To Café program staff more ways to present the results of their work, both in terms of analysis techniques and visualizations.

1997

I have gone from knowing no R to feeling that I can write it with some proficiency.

I have used R to replicate three of the charts offered by Tufte as examples of good data graphic design, two in this report and one elsewhere.

I know or know of several analysis techniques which I did not know before. I have a clear path of next steps, both in courses and my own independent work, to continue my continuing educational research learning.

So, so far so good.

Methodology

EpiDisplay and tab1()

EpiDisplay package and tab1() function by Virasakdi Chongsuvivatwong

Color schemes for the tab1() bar charts were chosen with sequences of blues, in the case of temperature also reds, to represent sequences of responses options. “I don’t know” is always light green so that it can be recognized consistently across questions. “LeftBlank” is white, also for consistency across questions, and to reflect that this is missing data. Where the responses options are not a logical sequence, or where using a sequence wouldn’t add useful information, such as for Grade Level, alternating blue and yellow bars are used to make adjacent bars more easily distinguishable.

I like the tab1() bar charts because they use a reasonable amount of Tuftean data-ink minimization, while not being overly minimalist. I also like that the tab1() function generates the frequency table at the same time as the bar chart.

The tab1() output isn’t always perfect, but I have been able to tweak it so that it works well enough, and then some.

R Script versus R Markdown

R Markdown gave me endless tsuris (Yiddish for aggravation) throughout the course. For instance, R code that works in an R Script does not work in an R Markdown file. R Markdown is supposed to make creating presentations easier and more convenient, but in practice it did the opposite.

I feel that R and R Markdown are two different languages, even if they are related. I would rather learn R first, and then learn R Markdown second.

In addition, I have realized that trying to create the presentation before completing the analysis has the process backwards. I need to know what the presentation should look like, and then design the R Markdown to match. Once that R Markdown file is built, if I need to rerun the presentation with updated data, then the R Markdown approach will provide convenience.

Note also that some of my thoughts on this topic, which were written earlier in the course, have changed since taking the R Markdown course on Datacamp, and giving into the strange, and vaguely threatening, error messages about “XQuartz” such that I installed it on my laptop. Now R Markdown works for me, as evidenced by the ~150 pages of text and code you just read.

I get that there are some advantages to open source software development, but sometimes this results in an end product where clearly the left hand did not know what the right hand was doing. A key case in point is the # character: in R script it comments out code, but in R Markdown it generates header formatting. These two functions are almost diametrical opposites. For me, it generated confusion that impeded my learning. Now that I have used # enough in both R and R Markdown it is okay, if still a puzzling choice on the part of the programmers. (See my R script and R Markdown replication of the NY Times 1978 economic forecasts chart reprinted in Tufte for a more detailed explanation of my learning process for R Markdown. Plus, the chart is fun!)

Better table formatting

This is an example of a table produced using the xtable() function. The formatting is much clearer than that produced with text-only tables. One next step would be to figure out how to apply xtable() or something similar to the tables produced by tab1() and CrossTable().

print(xtable(head(cars)), type = "html")
speed dist
1 4.00 2.00
2 4.00 10.00
3 7.00 4.00
4 7.00 22.00
5 8.00 16.00
6 9.00 10.00

The Mid-reflective Taste Test Survey

The Mid-reflective Taste Test Survey is the result of about five years of work at the New York City Department of Education’s Office of School Support Services and School Food (NYC DOE OSSS/SchoolFood, which later because the Office of Food and Nutrition Services (OFNS)). It was part of my work as the program evaluator for NYC DOE’s Garden To Café project. Shortly after the data presented in this report was collected, organizational issues arose that prevented my completing the data analysis at the time. (Those issues are still being addressed. I am happy to discuss them separately.)

Current and past surveys

Mid-reflective taste test survey used in the current study

Mid-reflective taste test survey used in the current study

Taste test survey used in the previous study

Taste test survey used in the previous study

References

Abrams, R. (2019). Taste test data collected at a NYC public school in May 2019 using the Mid-reflective Taste Test Scannable Survey (Unpublished) [Data set]. Contact the author for more information: or .

The two Garden To Café reports referenced in this report are available from the author on request. They were publicly published on the web, but are not currently available on the web. Citations below.

Abrams, R., Arnold, N., Sorkin, H., Sedito, V., Edwards, G. (2017). Pilot test of a Garden To Café scannable taste test survey for snack fruit administered in classrooms at PSABX on 12/14/2017 New York City Department of Education, Office of School Food (This report was dated 12/21/2017 and was made publicly available on the web. It is currently not available on the web, so please contact Robert Abrams for a copy of the report.)

Abrams, R. (2018). Supplemental results from a Garden To Café scannable taste test survey for snack fruit administered in classrooms at PSABX on 12/14/2017 New York City Department of Education, Office of School Food (This report was dated 2/14/2018 and was made publicly available on the web. It is currently not available on the web, so please contact Robert Abrams for a copy of the report.)

WG = R for Data Science: Visualize, model, transform, tidy and import data by Hadley Wickham & Garrett Grolemund. O’Reilly. r4ds.had.co.nz/index.html accessed 12/21/2021.

Tufte refers primarily to The Visual Display of Quantitative Information, Second Edition by Edward R. Tufte. Graphics Press. (2001)., as well as to his work in general.

Thank you

Thank you to Dr. Bryan Keller, who taught the Teachers College HUDM 5026 Intro to R class with aplomb (or possibly + aplomb()), not just for what I learned, but also for what I now know I need to learn next.

Thank you to Rachel Lee, the course TA, for her support and responsiveness.

Thank you to my classmates, who provided engaging discussion and invaluable collegiality.


  1. GTC is still operating, although the name has been changed.↩︎

  2. This analysis can’t answer the increase part of the question, being only one point in time, but it can establish a methodology and a baseline. Except, in a bit of Brechtian foreshadowing, the cross-tabs suggest even this analysis can answer the question to a certain, limited extent. See motivating question #3.↩︎

  3. Note that we turned this report around from data collection, through scanning and analysis, to a finished report in one week. For the NYC DOE, this was unheard of, lightning speed.↩︎

  4. The idea was to deliver the first report as quickly as possible, followed by a second report with deeper analysis later.↩︎

  5. And, this gives me an excuse to show I know what echo = FALSE does, even though I am not supposed to use it for this Intro to R report. And to lightly spice the paragraph with in-line back-ticks (`).↩︎

  6. For Social Network Analysis (SNA), I need a way of determining if any two vertices are connected. For an initial SNA, I tried specifying that if two students answered Q3 (the overall taste response) the same, they are connected. This would produce SNA matrix variables where student pairs would be “1” if they match and “0” if they don’t match. I would predict, though, that this will produce six clusters of connected students, with no connections between the clusters. Without anything else, this would presumably be more like a clustering analysis (PCA, Sillhouette, etc.). I think I would then need to create another SNA layer using other questions, at which point students should start to have groups of vertices that are then more or less strongly connected between those groups.↩︎

  7. Serendipity, written by Stephen Cosgrove and illustrated by Robin James, was an extremely important book for me as a child. Also, as can be seen from this footnote, the really good stuff tends to be in the footnotes.↩︎

  8. I subsequently, in editing, found reasons to use italics, bold and footnotes.↩︎

  9. Why, in the name of every researcher who ever sailed the seven seas, is the code used for a new line in R Markdown not the same as the code used for a new line in R?!?!? On the other hand, I stumbled upon the R Markdown code for a horizontal rule, AKA a line, so that’s useful.↩︎