Creating an Evaluation tool & Dashboard

THE GOAL

My goal for this project is to create an evaluation tool and dashboard that can be used to track teacher observation data and provide insights for growth management and coaching. In this case, we’ll be looking at creating a tool that can be used for a school (the imaginary "Charest Academy") in Rhode Island.

STEP 1: RESEARCH

To start off, I did some research on the requirements and structures that exist for teacher evaluation in Rhode Island through the Rhode Island Department of Education . Here’s some of the main resources I examined at this stage:

I kept track of what I learned on the "Research" tab of my Observation Workbook. In reviewing these materials, I got a better sense of what the evaluation system requires, and, more importantly, what the observation tools are lacking. In order to conduct a meaningful observation, an evaluator has to have intricate knowledge of the professional practice rubric, OR, keep a copy of it with them while completing the observation.

This is all the state-created observation form includes:

There is VERY little guidance here on the form itself.

The framework outline in the appendix of the teacher guidance document offers a little more guidance of what evaluators are looking for, as well as a full breakdown of each domain and component.

However, if I’m conducting an evaluation, the observation tool on its own is not enough, and the total rubric (all 16 pages of it for the relevant domains and subfactors) is too much.

STEP 2: DESIGNING THE OBSERVATION FORM

A. Developing the Scale

The first step was figuring out the scale for the observer to follow (because this will influence the language of the measures in the observation form). Both the state-created observation form and rubric have a 4-point scale.

However, there’s no clear delineation between each of the steps. In addition, the actual scoring connection to the “Final Effectiveness Rating” is not beholden to that scale, as it is out of 100 points. So, for ease of this project, and for our would-be observers, I used a 5-point Likert scale that I had created for one of my projects at work (see "Scale" tab of the Observation Workbook).

This works out perfectly, because the “component sum” can just be the sum of the average of each of the components (max 20, min 4), and then you can multiply that score by 5 to get the weighted points for the overall “Final Effectiveness Rating.”

B. Creating the Measures

The next step was creating a new observation form, with specific measurements for each component. To do this, I looked at the “Indicators” and Level 4 “Critical Attributes” for each component (trying to connect the critical attributes to appropriate indicators).

I organized these into a Google Sheets file (tab "Overview" of the Observation Workbook) to try to determine the individual measures I would include in the form itself. Once I had that organized, I reviewed this along with the language used in the listed elements of the components from the rubric to create a list of metrics to be measured in the observation form.

There were some challenges to this, since the attributes didn’t always align well with the indicators on the rubric document, and there was some overlap between the domain components. It took me a little while to create a list of metrics that were 1) clear statements, 2) measurable, and 3) properly aligned.

In the end, I developed 5 distinct measures for each component. You can see the full list in the "Measurements" tab of the Observation Workbook.

C. Creating the Form

This was probably the easiest step of all. There are many tools to collect data (SurveyMonkey, Typeform, Qualtrics, etc.), but I went with Old Faithful - Google Forms. I know from my time as a teacher that this is something EVERYONE knows how to use. In fact, most schools use Google Suite as their main tool, so it’d be easily accessible for school leadership and evaluators.

I made sure to also include in the form details about the lesson, date, start time, stop time, etc. This can be helpful for analysis later on.

D. Creating Fake Data

For this, as much as I have issues with it, rather than creating a bunch of responses manually, I relied on ChatGPT. It was a little bit of a task, and I had to do a little bit of finagling to get the data to match the output in the form. The first dataset was fairly good with the non-metrics, but used the wrong Likert-Scale. The second dataset was a little better, but the information didn’t always make sense. I ran out of the free-ability for ChatGPT, and I didn’t want to wait, so I made the best of it, and made some slight modifications on the raw data so it fit the correct format of the form. (Admittedly, I spent more time fiddling with this than I should have…)

STEP 3: CLEANING & PREPARING THE DATA

A. Organizing the Data

First, I always like to have a clean and organized sheet of just the form responses, just to make it easier to read and review quickly. So, I just created a new tab called “Responses_Wide” in the Response sheet where the questions are easier to read, and more clearly connected to their domains and components.

In addition, I created a “Reference” tab in the sheet that did the following:

Served as a quick data dictionary for the headers of the response sheet.
Connected each question to a clear Header, Domain, and Component, when applicable (this is going to be helpful later when putting the data into a long format, and using VLOOKUPS).
Matched the numerical score to the appropriate Likert rating.
Created a brief list of some fake-teacher info, including gender and years in teaching.

B. Aggregating the Data

This is something that could be done in Looker through calculations, but I always find it easier to do it in sheets. Plus, some folks love spreadsheets, so having all the data in one source to easily pull later works best for them.

The first thing I did was create a field for “Minutes Observed.” This would take the difference between the end time and the start time. This revealed the first flaw of the ChatGPT dataset, and I had to adjust the start/end time so we wouldn’t have negative minutes, or thousand minute observations. I didn’t go crazy here, so my fake dataset still has some… lengthy observations.

Then I created averages for each of the components, domains, and a calculation for the total scores. For each calculation I did, I made sure to include a =IF(ISBLANK(),””) function at the start, and then carried that down the full column. This way, as more observations are completed, the sheet will still aggregate the data.

C. Converting the data to Long Format

The final data adjustment in sheets was to convert the data to long format, because there are some visualizations that I planned to do in Looker Studio that only work if the data is in long format.

To do this, I had to use a combination of an ARRAYFORMULA, SPLIT, and FLATTEN formulas, and then combine with a FILTER function that filtered the return to only display rows that started with a number (since my first column was a date).

I then used a few VLOOKUPs surrounded by IFERRORS to connect the measurement to the proper domain and component. And voilà, the data is ready for the dashboard.

STEP 4: THE DASHBOARD TOOL

I created the dashboard in Looker Studio. I created six distinct pages, each with a separate purpose:

Home - This is just a general overview that shows some KPIs for the walkthroughs. Specifically, you can see: the number of observations for each teacher, how many observations each evaluator has conducted, the total number of walkthroughs, instructors observed, average score, average observation length, and average score per component.
School-Level Data - This page allows you to look at school averages by observables. Buttons on the side allow you to easily focus on one of the domains or components at a time.
Teacher-Comparison Data - This page shows the overall school averages as well as a separate row for each teacher, so you can compare how teachers are doing compared to the school as a whole. Also, the bottom chart has the calculations for the Classroom Environment and Instruction composite scores, so this table can easily be exported for reporting at the conclusion of the school year.
Observation-Level Data - This allows evaluators to look back at specific observations, and easily review that data. I imagine this could be helpful when meeting with teachers post-observation.
Teacher-Summary Page - This provides an overview for teachers as well, and would mostly be used as a coaching tool. The best part of this is it shows the top 3 strengths and top 3 areas to improve.
Trends - This page can be useful in determining if there has been any growth as a whole or in any of the specific components over the course of the year, either as a staff, or in individuals. It also allows you to isolate by observer, to see if there are differences between how observers score.

Here you can see the full dashboard:

With two different datasets (wide and long format), normally, you would set up “Blends” to connect the data. BUT, sometimes this is a challenge with Looker Studio, because of the limitations to the number of fields in blended data. One way to work around this is to create connected field IDs, like I did below:

STEP 5: FEEDBACK FORM

I created a template feedback form in Google Docs that had the key information that evaluators would want to share with teachers post-observation. Specifically, it includes the scores (out of 5) for the domains and components of each domain, as well as any comments from the observation.

With the help of ChatGPT (because I do NOT know JavaScript), I created a Google Apps Script code that would copy the template and fill in all the relevant information from a feeder sheet (which I created using a combined IMPORTRANGE and QUERY function to bring in the relevant fields).

As a result of the code, I had a folder of all the feedback forms with the appropriate information filled for each observation.

FINAL THOUGHTS

While this dashboard and reporting tool is tailored for the imaginary "Charest Academy," it really could be adapted by anyone using the Rhode Island Model of evaluation set forth by the Rhode Island Department of Education . The insights from this could be far more substantial than what is currently provided by the DOE in terms of:

Determining the pacing of the evaluations being conducted
Comparing the ratings of various evaluators
Assisting in coaching of teachers and identifying areas of improvement
Isolating systemic weak-points to target for professional development
Examine trends and growth over time

Creating an Evaluation tool & Dashboard

James Charest

THE GOAL

STEP 1: RESEARCH

STEP 2: DESIGNING THE OBSERVATION FORM

A. Developing the Scale

B. Creating the Measures

C. Creating the Form

D. Creating Fake Data

STEP 3: CLEANING & PREPARING THE DATA

A. Organizing the Data

B. Aggregating the Data

C. Converting the data to Long Format

STEP 4: THE DASHBOARD TOOL

STEP 5: FEEDBACK FORM

FINAL THOUGHTS

More articles by James Charest

Explore content categories

THE GOAL

STEP 1: RESEARCH

STEP 2: DESIGNING THE OBSERVATION FORM

A. Developing the Scale

B. Creating the Measures

C. Creating the Form

D. Creating Fake Data

STEP 3: CLEANING & PREPARING THE DATA

A. Organizing the Data

B. Aggregating the Data

C. Converting the data to Long Format

STEP 4: THE DASHBOARD TOOL

STEP 5: FEEDBACK FORM

FINAL THOUGHTS

More articles by James Charest

Creating an Excel Dashboard

Examining 2019 MA Education Data

Practicing Window Functions

Google Analytics Capstone

Delving into a Super-Messy Superstore Dataset!

What I Learned from the 21 Days to Data Challenge

Explore content categories