Internship Engineering
Introduction
What makes internship successful?
For business it’s cost reduction, predictable hiring, and stable delivery capacity.
For students it’s the knowledge, experience, networking, and growth.
It seems like a win-win, but only if the whole process is engineered. If rules are unclear, targets are unclear, and scoring is unclear, both sides will interpret results differently and the “win-win” becomes random.
This article is not a scientific research. It is a proof-of-concept and field notes about incentive design for internship outcomes: how rules, scoring, checkpoints, and transparency shape intern learning and selection.
The program did not finish as designed due to company internal restructuring. Internship III and full-time offers never happened. Because of NDA, I describe mechanics and aggregated observations only, and focus on what should be tested next.
Developing this field I tried to reduce ambiguity as much as possible and introduce necessary mechanics for learning, assessing, and scoring.
Setup
I believe in the engineering practice and apply necessary parts where they fit right. Whole internship process I split into several phases and put in guards and rules. The phases were Screening, Onboarding, Grading, Internship I, II, and III.
Internship was framed as a 12-month long process, targeting full-time contract to the chosen ones. Plus another month for screening and the interviews.
Checkpoints
Every phase had its checkpoint and quantitative assessment. It was necessary to provide clear and comprehensive solution, which can be expanded, modified, and reproduced, based on feedback and obvious limitations.
For almost every phase guards had the requirements, templates, assessment rules, and data-backed decision making.
Process
Process needed to be close to the standard practice to make it clear and straightforward, but required substation changes. The target was to satisfy both company and interns and avoid common blame of prejudices.
Method
Method below describes a staged internship pipeline with quantitative checkpoints. The scoring model is intended to be transparent and reproducible inside one cohort, not a universal benchmark across different years or different intern groups.
Pass Score was defined as 80% of mean of top 10 scores in the cohort. This threshold is a calibrated heuristic from prior observations on much smaller groups (compared to current 2×25 persons target). It tends to select “good enough” candidates by both pass rate and expected intern capability, without hard-coded grading. It can be tuned for other constraints (for example, fixed internship capacity), but in this proof-of-concept I kept it stable.
The method itself must stay relatively simple to avoid general confusion and make it verifiable by an average student. The intent is not to find “the best of the best”, but to make the system predictable, stable, and open for communication from both sides.
Screening
Screening phase targeted to select and onboard candidates independent on their background and marks. This targeted the group of good-enough candidates who may or may not have necessary university grade and performance, but who have passion to the engineering.
As aside effect, this move removed a need to compare colleges and universities. No one had bonuses or penalties because of place of residence or such.
Numbers
Assessment task
Screening scoring
Checkpoint 1: earn necessary points to stay above median
Onboarding
Onboarding process was setup to align all interns on the tools, methods, and get them acquainted. It was designed as an onboarding core discipline where students have power to assess and decide. All numbers, requirements, and rules were made public on very first day of the internship and were continuously present to both groups.
Students were encouraged to collaborate, to control their project quality, to use AI tools and do everything to get topmost score. AI usage was allowed and encouraged, but not required and not used in scoring as a separate criterion. Additionally, it was a chance for the best performers to choose a team to join.
Numbers
Assessment task
Project Scoring
Grading
Onboarding time was fixed length period and involved many activities for writing and reviewing designs. Grading itself targeted two main purposes.
To make this happen template had structure, allowing to gradually increase the project complexity and time. And to make continuous reviewing – sustainable norm. To support these activities entire onboarding was designed to provide weekly reviews for the project increments.
Recommended by LinkedIn
To support collaboration Peers were assigned randomly on start of onboarding, and this information was open for every participant. To complete Onboarding part peer-reviewing was a must activity.
Pass criteria
Checkpoint 2: onboarding final score and completion of required peer reviews.
Internship
Internship was designed to let students to understand company and team culture and gradually increase their participation in the project. The target was to retain as many interns as possible, and encourage them continue studying.
The final schedule was designed as follows
Internship I – was a cultural fit and team feedback time. It was necessary to align interns and teams and find the best fit.
Internship II and III were aligned with the university exams to ensure interns will continue their education.
Pass criteria
Return rate (in this article) means: % of interns who returned after Internship II phase even considering company difficulties.
Checkpoint 3: completion of Internship II and return decision (since Internship III did not fully happen).
Execution and Observations
This was an experimental proof-of-concept executed in real conditions. The program was preliminary stopped due to company internal restructuring on Internship III stage and Internship II stage was severe affected. Because of this, outcomes are limited to Screening → Onboarding → Grading → Internship I–II, and any full-time conversion result does not exist in this dataset.
At the end of the observation all-time engineering internship programme staffed more than 70 interns. This number includes previous, much smaller groups that were used to calibrate the scoring and Pass Score heuristic, and the current 2×25 persons target cohort.
Definitions used below
This section focuses on observed effects and side observations. Interpretation is kept minimal, because the program was not completed and not designed as scientific research.
Retain and Economy
From general perspective presented approach showed decent maturity and deserves further observation, but not final conclusions. Return rate in this proof-of-concept was above 85%: most interns returned after Internship II phase even considering company difficulties and programme interruption.
Economy index for the 9-month run was close to 2.0 against baseline 1.0 (entry-level engineer market salary level). Within the limits of this observation, engineered checkpoints and extended internship path produced more output per cost unit than expected, but it is not yet possible to say how stable this effect is over time or other cohorts.
Knowledge and Experience
From the very beginning it was obvious that incoming knowledge and experience was totally insufficient to fulfil the apprentice job. It was observed in the prior experiments on smaller groups and confirmed during present experiment.
The Onboarding phase proved its necessity to have interns aligned on tools, methods, and instrumentation. Artificial problems and final designs allowed them to align with other teams using simpler and safer environment, before touching production-grade tasks.
Collaboration
From the very beginning everyone was encouraged to collaborate and to use available tools for all aspects of their Onboarding. Even though it quickly became obvious that modern education does not provide enough practical experience in collaboration on design artefacts.
It was intentionally made and stated that:
Here I use collaboration in narrow sense: visible impact of peer review and shared work on artifacts, even when projects are different. In my experience, real collaboration tends to produce noticeable similarities (structures, wording, solution patterns), comparable to joint source code authoring in pair programming, not only informal communication.
Both groups shown no signs of strong collaboration toward the projects quality increase. The published projects had no significant similarities in any aspect. Based on this observation I can conclude that it is necessary to develop collaborative work further to succeed in the peer review application. In future iterations it will require explicit mechanics for shared work, not only encouragement and visibility of the scoring model.
AI usage
Application of AI to the Onboarding Project showed even more disappointing results. AI was encouraged, but not required, and it was not used as a separate scoring criterion.
Even thought AI adoption rate was above 70%, structured and systematic application was found in less than 5% of works. The common problems were:
In rare cases even after 2nd and 3rd mentor review the problems listed above reappeared and persisted in text. In next iterations AI should be presented explicitly as a tool for checking and structuring own thinking, not as a replacement for it, and its correct use must be practiced on small and safe exercises before it is applied to the main project work.
Conclusion
While being immature and incomplete, this experiment showed the potential of presented approach and its limitations. It helped to frame the method and reveal next steps toward the better engineered internship, with explicit incentives and checkpoints instead of implicit expectations.
Key findings
Next experiments
I am grateful for every person participated in this project. Special thanks to Mariya Pavina , Ekaterina Sedyukova , Anna Vasilenko , Mariam Matusevich for help and support in a whole project activity.
Thank you for sharing this Ilya!
В системах оценки не устраивает субъективность. Нет четкого критерия. Что у одного оценщика - достаточно, у другого может быть на уровне - слишком мало. И зависит это не только от проделанной работы, а еще и от личного отношения к человеку. Школа хороший пример для этого: что одному спишут, потому что он нравится преподавателю, другому ни за что.
You know, I’m currently in a dispute about how to explain pay differences between employees who have the same titles (e.g., engineers). Management doesn’t want to introduce seniority levels, and their explanation is that we can simply justify the differences by explaining each person’s impact and value. Sorry, what? How exactly would they do that? What defines “value”? And how does each person perceive it? Isn’t that too subjective?
I can totally relate! The worst part of traditional assessment is feeling powerless—never knowing what’s expected or how to actually improve. Transparency changes everything.
My takeaway - when rules are clear and checkable, people get control back. Assessment stops being a judgment and becomes a tool to grow.