Baselines for Verification – Creating and Using them

Introduction

In the previous post we talked about how we can reduce the churn of bugs being introduced into the repository by using a Continuous Integration flow.  This involved having a release flow, where each release would go out only after the code has passed a sanity test.  When the release goes out, we create something called a 'baseline'. A baseline is just a way to set up a workspace to use a release. In this post, we’ll discuss our process for creating baselines and how syncing to baselines also greatly improves your ability to reproduce failures in regressions.

Creating a Baseline

There are many ways to implement creating a baseline, but I’ll talk about what is done on our team.  Instead of directly calling repository commands to sync out a workspace, we have a script that we call to do the syncing for us.   This script also lives in our repository and is managed by Jenkins (which runs our sanity tests).  Each time a sanity completes successfully, Jenkins will update and submit the script that contains all the file versions used to run the sanity.  When a team member wants to sync to the latest baseline (called the ‘head’ baseline), they can just sync the latest version of the sync script and run it – at that point they are guaranteed to have the same file versions as what has passed the most recent sanity.

Other environment setup like Linux modules to load (tool/library versions) and environment variable setup should live in a separate script that is part of the baseline. The reason for this is those items need to be done for every terminal that gets opened. Ideally the baseline script only contains information/operations that need to be done once to set up the workspace, then can be reused by any terminals you open.

Give each Baseline a Unique Number (or name)

We’ve talked about the need to sync to the head baseline, but it should always be possible to sync to any baseline that has ever been created.  This is done by giving each baseline a unique number/name.  In our implementation, since each baseline is just a revision of a sync script we use the changelist number for each submission of the script as the baseline number.  When syncing to a non-head baseline, we can just sync to the version of the script at the time that the changelist was submitted to sync to the target baseline.

Baselines Improve Regression Reliability

One problem that can occur when launching regressions, especially if they are automatically launched, is that they can be taken down by a bad check-in.  If something is checked in too close to the regression launch that breaks the compile or introduces a catastrophic failure in the regression, you can lose the whole regression.  If you launch regressions nightly (for instance), this means you lose a day’s worth of results.

Conversely, if you always launch your regressions on a baseline, it is guaranteed to have passed a sanity which should mean that it will always at least compile and can pass basic testing.  If a catastrophic bug is introduced right before a regression launch, that bug will not be allowed into the regression as the sanity would fail, so your regression has some level of protection against incorrect submissions.

Baselines for Reproducibility

Since we normally work on constrained random environments, reproducing failures can be a challenge.  Simply knowing the command to run a test isn’t always enough, even if you know what seed to use.  This is because changes in the code base can affect randomization results or the behavior of your stimulus even when using the same command.  This is particularly challenging when trying to reproduce hard-to-hit corner cases.

This is generally not a problem if you don’t need to recompile and can just reuse the simulation binary, but this means that the simulation binary must be kept until all failures have been debugged.  Also it has the severe limitation that you cannot change any files and rerun, as changing a file requires a recompile – something you cannot do without also having access to the original runner’s workspace.  This is problematic because often when debugging one needs to change files to add debug code or to rerun to test a fix to a given failure.

Another strategy is to re-use the workspace that launched the regression when doing the debugging.  This approach does work but has the same limitation of being unable to modify files if you are not the owner of the workspace.  Additionally, if the workspace is inadvertently changed (i.e. the owner syncs the workspace), the ability to reproduce is lost.  Similarly, if the owner decides to add in debug code or test some fix, then the ability to reproduce failures can be lost.

Baselines provide an elegant solution to the problems above.  Since all regressions are launched on a baseline, all team members can sync their workspace to the baseline that was used to run any regression.  Even if the regression results are deleted and the launching workspace has changed, as long as you know the baseline to use and the command to run, you’re able to reproduce failures indefinitely.

Baselines for Sharing Workspace State

Although less common, from time-to-time it is helpful to have a way for team members to share their workspace state with each other.  Baselines provide a convenient way to do that.  If a workspace is synced to a baseline, then team members can pass along workspace state by simply providing the baseline number.  This can be helpful if you need to get input from another team member on some issue (or perhaps a local change that you have) but need to continue to make changes in the workspace while waiting.

Conclusion

In this post, we’ve talked about how to create a baseline and the importance of doing so.  The baselines not only protect our regressions and our team from bugs being introduced into the code base, but it also provides us with the ability to sync our workspace to the baseline used to run a regression or test to ease failure reproducibility. 

In the subsequent blog posts, we’ll talk about regression reporting for nightly regressions, and a strategy that can be used to allow for indefinite failure reproduction with only the regression report.

To view or add a comment, sign in

More articles by Keith Redmond

  • Dynamic Test Loading to reduce Compile and Runtime

    Introduction In the previous post we talked about how you can use save-restore to avoid re-running parts of the…

  • VCS Save/Restore to Reduce Simulation Runtime

    Introduction In the previous blog post we talked about how the partition compile feature of VCS allows you to reduce…

    2 Comments
  • Partition Compile to Reduce Compile Time

    Introduction One thing that has a large impact on productivity is your turnaround time – how long does it take to…

  • Regression Infrastructure Overview

    Introduction In the previous two posts we talked about regression reports and the regression manifest, and how they can…

  • Regression Manifest

    Introduction In the previous post we talked about regression reporting, what kind of information do we want to include…

  • Verification Regression Reports

    Introduction In the previous post we talked about how we can leverage our continuous integration and release flow to…

    3 Comments
  • Continuous Integration for Verification

    Introduction In the previous post we talked about the importance of sanity testing. In this post, we’ll talk about how…

    5 Comments
  • Shift Left with Sanity Testing

    In this post we talk about sanity testing - what it is, why it's important, and how you can use it to improve your…

  • Verification Viewpoints: A New Blog Series for Today's Verification Expert

    Hello, and welcome to the Verification Viewpoints Blog Series. I've decided to start a blog series to share some…

    3 Comments

Others also viewed

Explore content categories