Fun with Big Lists - Multi Value Choices
Some of the more interesting programming challenges in SharePoint happen when you exceed the List View Threshold. This article will first explain the List View Threshold. Then it will illustrate how traditional filtering in a big list is impossible when targeting a multi-value choice field. Finally it will present a jQuery strategy to compensate for that shortfall. The solution at the end uses JQuery to take advantage of promises and $.ajax in an environment that doesn’t support typescript.
The “List View Threshold” is a number in “Central Administration” that ensures SharePoint breaks when reading an item collection exceeding that number. Let’s imagine the List View Threshold was left at the default value of 5,000. In that environment, SharePoint will choke when trying to look at the first 5,001 items in a 32,000 item library. This happens in both standard browsing and when executing REST queries and is well described in the Microsoft documentation for Big Lists.
Microsoft’s solution for this dilemma is to filter the View. By ensuring that the number of items in a view is less than 5,000, you should be able to divide that 32,000 item library into chunks and bypass any errors…right? Kind of... The next thing you might discover is that filtering can also cause problems. Filtering only works on “Indexed Fields” in big lists.
The natural solution is to index your field, filter on it, and then celebrate your success. But what if you are trying to filter and index a multi-value choice field? This presents a problem. Multi-value choice fields are “Unsupported Column Types”. More information on unsuported column types can be found in the Microsoft documentation for Indexing a SharePoint Column. This means that Multi-value choice fields can’t be indexed and, consequently, can’t be used for filtering in big lists.
In my case, instead of trying to filter by a multi-value choice field, it made sense to grab every item in the list and count it selectively using JavaScript. Using a number range to filter the items by ID, I was able to divide the items up into chunks. As a chunk concludes its query, I looped through the results to count the documents by type. Filtering was implementing by skipping the count if the document type was in the "dontCount" array.
I was concerned that having so many queries executing asynchronously would cause performance issues. So I decided to store part of the REST for each query in an array. Then, I used a second array to ensure I had exactly three active queries at a time. As soon as one query ends, another begins until all are complete. This allows the server to track fewer open requests and the client to store the results of fewer requests at one time by throttling the speed of the query. Counter intuitively, the end performance is way better than trying to run all of those queries at the same time.
Fair warning: Be careful when using this approach. Even when limiting execution to 3 active queries, this can quickly use too many resources if people use it concurrently. Check out the complete code on my git-hub account.