Creating Intuition for Data Structure and Algorithms in Salesforce APEX.

KUMAR RAHUL

Published Mar 16, 2020

Collections are the heart of the APEX language.We have prebuilt data-structures such as List , Set and Map(collections) in the language. So do we need knowledge of data structures and algorithm then? We will try to find the answer in this apex blog.

We use collection in apex for almost any task. Whether it is bulkification , storing objects ,storing ids or dml etc. And we also follow the patterns such as no soql inside the for loop , no deeper for loops or no deeper if else branching etc.

At this point we should stop and try to find the answers of some of these questions :

No Soql inside the for loop?
No deeper for loop ?
Why list is ordered but set and map is not ordered ?
Why set doesn’t contain the duplicates ?
Why keys should be unique in map ?
How map uses object ( non primitive type ) as key?
Why set.contains(ele) takes less time than list.contains(ele) ?
Is for each loop better then the traditional for loop ?

So let’s start with some basic code patterns we write and try to get insights from them why some code is efficient then other and by what measure.

1.No SOQL inside the for loop.
 
explanation 1: maximum no of soql query in a single transaction is 100.

explanation 2:query is a database read operation and it is very expensive so  it should not be used inside iteration.

let's say there is a time of execution of one query is c sec and the loop size = n.
  n = 1 , time= c;
  n=10 ,time = 10*c;
  n=100,time = 100*c;
  n=10000 ,time= 10000*c;
  n= n ,   time= n*c

for smaller n the time is very less but as the n increases the time increases with very high rate.

So the solution here is use collections to get the results in one query and then iterate over the collection.

 List<Account> accountsList = [select id,fields from account]; 
 for(account ac: accounts){
    //business logic. }

Now the time will be : c + accountsList.size() secs. 

now the time is reduced from (c*n) to (c+n)

....

2.No deeper for loops.


 for(account ac : accounts){
   for(contact ct : contacts){
      for(childOfcontact ct: ct.childOfcontacts){
          //business logic
     }
   }
 }
 let's assume size of accounts will be n;
   size of contacts per account on average is m;
   size of childContacts per contact on average is k;
   business logic takes z sec.

 Then total execution time  will be n*m*k*z sec.

 if n= 1000, m= 100 and k= 50 => time = 5000000*z sec.

 if n=m=k=n then cost = n*n*n*z

 Time in the solutions:

 1. if you optimize with 1 query and two for loops then the time will
   c sec for 1 query + n*n*z sec

 2.if you can optimize it with 1 query and 1 for loop then the time
   will be  : c sec for 1 query  + n*z sec

....

3.List is ordered but Set and Map is not ordered.

 Ordered: we can the determine the order of the elements and can access it by 
  specifying some index.
    
 Unordered: we can not determine the order of the elements and also can not access
  the element by specifying some index.

 To understand this we need to first understand the process of storing the
 elements in each of these collections.

1. List: We can think of list as dynamic array. Array is collection of elements of 
  same type. Dynamic array is a re sizable array i.e. it can increase it's size on
  adding the elements and it can decrease it's size on removal.

  add(ele): on add operation list appends the element to the end of  the list.So 
  the list is ordered.

  List<Integer> list = new List<Integer>(); //[]
  list.add(1); //[1]
  list.add(2); //[1,2]
  list.add(3); //[1,2,3]

  Hence the order of the list depends on the order of input so we can also specify
  index to get an element.

2. Set:Set is also a dynamic array but process of populating it with values is 
     different from that of a list. It uses concept of hashing to store the 
     values.
    Hashing: In hashing we map a set of values to a set of keys. we use hash 
             function to create the key for a value. 

             e.g. let hash function is h(ele) = ele % 10 => remainder when divided
                  by 10. 
                  h(1) = 1%10 = 1,   h(5) = 5 , h(9) = 9%10=10
                  for ele = 1, key =1,value= 1
                  for ele = 9 , key = 9,value 9
             Then we use this key as index to store the value in the set (array).

             Set<integer> set = new Set<Integer>();
             set.add(5) can be viewed as set[h(5)]= set[5]=5
             set.add(2) can be viewed as set[h(2)] = set[2]=2
             set.add(9) can be viewed as set[h(9)] = set[9]=9

             See here that we are adding the elements in order (5,2,9) but the 
             values are being stored as order based on the hash function. So we 
             don't have any idea how the values are being stored in set i.e it is
             unordered. So we can't access elements from set by specifying some 
             index.

             *10 is the size of set. we can choose other size also. it's called
             bucket size. 
3.MAP: Map also uses the concept of hashing to store the key and value. The 
       difference between the set and map is that set only stores the key but map
       stores (key,value) pair.

       Map<Integer,Integer> map = new Map<Integer,Integer>(); //{}
       map.put(5,10);{5=10}

       map calculates hash value of 5 by using the hash  function h(5) and stores
       the (5,10) based on the h(5) value.

       So map is also unordered as we don't know how keys are being stored.

...

Recommended by LinkedIn

Imagine the Ultimate Lightning Data Table: a Swiss…

Fernando F. 1 year ago

Mastering Salesforce Apex: Essential Optimization…

Merfantz Technologies Pvt. Ltd 1 year ago

Page API Query

Nguyễn Ngọc Đức Anh Alex 1 year ago

4.Set doesn’t contain the duplicate values.

  Set stores value based on the hash function. 
  So if you provide duplicate value the key (index) will be same. So it will just 
  replace the old value with same new value.

  set.add(2) => set[h(2)] = set[2%10] = set[2] = 2

  Now if we again add 2 to set : it will just set set[2]=2;

...

5.Key should be unique in Map.

  Keys should be unique enough so that value doesn't get override.

  map.put(2,4) => map[h(2)]= h[2%10] = h[2] = {2,4}

  map.put(2,5) => map[h(2)] = h[2%10] = h[2] = {2,5} 

  //Old value got replaced by the  new value with the same key.

...

6. How Map uses object(non primitive type) as key:

   As we have seen already the key should be unique enough to get the unique hash
   value.

   1. sobject type:

      let's see this code:

      Map<Account,integer> objectMap= new Map<Account,Integer>();
      account a = new account(name='kumar');
      account b = new account(name = 'kumar');
      objectMap.put(a,1);
      objectMap.put(b,1);
      system.debug(objectMap);// output will be : {Account:{Name=kumar}=1}

      Even though the a and b are two different object ,But the size of the map is
      only one. 

      The hash value for these two objects is same i.e. a.name and b.name = 
      'kumar'.

      The default hash value  or uniqueness of key is calculated using the object 
      field values.

  2.Other Non primitive type:

      For Other primitive type such as user defined type , we can define the hash 
      function or uniqueness by overriding the equals method and hashcode method.

 https://developer.salesforce.com/docs/atlas.en-us.apexcode.meta/apexcode/langCon_apex_collections_maps_keys_userdefined.htm

...

7.Set.contains(ele) takes less time than List.contains(ele) ?

  Suppose we have collection of ids. Collection<ID> (set<id> or list<Id>)
  And List of contacts : List<contact> ctList;

  we need to filter those contacts whose id is in that collection.

  Which will give better performance set or list ?

  for(contact ct : contact){
     if(Collection.contains(ct.id)){
      // do something.}
  }


1. if collection is list: 

   List<Integer> intlist = new List<Integer>(){1,2,3,4}

   list.contains(4) can be viewed as
       for(Integer i : intlist){
            if(i == 4){
               return true;
                    }}

  i.e. it searches for the element is list.So if the size of list
       n > 1000. it will iterate through the list 1000 times for 
       worst case i.e for the last element. order of O(n).


2.If collection is set:

  Set<Integer> setInt = new Set<Integer>(){1,2,3,4}

  if hash function h(ele) = ele %10
       h(3) = 3%10 = 3;

  setInt.contains(3) can be viewed as :
      if(setInt[h(3)]!=null)return true;

  It takes constant time to return true or false. c1 time to calculate hash value 
  and c2 time to access the value at that index.

  Then c1+c2 sec = c sec which is constant time of order O(1).

  So clearly the set.contains(ele) is way faster than the list.contains(ele).

....

8.Is enhanced for loop better then the traditional for loop ?

  It depends with your case.

  In traditional for loop you have access to the index so you can customize 
  things. We can replace the value etc. We can customize the counter. it is more
  error prone.

  Enhanced for loop is less error prone and easier then the for loop.
  It is less customizable and gives direct access to object or element in   
  collection.

...

We have seen that the even we are happy using the collection but we are using the data structure and algorithm concepts every day. As the time will pass the complexity and problem will grow so knowledge of data structures becomes necessary for developers.

NAGENDRA KUMAR SINGH 6y

Nice Article

1 Reaction

Satyajeet P. 6y

Superb Article

1 Reaction

PRAKARAN VERMA 6y

Great explanation 👍👍

1 Reaction

See more comments

To view or add a comment, sign in

Creating Intuition for Data Structure and Algorithms in Salesforce APEX.

KUMAR RAHUL

Recommended by LinkedIn

More articles by KUMAR RAHUL

Others also viewed

Making Callouts from Batch Apex in Salesforce

there are several ways to integrate with DataVerse?

Salesforce - Trigger vs Formula/Field Update

How do you handle large datasets and implement virtual entities in Microsoft Dynamics?

SFDCQuery — Building and publishing my first Elixir Library

Complete List of #FlowToolKit Apex Actions

Process Integration Using Salesforce Apex and Snowflake API.

Power BI Execute Queries REST API with Service Principal Authentication

Fetch Picklist values by RecordTypeId in Lightning component

Display Standard Report charts in Lightning component using Chart.js

Explore content categories

Recommended by LinkedIn

More articles by KUMAR RAHUL

Efficiently handle the row action event on table in lwc through event bubbling and propagation.

Solve Array related Data Structure problems in Salesforce Apex part 1. Learn collections and their usage.

Pagination in Lightning Web Components.

Others also viewed

Making Callouts from Batch Apex in Salesforce

there are several ways to integrate with DataVerse?

Salesforce - Trigger vs Formula/Field Update

How do you handle large datasets and implement virtual entities in Microsoft Dynamics?

SFDCQuery — Building and publishing my first Elixir Library

Complete List of #FlowToolKit Apex Actions

Process Integration Using Salesforce Apex and Snowflake API.

Power BI Execute Queries REST API with Service Principal Authentication

Fetch Picklist values by RecordTypeId in Lightning component

Display Standard Report charts in Lightning component using Chart.js

Explore content categories