Calculating Memory Requirements for Analysis Services Tabular Models

One of the challenges with implementing Tabular Analysis Services models is right sizing the hardware that will host the database. Conventional wisdom holds that the xVelocity engine yields roughly 90% compression. Accordingly, the commonly accepted formula for calculating the compression rate is ([Source Database Size]/10) * 2. There are a couple of ways to approach that calculation.

1. Shrink the database to remove free space then check the file size.

2. Execute the sp_spaceused stored procedure for each table used in the tabular model then sum the values for the Data and Index_Size columns.

3. Execute the sp_spacedused stored procedure on the database (assuming it doesn't contain additional tables that won't be used in the tabular model) and use the sum of the Data and Index_Size columns.

That's fine, as far as it goes but there is a problem with stopping there for the following reasons:

1. In tabular mode, the Analysis Services process (msmdsrv.exe), running with no data loaded, requires approximately 71 MB of volatile memory.

2. The data is likely to inflate somewhat when loaded into memory.

3. The calculation doesn't allow for additional memory overhead for trivial things like connections and administrative/maintenance operations. For some reason, it's occasionally necessary to create backups or process data.

4. The calculation assumes that you actually get something around 90% compression. In practice, the level of compression is more likely to be something closer a range between 60% and 85%. So with a 1TB relational database, you might realistically expect to see a disk footprint between 150 GB and 400 GB.

Because of those considerations, I've been using a slightly different and frankly more conservative formula for determining hardware size requirements when building Tabular models.

The factors that the above formula fail to consider are:

1. Server operating memory requirements.

2. Increases in data volumes (and data volumes in data marts do tend to grow)

3. Server memory overhead to support user connections.

4. Server memory overhead to support administrative operations such as database backup or database processing.

Not being one who particularly enjoys going to the hardware guys to beg and grovel for more memory, I've begun using a slightly different formula to determine memory requirements for tabular implementations. Testing it out, the following formula seemingly provides a more conservative initial estimate of the memory required to process and store data in local memory:

 71 + (([Size of uncompressed data (in MB)] /6 ) * 5.2)

1. The constant 71 is idle server memory consumption with no data loaded.

2. When the database is loaded from disk into memory, it will inflate to approximately twice the size of the data files stored on disk

3. The 5.2 operand is used to allow for additional memory for Processing and Disaster Recovery operations in addition to an amount of memory that is approximately equal to the size of the database in memory and prevent paging.

I've inherited a legacy Tabular model that is small in size overall but spikes to > 10x size during processing due to what would now be considered poor design choices. Right now we're dealing with it since it's a huge effort to redesign it at this point.

Like
Reply

To view or add a comment, sign in

More articles by John Desch

Others also viewed

Explore content categories