5 Replies Latest reply on Jul 7, 2015 9:41 AM by m60freeman

# Page Life Expectancy for NUMA

I was just reading an interesting article (Page Life Expectancy isn't what you think... - Paul S. Randal) that states the following:

Most new systems today use NUMA, and so the buffer pool is split up and managed per NUMA node, with each NUMA node getting it’s own lazy writer thread, managing it’s own buffer free list, and dealing with node-local memory allocations. Think of each of these as a mini buffer pool.

The Buffer Manager:Page Life Expectancy counter is calculated by adding the PLE of each mini buffer pool and then calculating the mean. But it’s not the arithmetic mean as we’ve all thought forever, it’s the harmonic mean (see Wikipedia here), so the value is lower than the arithmetic mean. (5/11/2015: Thanks to Matt Slocum (b | t) for pointing out a discrepancy from the arithmetic mean on a large NUMA system and making me dig into this more, and my friend Bob Dorr from CSS for digging into the code.)

What does this mean? It means that the overall PLE is not giving you a true sense of what is happening on your machine as one NUMA node could be under memory pressure but the *overall* PLE would only dip slightly. One of my friends who’s a Premier Field Engineer and MCM just had this situation today, which prompted this blog post. The conundrum was how can there be 100+ lazy writes/sec occurring when overall PLE is relatively static – and this was the issue.

For instance, for a machine with 8 NUMA nodes, with PLE of each being 4000, the overall PLE is 4000.

The calculation is: add the reciprocals of (1000 x PLE) for each node, divide that into the number of nodes and then divide by 1000.

In my example, this is 4 / (1/(1000 x 4000) + 1/(1000 x 4000) + 1/(1000 x 4000) + 1/(1000 x 4000)) / 1000 = 4000.

Now, if one of them drops to 2200, the overall PLE only drops to: 4 / (1/(1000 x 2200) + 1/(1000 x 4000) + 1/(1000 x 4000) + 1/(1000 x 4000)) / 1000 = 3321.

If you had an alert set watching for a 20% drop in PLE then that wouldn’t fire, even though one of the buffer nodes was under high pressure.

And you have to be careful not to overreact too. If one of them drops to 200, the overall PLE only drops to: 4 / (1/(1000 x 200) + 1/(1000 x 4000) + 1/(1000 x 4000) + 1/(1000 x 4000)) / 1000 = 695, which might make you think that the server is suffering hugely across the board.

It seems that DPA is simply showing the average PLE and not the PLE for each node. Is that true?

• ###### Re: Page Life Expectancy for NUMA

Here is the calc DPA is using:

select cntr_value

from sys.dm_os_performance_counters

where lower(counter_name) = 'page life expectancy'

and object_name like '%Buffer Manager%'

As you can see, we're relying on SQL Server for this counter.  My question would be are they calculating it correctly?  I have not looked into how MS is calculating PLE to populate this DMV...

• ###### Re: Page Life Expectancy for NUMA

If we're only getting a single number, then we are likely just getting that

harmonic mean that Paul Randall mentioned. This might be worth digging

into. SQL Sentry apparently monitors the individual buffer nodes

.

It isn't a matter of whether they are calculating it correctly, but that it

provides insufficient information. If you have 3 checking accounts and have

\$500 in one, \$400 in another, and \$1.25 in a third, an average (or a

harmonic mean) balance across all three isn't going to tell you that you

have a problem with that third one.

1 of 1 people found this helpful
• ###### Re: Page Life Expectancy for NUMA

Yep, I get the mathematical significance.  8 )

Feature request?

1 of 1 people found this helpful
• ###### Re: Page Life Expectancy for NUMA

I would encourage people to up-vote this if it is relevant to their interests.

1 of 1 people found this helpful