1 of 1 people found this helpful
Hey Mike, here are some of my thoughts and things for you to look at.
Commits is being called out as it's own activity/statement likely because that's how it's being used by the application.
Think in terms like the following transaction:
update table foo set bar = 10
insert into table foo2 values (...)
delete from users where userid = (select userid from users where username = 'Mike.King')
In this one transaction or unit of work, you have 3 SQL statements plus a commit. Each statement (and the commit) will accrue wait time against them which will be tracked/displayed by DPA.
When a commit is done, the database then has to persist that transaction to the transaction log so that it can replay it in case of a recovery since the last backup point (oversimplifying here a bit).
The time it takes to persist that transaction to the tran log is tracked by SQL Server under the WRITELOG wait type.
Things to look for:
- Make sure the write latency for the storage where the transaction log resides is healthy (if mission critical database, less than 10ms). You can look for this by looking at the "Resources" correlation tab below the wait times in trend and timeslice. Look for a metric called SQL disk write latency. You can also drill into an hour in the Storage I/O tab and select the file for the transaction log, then select latency vs throughput to see what kind of volume and stalls I/O requests are being experienced for this file.
- Also, look at the logic for the transactions as well. It's not uncommon for high volume transactions to commit after each and every update say instead of batching them up (batching means significantly reduced commits and reduced wait times).
Hope this helps and provides you at least some things to dig into.
Others may have thoughts or things to look at as well...
2 of 2 people found this helpful
I just wanted to clarify that when commit is run, it's not necessarily the entire transaction that is being persisted to disk. There is a continual process of persisting the transaction log to disk, even for uncommitted transactions.
SQL writes uncommitted transactions to the transaction log, especially on large transactions. The log will be written to disk, and associated data to the data file. If there is a system outage, if the commit statement cannot be found upon startup, then the transaction will be rolled back, and the accurate data will be restored from the transaction log.
This blog post says it well. http://www.sqlnotes.info/2011/10/04/sql-server-transaction-log-buffers/
"Log buffer is a small contiguous memory structure, only 60k maximum, to host in memory transaction logs per database. When the transaction is committed, entire buffer will be written to transaction log file before commit finishes. While running a big transaction, the log buffer will be filled up immediately. As this happens, the buffer will be forced to be written to the disk and being cleaned up for rest of the transactions. In concurrent transaction within one database scenario, the log buffer contains mixed logs from all active transactions. Any one of the transaction gets committed or the 60k buffer gets full will cause entire buffer to be written."
Also, keep in mind that if you are running AGs in synchronous mode, those log writes have to be hardened on the replica node before you will get a commit confirmation. This can cause significant performance hits in your transactional system.
In Mike's case, we took a look at his environment and DPA spelled it out very clearly that write operations during certain periods were taking a way long time. Clear textbook example of storage latency killing database performance.
That makes sense. Thanks for interacting with me on my first ever THWACK comment.
Thanks for the reply it's really appreciated.
Ok that makes sense, with regard to it tracking it's own time and not being directlylinked. I can't really find any queries before hand though that are run in transactions. 90% of our calls are via Entity Framework, and the rest are Sprocs. I will dig into that further to see if there's anything I'm missing.
As for the disk and latency stats I've added the charts below. There definitely seems to be an issue here, but I guess it's chicken and egg and trying to see what is causing this spike, although there doesn't seem to be anything application based that would cause this when i investigate the SQL, mainly selects rather than update/inserts.
Few additional questions i have:
* I can see Tempdb/templog spike before everything else and seem to set the latency on an upward trend, if i can see the "commits" on DB A, is it possible that DB B is the cause which then makes DB A slow? Note all the logs/files are on the same disk, but our customer has discounted this as the cause.
* In the final graph the biggest proportion of Wait time is other waits, from reading i have done on the forum I can see that this is non-IO waits, so does this imply the issue is actually elsewhere?
1 of 1 people found this helpful
Let me verify if WRITELOG is included in i/o waits. I think it should be, but want to make sure.
This may be easier to discuss on a phone call... Can you open a support case and mention you have been working with me on this?
We can coordinate a time next week to hop on a screenshare and investigate more. (I'm currently OOO until Friday)
Let me know your thoughts.
Yeah that's great. I've raised the support call, so we'll arrange a call next week.