This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

SNMP data not written to database at night

FormerMember
FormerMember

Has anyone seen this problem? SNMP data seems to not be collected during the hours of  11:00 pm and 5:00 am every day. This has been going on ever since I purchased Orion in May 2007. Ping data is always good, only SNMP seems to be affected. I have included a screen shot to illustrate. Also, Solarwinds support has not been able to resolve this and I believe they have given up, so I hope that someone out there has some suggestions before I scrap the product.

Some things we have done to troubleshoot.

  • Moved SQL database from external cluster back to the local system
  • Reinstalled the product (many times, Solarwinds support has also done this remotely)
  • Applied all hot fixes
  • Adjusted Polls per second.
  • Reduced poll frequency
  • Verified that no other apps are running during that time frame. (backups, AV scans)
  • Orion server is not taxed. CPU runs around 20%

Even data on the Orion server itself has drop outs

  • We have seen a similar problem and opened a ticket with support. They recommending adjusting the polling time from 5 to 4 minutes. This did not help. No further direction as of yet.

  •  What time does your nightly maintenance start & finish?
    This normally happens due to the maintenance script locking some tables in the DB & data can no longer be written to the DB.

    Check this file in your Orion installation folder for the nightly DB maint. timestamps:
    ....\Program Files\SolarWinds\Orion\swdebugMaintenance.log

    2007-10-01 02:15:54,919 [1] INFO  DatabaseMaintenanceGui.Program - Database Maintenance Starting.
    2007-10-01 02:15:54,950 [1] INFO  DatabaseMaintenanceGui.Program - Database Maintenance running in automated mode.
    2007-10-01 02:15:54,981 [1] INFO  SolarWinds.Data.DatabaseMaintenance.Settings - Retaining Detailed for 35
    2007-10-01 02:15:54,997 [1] INFO  SolarWinds.Data.DatabaseMaintenance.Settings - Retaining Hourly for 65
    2007-10-01 02:15:54,997 [1] INFO  SolarWinds.Data.DatabaseMaintenance.Settings - Retaining Daily for 370
    2007-10-01 02:15:54,997 [1] INFO  SolarWinds.Data.DatabaseMaintenance.Settings - Retaining Events for 365
    2007-10-01 02:15:54,997 [1] INFO  SolarWinds.Data.DatabaseMaintenance.Settings - Retaining Wireless Sessions for 30
    2007-10-01 02:15:55,122 [1] INFO  SolarWinds.Data.DatabaseMaintenance.MaintenanceEngine - Begining Maintenance
    <SNIP>
    2007-10-01 02:49:05,517 [1] INFO  SolarWinds.Data.DatabaseMaintenance.MaintenanceEngine - Removing older data from VolumeUsage
    2007-10-01 02:49:05,580 [1] INFO  SolarWinds.Data.DatabaseMaintenance.MaintenanceEngine - Finializing maintenance for  VolumeUsage
    2007-10-01 02:49:05,580 [1] INFO  SolarWinds.Data.DatabaseMaintenance.NetworkElements - Clearing Deleted Nodes
    2007-10-01 02:49:05,580 [1] INFO  SolarWinds.Data.DatabaseMaintenance.NetworkElements - Clearing Deleted Interfaces
    2007-10-01 02:49:05,580 [1] INFO  SolarWinds.Data.DatabaseMaintenance.NetworkElements - Clearing Deleted Volumes
    2007-10-01 02:49:05,580 [1] INFO  SolarWinds.Data.DatabaseMaintenance.MaintenanceEngine - Database Maintenance Complete.

    I have about a one hour gap in all my stats at this time every monring.
    This has been an ongoing issue for over 3 years for me & have been unable to resolve it.

  • FormerMember
    0 FormerMember in reply to Network_Guru

    Database maintenance runs at 8:15 am 


     


    2007-10-01 08:15:59,849 [1] INFO  DatabaseMaintenanceGui.Program - Database Maintenance Starting.


    2007-10-01 08:23:05,514 [1] INFO  SolarWinds.Data.DatabaseMaintenance.MaintenanceEngine - Database Maintenance Complete.

  • We are writing to an external SQL DB. No log file on Orion server.



  • We are writing to an external SQL DB. No log file on Orion server.

     

    Have you checked for this file in your Orion directory?
    This is a job which runs from a DLL included in Orion & is based on the settings you provide in System Manager for data retention.
    If this log file does not exist on your server, then you will be keeping detailed statistics forever & you DB will just keep growing, unless you manually truncate the DB.

    In any case, it appears this is not the issue for Mark, as his maint runs at 8:15AM.
    It was just something to check.....
     

  • This looks a bit like ours. But we lost the polling data until restart of service. I.e. we would have holes in the graph which would remain like that until something kicked the service in the head.

    Now, SolarWinds recommended a second polling engine due to the size of our network, so we did that. The problem remained. Then they said that our brand new quad-core servers had issues, since it was the cpu cycle (which is 1/4 on a quad-core) that had to be greater than 2.6GHz. [case # 5102]

    Since the problem remains, I am quite confident tat this was not the case. The issue is a software bug, and I have spent several thousands on new servers and extra polling engine. I really am NOT happy with this.

    We are also writing to an external SQL DB. No log file on Orion server.

     
    The processing is what appears to be the issue on the application servers. Even though the server is set up on a quad core 1.6 GHz processors, Orion is only going to use one processor. The minimum requirement for the amount of devices you have is 2.6 Ghz. Orion may simply be utilizing that processor. The utilization won't ever be higher than about 25% since it is a quad core, but again, Orion is only going to use one processor. Do you have another server is the above Ghz installed to test out on by installing Orion?

    Regards,
    Matthew Harvey
    Solarwinds Tier II Support


    After I rolled back to 2 x 1CPU servers, everything slowed down (of course):


    When I open a web page with all nodes (approx 1500) it takes 15-16 seconds to open it. This makes it difficult to work with. Actually, when running on 2xquad-core servers it took ~9 seconds. And on one server, 12 seconds.


    I think this must be an architecture issue.

  • FormerMember
    0 FormerMember in reply to btp

    I would advise both of you to open Support tickets again.  With these tickets, please provide us some data to speed our analysis:

     

    1.  Please run the diagnostic utility on the Orion server.

    2.  Please send us the event logs for both the Orion server and the database server for the time period where you're missing data.

     


     

  • FormerMember
    0 FormerMember in reply to FormerMember

    I have opened a new case Case #19589 the old case was 10197.

  • FormerMember
    0 FormerMember in reply to FormerMember

    I have opened a new case. Sent in diag info and guess what. No one will call me back. Left more messages.. No call back.

  •  Let me pull the case and see what the status is. From looking at some of the cases you have opened you might want to check your Spam filter for replies from the support reps. to make sure you are receiving the responses.