cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post
Level 10

Multiple simultaneous Serv-U server crashes (+high memory/RAM)

Hello all,

We had a very strange incident yesterday (Sunday 22nd February 2015 around 11.20AM GMT) with multiple Serv-U servers.

Our alerting systems started telling us that FTP and the HTTP/HTTPS web interface on multiple servers was inaccessible. Logged straight on to find that indeed they had all locked up. Further investigation showed that all servers had serv-u.exe running at around 1000MB of RAM. My long experience with Serv-U tells me that it has crashed and is in some kind of loop or has a memory leak.

I've sene this happen before and the service needs to be restarted manually, however this happened over 5 servers, all around the same time. 3 straight away, 1 shortly after (30 mins or so) and then we pre-empted another and restarted the process quickly.

3 of the servers have network storage, two have local, so it seems unlikley that it was related to that. The versions also vary, so if it is a bug, it is even in the latest build.

Im intrigued to know if anyone else had such an issue yesterday, or has had it in the past? I know various memory leaks have been fixed on Serv-U in the past but this seems to be a much rarer but severe issue.

Server, switch and storage logs didnt show up anything around that time that could have distrupted Serv-U, so it looks to be a direct issue with the program. I am however, open to any suggestions, of course.

Any help or insight appreciated!

8 Replies
Level 12

The fact that all 5 died about the same time is strange.  Are all of them taken on & offline as a group?

If they are all taken on & offline as a group, and there is a constant leak, that might make sense.

Do all servers get the same traffic?  Or are they in a load balanced configuration?

What kind of monitoring system do you use?  What is it's poll rate?

0 Kudos

Thanks for your reply Josh, here are my responses..

The fact that all 5 died about the same time is strange.  Are all of them taken on & offline as a group?
No, they all run independantly.

If they are all taken on & offline as a group, and there is a constant leak, that might make sense.

Do all servers get the same traffic?  Or are they in a load balanced configuration?
Each has their own configuration, not load blanaced.

What kind of monitoring system do you use?  What is it's poll rate?
System monitors repsonse times + up/down state and alerts. Poll rate is every 60 seconds. Monitoring works perfectly and correctly identified the issue accross all servers at the time they became unavailable.



Very strange one isnt it! Thought it was a prime candidate for discussion on Thwack!


Any further thoughts appreciated. My main focus is the huge memory allocation on all servers for the serv-u.exe process which ultimiatly caused them all to crash.

0 Kudos

If all of your machines are independent, it sounds like the most significant commonality between them is the monitoring.  Are there any other commonalities, like they are on the same rack, or hosted on the same VM controller?

Are you able to detect a constant leak?  You can use the charting abilities of performance monitor to view it over time.

If you can detect the leak over time, what happens if you switch off monitoring?  What if you move it to a different rack?

You may need to open a ticket, so this situation can be examined closer.

0 Kudos

Thanks for you reply. They are on separate VMs on separate hosts and no other issue like this has occoured with other software. Actually, all of the VM was running fine apart from Serv-U's leak.

Our monitoring showed no gradual incline, they ALL just went bang at that time.

It's a real mystery!

0 Kudos

Wait, you mean that memory usage was a normal levels, and then ALL of them grabbed several hundred MB of memory at the same time?!?!

0 Kudos

Yes, exactly that! I've never seen anything like it. All independent, all at the same time. This made me think it was something serv-u and time/date specific?

0 Kudos

I have never heard of anything like that.  That is probably not a memory leak, as a leak would happen over the lifetime of the process, not all of a sudden.

I can only think of 2 logical explainations-

1) you had a massive traffic spike that hit some part of serv-u that triggered some massive allocations.  You could probably check your server/router logs to see if that was the case.  It would have to be a huge spike, because of the volume of memory allocated.  This would also have to be coordinated, to hit all the servers at the same time.  Maybe a DDOS?

2) There was some kind task on the network that either caused a very bad error in serv-u, or forced serv-u to allocate memory.  Forcing a process to allocate memory is possible, but it requires very high permissions (debug level, I think).  I would investigate if you had any huge failures (like a RAID/SAN catching fire, exploding) or invasive tasks (AV scan, vulnerability scan, firewall install, etc.) scheduled at the same time.

My gut says it is not date/time related, as there would be a lot more people noticing it.  But, it is possible

0 Kudos

Thanks for your ideas. We didnt see anything initially and all other services were fine which is why I posted on there.

I doubt anyone will post now about this unless it happens again but atleast it is documented.

Thanks again.

0 Kudos