Shutdown timeout on a stopped process

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Shutdown timeout on a stopped process

John Poth
Hi wrapper users,

When using the service wrapper on Apache Karaf 4.0.8, I noticed that the shutdown timeouts aren't respected when the underlying java process changes status from 'Running' to 'Stopped' (this can be triggered by using the kill -STOP karaf_java_pid) after the kill $wrapper_pid was called. This doesn't allow for graceful shutdown in this scenario. I noticed the following line in the attached logs:

ERROR  | wrapper  | 2017/03/21 11:15:23 | Shutdown failed: Timed out waiting for signal from JVM.

which seems to indicated the timeout setting I'm looking for. I was wondering if there was a way to configure this timeout? I've tried a couple with no success (see attached conf):

wrapper.jvm_cleanup.timeout=300
wrapper.shutdown.timeout=300
wrapper.ping.timeout=300
wrapper.startup_thread.timeout=300
wrapper.startup.timeout=300
wrapper.cpu.timeout=300
wrapper.jvm_exit.timeout=300
wrapper.ping.timeout.action=300

To reproduce, simply execute a kill $wrapper_pid and right after that a kill -STOP $java_pid and you'll see that the 300 seconds aren't taken into account. Oddly, If you do a kill -STOP $java_pid first and then kill $wrapper_pid you'll see the 300 seconds are taken into account.

I'd be happy to provide a full reproducer with Apache Karaf just let me know.

Thanks a lot,

John.

PS: the required info

Wrapper (Version 3.2.3)
Linux 4.7.3-200.fc24.x86_ GNU/Linux


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Wrapper-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/wrapper-user

wrapper.log (16K) Download Attachment
karaf-wrapper.conf (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Shutdown timeout on a stopped process

Maxime
Hello John

Thank you for your email.

When requested to shutdown, the Wrapper asks the JVM to stop and waits for the JVM to respond that it is stopping, then wait until the process actually stopped.

If you send a SIGSTOP to the JVM right after killing the Wrapper, then JVM will go in a paused state and may not be able to respond to the Wrapper. In this case the Wrapper will simply wait until wrapper.shutdown.timeout exceed and then will shutdown.


If you leave a few milliseconds between "kill $wrapper_pid" and "kill -STOP $java_pid", then the JVM may have enough time to signal it is stopping but will eventually go in a pause state before it can actually shutdown. In this case the Wrapper will simply wait until wrapper.jvm_exit.timeout exceed and then will shutdown.


I could test both cases with the testwrapper sample application that is provided with the Wrapper, but I could not reproduce a case where any of these timeouts are ignored. If I wait longer between the two command, the JVM will have enough time stop cleanly and the Wrapper will stop normally right after.

case 1 (almost same time):
kill $(head -n 1 testwrapper.pid) & kill -STOP $(head -n 1 jvm.pid)

case 2 (manually execute the command one after the other ~200ms between them):
kill $(head -n 1 testwrapper.pid)
kill -STOP $(head -n 1 jvm.pid)

Could you also have a try with the testwrapper sample application? Approximately how long do you wait between the two commands?

PS: note that wrapper.ping.timeout.action=300 is not correct. This property should specify an action (RESTART, DEBUG, etc.) and not a number of seconds.

Regards,

Maxime

On Tue, Mar 21, 2017 at 9:13 PM, John Poth <[hidden email]> wrote:
Hi wrapper users,

When using the service wrapper on Apache Karaf 4.0.8, I noticed that the shutdown timeouts aren't respected when the underlying java process changes status from 'Running' to 'Stopped' (this can be triggered by using the kill -STOP karaf_java_pid) after the kill $wrapper_pid was called. This doesn't allow for graceful shutdown in this scenario. I noticed the following line in the attached logs:

ERROR  | wrapper  | 2017/03/21 11:15:23 | Shutdown failed: Timed out waiting for signal from JVM.

which seems to indicated the timeout setting I'm looking for. I was wondering if there was a way to configure this timeout? I've tried a couple with no success (see attached conf):

wrapper.jvm_cleanup.timeout=300
wrapper.shutdown.timeout=300
wrapper.ping.timeout=300
wrapper.startup_thread.timeout=300
wrapper.startup.timeout=300
wrapper.cpu.timeout=300
wrapper.jvm_exit.timeout=300
wrapper.ping.timeout.action=300

To reproduce, simply execute a kill $wrapper_pid and right after that a kill -STOP $java_pid and you'll see that the 300 seconds aren't taken into account. Oddly, If you do a kill -STOP $java_pid first and then kill $wrapper_pid you'll see the 300 seconds are taken into account.

I'd be happy to provide a full reproducer with Apache Karaf just let me know.

Thanks a lot,

John.

PS: the required info

Wrapper (Version 3.2.3)
Linux 4.7.3-200.fc24.x86_ GNU/Linux


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Wrapper-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/wrapper-user



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Wrapper-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/wrapper-user
Reply | Threaded
Open this post in threaded view
|

Re: Shutdown timeout on a stopped process

John Poth
Hello Maxime!

Thank you for your answer and time. It turns out that it was a problem on the application side and not with the Wrapper. It looks like Apache Karaf was using org.tanukisoftware.wrapper.WrapperManager#signalStopping(5000*2) [1] which I'm not sure why it was done but I think it overrides the wrapper.shutdown.timeout timeout. I've submitted a pull request to have that changed to Karaf's shutdown timeout and it seems to work. I've tested the case where the timeout is exceeded and not.

Thanks again for everything,

John.



On Thu, Mar 23, 2017 at 5:01 AM, Maxime <[hidden email]> wrote:
Hello John

Thank you for your email.

When requested to shutdown, the Wrapper asks the JVM to stop and waits for the JVM to respond that it is stopping, then wait until the process actually stopped.

If you send a SIGSTOP to the JVM right after killing the Wrapper, then JVM will go in a paused state and may not be able to respond to the Wrapper. In this case the Wrapper will simply wait until wrapper.shutdown.timeout exceed and then will shutdown.


If you leave a few milliseconds between "kill $wrapper_pid" and "kill -STOP $java_pid", then the JVM may have enough time to signal it is stopping but will eventually go in a pause state before it can actually shutdown. In this case the Wrapper will simply wait until wrapper.jvm_exit.timeout exceed and then will shutdown.


I could test both cases with the testwrapper sample application that is provided with the Wrapper, but I could not reproduce a case where any of these timeouts are ignored. If I wait longer between the two command, the JVM will have enough time stop cleanly and the Wrapper will stop normally right after.

case 1 (almost same time):
kill $(head -n 1 testwrapper.pid) & kill -STOP $(head -n 1 jvm.pid)

case 2 (manually execute the command one after the other ~200ms between them):
kill $(head -n 1 testwrapper.pid)
kill -STOP $(head -n 1 jvm.pid)

Could you also have a try with the testwrapper sample application? Approximately how long do you wait between the two commands?

PS: note that wrapper.ping.timeout.action=300 is not correct. This property should specify an action (RESTART, DEBUG, etc.) and not a number of seconds.

Regards,

Maxime

On Tue, Mar 21, 2017 at 9:13 PM, John Poth <[hidden email]> wrote:
Hi wrapper users,

When using the service wrapper on Apache Karaf 4.0.8, I noticed that the shutdown timeouts aren't respected when the underlying java process changes status from 'Running' to 'Stopped' (this can be triggered by using the kill -STOP karaf_java_pid) after the kill $wrapper_pid was called. This doesn't allow for graceful shutdown in this scenario. I noticed the following line in the attached logs:

ERROR  | wrapper  | 2017/03/21 11:15:23 | Shutdown failed: Timed out waiting for signal from JVM.

which seems to indicated the timeout setting I'm looking for. I was wondering if there was a way to configure this timeout? I've tried a couple with no success (see attached conf):

wrapper.jvm_cleanup.timeout=300
wrapper.shutdown.timeout=300
wrapper.ping.timeout=300
wrapper.startup_thread.timeout=300
wrapper.startup.timeout=300
wrapper.cpu.timeout=300
wrapper.jvm_exit.timeout=300
wrapper.ping.timeout.action=300

To reproduce, simply execute a kill $wrapper_pid and right after that a kill -STOP $java_pid and you'll see that the 300 seconds aren't taken into account. Oddly, If you do a kill -STOP $java_pid first and then kill $wrapper_pid you'll see the 300 seconds are taken into account.

I'd be happy to provide a full reproducer with Apache Karaf just let me know.

Thanks a lot,

John.

PS: the required info

Wrapper (Version 3.2.3)
Linux 4.7.3-200.fc24.x86_ GNU/Linux


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Wrapper-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/wrapper-user



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Wrapper-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/wrapper-user



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Wrapper-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/wrapper-user