Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "cmdline" column to "threads" command #249

Merged
merged 1 commit into from
Oct 19, 2020

Conversation

prakashsurya
Copy link
Contributor

No description provided.

@prakashsurya
Copy link
Contributor Author

Example:

sdb> threads | tail 50 | threads
task               state         pid   prio comm            cmdline
------------------ ------------- ----- ---- --------------- ------------------------------------------------
0xffff8c9532921740 INTERRUPTIBLE 8171  120  sdb             /usr/bin/python3 /usr/bin/sdb
0xffff8c9532922e80 RUNNING       8170  120  sdb             /usr/bin/python3 /usr/bin/sdb
0xffff8c9542750000 INTERRUPTIBLE 28559 120  vmstat          vmstat -tw 60 60
0xffff8c9542751740 INTERRUPTIBLE 28554 120  iostat          iostat -dxzt 60 60
0xffff8c954efaae80 INTERRUPTIBLE 28590 120  zpool           zpool iostat -vw 60 60
0xffff8c959aaa2e80 INTERRUPTIBLE 8168  120  sudo            sudo sdb
0xffff8c959aaa5d00 INTERRUPTIBLE 8141  120  (sd-pam)        (sd-pam)
0xffff8c9603332e80 INTERRUPTIBLE 11590 120  CloseDspConnect /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c96033345c0 INTERRUPTIBLE 20037 120  CloseDspConnect /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c9603335d00 INTERRUPTIBLE 11373 120  EnvironmentMoni /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c966a719740 INTERRUPTIBLE 8152  120  bash            bash
0xffff8c966afd5d00 INTERRUPTIBLE 8137  120  sshd            sshd: delphix [priv]
0xffff8c966f9a0000 INTERRUPTIBLE 5610  120  http-nio-127.0. /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c966f9a45c0 INTERRUPTIBLE 22102 120  worker-manager- /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c966f9a5d00 INTERRUPTIBLE 5613  120  http-nio-127.0. /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c968daf45c0 INTERRUPTIBLE 11604 120  worker-manager- /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c9691f69740 INTERRUPTIBLE 8151  120  sshd            sshd: delphix@pts/0
0xffff8c96988345c0 INTERRUPTIBLE 5662  120  oom_waiter      /opt/delphix/server/bin/oom_waiter
0xffff8c969c1e0000 INTERRUPTIBLE 11374 120  CloseDspConnect /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c969c1e1740 INTERRUPTIBLE 20050 120  worker-manager- /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c969c1e2e80 INTERRUPTIBLE 12680 120  worker-manager- /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c969c9c1740 INTERRUPTIBLE 5599  120  http-nio-127.0. /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c96a40d2e80 INTERRUPTIBLE 20036 120  EnvironmentMoni /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c96a7c20000 INTERRUPTIBLE 5651  120  native_oom_hand /bin/bash [...]
0xffff8c9706d08000 INTERRUPTIBLE 7952  120  sleep           sleep 60
0xffff8c9706d0ae80 INTERRUPTIBLE 7948  120  sleep           sleep 60
0xffff8c9708152e80 IDLE          8259  120  kworker/1:0
0xffff8c9708155d00 INTERRUPTIBLE 7983  120  delphix-startup /usr/bin/python3 /usr/bin/delphix-startup-screen
0xffff8c970a4f5d00 INTERRUPTIBLE 8050  120  sleep           sleep 60
0xffff8c970b558000 INTERRUPTIBLE 28558 120  timeout         timeout --kill-after 5s 7200s vmstat -tw 60 60
0xffff8c970b55dd00 INTERRUPTIBLE 28589 120  timeout         timeout --kill-after 5s 7200s zpool iostat [...]
0xffff8c970de58000 INTERRUPTIBLE 8044  120  sleep           sleep 60
0xffff8c970f148000 INTERRUPTIBLE 8348  120  sleep           sleep 5
0xffff8c9710082e80 INTERRUPTIBLE 28553 120  timeout         timeout --kill-after 5s 7200s iostat -dxzt 60 60
0xffff8c97105e8000 INTERRUPTIBLE 7370  120  sleep           sleep 600
0xffff8c97105eae80 INTERRUPTIBLE 8104  120  sleep           sleep 60
0xffff8c97105ec5c0 INTERRUPTIBLE 8054  120  sleep           sleep 60
0xffff8c9714e29740 IDLE          18631 120  kworker/u4:0
0xffff8c9716dd0000 INTERRUPTIBLE 11589 120  EnvironmentMoni /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c9716dd45c0 INTERRUPTIBLE 11381 120  worker-manager- /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c9716dd5d00 INTERRUPTIBLE 15866 120  worker-manager- /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c971924ae80 INTERRUPTIBLE 7953  120  delphix-startup /usr/bin/python3 /usr/bin/delphix-startup-screen
0xffff8c97196a9740 IDLE          8276  120  kworker/1:3
0xffff8c97196ac5c0 IDLE          8274  120  kworker/1:1
0xffff8c971de0ae80 INTERRUPTIBLE 8140  120  systemd         /lib/systemd/systemd --user
0xffff8c971de9ae80 IDLE          6820  120  kworker/0:0
0xffff8c9722300000 INTERRUPTIBLE 5600  120  http-nio-127.0. /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c9722301740 INTERRUPTIBLE 5601  120  http-nio-127.0. /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c9722302e80 INTERRUPTIBLE 5602  120  Catalina-utilit /usr/bin/java -Ddelphix.debug=true [...]
0xffff8c9726e10000 IDLE          8256  120  kworker/u4:3

@sdimitro
Copy link
Contributor

Thanks for working on this! It looks great.

This PR unfortunately will continue to get test failures until we merge this one -> #246
As for testing I wouldn't worry to much about it since the command should already be covered. That said since the output of threads changed we should still be getting some failures. To update the tests output consult this wiki page -> https://github.com/delphix/sdb/wiki/Integration-Tests

I also opened an issue for grabbing a 2nd regression dump at some point that includes user-pages for this to be tested more thoroughly: #250

@prakashsurya
Copy link
Contributor Author

prakashsurya commented Oct 12, 2020

Sure, I haven't worried about getting the tests to pass. I figured I'd wait to get some feedback, and see if folks think this is a good approach to take.

I'm a little unsure how to best to accomplish this, since as-is, we won't get the full command line for any threads with long command lines. I don't like limiting to 50 characters, since that may lead us to not get the information we need. I also don't like the arbitrary limit of 50 characters that I'm using right now (e.g. why not 40? or 60? or anything else?).

Further, if this approach seems reasonable, and using an arbitrary default limit (like the 50 chars in use now), it'd be probably best to allow the user to override this to get the full command line(s). For example, I was thinking about a -v and/or --verbose flag that could be used. The problem then, is the table formatting is awkward since it'll format the width of the cmdline column to the size of the longest command line in the table, which will add lost of wasted blank characters for each thread, since most threads will not have such a long command line in that column.

@sdimitro
Copy link
Contributor

So here are 2 ideas that are not necessarily mutually exclusive:

[1] We can have a pargs command internally does a find_task and then calls drgn.cmdline, to get the full arguments

[2] We can augment threads to be more flexible like slabs or spl_kmem_caches (see: https://github.com/delphix/sdb/blob/master/sdb/commands/linux/slabs.py for example). The idea is that we have some good default fields and then people can request more fields with -o and/or sort the entries with -s if they want to. In our case if the user needs to see the args they can probably specify it at which point we won't be enforcing the 50 character limit.

Thoughts?

@prakashsurya
Copy link
Contributor Author

I think both are good ideas, actually.

My only concern with (2), is it still means the tabular output will be awkward when encountering long command line fields.. here's what I mean:
Screenshot from 2020-10-12 14-36-14

Although, it does look OK when line wrapping is not a factor:

sdb> threads | tail 40 | head 3 | threads
task               state         pid   prio comm            cmdline

0xffff8c96033345c0 INTERRUPTIBLE 20037 120  CloseDspConnect /usr/bin/java -Ddelphix.debug=true -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005 -server -d64 -enableassertions -XX:MaxMetaspaceSize=512m -XX:MetaspaceSize=16m -XX:MaxMetaspaceFreeRatio=40 -XX:CompressedClassSpaceSize=256m -Xfuture -Xmx2g -Xms2g -Xss512k -XX:+UseCompressedOops -XX:InlineSmallCode=500 -XX:-OmitStackTraceInFastThrow -XX:+PreserveFramePointer -javaagent:/opt/delphix/server/lib/exec/tomcat-launcher/libs/com.google.code.java-allocation-instrumenter/java-allocation-instrumenter-3.0.jar=manualOnly -XX:ErrorFile=/var/delphix/server/log/hs_err_pid%p.log -XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/crash -Dcom.sun.management.jmxremote -Ddelphix.group=0 -Ddelphix.server.root=/opt/delphix/server -Ddelphix.user=0 -Djava.security.properties==/opt/delphix/server/etc/java.security -Djna.dump_memory=true -Dlog.dir=/var/delphix/server/log -Dsun.net.inetaddr.negative.ttl=0 -Dsun.net.inetaddr.ttl=0 -Dcom.sun.jndi.ldap.object.disableEndpointIdentification=true -Djdk.tls.ephemeralDHKeySize=2048 -Djava.security.auth.login.config=/opt/delphix/server/etc/krbjaas.conf -Dsun.security.krb5.debug=true -Dsun.security.jgss.debug=true -Djava.security.debug=gssloginconfig,configfile,configparser,logincontext -Djavax.security.auth.useSubjectCredsOnly=false -jar /opt/delphix/server/lib/exec/tomcat-launcher/tomcat-launcher.jar /var/tmp /opt/delphix/server/lib/module/login.war /opt/delphix/server/lib/module/dxcore.war /opt/delphix/server/lib/module/styleguide.war /opt/delphix/server/lib/module/dxtest.war /opt/delphix/server/lib/module/api-json.war /opt/delphix/server/lib/module/resources.war /opt/delphix/server/lib/module/ROOT.war /opt/delphix/server/lib/module/jetstream.war /opt/delphix/server/lib/module/api.war /opt/delphix/server/lib/module/connector.war
0xffff8c969c1e1740 INTERRUPTIBLE 20050 120  worker-manager- /usr/bin/java -Ddelphix.debug=true -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005 -server -d64 -enableassertions -XX:MaxMetaspaceSize=512m -XX:MetaspaceSize=16m -XX:MaxMetaspaceFreeRatio=40 -XX:CompressedClassSpaceSize=256m -Xfuture -Xmx2g -Xms2g -Xss512k -XX:+UseCompressedOops -XX:InlineSmallCode=500 -XX:-OmitStackTraceInFastThrow -XX:+PreserveFramePointer -javaagent:/opt/delphix/server/lib/exec/tomcat-launcher/libs/com.google.code.java-allocation-instrumenter/java-allocation-instrumenter-3.0.jar=manualOnly -XX:ErrorFile=/var/delphix/server/log/hs_err_pid%p.log -XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/crash -Dcom.sun.management.jmxremote -Ddelphix.group=0 -Ddelphix.server.root=/opt/delphix/server -Ddelphix.user=0 -Djava.security.properties==/opt/delphix/server/etc/java.security -Djna.dump_memory=true -Dlog.dir=/var/delphix/server/log -Dsun.net.inetaddr.negative.ttl=0 -Dsun.net.inetaddr.ttl=0 -Dcom.sun.jndi.ldap.object.disableEndpointIdentification=true -Djdk.tls.ephemeralDHKeySize=2048 -Djava.security.auth.login.config=/opt/delphix/server/etc/krbjaas.conf -Dsun.security.krb5.debug=true -Dsun.security.jgss.debug=true -Djava.security.debug=gssloginconfig,configfile,configparser,logincontext -Djavax.security.auth.useSubjectCredsOnly=false -jar /opt/delphix/server/lib/exec/tomcat-launcher/tomcat-launcher.jar /var/tmp /opt/delphix/server/lib/module/login.war /opt/delphix/server/lib/module/dxcore.war /opt/delphix/server/lib/module/styleguide.war /opt/delphix/server/lib/module/dxtest.war /opt/delphix/server/lib/module/api-json.war /opt/delphix/server/lib/module/resources.war /opt/delphix/server/lib/module/ROOT.war /opt/delphix/server/lib/module/jetstream.war /opt/delphix/server/lib/module/api.war /opt/delphix/server/lib/module/connector.war
0xffff8c971dcc45c0 IDLE          21007 120  kworker/u4:1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
sdb>

@sdimitro
Copy link
Contributor

Whoa! Ok I vastly underestimated the length of args that we generally see. I'm ok with the 50 limit for now as long as we can still have something like pargs for users that want the complete list of args.

@prakashsurya prakashsurya force-pushed the threads-cmdline branch 2 times, most recently from 4f1ff11 to 775f900 Compare October 13, 2020 17:37
@prakashsurya prakashsurya changed the title WIP: Add "cmdline" column to "threads" command Add "cmdline" column to "threads" command Oct 13, 2020
@prakashsurya
Copy link
Contributor Author

@sdimitro OK, sounds good. Perhaps then, lets move forward with this, and we can always improve on it later (e.g. add the new pargs command, and/or implement your suggestions w.r.t. -s and -o support).

@codecov-io
Copy link

codecov-io commented Oct 13, 2020

Codecov Report

Merging #249 into master will increase coverage by 0.00%.
The diff coverage is 87.50%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #249   +/-   ##
=======================================
  Coverage   87.37%   87.37%           
=======================================
  Files          60       60           
  Lines        2488     2496    +8     
=======================================
+ Hits         2174     2181    +7     
- Misses        314      315    +1     
Impacted Files Coverage Δ
sdb/commands/threads.py 96.42% <87.50%> (-3.58%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4c9c5d1...aaf798b. Read the comment docs.

@prakashsurya prakashsurya merged commit f0aa360 into delphix:master Oct 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants