SGI Techpubs Library

Linux  »  Man Pages
find in page



NAME
       memacctd - memory Accounting Daemon

SYNOPSIS
       memacctd  [  -hVnj1mdSB ] [ -T tolerance ] [ -r restart_interval ] [ -R
       restart_RSS ] [ -t timeout_msec ] [ -f fallback_percent ]

DESCRIPTION
       memacctd is a system daemon which intercepts  the  memacct(3)  call  to
       get_weighted_memory_size()(3), computes the RSS value for the requested
       Process ID(pid) , stores the result in an internal cache before return-
       ing the value back to the function.

OPTIONS
       -h | --help
              this help message.

       -V | --version
              show version information.

       -n | --nodaemon
              do not detach from current session.

       -j | --job_only
              ignore  processes  without  an  assigned  jid  (  see  job(7) ).
              Instead, return the system RSS as obtained via /proc/<pid>/stat.
              Job  Support:  Starting  with SGI MPI 1.0, because SGI job is no
              more supplied, the use of the  -j  flag  is  useless  but  still
              accepted for compatibility.

              To  use this flag, the job module must be loaded and functional.
              In particular, the library /usr/lib/libjob.so must be present on
              the system. If not, the flag is ignored.

       -1 | --nopid1
              ignore entries with ppid=1.

       -m | --monitor
              monitor the following process events.

              -m     monitor   process   exit(),  allowing  memacctd  to  free
                     resources associated with the process id exiting.

              -mm    same as -m but also monitor process fork() parent process
                     id,  allowing  memacctd  to  pre-load  the  cache with an
                     entry, hence likely increasing the cache hit.

              -mmm   same as -mm but also monitor process fork() child process
                     id,  allowing  memacctd  to  pre-load  the  cache with an
                     entry, hence likely increasing the cache hit. NOTE: since
                     each  entry  addition  and deletion require a lock on the
                     table, this monitoring  option  may  in  fact  result  in
                     slower response since a lot of processes may come and go,
                     especially on a machine with an interactive load profile.

              NOTE: Please read section ISSUES item 1 for more information.

       -d | --debug
              enable debug. Can be specified multiple times.

       -S | --no_signal
              SIGUSR1  output  is appended to /var/run/memacctd.stats. SIGUSR2
              signal is ignored.

       -B | --bypass
              Just return system RSS; do not compute the true RSS.

       -T | --tolerance
              Tolerance in Kbytes. Default = 0 . memacctd will re-compute  the
              RSS  if the difference between the current system RSS and cached
              system RSS is greater than or equal the value supplied.

       -r | --restart_interval
              restart interval in seconds. Default:0.   Check  if  the  daemon
              exceed  restart_RSS  (  in Kbytes  ) at defined interval in sec-
              onds. 0 means it's disabled.

       -R | --restart_RSS
              RSS value ( in Kbytes ) triggering a daemon  automatic  restart.
              Default:  16384  Kbytes.  Note  it  is  estimated memacctd has a
              ~2Mbytes RSS when there are no entries in the table.

       -t | --timeout
              Timeout (in milliseconds ) waiting for the kernel ioctl() opera-
              tion  to  terminate.  When  a timeout occurs, and if there is no
              value in cache, the returned value is : the system RSS x  <fall-
              back>  /  100. Default: 20 milliseconds. NOTE: Environment vari-
              able MEMACCTD_TIMEOUT may be specified by this  parameter.  This
              is not to be confused with MEMACCT_TIMEOUT variable.

       -f | --fallback
              When a timeout occurs, and if there is no value in cache, fake a
              trueRss as  <fallback>/100 x system  RSS.  Default:  100.  NOTE:
              Environment  variable MEMACCTD_FALLBACK may be specified by this
              parameter. This is not  to  be  confused  with  MEMACCT_FALLBACK
              variable.

OVERVIEW
       The  goal of memacctd is to cache the computation of the weighted RSS (
       see memacct(3) BACKGROUND section ) value obtained via the kernel  ser-
       vices  in  a table. Further requests for a pid already in the table are
       satisfied from the table ( hence, the cache ) if the parent  process  (
       ppid  )  didn't  change  and  if the VmRSS reported in /proc/<pid>/stat
       didn't change within a given tolerance.

       memacct use the get_weighted_memory_size() function when:

              1.     the parent pid of the current pid request is not equal to
                     the  parent  pid  stored  in cache. This covers pid recy-
                     cling.

              2.     the current system RSS value ( as stored /proc/<pid>/stat
                     )  in  not  within the tolerance as specified by the com-
                     mand-line flag -T. If within the tolerance, the following
                     is return instead:
                     cached  computed  RSS value x current system RSS / cached
                     system RSS.

                     This is to prevent  un-necessary  re-computation  of  the
                     computed RSS.

       In all other cases, memacctd return the value stored in cache.

       get_weighted_memory_size() perform the request by writing the pid value
       onto a UNIX file socket ( /var/run/memacctd.socket ).  The  request  is
       read  by  memacctd and handled via a separate pthread, hence it handles
       multiple requests asynchronously.

       When memacctd handles the request,  it  compute  the  trueRSS,  if  not
       already  in  cache.  If  not  in  cache and if value cannot be computed
       within MEMACCTD_TIMEOUT seconds, the returned value is  computed  using
       the  -f  specification.  If the value is already in cache, the previous
       value is returned instead.

       If there are too many pthreads, memacctd will  perform  the  task  syn-
       chronously.

       Using  the  default option, memacctd cache table will grow as processes
       are generated on the system until all possible pids numbers are  gener-
       ated.  For  this  reason, it may be desired to work-around this problem
       using the following combination of options:

       -1     by not collecting any information for processes with a value  of
              PPID=1.

       -j     by  not  collecting  any  information  for  processes without an
              assigned jid ( See job(7) ).

       -m     by starting monitoring processes exit() ( using PF_NETLINK  ker-
              nel  process  event  connector facilities ) and, hence, allowing
              memacctd to invalidate and free the entries  for  PIDs  that  no
              longer  exist.   Note  the  "-m" option is handled in a separate
              pthread.

       -mm and -mmm
              by starting monitoring exit() and  fork()  processes  event  and
              pro-actively  add  and  remove entries in the cache as processes
              "come and go".

              NOTE: Please read section ISSUES item 1 for more information.

              NOTE: 'fork and exit' are handled in the same separate  pthread.
              Also  note  a  system with an high rate of fork() and exit() may
              result in a memacctd being very busy.

       -r interval
              by specifying a check interval ( in  seconds  )  where  memacctd
              VmRSS should not exceed the value specified via the -R parameter
              ( in Kbytes ). When it exceed the value, the daemon is automati-
              cally restarted, exactly as if the daemon had received  a SIGHUP
              signal.

SIGNAL HANDLING
       memacctd  will respond to 3 signals:

       SIGHUP the daemon is restarted with the original  command-line  parame-
              ters.

       SIGQUIT
              dump and append cache table to /var/run/memacctd.stats

       SIGUSR1
              dump details status information to stdout ( if -n is active ) or
              via syslog facilities.

       SIGRUSR2
              increment debug level by 1. After a value of 2, it is  set  back
              to 0.

SIGUSR1 information
       memacctd  dump information such as memacctd uptime, the CPU time spent,
       current RSS, etc.. It also dump the following statistical information:

       entries
              current number of entries. Each  entry  contain  the  quadruplet
              {pid,ppid,linux_rss,computed_rss}.   The  table is a binary tree
              keyed on pid.

       table size
              #entries * sizeof( entry )

       request
              number of request received on memacct socket

       req_pipeErr
              number of times memacctd get  an  EPIPE  from  libmemacct.  This
              occurs  when  get_weight_memory_size()  libmemacct  call timeout
              expire after his MEMACCT_TIMEOUT, which is  normally  2000  mil-
              liseconds.

       req_timeout
              number  of  times memacctd timed out after MEMACCTD_TIMEOUT mil-
              liseconds.

       reqIoctl
              number of times memacctd performed a /proc/numatools ioctl()  to
              obtain the trueRss

       usec/reqIoctl
              average  time  spent ( microseconds/req ) performing /proc/numa-
              tools ioctl() to obtain the trueRss

       #ioctl Thread
              current number of threads waiting for /proc/numatools ioctl() to
              complete

       #ioctl MaxThr
              max  #ioctl  Thread  number  ever  reached  since  memacctd  was
              started.

       #ioctlWhenMax
              data/time  when  #ioctl  Thread  occurred  since  memacctd   was
              started.

       request/sec
              #request / memacctd uptime

       usec/req
              average time spent ( microseconds/req ) for each request regard-
              less if fetched from cache or not.

       ignored
              #entries ignored because '-1' is  activated  or  querying  about
              memacctd  own  pgrp  id  or  when  -j is activated and a pid has
              jid=-1.  It does not count exit()ed pids not present in the  ta-
              ble.

       insertions
              #  of new entries, either because requested from memacctd socket
              or because the process monitor captured a forked() pid or child.

       deletions
              #  of entries deleted from the table. Those entries were already
              present in the table.  Entries are  deleted  upon  receiving  an
              exit() event ( activated by -m ).

       hits   Cache  hit  for  memacctd  socket  ONLY.   A cache hit is when a
              request for an entry is present in the  table  and  its  current
              ppid is the same -and- it current linux_rss is also the same.

       hits w/tol
              Number  of  entries  included  in the hits count who were within
              tolerance kbytes.

       misses Cache miss for memacctd socket ONLY.  A cache miss is defined as
              a request for a pid already in the table but the current ppid is
              different than the one stored in the table ( the  pid  has  been
              recycled  )  -or- the linux_rss has changed. In this case, a new
              computed_rss value need to be performed.

       events # events handled by the process  monitor.  The  type  of  events
              depends on the '-m' flags used.

       events/sec
              # events / uptime

       hits events
              Cache  hit  for process monitor request ONLY.  The definition is
              the same as 'hits'.

       hits e/tol.
              Number of entries included in the hits  events  count  who  were
              within tolerance kbytes.

       misses evnt
              Cache misses for process monitor request ONLY. The definition is
              the same as 'misses'.

SIGQUIT information
       memacctd dump the cache table and append to /var/run/memacctd.stats.

              NOTE: There is no table when -B flag is used.

       It contains the following information:

       Depth  The entry's binary tree depth.

       PID    The pid. This is the binary tree key.

       PPID   The parent pid. This is used to determine  if  a  pid  has  been
              recycled or not.

       systemRSS
              The vmRss value obtained from /proc/<pid>/stat

       trueRSS
              The  detailed  computed  RSS  value obtained from the kernel. It
              should always be less than the systemRSS. In rare cases, it  can
              be equal; example is pid 1, IE. init.

ISSUES
       1. memacctd -mm and -mmm command-line flags are deprecated
              Since  version  1.3,  memacctd  ignore -mm and -mmm flags, hence
              reducing the process monitoring to the -m level. There  were  no
              useful benefit enabling -mm and -mmm options.

       2. memacctd -mm and -mmm not working well on SLES10
              On  SLES10,  when memacctd is told to monitor parent pid forking
              events ( using -mm flag ), it has been noticed the kernel return
              garbage  values,  hence creating a huge number of entries in the
              cache table. Those entries are not useful at all, makes  further
              requests  consume  more time to search. Unfortunately, there are
              no way to detect this condition.  For this this  reason,  it  is
              strongly recommended to not use the -mm and -mmm on SLES10 based
              kernel.

       3. memacctd hung in kernel finish_stop()
              On SLES10 and SLES10SP1, when memacctd is under  load  and  when
              one repeatedly execute a "strace/Ctrl-C" sequence on the daemon,
              the daemon will appear hung.  Sending a SIGCONT seems to  unlock
              it.  For  this  reason, it is strongly recommended to not strace
              the daemon.

FILES
       /var/run/memacctd.socket
              Requests are received on this Unix socket

       /var/run/memacctd.pid
              The file containing the process id of memacctd

       /var/run/memacctd.stats
              The file containing memacctd stats generated by SIGUSR2 when  -S
              flag  is  active.   This also contain rss cache table when using
              SIGQUIT.

COPYRIGHT
       Copyright © 2007-2010 Silicon Graphics Inc.

SEE ALSO
       memacct(3), ps(1), proc(5), job(7)

Version 1.3                    16 February, 2012                   MEMACCTD(8)

Output converted with man2html


home/search | what's new | help