有什么快速而简单的方法可以确保在给定时间内只有一个shell脚本实例在运行?


当前回答

如果flock的限制,这已经在这篇文章的其他地方描述过了,对你来说不是问题,那么这应该是有效的:

#!/bin/bash

{
    # exit if we are unable to obtain a lock; this would happen if 
    # the script is already running elsewhere
    # note: -x (exclusive) is the default
    flock -n 100 || exit

    # put commands to run here
    sleep 100
} 100>/tmp/myjob.lock 

其他回答

现有的答案要么依赖于CLI实用程序群,要么没有正确地保护锁文件。flock实用程序在所有非linux系统(即FreeBSD)上都不可用,在NFS上也不能正常工作。

在我从事系统管理和系统开发的早期,有人告诉我,一种安全且相对可移植的创建锁文件的方法是使用mkemp(3)或mkemp(1)创建临时文件,将标识信息写入临时文件(即PID),然后将临时文件硬链接到锁文件。如果链接成功,那么您已经成功地获得了锁。

当在shell脚本中使用锁时,我通常会在共享概要文件中放置一个obtain_lock()函数,然后从脚本中获取它。下面是一个lock函数的例子:

obtain_lock()
{
  LOCK="${1}"
  LOCKDIR="$(dirname "${LOCK}")"
  LOCKFILE="$(basename "${LOCK}")"

  # create temp lock file
  TMPLOCK=$(mktemp -p "${LOCKDIR}" "${LOCKFILE}XXXXXX" 2> /dev/null)
  if test "x${TMPLOCK}" == "x";then
     echo "unable to create temporary file with mktemp" 1>&2
     return 1
  fi
  echo "$$" > "${TMPLOCK}"

  # attempt to obtain lock file
  ln "${TMPLOCK}" "${LOCK}" 2> /dev/null
  if test $? -ne 0;then
     rm -f "${TMPLOCK}"
     echo "unable to obtain lockfile" 1>&2
     if test -f "${LOCK}";then
        echo "current lock information held by: $(cat "${LOCK}")" 1>&2
     fi
     return 2
  fi
  rm -f "${TMPLOCK}"

  return 0;
};

lock功能的使用示例如下:

#!/bin/sh

. /path/to/locking/profile.sh
PROG_LOCKFILE="/tmp/myprog.lock"

clean_up()
{
  rm -f "${PROG_LOCKFILE}"
}

obtain_lock "${PROG_LOCKFILE}"
if test $? -ne 0;then
   exit 1
fi
trap clean_up SIGHUP SIGINT SIGTERM

# bulk of script

clean_up
exit 0
# end of script

记住在脚本中的任何退出点调用clean_up。

我在Linux和FreeBSD环境中都使用了上述方法。

我有一个基于文件名的简单解决方案

#!/bin/bash

MY_FILENAME=`basename "$BASH_SOURCE"`

MY_PROCESS_COUNT=$(ps a -o pid,cmd | grep $MY_FILENAME | grep -v grep | grep -v $$ | wc -
l)

if [ $MY_PROCESS_COUNT -ne 0  ]; then
  echo found another process
  exit 0
if

# Follows the code to get the job done.

对于shell脚本,我倾向于使用mkdir而不是flock,因为它使锁更可移植。

不管怎样,使用set -e是不够的。它只在任何命令失败时退出脚本。你的锁还是会留下的。

为了正确的锁清理,你真的应该把你的陷阱设置成这样的伪代码(提取,简化和未经测试,但来自积极使用的脚本):

#=======================================================================
# Predefined Global Variables
#=======================================================================

TMPDIR=/tmp/myapp
[[ ! -d $TMP_DIR ]] \
    && mkdir -p $TMP_DIR \
    && chmod 700 $TMPDIR

LOCK_DIR=$TMP_DIR/lock

#=======================================================================
# Functions
#=======================================================================

function mklock {
    __lockdir="$LOCK_DIR/$(date +%s.%N).$$" # Private Global. Use Epoch.Nano.PID

    # If it can create $LOCK_DIR then no other instance is running
    if $(mkdir $LOCK_DIR)
    then
        mkdir $__lockdir  # create this instance's specific lock in queue
        LOCK_EXISTS=true  # Global
    else
        echo "FATAL: Lock already exists. Another copy is running or manually lock clean up required."
        exit 1001  # Or work out some sleep_while_execution_lock elsewhere
    fi
}

function rmlock {
    [[ ! -d $__lockdir ]] \
        && echo "WARNING: Lock is missing. $__lockdir does not exist" \
        || rmdir $__lockdir
}

#-----------------------------------------------------------------------
# Private Signal Traps Functions {{{2
#
# DANGER: SIGKILL cannot be trapped. So, try not to `kill -9 PID` or 
#         there will be *NO CLEAN UP*. You'll have to manually remove 
#         any locks in place.
#-----------------------------------------------------------------------
function __sig_exit {

    # Place your clean up logic here 

    # Remove the LOCK
    [[ -n $LOCK_EXISTS ]] && rmlock
}

function __sig_int {
    echo "WARNING: SIGINT caught"    
    exit 1002
}

function __sig_quit {
    echo "SIGQUIT caught"
    exit 1003
}

function __sig_term {
    echo "WARNING: SIGTERM caught"    
    exit 1015
}

#=======================================================================
# Main
#=======================================================================

# Set TRAPs
trap __sig_exit EXIT    # SIGEXIT
trap __sig_int INT      # SIGINT
trap __sig_quit QUIT    # SIGQUIT
trap __sig_term TERM    # SIGTERM

mklock

# CODE

exit # No need for cleanup code here being in the __sig_exit trap function

接下来会发生什么。所有陷阱都会产生一个出口,所以__sig_exit函数总是会发生(除非SIGKILL),它会清理你的锁。

注意:我的退出值不是低值。为什么?各种批处理系统生成或期望数字0到31。将它们设置为其他内容,我可以让我的脚本和批处理流对前一个批处理作业或脚本做出相应的反应。

在flock(2)系统调用周围有一个包装器,毫无想象力地称为flock(1)。这使得可靠地获得独占锁相对容易,而不必担心清理等问题。手册页上有一些关于如何在shell脚本中使用它的示例。

一个有flock(1)但没有subshell的例子。Flock ()ed文件/tmp/foo永远不会被删除,但这没关系,因为它会被Flock()和un-flock()ed。

#!/bin/bash

exec 9<> /tmp/foo
flock -n 9
RET=$?
if [[ $RET -ne 0 ]] ; then
    echo "lock failed, exiting"
    exit
fi

#Now we are inside the "critical section"
echo "inside lock"
sleep 5
exec 9>&- #close fd 9, and release lock

#The part below is outside the critical section (the lock)
echo "lock released"
sleep 5