我希望提供一个结构化的配置文件,它对于非技术用户来说尽可能容易编辑(不幸的是它必须是一个文件),所以我想使用YAML。然而,我找不到任何方法从Unix shell脚本解析这个。


当前回答

我知道我的回答很具体,但是如果已经安装了PHP和Symfony,那么使用Symfony的YAML解析器会非常方便。

例如:

php -r "require '$SYMFONY_ROOT_PATH/vendor/autoload.php'; \
    var_dump(\Symfony\Component\Yaml\Yaml::parse(file_get_contents('$YAML_FILE_PATH')));"

这里我只是简单地使用var_dump来输出解析后的数组,当然你还可以做更多…:)

其他回答

如果你有python 2和PyYAML,你可以使用我写的这个解析器parse_yaml.py。它做的一些更整洁的事情是让您选择一个前缀(以防您有多个具有类似变量的文件),并从yaml文件中选择一个值。

例如,如果你有这些yaml文件:

staging.yaml:

db:
    type: sqllite
    host: 127.0.0.1
    user: dev
    password: password123

prod.yaml:

db:
    type: postgres
    host: 10.0.50.100
    user: postgres
    password: password123

您可以加载两者而不会产生冲突。

$ eval $(python parse_yaml.py prod.yaml --prefix prod --cap)
$ eval $(python parse_yaml.py staging.yaml --prefix stg --cap)
$ echo $PROD_DB_HOST
10.0.50.100
$ echo $STG_DB_HOST
127.0.0.1

甚至可以选择你想要的值。

$ prod_user=$(python parse_yaml.py prod.yaml --get db_user)
$ prod_port=$(python parse_yaml.py prod.yaml --get db_port --default 5432)
$ echo prod_user
postgres
$ echo prod_port
5432

可以将一个小脚本传递给一些解释器,比如Python。使用Ruby和它的YAML库的简单方法如下:

$ RUBY_SCRIPT="data = YAML::load(STDIN.read); puts data['a']; puts data['b']"
$ echo -e '---\na: 1234\nb: 4321' | ruby -ryaml -e "$RUBY_SCRIPT"
1234
4321

,其中data是来自yaml的值的散列(或数组)。

作为奖励,它可以很好地解析杰基尔的正面问题。

ruby -ryaml -e "puts YAML::load(open(ARGV.first).read)['tags']" example.md

如果您知道您感兴趣的标记和您期望的yaml结构,那么在Bash中编写一个简单的yaml解析器并不难。

在下面的示例中,解析器将一个结构化YAML文件读入环境变量、数组和关联数组。

注意:这个解析器的复杂性与YAML文件的结构有关。对于YAML文件的每个结构化组件,都需要一个单独的子例程。高度结构化的YAML文件可能需要更复杂的方法,例如通用的递归下降解析器。

圣诞节。yaml文件:

# Xmas YAML example
---
 # Values
 pear-tree: partridge
 turtle-doves: 2.718
 french-hens: 3

 # Array
 calling-birds:
   - huey
   - dewey
   - louie
   - fred

 # Structure
 xmas-fifth-day:
   calling-birds: four
   french-hens: 3
   golden-rings: 5
   partridges:
     count: 1
     location: "a pear tree"
   turtle-doves: two

解析器使用mapfile将文件作为数组读入内存,然后循环遍历每个标记并创建环境变量。

梨树、斑鸠和法国母鸡:最终成为简单的环境变量 呼叫鸟:变成一个数组 xmas-fifth-day:结构被表示为一个关联数组,但是如果您没有使用Bash 4.0或更高版本,您可以将这些数组编码为环境变量。 注释和空白将被忽略。

#!/bin/bash
# -------------------------------------------------------------------
# A simple parser for the xmas.yaml file
# -------------------------------------------------------------------
# 
# xmas.yaml tags
#  #                        - Ignored
#                           - Blank lines are ignored
#  ---                      - Initialiser for days-of-xmas 
#   pear-tree: partridge    - a string
#   turtle-doves: 2.718     - a string, no float type in Bash
#   french-hens: 3          - a number
#   calling-birds:          - an array of strings
#     - huey                - calling-birds[0]
#     - dewey
#     - louie
#     - fred
#   xmas-fifth-day:         - an associative array
#     calling-birds: four   - a string
#     french-hens: 3        - a number
#     golden-rings: 5       - a number
#     partridges:           - changes the key to partridges.xxx
#       count: 1            - a number
#       location: "a pear tree" - a string
#     turtle-doves: two     - a string
# 
# This requires the following routines
# ParseXMAS
#   parses #, ---, blank line
#   unexpected tag error
#   calls days-of-xmas
#
# days-of-xmas
#   parses pear-tree, turtle-doves, french-hens
#   calls calling-birds
#   calls xmas-fifth-day
# 
# calling-birds
#   elements of the array
#
# xmas-fifth-day
#   parses calling-birds, french-hens, golden-rings, turtle-doves
#   calls partridges
# 
# partridges
#   parses partridges.count, partridges.location
#

function ParseXMAS()
{

  # days-of-xmas
  #   parses pear-tree, turtle-doves, french-hens
  #   calls calling-birds
  #   calls xmas-fifth-day
  # 
  function days-of-xmas()
  {
    unset PearTree TurtleDoves FrenchHens

    while [ $CURRENT_ROW -lt $ROWS ]
    do
      LINE=( ${CONFIG[${CURRENT_ROW}]} )
      TAG=${LINE[0]}
      unset LINE[0]

      VALUE="${LINE[*]}"

      echo "  days-of-xmas[${CURRENT_ROW}] ${TAG}=${VALUE}"

      if [ "$TAG" = "pear-tree:" ]
      then
        declare -g PearTree=$VALUE
      elif [ "$TAG" = "turtle-doves:" ]
      then
        declare -g TurtleDoves=$VALUE
      elif [ "$TAG" = "french-hens:" ]
      then
        declare -g FrenchHens=$VALUE
      elif [ "$TAG" = "calling-birds:" ]
      then
        let CURRENT_ROW=$(($CURRENT_ROW + 1))
        calling-birds
        continue
      elif [ "$TAG" = "xmas-fifth-day:" ]
      then
        let CURRENT_ROW=$(($CURRENT_ROW + 1))
        xmas-fifth-day
        continue
      elif [ -z "$TAG" ] || [ "$TAG" = "#" ]
      then
        # Ignore comments and blank lines
        true
      else
        # time to bug out
        break
      fi

      let CURRENT_ROW=$(($CURRENT_ROW + 1))
    done
  }

  # calling-birds
  #   elements of the array
  function calling-birds()
  {
    unset CallingBirds

    declare -ag CallingBirds

    while [ $CURRENT_ROW -lt $ROWS ]
    do
      LINE=( ${CONFIG[${CURRENT_ROW}]} )
      TAG=${LINE[0]}
      unset LINE[0]

      VALUE="${LINE[*]}"

      echo "    calling-birds[${CURRENT_ROW}] ${TAG}=${VALUE}"

      if [ "$TAG" = "-" ]
      then
        CallingBirds[${#CallingBirds[*]}]=$VALUE
      elif [ -z "$TAG" ] || [ "$TAG" = "#" ]
      then
        # Ignore comments and blank lines
        true
      else
        # time to bug out
        break
      fi

      let CURRENT_ROW=$(($CURRENT_ROW + 1))
    done
  }

  # xmas-fifth-day
  #   parses calling-birds, french-hens, golden-rings, turtle-doves
  #   calls fifth-day-partridges
  # 
  function xmas-fifth-day()
  {
    unset XmasFifthDay

    declare -Ag XmasFifthDay

    while [ $CURRENT_ROW -lt $ROWS ]
    do
      LINE=( ${CONFIG[${CURRENT_ROW}]} )
      TAG=${LINE[0]}
      unset LINE[0]

      VALUE="${LINE[*]}"

      echo "    xmas-fifth-day[${CURRENT_ROW}] ${TAG}=${VALUE}"

      if [ "$TAG" = "calling-birds:" ]
      then
        XmasFifthDay[CallingBirds]=$VALUE
      elif [ "$TAG" = "french-hens:" ]
      then
        XmasFifthDay[FrenchHens]=$VALUE
      elif [ "$TAG" = "golden-rings:" ]
      then
        XmasFifthDay[GOLDEN-RINGS]=$VALUE
      elif [ "$TAG" = "turtle-doves:" ]
      then
        XmasFifthDay[TurtleDoves]=$VALUE
      elif [ "$TAG" = "partridges:" ]
      then
        let CURRENT_ROW=$(($CURRENT_ROW + 1))
        partridges
        continue
      elif [ -z "$TAG" ] || [ "$TAG" = "#" ]
      then
        # Ignore comments and blank lines
        true
      else
        # time to bug out
        break
      fi
 
      let CURRENT_ROW=$(($CURRENT_ROW + 1))
    done
  }

  function partridges()
  {
    while [ $CURRENT_ROW -lt $ROWS ]
    do
      LINE=( ${CONFIG[${CURRENT_ROW}]} )
      TAG=${LINE[0]}
      unset LINE[0]

      VALUE="${LINE[*]}"

      echo "      partridges[${CURRENT_ROW}] ${TAG}=${VALUE}"

      if [ "$TAG" = "count:" ]
      then
        XmasFifthDay[PARTRIDGES.COUNT]=$VALUE
      elif [ "$TAG" = "location:" ]
      then
        XmasFifthDay[PARTRIDGES.LOCATION]=$VALUE
      elif [ -z "$TAG" ] || [ "$TAG" = "#" ]
      then
        # Ignore comments and blank lines
        true
      else
        # time to bug out
        break
      fi
 
      let CURRENT_ROW=$(($CURRENT_ROW + 1))
    done
  }

  # ===================================================================
  # Load the configuration file

  mapfile CONFIG < xmas.yaml

  let ROWS=${#CONFIG[@]}
  let CURRENT_ROW=0

  # +
  # #
  #
  # ---
  # -
  while [ $CURRENT_ROW -lt $ROWS ]
  do
    LINE=( ${CONFIG[${CURRENT_ROW}]} )
    TAG=${LINE[0]}
    unset LINE[0]

    VALUE="${LINE[*]}"

    echo "[${CURRENT_ROW}] ${TAG}=${VALUE}"

    if [ "$TAG" = "---" ]
    then
        let CURRENT_ROW=$(($CURRENT_ROW + 1))
        days-of-xmas
        continue
    elif [ -z "$TAG" ] || [ "$TAG" = "#" ]
    then
        # Ignore comments and blank lines
        true
    else
        echo "Unexpected tag at line $(($CURRENT_ROW + 1)): <${TAG}>={${VALUE}}"
        break
    fi

    let CURRENT_ROW=$(($CURRENT_ROW + 1))
  done
}

echo =========================================
ParseXMAS

echo =========================================
declare -p PearTree
declare -p TurtleDoves
declare -p FrenchHens
declare -p CallingBirds
declare -p XmasFifthDay

这将产生以下输出

=========================================
[0] #=Xmas YAML example
[1] ---=
  days-of-xmas[2] #=Values
  days-of-xmas[3] pear-tree:=partridge
  days-of-xmas[4] turtle-doves:=2.718
  days-of-xmas[5] french-hens:=3
  days-of-xmas[6] =
  days-of-xmas[7] #=Array
  days-of-xmas[8] calling-birds:=
    calling-birds[9] -=huey
    calling-birds[10] -=dewey
    calling-birds[11] -=louie
    calling-birds[12] -=fred
    calling-birds[13] =
    calling-birds[14] #=Structure
    calling-birds[15] xmas-fifth-day:=
  days-of-xmas[15] xmas-fifth-day:=
    xmas-fifth-day[16] calling-birds:=four
    xmas-fifth-day[17] french-hens:=3
    xmas-fifth-day[18] golden-rings:=5
    xmas-fifth-day[19] partridges:=
      partridges[20] count:=1
      partridges[21] location:="a pear tree"
      partridges[22] turtle-doves:=two
    xmas-fifth-day[22] turtle-doves:=two
=========================================
declare -- PearTree="partridge"
declare -- TurtleDoves="2.718"
declare -- FrenchHens="3"
declare -a CallingBirds=([0]="huey" [1]="dewey" [2]="louie" [3]="fred")
declare -A XmasFifthDay=([CallingBirds]="four" [PARTRIDGES.LOCATION]="\"a pear tree\"" [FrenchHens]="3" [GOLDEN-RINGS]="5" [PARTRIDGES.COUNT]="1" [TurtleDoves]="two" )

以下是Stefan Farestam回答的扩展版本:

function parse_yaml {
   local prefix=$2
   local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
   sed -ne "s|,$s\]$s\$|]|" \
        -e ":1;s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s,$s\(.*\)$s\]|\1\2: [\3]\n\1  - \4|;t1" \
        -e "s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s\]|\1\2:\n\1  - \3|;p" $1 | \
   sed -ne "s|,$s}$s\$|}|" \
        -e ":1;s|^\($s\)-$s{$s\(.*\)$s,$s\($w\)$s:$s\(.*\)$s}|\1- {\2}\n\1  \3: \4|;t1" \
        -e    "s|^\($s\)-$s{$s\(.*\)$s}|\1-\n\1  \2|;p" | \
   sed -ne "s|^\($s\):|\1|" \
        -e "s|^\($s\)-$s[\"']\(.*\)[\"']$s\$|\1$fs$fs\2|p" \
        -e "s|^\($s\)-$s\(.*\)$s\$|\1$fs$fs\2|p" \
        -e "s|^\($s\)\($w\)$s:$s[\"']\(.*\)[\"']$s\$|\1$fs\2$fs\3|p" \
        -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p" | \
   awk -F$fs '{
      indent = length($1)/2;
      vname[indent] = $2;
      for (i in vname) {if (i > indent) {delete vname[i]; idx[i]=0}}
      if(length($2)== 0){  vname[indent]= ++idx[indent] };
      if (length($3) > 0) {
         vn=""; for (i=0; i<indent; i++) { vn=(vn)(vname[i])("_")}
         printf("%s%s%s=\"%s\"\n", "'$prefix'",vn, vname[indent], $3);
      }
   }'
}

该版本支持字典和列表的-符号和短符号。以下输入:

global:
  input:
    - "main.c"
    - "main.h"
  flags: [ "-O3", "-fpic" ]
  sample_input:
    -  { property1: value, property2: "value2" }
    -  { property1: "value3", property2: 'value 4' }

产生如下输出:

global_input_1="main.c"
global_input_2="main.h"
global_flags_1="-O3"
global_flags_2="-fpic"
global_sample_input_1_property1="value"
global_sample_input_1_property2="value2"
global_sample_input_2_property1="value3"
global_sample_input_2_property2="value 4"

as you can see the - items automatically get numbered in order to obtain different variable names for each item. In bash there are no multidimensional arrays, so this is one way to work around. Multiple levels are supported. To work around the problem with trailing white spaces mentioned by @briceburg one should enclose the values in single or double quotes. However, there are still some limitations: Expansion of the dictionaries and lists can produce wrong results when values contain commas. Also, more complex structures like values spanning multiple lines (like ssh-keys) are not (yet) supported.

A few words about the code: The first sed command expands the short form of dictionaries { key: value, ...} to regular and converts them to more simple yaml style. The second sed call does the same for the short notation of lists and converts [ entry, ... ] to an itemized list with the - notation. The third sed call is the original one that handled normal dictionaries, now with the addition to handle lists with - and indentations. The awk part introduces an index for each indentation level and increases it when the variable name is empty (i.e. when processing a list). The current value of the counters are used instead of the empty vname. When going up one level, the counters are zeroed.

编辑:我已经为此创建了一个github存储库。

考虑到Python3和PyYAML是非常容易满足的依赖关系,下面的代码可能会有所帮助:

yaml() {
    python3 -c "import yaml;print(yaml.safe_load(open('$1'))$2)"
}

VALUE=$(yaml ~/my_yaml_file.yaml "['a_key']")