yq コマンドで markdown のテーブル・データをつくる

yq コマンドの出力結果として、markdown 形式のテーブルデータを作ります。

環境

$ yq --version
yq (https://github.com/mikefarah/yq/) version v4.44.3

yq コマンドには、golang 製のものと Python 製のものがありますが、これは golang 製のものです。

利用する YAML ファイル

下記の、dbt model properties ファイルを処理対象として利用します。

schema.yml

version: 2

models:
  - name: customers
    description: This table has basic information about a customer, as well as some derived facts based on a customer's orders

    columns:
      - name: customer_id
        description: This is a unique identifier for a customer
        tests:
          - unique
          - not_null

      - name: first_name
        description: Customer's first name. PII.

      - name: last_name
        description: Customer's last name. PII.

      - name: first_order
        description: Date (UTC) of a customer's first order

      - name: most_recent_order
        description: Date (UTC) of a customer's most recent order

      - name: number_of_orders
        description: Count of the number of orders a customer has placed

      - name: total_order_amount
        description: Total value (AUD) of a customer's orders

  - name: orders
    description: This table has basic information about orders, as well as some derived facts based on payments

    columns:
      - name: order_id
        tests:
          - unique
          - not_null
        description: This is a unique identifier for an order

      - name: customer_id
        description: Foreign key to the customers table
        tests:
          - not_null
          - relationships:
              to: ref('customers')
              field: customer_id

      - name: order_date
        description: Date (UTC) that the order was placed

      - name: status
        description: '{{ doc("orders_status") }}'
        tests:
          - accepted_values:
              values: ["placed", "shipped", "completed", "return_pending", "returned"]

      - name: amount
        description: Total amount (AUD) of the order
        tests:
          - not_null

      - name: credit_card_amount
        description: Amount of the order (AUD) paid for by credit card
        tests:
          - not_null

      - name: coupon_amount
        description: Amount of the order (AUD) paid for by coupon
        tests:
          - not_null

      - name: bank_transfer_amount
        description: Amount of the order (AUD) paid for by bank transfer
        tests:
          - not_null

      - name: gift_card_amount
        description: Amount of the order (AUD) paid for by gift card
        tests:
          - not_null

https://github.com/dbt-labs/jaffle-shop-classic/blob/main/models/schema.yml

基本

echo "| table | description |" ;\
echo "| --- | --- |" ;\
yq '.models[] | ["|" + .name, .description + "|"] | join("|")' schema.yml


  • テーブルのヘッダー部分は echo コマンドで処理
  • array 出力したものを、| 文字列にて結合


table description
customers This table has basic information about a customer, as well as some derived facts based on a customer's orders
orders This table has basic information about orders, as well as some derived facts based on payments

アンネストしたデータを作成

echo "| table | column | description |" ;\
echo "| --- | --- | --- |" ;\
yq '
  .models[] |
  .name as $table |
  .columns[] |
  [ "|" + $table, .name, .description + "|" ] |
  join("|")
' schema.yml



table column description
customers customer_id This is a unique identifier for a customer
customers first_name Customer's first name. PII.
customers last_name Customer's last name. PII.
customers first_order Date (UTC) of a customer's first order
customers most_recent_order Date (UTC) of a customer's most recent order
customers number_of_orders Count of the number of orders a customer has placed
customers total_order_amount Total value (AUD) of a customer's orders
orders order_id This is a unique identifier for an order
orders customer_id Foreign key to the customers table
orders order_date Date (UTC) that the order was placed
orders status {{ doc("orders_status") }}
orders amount Total amount (AUD) of the order
orders credit_card_amount Amount of the order (AUD) paid for by credit card
orders coupon_amount Amount of the order (AUD) paid for by coupon
orders bank_transfer_amount Amount of the order (AUD) paid for by bank transfer
orders gift_card_amount Amount of the order (AUD) paid for by gift card

複雑なもの

これまでの点も応用して、より複雑なものを。

echo "| table | column | tests |" ;\
echo "| --- | --- | --- |" ;\
yq '
  .models[] | .name as $table | .columns[] |
  [ "|" + $table, .name, (
      .tests // [] | .[] |= (
          with(select(tag == "!!str");
              . = .
          ) |
          with(select(tag == "!!map");
              . = keys[0]
          )
      ) | join("<br>")
  ) + "|"] |
  join("|")
' schema.yml


  • .models[].name.columns[].tests[] 内の要素を <br> タグで結合したい訳だが
  • そもそも .models[].name.columns[].tests[] がない場合があるため、その場合には空の array を用意(.tests // [])
  • array 内の要素に map 処理を実行
    • map 処理の記法は |=
  • しかしながら、要素内には string 型と map 型があるため、データ型に応じて処理を分岐してあげないと
    • yq では if-else 構文は用意されていないものの、withselect を利用して代替が可能
      • Logic without if/elif/else
      • コマンド例では、map 型の場合には最初の key 名(keys[0])で値を更新して string 型にしている
        • これで .models[].name.columns[].tests[] 内を <br> タグで結合できるようなった
      • なお、yq でのデータ型のリテラルTag を参照


table column tests
customers customer_id unique
not_null
customers first_name
customers last_name
customers first_order
customers most_recent_order
customers number_of_orders
customers total_order_amount
orders order_id unique
not_null
orders customer_id not_null
relationships
orders order_date
orders status accepted_values
orders amount not_null
orders credit_card_amount not_null
orders coupon_amount not_null
orders bank_transfer_amount not_null
orders gift_card_amount not_null