Using jq and curl to Wrangle JSON Arrays from the Terminal

栏目: IT技术 · 发布时间: 4年前

内容简介:In this tutorial, we use jq and curl to query a web service and retrieve JSON objects containing embedded arrays. We then invoke a command based on each element in the array rather than simply printing the values to the console. A special thanks to one of

In this tutorial, we use jq and curl to query a web service and retrieve JSON objects containing embedded arrays. We then invoke a command based on each element in the array rather than simply printing the values to the console. A special thanks to one of my readers, B. Anderson, who left a comment on my Consuming Web API JSON Data Using curl and jq post and provided this question for us to explore. Let's get started!

Article contents

Objective

Our goal is to query a web service containing metadata about weather stations in the area. The web service is located at https://thisdavej.com/api/weather-station.json and outputs the following JSON content:

{
  "name": "Station1",
  "location": "Centerville",
  "sensors": ["temperature", "humidity"]
}

We ultimately want to process the array of sensors embedded in the JSON object and fetch the current value of each sensor using a command that retrieves the values from a cloud-based time-series database. For this exercise, we'll create a Python script called get-sensor to simulate the retrieval of sensor values from the time-series database:

get-sensor
#!/usr/bin/env python3

import argparse
import random

parser = argparse.ArgumentParser()
parser.add_argument("station")
parser.add_argument("sensor")
args = parser.parse_args()

station = args.station
sensor = args.sensor


ranges = {
    "temperature": (50, 100),
    "humidity": (0, 100),
    "rainfall": (0, 3),
    "wind_speed": (0, 60),
}
default_range = (0, 100)
range = ranges.get(sensor, default_range)

ndigits = 2
current_value = round(random.uniform(*range), ndigits)

print(f"{station}.{sensor} current value: {current_value}")

After creating the above script, change the file access permissions to make the script executable:

$ chmod u+x get-sensor

The get-sensor requires two arguments (station and sensor) and returns the sensor value using a random number generator:

$ ./get-sensor Station1 temperature
Station1.temperature current value: 77.58

Initial Setup

To get started, check if curl is installed on your system since we will be using it fetch web data. It is installed by default on most systems.

$ which curl
/usr/bin/curl

I'm running Ubuntu on WSL at the moment and an executable path is returned so it's installed for me. If the which command returns nothing, you'll need to install it. For Debian-based distros such as Ubuntu, install it like this:

$ sudo apt install curl

We'll also be using jq, an awesome tool for slicing, dicing, and transforming JSON data. Let's check if we have it installed.

$ which jq

It looks like it needs to be installed so install it now:

$ sudo apt install jq

Note: We could also check if jq is installed and install it in one shot as follows:

$ which jq || sudo apt install jq

Next, create a shell script called get-station.sh to fetch the data using curl and output the result to standard output:

get-station.sh
#!/bin/bash

content=$(curl -sS "https://thisdavej.com/api/weather-station.json")
echo "$content"

curl parameters used:

  • -s Silent or quiet mode. Don't show progress meter or error messages. Makes Curl mute.
  • -S When used with -s it makes curl show an error message if it fails.

Consult the curl man page for more details or run the command syntax throughexplainshell, which is one of my favorite tools.

Change the script file access permissions to make the script executable:

$ chmod u+x get-station.sh

Finally, run the script to ensure it returns results:

$ ./get-station.sh
{
  "name": "Station1",
  "location": "Centerville",
  "sensors": ["temperature", "humidity"]
}

All looks good so we're ready to pipe the output through jq and do some amazing things!:smile:

Process one JSON object

Our web service ( https://thisdavej.com/api/weather-station.json ) returns one JSON object and we'll start using jq in this context. In the second section, we'll process an array of JSON objects representing multiple weather stations.

Let's start by instructing jq to pretty print the JSON output from the web service. If the JSON returned from the web service were more compact and not already formatted nicely, we would see a change. In this context, the JSON output will not be any different; however, we carry out the command for completeness.

$ ./get-station.sh| jq .
{
  "name": "Station1",
  "location": "Centerville",
  "sensors": [
    "temperature",
    "humidity"
  ]
}

The JSON returned from the web service was already formatted with spaces and new lines, but the jq would format it properly if the JSON was mashed together on one line, for example.

Next, let's use jq to filter the JSON and return only one field, the name of our weather station:

$ ./get-station.sh| jq '.name'
"Station1"

This looks good, but we'd like to return the string directly rather than returning it as a JSON string with quotes. We use the jq -r (raw output) command-line option to accomplish this goal:

$ ./get-station.sh| jq -r '.name'
Station1

Excellent. Let's fetch the array of sensors available for this weather station next:

$ ./get-station.sh | jq -r '.sensors'
[
  "temperature",
  "humidity"
]

This is a great start, but we want to output just the list of sensors rather than the square brackets denoting a JSON array so we can ultimately feed these values into our get-sensor script. We change the filter from .sensors to .sensors[] to return just the sensors available.

$ ./get-station.sh | jq -r '.sensors[]'
temperature
humidity

Our get-sensor script requires two parameters, the station name and the sensor; therefore, we need jq to filter and return both parameters.

$ ./get-station.sh | jq -r '. | "\(.name) \(.sensors[])"'
Station1 temperature
Station1 humidity

We're getting very close to victory. We can use xargs to build and execute commands from standard input. The -n2 option is included to instruct xargs to process 2 arguments at a time for our 2 parameters. We'll start by using xargs in conjunction with the echo command to ensure we are processing the arguments as expected.

./get-station.sh | jq -r '. | "\(.name) \(.sensors[])"'| xargs -n2 echo "$1 $2"
 Station1 temperature
 Station1 humidity

This looks good! We're ready to put everything together and process an array of values using jq and take action on each of the values by invoking our get-sensor script:

$ ./get-station.sh | jq -r '. | "\(.name) \(.sensors[])"'| xargs -n2 ./get-sensor $1 $2
Station1.temperature current value: 60.59
Station1.humidity current value: 41.21

It works! Let's take it one step further and process an array of JSON objects.

Process an array of JSON objects

Our second web service ( https://thisdavej.com/api/weather-stations.json ) returns an array of JSON objects containing metadata about multiple weather stations:

[
    {
        "name": "Station1",
        "location": "Centerville",
        "sensors": ["temperature", "humidity"]
    },
    {
        "name": "Station5",
        "location": "Anytown",
        "sensors": ["temperature", "humidity", "rainfall", "wind_speed"]
    }
]

Create a script called get-stations.sh to fetch the data using curl and output the result to the console:

get-stations.sh
#!/bin/bash

content=$(curl -sS "https://thisdavej.com/api/weather-stations.json")
echo "$content"

Change the script file access permissions to make the script executable:

$ chmod u+x get-stations.sh

Finally, run the script to ensure it writes the JSON content to standard output:

$ ./get-stations.sh
[
    {
        "name": "Station1",
        "location": "Centerville",
        "sensors": ["temperature", "humidity"]
    },
    {
        "name": "Station5",
        "location": "Anytown",
        "sensors": ["temperature", "humidity", "rainfall", "wind_speed"]
    }
]

Once again, we'll start by instructing jq to pretty print the JSON output from the web service (even though it won't look any different in this context since the JSON output is already well formatted):

$ ./get-stations.sh| jq .
[
    {
        "name": "Station1",
        "location": "Centerville",
        "sensors": ["temperature", "humidity"]
    },
    {
        "name": "Station5",
        "location": "Anytown",
        "sensors": ["temperature", "humidity", "rainfall", "wind_speed"]
    }
]

Next, let's use jq to filter the JSON and return only one field, the name of our weather station, for each weather station object in the array:

$ ./get-stations.sh | jq '.[] .name'
"Station1"
"Station5"

Ah yes, the double quotes. Let's use the jq -r (raw output) command-line option once again to eradicate the double quotes:

$ ./get-stations.sh | jq '.[] .name'
Station1
Station5

Excellent. Let's return a list of all items from the sensors array for both weather stations:

$ ./get-stations.sh | jq -r '.[] .sensors[]'
temperature
humidity
temperature
humidity
rainfall
wind_speed

This is a good start but recall that our get-sensor script requires both the station name and the sensor name as parameters. Let's return the station name also:

$ ./get-stations.sh | jq -r '.[] | "\(.name) \(.sensors[])"'
Station1 temperature
Station1 humidity
Station5 temperature
Station5 humidity
Station5 rainfall
Station5 wind_speed

We're getting close to victory! Let's use xargs once again and practice first with the echo command:

$ ./get-stations.sh | jq -r '.[] | "\(.name) \(.sensors[])"' | xargs -n2 echo "$1 $2"
 Station1 temperature
 Station1 humidity
 Station5 temperature
 Station5 humidity
 Station5 rainfall
 Station5 wind_speed

Finally, we bring it all together and process an array of values using jq and take action on each of the values by invoking our get-sensors script:

$ ./get-stations.sh | jq -r '.[] | "\(.name) \(.sensors[])"' | xargs -n2 ./get-sensor $1 $2
Station1.temperature current value: 71.62
Station1.humidity current value: 83.43
Station5.temperature current value: 96.1
Station5.humidity current value: 5.6
Station5.rainfall current value: 2.32
Station5.wind_speed current value: 11.35

Mission accomplished - jq is a very powerful tool!

Conclusion

The jq command is very useful for slicing, dicing, and transforming JSON data. We successfully utilized jq and curl to invoke a web service and retrieve JSON objects containing embedded arrays. We also moved beyond simply displaying the array values to the console and took action on each element. To learn more about jq, see my article on Consuming Web API JSON Data Using curl and jq as well as the official jq manual .

Follow @thisDaveJ (Dave Johnson) on Twitter to stay up to date with the latest tutorials and tech articles.

Additional articles

Last updated Jan 28 2020


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

生态战略:设计未来企业新模式

生态战略:设计未来企业新模式

周文艺 / 机械工业出版社 / 2017-3 / 49.00

思想影响战略,战略决定组织。在充满高度不确定性的今天,企业要生存和发展,必须不断进行组织变革与进化,跨越不连续性的鸿沟。本书分析了大量互联网生态型企业的案例,从生态思维进化、生态战略构建和生态组织变革三个角度出发,全面阐述了企业的进化之路。 本书认为,生态是企业进化的核心思想,企业须重新定义增长模式,从封闭的企业链转向开放的价值网,不断创新文化、技术和连接,培育新物种,实现企业从技术生态位到......一起来看看 《生态战略:设计未来企业新模式》 这本书的介绍吧!

Markdown 在线编辑器
Markdown 在线编辑器

Markdown 在线编辑器

正则表达式在线测试
正则表达式在线测试

正则表达式在线测试