Manipulate JSON in command line thanks to JQ example with cURL

I almost use only the command line to make calls on APIs using cURL. It's convenient and fast, easily scriptable if needed and the history of bash saves me time. But the response provided by cURL is not always readable especially when it comes to JSON.
To format the response of cURL and have a readable JSON I use the tool jq. This lib is available for many linux distributions in package but also for Mac and Windows. On Debian, the package is available without the need to add a new repo to the source.list.


If examples failed, run the curl command with -I option, if you get a 403 error you have to change the string after the -A option.


> apt-get install jq

To illustrate the commands I will use the API of the site Have I been pwned. A classic call with cURL gives result.

> curl -X GET -A mon-code.net-exemple https://haveibeenpwned.com/api/v2/breachedaccount/john@doe.com
 


[{"Title":"000webhost","Name":"000webhost","Domain":"000webhost.com","BreachDate":"2015-03-01","AddedDate":"2015-10-26T23:35:45Z"
,"ModifiedDate":"2015-10-26T23:35:45Z","PwnCount":13545468,
"Description":"In approximately March 2015, the free web hosting 
provider <a href=\"http://www.troyhunt.com/2015/10/breaches-traders-plain-text-passwords.html\" target=\"_blank\"
 
rel=\"noopener\">000webhost suffered a major data breach</a> that exposed over 13 million customer records. The data was sold 
and traded before 000webhost was alerted in October. 
The breach included names, email addresses and plain text passwords.",
"DataClasses":["Email addresses","IP addresses","Names","Passwords"],"IsVerified":true,"IsFabricated":false,"IsSensitive":false,
"IsActive":true,"IsRetired":false,"IsSpamList":false,"LogoType":"png"},{"Title":"Acne.org","Name":"AcneOrg","Domain":"acne.org",
"BreachDate":"2014-11-25","AddedDate":"2016-03-06T11:07:41Z","ModifiedDate":"2016-03-06T11:07:41Z","PwnCount":432943,"Description"
:"In November 2014, the acne website <a href=\"http://www.acne.org/\" target=\"_blank\" rel=\"noopener\">acne.org</a>...


Adding jq after a pipe on the command line, response become easier to read. Be aware of the dot at the end, it's mandatory.

> curl -X GET -A mon-code.net-exemple https://haveibeenpwned.com/api/v2/breachedaccount/john@doe.com | jq .
 


[
  {
    "Title": "000webhost",
    "Name": "000webhost",
    "Domain": "000webhost.com",
    "BreachDate": "2015-03-01",
    "AddedDate": "2015-10-26T23:35:45Z",
    "ModifiedDate": "2015-10-26T23:35:45Z",
    "PwnCount": 13545468,
    "Description": "In approximately March 2015, the free web hosting provider <a href=\"http://www.troyhunt.com/2015/10/breaches-traders-plain-text-passwords.html\" target=\"_blank\" rel=\"noopener\">000webhost suffered a major data breach</a> that exposed over 13 million customer records. The data was sold and traded before 000webhost was alerted in October. The breach included names, email addresses and plain text passwords.",
    "DataClasses": [
      "Email addresses",
      "IP addresses",
      "Names",
      "Passwords"
    ],
    "IsVerified": true,
    "IsFabricated": false,
    "IsSensitive": false,
    "IsActive": true,
    "IsRetired": false,
    "IsSpamList": false,
    "LogoType": "png"
  },
....]


As you can see, it's easier to read, but jq allow you to get more with filters. If I only want to get the list of all domains who exposed john@doe.com email, I just have to add a filter.

> curl -X GET -A mon-code.net-exemple https://haveibeenpwned.com/api/v2/breachedaccount/john@doe.com | jq '.[].Domain'
"000webhost.com"
"acne.org"
"adobe.com"
"armyforceonline.com"
"cheapassgamer.com"
"dailymotion.com"
"dropbox.com"
"edmodo.com"
"elance.com"
"evony.com"
"ffshrine.org"
...

Still not seduced by jq? It is possible to create your own Json by drawing in the supplied Json data. So I can associate the domain with the type of data exposed.

> curl -X GET -A mon-code.net-exemple https://haveibeenpwned.com/api/v2/breachedaccount/john@doe.com | jq '[{(.[].Domain): .[].DataClasses}]'
[{
  "000webhost.com": [
    "Email addresses",
    "IP addresses",
    "Names",
    "Passwords"
  ]
},
{
  "000webhost.com": [
    "Dates of birth",
    "Email addresses",
    "IP addresses",
    "Passwords",
    "Usernames"
  ]
},
...]
 

Note the '()' around the expression I'm using to extract the domain. It allows to evaluate it because I want to use it as a key, otherwise jq expects a simple string.

Jq allows many other manipulations, I invite you to browse the doc which is very well done. And for those who want to test without using an API we can use jq play for that.

Add a comment