Author: | Dave Kuhlman |
---|---|
Contact: | dkuhlman (at) reifywork (dot) com |
Address: | http://www.reifywork.com |
Revision: | 1.0a |
Date: | October 25, 2024 |
Copyright: | Copyright (c) 2015 Dave Kuhlman. All Rights Reserved. This software is subject to the provisions of the MIT License http://www.opensource.org/licenses/mit-license.php. |
---|---|
Abstract: | This document provides hints, guidance, and sample code for access to an h5serv server. |
h5serv, the HDF SERVER, serves information about and data from HDF5 data files.
I installed h5serv under the Anaconda Python distribution from Continuum. See this for more information: https://store.continuum.io/cshop/anaconda/.
Instructions on installing h5serv under Anaconda and setting up your environment are included with h5serv distribution. See file ../docs/Installation/ServerSetup.rst in the h5serv distribution.
Installation -- Do this under Linux:
$ conda create -n h5serv python=2.7 h5py twisted tornado requests pytz
Set up your environment -- Depending on where you have installed Anaconda, so something like the following:
$ source ~/a1/Python/Anaconda/Anaconda01/envs/h5serv/bin/activate h5serv
If and when you need to deactivate this environment, use:
$ source deactivate
Server startup -- Go to the server sub-directory in your h5serv installation, and run app.py. For example:
$ cd ~/a1/Python/Anaconda/H5serv/Git/h5serv/server $ python app.py
The curl command line tool is an easy way to make REST requests to an h5serv server. Some examples:
$ curl -X GET -H "host:testdata04.hdfgroup.org" http://crow:5000
Here is a bash shell script that makes several requests (I've added echo at the end of each command so that a new line is added.):
#!/bin/bash # get info about a database hdf5 file. curl -X GET -H "host: testdata04.hdfgroup.org" http://crow:5000 ; echo # get the IDs of the datasets in the file. curl -X GET -H "host: testdata04.hdfgroup.org" http://crow:5000/datasets ; echo # get info about one specific dataset. curl -X GET -H "host: testdata04.hdfgroup.org" http://crow:5000/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89 ; echo # get the data values from a specific dataset. curl -X GET -H "host: testdata04.hdfgroup.org" http://crow:5000/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89/value ; echo
You will need to install the requests package. You can find it here: https://pypi.python.org/pypi/requests. For this testing, I used the Anaconda distribution of Python, which, I believe, includes requests by default. You can learn about Anaconda here: https://store.continuum.io/cshop/anaconda/.
Using IPython:
In [1]: import requests In [2]: req = 'http://crow:5000/' In [3]: hdrs = {'host': 'testdata04.hdfgroup.org'} In [4]: rsp = requests.get(req, headers=hdrs) In [5]: rsp Out[5]: <Response [200]> In [6]: print rsp.text {"lastModified": "2015-07-02T23:49:18.303330Z", "hrefs": [{"href": "http://testdata04.hdfgroup.org/", "rel": "self"}, {"href": "http://testdata04.hdfgroup.org/datasets", "rel": "database"}, {"href": "http://testdata04.hdfgroup.org/groups", "rel": "groupbase"}, {"href": "http://testdata04.hdfgroup.org/datatypes", "rel": "typebase"}, {"href": "http://testdata04.hdfgroup.org/groups/f416d152-2114-11e5-81d4-0019dbe2bd89", "rel": "root"}], "root": "f416d152-2114-11e5-81d4-0019dbe2bd89", "created": "2015-07-02T23:49:18.303330Z"} In [7]: In [7]: print rsp.json() {u'lastModified': u'2015-07-02T23:49:18.303330Z', u'hrefs': [{u'href': u'http://testdata04.hdfgroup.org/', u'rel': u'self'}, {u'href': u'http://testdata04.hdfgroup.org/datasets', u'rel': u'database'}, {u'href': u'http://testdata04.hdfgroup.org/groups', u'rel': u'groupbase'}, {u'href': u'http://testdata04.hdfgroup.org/datatypes', u'rel': u'typebase'}, {u'href': u'http://testdata04.hdfgroup.org/groups/f416d152-2114-11e5-81d4-0019dbe2bd89', u'rel': u'root'}], u'root': u'f416d152-2114-11e5-81d4-0019dbe2bd89', u'created': u'2015-07-02T23:49:18.303330Z'} In [8]: In [8]: req = 'http://crow:5000/groups' In [9]: rsp = requests.get(req, headers=hdrs) In [10]: rsp Out[10]: <Response [200]> In [11]: print rsp.json() {u'hrefs': [{u'href': u'http://testdata04.hdfgroup.org/groups', u'rel': u'self'}, {u'href': u'http://testdata04.hdfgroup.org/groups/f416d152-2114-11e5-81d4-0019dbe2bd89', u'rel': u'root'}, {u'href': u'http://testdata04.hdfgroup.org/', u'rel': u'home'}], u'groups': [u'f416d155-2114-11e5-81d4-0019dbe2bd89', u'f416d158-2114-11e5-81d4-0019dbe2bd89', u'f416d15b-2114-11e5-81d4-0019dbe2bd89']}
And here is a Python script containing examples of several requests like those above:
#!/usr/bin/env python import requests def test(): rsp = requests.get( 'http://crow:5000', headers={'host': 'testdata04.hdfgroup.org'}) print rsp.text print rsp.json() rsp = requests.get( 'http://crow:5000/groups', headers={'host': 'testdata04.hdfgroup.org'}) print rsp.json() rsp = requests.get( 'http://crow:5000/groups/f416d155-2114-11e5-81d4-0019dbe2bd89', headers={'host': 'testdata04.hdfgroup.org'}) print rsp.json() rsp = requests.get( 'http://crow:5000/datasets', headers={'host': 'testdata04.hdfgroup.org'}) print rsp.json() rsp = requests.get( 'http://crow:5000/datasets/f416d154-2114-11e5-81d4-0019dbe2bd89', headers={'host': 'testdata04.hdfgroup.org'}) print rsp.json() rsp = requests.get( 'http://crow:5000/datasets/f416d154-2114-11e5-81d4-0019dbe2bd89/value', headers={'host': 'testdata04.hdfgroup.org'}) print rsp.json() value = rsp.json()['value'] print 'value: {}'.format(value) return rsp.json() def main(): test() if __name__ == '__main__': main()
And, the following is a Python script that is functionally equivalent to the previous one, but that attempts to hide some of the repetition and messiness in a class:
#!/usr/bin/env python import requests class H5servRequest(object): def __init__(self, host, machine, port): self.host = host self.machine = machine self.port = port self.location = "{}:{}".format(machine, port) def get(self, path): rsp = requests.get( self.location + path, headers={'host': self.host}) return rsp.json() def test(): req = H5servRequest( 'testdata04.hdfgroup.org', 'http://crow', 5000) data = req.get('') print '-----\n{}'.format(data) data = req.get('/groups') print '-----\n{}'.format(data) data = req.get('/datasets') print '-----\n{}'.format(data) data = req.get('/datasets/f416d154-2114-11e5-81d4-0019dbe2bd89') print '-----\n{}'.format(data) data = req.get('/datasets/f416d154-2114-11e5-81d4-0019dbe2bd89/value') print '-----\n{}'.format(data) def main(): test() if __name__ == '__main__': main()
Notes:
Here is a similar example written in Node.js:
#!/usr/bin/env node var http = require('http'); var log = console.log; function do_request(path, cb) { var opt = {}; opt.hostname = 'crow'; opt.port = 5000; opt.method = 'GET'; opt.headers = {host: 'testdata04.hdfgroup.org'}; opt.path = path; log('opt: ' + JSON.stringify(opt)); var req = http.request(opt, function (response) { response.on('data', function (chunk) { log('-----\nbody: ' + chunk); if (cb !== null) { cb(chunk); } }); }); req.on('error', function(e) { log('request error: ' + e.message); }); req.end(); } function test() { var content; do_request('/', null); do_request('/groups', null); do_request('/datasets', null); do_request('/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89', null); do_request( '/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89/value', function(data) { var content, values; content = JSON.parse(data); values = content.value; log('-----\nvalues: ' + values); }); } test();
The HTTP requests in the above example are asynchronous, and, therefore, the results may not come out in the same order as our calls to do_request. Here is an example that uses a recursive loop to execute these operations in a serial order:
#!/usr/bin/env node var http = require('http'); var async = require('async'); var log = console.log; var args = [ ['/', null], ['/groups', null], ['/datasets', null], ['/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89', null], ['/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89/value', function(data) { var content, values; content = JSON.parse(data); values = content.value; log('-----\nvalues: ' + values); }], ]; function do_request(args, idx) { if (idx < args.length) { var path = args[idx][0], cb = args[idx][1], opt = {}; opt.hostname = 'crow'; opt.port = 5000; opt.method = 'GET'; opt.headers = {host: 'testdata04.hdfgroup.org'}; opt.path = path; log('opt: ' + JSON.stringify(opt)); var req = http.request(opt, function (response) { response.on('data', function (chunk) { log('-----\nbody: ' + chunk); if (cb !== null) { cb(chunk); } do_request(args, idx + 1); }); }); req.on('error', function(e) { log('request error: ' + e.message); }); req.end(); } } function test() { do_request(args, 0); } test();
Notice that, in this example (above) we do not call do_request recursively until the response.on callback has been called.