How SQL can unify entry to APIs


Within the authentic proposal for the World Large Internet, Tim Berners-Lee wrote:

A generic software might maybe be made to permit any database which makes use of a business DBMS to be displayed as a hypertext view.

We did get these hypertext views, within the type of APIs, however not in a generic means. Database-backed net purposes grew APIs that yielded numerous XML after which JSON outputs. Programming languages grew libraries to assist builders devour these outputs. Studying to make use of every particular person API was difficult, becoming a member of outputs from a number of APIs much more so. 

In 2009 I used to be constructing a system to mix calendar info from many sources. It requested the identical query of all of them: What occasions are scheduled for a given place and time? To do this it had to make use of a half-dozen APIs, every requiring a unique option to make a request and unpack the response.

So once I realized about Challenge Astoria I used to be a right away fan. Astoria was the generic hypertext view of databases that we wanted. With Astoria layered on prime, each database might robotically present a default API. If the half-dozen programs I used to be querying for occasions supported one thing like Astoria, none would have wanted to invent bespoke APIs and all can be queryable in the identical constant means.

The concept matured as Open Information, aka OData, an OASIS normal since 2014. In precept any database-backed net app might now sport an “OData head” that would supply a default API, requiring no code to be written by builders of the app, and no new request/response protocols to be realized by builders utilizing the API. 

In apply that largely hasn’t occurred. 

Greater than ever, software program building requires builders to compose options utilizing a rising proliferation of APIs. Usually there’s a library to wrap every API in your programming language of selection, so that you’re spared the hassle of creating uncooked REST calls and parsing the outcomes. However every wrapper has its personal means of representing outcomes, so when composing a multi-API answer it’s important to normalize these representations. Since combining outcomes occurs in a language-specific means, your answer is tied to that language. And if that language is JavaScript or Python or Java or C# then it’s arguably not essentially the most common and highly effective option to question (or replace) a database.

What’s one of the best ways? It’s been hiding in plain sight all alongside: SQL. Battle-hardened for many years, and developed past the pure relational mannequin, SQL has restablished itself because the preeminent interface to information. And it’s positioned to turn out to be the API unifier that we’d like greater than ever.

Overseas information wrappers for APIs

Steampipe ( is an open-source software that fetches information from numerous APIs and makes use of it to populate tables in a database. The database is Postgres, which is, these days, a platform on which to construct every kind of database-like programs by creating extensions that deeply customise the core. One class of Postgres extension, the overseas information wrapper (FDW), creates tables from exterior information. Steampipe embeds an occasion of Postgres that masses an API-oriented overseas information wrapper. The FDW in flip communicates with a rising household of plug-ins that devour APIs and feed the info by way of the FDW into Postgres tables.

To make these abstractions concrete, ask your self how you’ll clear up the next drawback. You use public AWS providers, and also you’d wish to know if any of their endpoints present up as weak in Shodan, a service that scans public endpoints. Your answer in all probability seems one thing like this:

  1. Discover ways to use the AWS API that finds your endpoints
  2. Discover ways to use the Shodan API the checks your endpoints
  3. Discover ways to mix these two APIs to reply the query

Right here’s the Steampipe answer.

  aws_ec2_instance a
left be a part of
  shodan_host s on a.public_ip_address = s.ip
the place
  a.public_ip_address shouldn't be null;
| instance_id         | ports    | vulns              | security_groups                                             |
| i-0dc60dd191cb84239 | <null>   | <null>             | [{"GroupId":"sg-042fe79169eb42818","GroupName":"lockdown"}] |
| i-042a51a815773780d | [80,22]  | <null>             | [{"GroupId":"sg-042042bac705630f4","GroupName":"bastion"}]  |
| i-00cf426db9b8a58b6 | [22]     | <null>             | [{"GroupId":"sg-0423f79169eb42818","GroupName":"default"}]  |
| i-0e97f373db42dfa3f | [22,111] | ["CVE-2018-15919"] | [{"GroupId":"sg-0423f79169eb42818","GroupName":"default"}]  |

The 2 tables joined listed below are offered by Steampipe plug-ins for AWS and Shodan. The primary maps the sprawling catalog of AWS APIs to (presently) 269 tables; the second gives a dozen Shodan tables.

You configure these plug-ins to authenticate to the APIs with the identical credentials you’d want if utilizing the APIs straight. However you don’t must know the rest about underlying REST calls, or libraries wrapped round them. The answer is created from tables that work the identical means inside and throughout APIs. You examine them (aws_ec2_instance, shodan_host) to find the names of their columns, and also you be a part of them within the time-honored SQL means.

A plug-in for each API

Clearly this two-API answer will depend on the existence of plug-ins to map each APIs to tables. If each providers applied OData that wouldn’t be crucial. The APIs would robotically be queryable, albeit arguably not joinable with the magnificence that SQL affords. However these two providers, like most, don’t current a unified interface to their APIs. In order that unification must be layered on prime of them. Steampipe’s plug-in SDK smooths the way in which for plug-in authors by abstracting connection administration, retry logic, caching, and naturally the mapping of API outcomes to tables. 

Steampipe plug-ins are written in Go. They leverage the excellent catalog of Go libraries that wrap APIs. However solely plug-in authors must know that. As a developer working with Steampipe you solely see tables, and also you solely write SQL. These days, as a result of SQL has developed, that features options like Widespread Desk Expressions (aka CTEs or WITH clauses) and JSON columns. But it surely’s nonetheless simply SQL.

Can such plug-ins feasibly be constructed for each API? Nicely, Steampipe launched in early 2021 with a handful of plug-ins, immediately there are greater than 60, and the quantity is rising shortly. Thus far most have been written by the core staff however exterior contributions are rising. Because of the plug-in SDK, which does the heavy lifting, it’s easy to construct a plug-in that maps an API to a set of tables. 

Standing on the shoulders of Postgres

By embedding Postgres, Steampipe inherits all of its capabilities. So, for instance, you’ll be able to be a part of API-sourced overseas tables with native Postgres tables. And whereas Steampipe’s major profit is dwell querying of APIs, you’ll be able to create materialized views to persist that information and write Postgres capabilities to function on it. You’ll be able to even load different Postgres extensions and use them with Steampipe tables. Postgres’s built-in tablefunc extension, for instance, can do crosstabs, in SQL, with spreadsheet information from Steampipe’s Google Sheets plug-in.

One other good thing about embedding Postgres: Any Postgres-compatible API shopper can connect with Steampipe. That features command-line instruments like psql and GUI-based ones like Tableau, Energy BI, Metabase, and Superset that convey visualization and interactivity to dwell API information. 

Postgres could by no means be as broadly embedded as the ever-present SQLite but it surely’s extra succesful, and more and more it’s used to energy an ecosystem of interoperable instruments for working with information. Steampipe extends Postgres to ship unified entry to APIs, and a typical SQL atmosphere through which to cause concerning the information they supply.

Jon Udell is the group lead for Steampipe at Turbot.

New Tech Discussion board gives a venue to discover and talk about rising enterprise expertise in unprecedented depth and breadth. The choice is subjective, primarily based on our choose of the applied sciences we consider to be essential and of biggest curiosity to InfoWorld readers. InfoWorld doesn’t settle for advertising collateral for publication and reserves the precise to edit all contributed content material. Ship all inquiries to

Copyright © 2022 IDG Communications, Inc.