An Intro to Qri Command Line

Published in

qri.io

6 min readAug 14, 2020

Qri is ultimately about collaborating on datasets, and the command line is a great place to start as it shows off a lot of what qri can do. We cut this video (and accompanying blog post) to help you get started.

Download Qri CLI & Get Started

Download the latest Qri command line client: https://github.com/qri-io/qri/releases, or just:

curl -fsSL https://qri.io/install.sh | sh

Quickstart: https://qri.io/docs/getting-started/qri-cli-quickstart

Video Breakdown

0:47 Save a dataset

Once you’ve got qri installed, first step is creating a dataset. In a sense, I’m introducing a file I already have locally (synths.csv) to qri via the ‘save’ command:

$ qri save --body synths.csv

Proof the dataset saved:

More on that ‘body’ flag later. For now it’s worth knowing that my synths.csv file has been saved to a qri repository.

1:43 Your Local Qri repo

To see what datasets you have in your qri repo at any given point, run ‘qri list’:

$ qri list

If you’ve already created a few datasets, as I have, you’ll get something like this:

2:20 Push to qri.cloud

Next step is to push that dataset to qri cloud using the ‘qri push’ command:

$ qri push me/synths

Once you’ve pushed the dataset, you (and the world) can view on the cloud. In this case, the url generated would be https://qri.cloud/b5/synths. Voilà:

3:25 Dataset Components

Qri datasets are more than just data. This is our point of view on what makes others’ datasets easier to work with. If you’re going to use someone else’s data/information, you need to be able to understand what it is for yourself. Qri dataset components help you do this.

See also: What are dataset components on Qri?

Using HTML as an analogy, the ‘body’ of a qri dataset is kind of like the ‘body’ of an HTML page. The ‘body’ is…the data, the stuff we care the most about.

You can read all about the other Qri dataset components here.

4:05 Version Control

Qri is, at its core, about version control. This is the most important difference between Qri and any other dataset tool. Every dataset in qri is versioned, and new versions can be created very easily, again, with the save command. In the example below, we’re creating a new version of our Synths dataset by adding two new components, a readme.md file and a meta.json file with the following command(s):

$ qri save --file readme.md --filemeta.json me/synths

You can see that new version (and eventually all previous versions) with ‘qri log’ command:

$ qri log me/synths

The result is a new version!

You may also notice qri has inferred a commit message, “updated meta and readme.” This is particularly useful when qri is working inside of data pipelines, where machines are doing much of the data movement and manipulation, and are unable to add human-readable context to key changes.

Using the push command you’re already familiar with, you can now share this new version (enhanced with metadata and a readme) with the world on qri cloud.

5:46 Structure

Qri automatically infers and assigns a structure (or, schema) to datasets, which define how a dataset becomes machine-readable. Structure includes: format (CSV, JSON), & the schema (JSON schema, used by OpenAPI). In this case qri correctly identified the data (body) content in columns 1 & 2 as strings, and column 3 as integers.

A view on the Structure component of a dataset on qri.cloud

This comes in handy for the next step…

6:20 Dataset SQL

Let’s pretend another user comes along named, “b6." b6 can use the command ‘qri sql’ to run SQL directly against any dataset, even those b6 does not have.

$ qri sql

In the example below, b6 joins two datasets with similar structures — and therefore are joinable using country codes as the primary key.

Qri then finds those datasets, pulls the latest versions down to my local repository, runs the sql command, and spits out the join:

FUN!

From here, once you have qri installed on your machine, you can give someone an SQL statement. When they run it, you’ll know the statement is being run against the latest versions of the datasets in question.

Using the qri log command…

$ qri log b5/world_bank_population

…returns a log of that dataset. This will show you which versions of the dataset you have locally (local), and which are held by others (remote) — either as peers or on qri cloud.

You can easily fetch older versions and work with them directly.

8:30 Local Checkout

Turn that dataset back into individual files with the ‘checkout’ command:

$ qri checkout b5/world_bank_population

This creates a linked working directory with which I can apply other tools. Here you see the components ‘broken up’ into normal, standard files (JSON, markdown, CSV) other tools and apps know how to work with:

Qri Pull

Data changes (updates) all the time. To be assured you’re working with the latest version of a dataset, use the ‘qri pull’ command:

$ qri pull b5/world_bank_population

9:17 Conclusion

We think these features add up to more than the sum of their parts. Among other key benefits, Qri datasets are:

interoperable
easier to version
easier to move
easier to understand and contextualize as you inspect and prepare to work with the data (body).
…and ultimately easier to collaborate on, which means less work for everybody.

It can be very difficult to rely on one another’s data, so we need version histories and consistent structures to help us stay organized and informed. We hope making data easier to work with in this way will bring down the barriers preventing us working on data together, and allow us to build on each other’s work.

Datasets referenced:

Synth Catalog | Dataset Published on qri.cloud

I'm a synth nerd, and I've always wanted to see a "timeline" of which years synths are really blowing up. We often talk…

qri.cloud

World Bank Population | Dataset Published on qri.cloud

( 1 ) United Nations Population Division. World Population Prospects: 2017 Revision. ( 2 ) Census reports and other…

qri.cloud

World Bank GDP (current US$) | Dataset Published on qri.cloud

This dataset is a mirror of data published by the world bank, with two differences: * This dataset only includes…

qri.cloud

An Intro to Qri Command Line

Download Qri CLI & Get Started

Video Breakdown

0:47 Save a dataset

1:43 Your Local Qri repo

2:20 Push to qri.cloud

3:25 Dataset Components

4:05 Version Control

5:46 Structure

6:20 Dataset SQL

8:30 Local Checkout

Qri Pull

9:17 Conclusion

Datasets referenced:

Synth Catalog | Dataset Published on qri.cloud

I'm a synth nerd, and I've always wanted to see a "timeline" of which years synths are really blowing up. We often talk…

World Bank Population | Dataset Published on qri.cloud

( 1 ) United Nations Population Division. World Population Prospects: 2017 Revision. ( 2 ) Census reports and other…

World Bank GDP (current US$) | Dataset Published on qri.cloud

This dataset is a mirror of data published by the world bank, with two differences: * This dataset only includes…

Written by Rico Gardaphe