Skip to content
vic

tspurway/hustle

A column oriented, embarrassingly distributed relational event database.

tspurway/hustle.json
{
"createdAt": "2014-02-19T02:13:45Z",
"defaultBranch": "master",
"description": "A column oriented, embarrassingly distributed relational event database.",
"fullName": "tspurway/hustle",
"homepage": "",
"language": "Python",
"name": "hustle",
"pushedAt": "2018-04-14T02:03:05Z",
"stargazersCount": 238,
"topics": [],
"updatedAt": "2025-08-26T02:15:19Z",
"url": "https://github.com/tspurway/hustle"
}

![Hustle]!(doc/_static/hustle.png)

A column oriented, embarrassingly distributed, relational event database.

  • column oriented - super fast queries
  • events - write only semantics
  • distributed insert - designed for petabyte scale distributed datasets with massive write loads
  • compressed - bitmap indexes, lz4, and prefix trie compression
  • relational - join gigantic data sets
  • partitioned - smart shards
  • embarrassingly distributed (based on Disco)
  • embarrassingly fast (uses LMDB)
  • NoSQL - Python DSL
  • bulk append only semantics
  • highly available, horizontally scalable
  • REPL/CLI query interface
select(impressions.ad_id, impressions.date, h_sum(pix.amount), h_count(),
where=((impressions.date < '2014-01-13') & (impressions.ad_id == 30010),
pix.date < '2014-01-13'),
join=(impressions.site_id, pix.site_id),
order_by=impressions.date)

After cloning this repo, here are some considerations:

  • you will need Python 2.7 or higher - note that it probably won’t work on 2.6 (has to do with pickling lambdas…)
  • you need to install Disco 0.5 and its dependencies - get that working first
  • you need to install Hustle and its ‘deps’ thusly:
cd hustle
sudo ./bootstrap.sh

Please refer to the Installation Guide for more details

Hustle User Guide

Hustle Mailing List

Special thanks to following open-source projects:

Build Status