We wrote Essentia to help solve the day-to-day ‘big data’ analysis problems we faced when processing different types of data from different types of users. Specifically, we needed a framework that would allow us to quickly:
Essentia combines scalable, fast, Data Processing operations with an in-memory NoSQL database to simplify many common problems encountered by data engineers and scientists. The documentation in these pages is meant to train users on how to use and integrate Essentia into their data processing workflow. Another useful resource and supplement to this documentation are the Essentia Forums.
We maintain a GitHub repository that contains test data and source code for some of the tutorials and usecases you will find in this documentation.
To get started, pull the tutorial repository via:
$ git clone https://github.com/auriq/EssentiaPublic.git
The data and scripts relevant for most of the documentation tutorials are under tutorials
and those relevant for the examples and integrations are under case studies
.
To get started, go to Essentia Tutorials.
Essentia is made to be run on the cloud, where we can spin up as many worker nodes as needed to scale to difficult problems. Currently the Amazon cloud is supported. Essentia can also be used for an on premise cluster; contact us for details. We do offer a single node version that can be run from a desktop; however, the power of Essentia lies in the cloud. You can install this single node version of Essentia on an Azure Linux VM if you want to run Essentia on the Microsoft cloud.
Note
The tutorials assume you are using the bash
shell.