ASTERIX: A Highly Scalable Parallel Platform for Semi-structured Data Management and Analysis


UCI ISG Logo

Home   ASTERIX Cast    Publications

 

News (more)

Overview

The ASTERIX project is developing new technologies for ingesting, storing, managing, indexing, querying, analyzing, and subscribing to vast quantities of semi-structured information. The project is combining ideas from three distinct areas – semi-structured data, parallel databases, and data-intensive computing – to create a next-generation, open source software platform that scales by running on large, shared-nothing commodity computing clusters. ASTERIX targets a wide range of semi-structured information, ranging from “data” use cases – where information is well-tagged and highly regular – to “content” use cases – where data is irregular and much of each datum is textual. ASTERIX is taking an open stance on data formats and addressing research issues including highly scalable data storage and indexing, semi-structured query processing on very large clusters, and merging parallel database techniques with today’s data-intensive computing techniques to support performant yet declarative solutions to the problem of analyzing semi-structured information.

                                  

ASTERIX Eco-System

ASTERIX Applications/Customers

 

Acknowledgement: This project is supported by an eBay matching grant, one Facebook Fellowship Awardthe NSF Awards No. IIS-0910989, IIS-0910859, and IIS-0910820, a UC Discovery grant,  three Yahoo! Key Scentific Challenge Awards, and generous industrial gifts from Google, HTC, Microsoft and Oracle Labs.

Thumbnail exampleThumbnail exampleThumbnail example


For any questions regarding this project, please send email to asterix AT ics.uci.edu.