Oswald Regular
OpenSans Regular
Storing and Managing Big Data
Really Big!

What if you have hundreds of terabytes – or even multiple petabytes – of raw data that you need to keep online and process on a regular basis? Storing that data in a traditional relational database can be prohibitively expensive. If the pattern of access is write-once/read-many, then Ab Initio’s Indexed Compressed Flat File (ICFF) facility may be for you.

The Ab Initio® ICFF facility makes it easy to take hundreds of gigabytes to multiple petabytes of raw data, compress it, index it, and store it in a set of traditional flat files that may be dispersed across multiple disks, possibly on different servers. ICFFs require very little temporary storage, and because the standard (non-proprietary) compression utilities used by ICFFs often achieve compression ratios of 1:10, the physical storage difference between an ICFF and a database can easily be a factor of 10, if not 20. One-tenth or one-twentieth as much disk can mean real money savings.

The performance of ICFFs is also extraordinary. Data can be added to an ICFF at the rate of hundreds of thousands of records per second – in real-time! These records can be looked up immediately after they are received – big bulk loads that might take minutes or hours don’t get in the way. Indexed lookup performance is limited only by the total number of disk arms that can move at any one moment. Full table scans enable parallel streams of data from each data partition to flow into Ab Initio applications, and these applications can run in parallel across as many CPUs as desired. ICFFs can even be queried via SQL, with support for federated queries across an ICFF and datasets stored in databases and/or flat files. For the right applications, an ICFF can’t be beat.