In April we released cstore_fdw, the first columnar store foreign data wrapper for PostgreSQL. Our initial blog post received lots of interest in cstore_fdw but also lots of questions. In this blog post we're going to attempt to answer some of the more common performance related questions:
Before we start we will briefly touch on the test environment. All of the tests below were done using data and some representative queries generated by the industry standard TPC-H benchmark. We ran our tests on EC2 hosts using 10 GB of generated TPC-H data loaded into CitusDB version 3.0 or PostgreSQL version 9.3.4. All database files were stored on SSD drives. CitusDB and PostgreSQL were configured identically with the only variations from the stock configuration being setting shared_buffers to 2 GB and work_mem to 512 MB.
cstore_fdw allows users to enable compression on their tables to reduce the size of the data at rest on disk and to reduce the amount of I/O required to read the table. We won't dive into the details of the compression technique used in this post but a short description is that each column of data is broken up into blocks of ten thousand values which are compressed using PostgreSQL's built-in LZ compression. Compression has four primary benefits:
In order to test how well our compression technique works we took 10 GB of raw data generated by TPC-H and loaded that into a standard PostgreSQL database and into a PostgreSQL database that uses compressed cstore_fdw tables. We then compared the sizes of the tables in each of the databases. The table below summarizes the compression ratios we saw for each table:
|Table Type||Lineitem (9.1 GB)||Orders (2.1 GB)||Part (0.3 GB)||Partsupp (1.4 GB)||Customer (0.3 GB)|
Depending on the TPC-H table compression ratios varied between 2.5x and 16x. This type of variability is expected depending on the size of the table and the type of the data contained in it. One other interesting note is that our 13.1 GB PostgreSQL database compressed down to only 3.5 GB when using our compressed storage format.
Using compression with cstore_fdw we would expect users to see compression ratios of 3-4x for their tables.
cstore_fdw improves query times by reducing the amount of I/O in two ways:
We wanted to evaluate how much of an improvement in query times users of cstore_fdw were likely to see in the real world. To do this we used the data and queries from the TPC-H benchmark as these are designed to represent real-world use cases. Our test consisted of loading 10 GB of generated TPC-H data into both a standard PostgreSQL database and a PostgreSQL database that used cstore_fdw storage. We then flushed the OS cache to make sure that all data was coming from SSDs. Next, we ran a selection of TPC-H queries against each storage engine.
As shown above the improvement from cstore_fdw varies depending on the query. In the best case query runtimes dropped to half their previous levels, in the worst case the improvement is only 11% (with compression enabled).
Explaining this variation requires a bit of an explanation of how databases work. The total query time can be thought of as the sum of two types of work: time spent doing I/O and time spent doing computation. Queries that use complex filters, aggregations, or other functions spend increased amounts of time doing computation. For these types of queries it can be difficult to significantly affect overall query time by reducing I/O as the majority of the query time is spent doing non-I/O related work. However, for many common queries there is relatively little computation and reducing I/O does have a significant impact on overall query times.
Across the set of TPC-H queries we see that when using cstore_fdw with compression every query executes faster and users can expect their analytic queries to run 20-30% faster. We'll show how to turn this 20-30% improvement into a 60-70% one later in our investigation on skiplist indexes.
CitusDB enables its users to create a horizontally scalable PostgreSQL database that can leverage the PostgreSQL extension ecosystem. To determine the benefits of cstore_fdw for CitusDB users we set up an experiment identical to the one we did for PostgreSQL except this time we compared two CitusDB clusters; one using PostgreSQL's native storage format for the TPC-H data and one using cstore_fdw to store the same TPC-H data.
|CitusDB + cstore||13.664943||13.34371||6.176959||14.522777||18.638935||17.483885||9.619221||9.401148||17.571072|
|CitusDB + cstore (compressed)||12.476285||12.060501||5.600474||12.884196||17.033522||16.009109||7.595766||8.272748||12.080371|
The results are similar to those when using cstore_fdw on PostgreSQL, however there is a significant difference in the amount of improvement seen. When using PostgreSQL we saw an average improvement to query times of 28% whereas when using CitusDB we see an average improvement to query times of 66%. This occurs because CitusDB enables you simultaneously use all cores on a machine to answer a query. In doing so CitusDB introduces more demand and hence more contention for limited available disk I/O. As a result the reduction in I/O times that cstore_fdw brings contributes more significantly to improving query times. Users running cstore_fdw on PostgreSQL in an environment with multiple clients querying in parallel should expect similar results.
cstore_fdw users with CitusDB can expect to see their performance improve by 60-70% for their analytic query workloads.
Indexing is a common technique for improving query performance in databases and PostgreSQL has a number of indexing techniques available for its users. However, when using a foreign data wrapper, like cstore_fdw, users can no longer leverage PostgreSQL's built-in indexing techniques. In order to give users of cstore_fdw the benefits of indexes we adapted the indexing technology from ORC. The ORC indexing system works by breaking each column's data into blocks and storing the min and max values of each block. In order to determine how much of an improvement this indexing technique offers we used the TPC-H data loaded onto a couple PostgreSQL databases using cstore_fdw. The most effective use of ORC's indexing system requires that the column data be sorted. Into one of these databases we loaded sorted data and into the other unsorted data was loaded. As previously we then ran the TPC-H queries against each of these databases. It's worth noting that all data was sorted by its natural ordering, meaning for example that orders were sorted by order date.
One thing to note about these results is that the tests were done without compression enabled. Users should expect to see additional improvements from enabling compression as described in previous sections.
When using cstore_fdw with sorted input users should expect a 30-35% improvement to query times compared to using unsorted input and a 60-70% improvement compared to PostgresSQL.
After benchmarking cstore_fdw in a number of analytic query scenarios we learned that on SSDs:
We hope this post left you interested to see what cstore_fdw and CitusDB can do for your data. Feel free to contact us with questions or comments using our Contact Us form or by simply emailing us at engage @ citusdata.com.
We are excited to open source our columnar store extension for PostgreSQL and share it with the community! Columnar stores bring notable benefits for analytic workloads, where data is loaded in batches.
This columnar store extension uses the Optimized Row Columnar (ORC) format for its data layout. ORC improves upon the RCFile format developed at Facebook, and brings the following benefits:
Further, we used the Postgres foreign data wrapper APIs and type representations with this extension. This brings:
It's worth noting that the columnar store extension is self-contained. If you're a PostgreSQL user, you can get the entire source code and build using the instructions on our GitHub page. You can even join columnar store and regular Postgres tables in the same SQL query.
Now, let's see how all this fits together with an example. For this, we start by downloading customer review data from Amazon for the year 1998. In this table, each review represents an event. We find that events or fact tables, or tables that have a large number of columns are good fits for the columnar store.
wget http://examples.citusdata.com/customer_reviews_1998.csv.gz gzip -d customer_reviews_1998.csv.gz
Next, let's connect to the PostgreSQL database and create a foreign table.
CREATE EXTENSION cstore_fdw; CREATE SERVER cstore_server FOREIGN DATA WRAPPER cstore_fdw; CREATE FOREIGN TABLE customer_reviews ( customer_id TEXT, review_date DATE, review_rating INTEGER, review_votes INTEGER, review_helpful_votes INTEGER, product_id CHAR(10), product_title TEXT, product_sales_rank BIGINT, product_group TEXT, product_category TEXT, product_subcategory TEXT, similar_product_ids CHAR(10) ) SERVER cstore_server OPTIONS(filename '/opt/citusdb/3.0/cstore/customer_reviews.cstore', compression 'pglz');
Note that we specified the compression method as the LZ-class algorithm that's already built into PostgreSQL. The cstore extension will use this algorithm to compress column values in groups of 10K; and this group size is also configurable.
Next, we can use PostgreSQL's copy command to load data into the columnar store.
COPY customer_reviews FROM '/home/user/customer_reviews_1998.csv' WITH CSV;
Finally, let's run an example SQL query on the column store table.
-- Do we have a correlation between a book's title's length and its review ratings? SELECT width_bucket(length(product_title), 1, 50, 5) title_length_bucket, round(avg(review_rating), 2) AS review_average, count(*) FROM customer_reviews WHERE product_group = 'Book' GROUP BY title_length_bucket ORDER BY title_length_bucket;
This example shows that the columnar store extension works nicely with your PostgreSQL database. The natural follow up question is, what are the benefits? To quantify them, we ran some preliminary numbers using the industry standard TPC-H benchmark.
We started by measuring compression ratios for the tables in this benchmark and found that they compressed by 3.5x with pglz. This means that you can now fit 3.5x of your working set into memory. Also, when going to disk, you read proportionally less data. Finally, if you're using SSDs, you save notably from storage costs.
After looking at compression ratios, we next measured query run times on an m1.xlarge instance with rotational disks. We also flushed the page cache before each test to see the impact on disk I/O. Further, we ran Analyze on each foreign table so that PostgreSQL has the statistics it needs to choose the best query plan.
|Table Type||TPC-H 3||TPC-H 5||TPC-H 6||TPC-H 10|
This first diagram shows four representative queries from the TPC-H benchmark, and highlights query run-times improving by about 2x. It's worth noting that the tables in this benchmark have 10-15 columns each and we see these benefits grow as the number of columns increase to 100s.
The second diagram focuses on disk I/O, and shows total volume of data read from disk with regular Postgres tables, cstore tables, and cstore tables with pglz compression. The benefits of reading only relevant columns are more apparent here. With pglz, cstore reduces data read from disk by more than 10x.
|Table Type||TPC-H 3||TPC-H 5||TPC-H 6||TPC-H 10|
Besides all this, we are really excited by how we can further improve upon this extension! Some features that we're thinking about include new compression methods, better query cost estimation, improved read/write performance, and checksums. We outline these features in our GitHub page and welcome your opinions.
Finally, if you start using the columnar store and need for it to scale out, CitusDB could easily take care of that. Citus extends Postgres to provide distributed queries across a cluster; and we're actively working on improving it. We also have a few cool projects in the works around cstore and more. So please do get in touch with us at engage @ citusdata.com to tell us what you think!
Got comments? Join the discussion on Hacker News.