Citus Blog

Articles tagged: Postgres

Onur Tirtir

Schema-based sharding comes to PostgreSQL with Citus

Written byBy Onur Tirtir | July 31, 2023Jul 31, 2023

Citus, a database scaling extension for PostgreSQL, is known for its ability to shard data tables and efficiently distribute workloads across multiple nodes. With Citus 12.0, Citus introduces a very exciting feature called schema-based sharding. The new schema-based sharding feature gives you a choice of how to distribute your data across a cluster, and for some data models (think: multi-tenant apps, microservices, etc.) this schema-based sharding approach may be significantly easier!

In this blog post, we will take a deep dive into the new schema-based sharding feature, and you will learn:

Keep reading

Postgres community released a new feature, in Postgres 15.0, that performs actions to modify rows in the target table, using the data from a source. MERGE provides a single SQL statement that can conditionally INSERT, UPDATE or DELETE rows, a task that would otherwise require multiple procedural language statements, using INSERT with ON CONFLICT clause etc.

In this blog post, you will learn a high-level overview of the functioning of Postgres MERGE. It will delve into some of the practical use-cases, and subsequently elaborate on the different strategies employed by Citus for handling MERGE in a distributed environment.

Keep reading
Marco Slot

Citus 12: Schema-based sharding for PostgreSQL

Written byBy Marco Slot | July 18, 2023Jul 18, 2023

What if you could automatically shard your PostgreSQL database across any number of servers and get industry-leading performance at scale without any special data modelling steps?

Our latest Citus open source release, Citus 12, adds a new and easy way to transparently scale your Postgres database: Schema-based sharding, where the database is transparently sharded by schema name.

Schema-based sharding gives an easy path for scaling out several important classes of applications that can divide their data across schemas:

  • Multi-tenant SaaS applications
  • Microservices that use the same database
  • Vertical partitioning by groups of tables

Each of these scenarios can now be enabled on Citus using regular CREATE SCHEMA commands. That way, many existing applications and libraries (e.g. django-tenants) can scale out without any changes, and developing new applications can be much easier. Moreover, you keep all the other benefits of Citus, including distributed transactions, reference tables, rebalancing, and more.

Keep reading

Introducing Path To Citus Con, a podcast for developers who love Postgres. Why? Because sometimes, something you build gets bigger than you thought it would. The monthly podcast Path To Citus Con as originally meant to be a “pre-event” to build excitement and give a hands-on experience for people who would be attending Citus Con: An Event for Postgres. The audience would get a chance to talk to speakers for the conference and hear a deep dive conversation.

It’s now its own monthly podcast with guests from around the world. Guests have been deep in the world of databases and the Citus database extension to Postgres, and also people in the Postgres community and technology more generally. It’s the human side of open source, PostgreSQL, and the many PG extensions (including Citus.)

In this blog post, you’ll learn about what Path To Citus Con is, how you can participate, listen, and read each episode, and about episodes like “Working in public on open source,” “Why giving talks at Postgres conferences matters,” and more (details below.)

Keep reading

Distributed PostgreSQL has become a hot topic. Several distributed database vendors have added support for the PostgreSQL protocol as a convenient way to gain access to the PostgreSQL ecosystem. Others (like us) have built a distributed database on top of PostgreSQL itself.

For the Citus database team, distributed PostgreSQL is primarily about achieving high performance at scale. The unique thing about Citus, the technology powering Azure Cosmos DB for PostgreSQL, is that it is fully implemented as an open-source extension to PostgreSQL. It also leans entirely on PostgreSQL for storage, indexing, low-level query planning and execution, and various performance features. As such, Citus inherits the performance characteristics of a single PostgreSQL server but applies them at scale.

That all sounds good in theory, but to see whether this holds up in practice, you need benchmark numbers. We therefore asked GigaOM to run performance benchmarks comparing Azure Cosmos DB for PostgreSQL to other distributed implementations. GigaOM compared the transaction performance and price-performance of these popular managed services of distributed PostgreSQL, using the HammerDB benchmark software:

Keep reading

One of the most important improvements in Citus 11.3 is that Citus offers more reliable metadata sync. Before 11.3, when a Citus cluster had thousands of distributed objects (such as distributed tables), Citus occasionally experienced memory problems while running metadata sync. Due to these memory errors, some users with very large numbers of tables were sometimes unable to add new nodes or upgrade beyond Citus 11.0.

To address the memory issues, we added an alternative "non-transactional" mode to the current metadata sync in Citus 11.3.

The default mode for metadata sync is still the original single transaction mode that we introduced in Citus 11.0. But now in 11.3 or later, if you have a very large number of tables and you run into the memory error, you can choose to optionally switch to the non-transactional mode, which syncs the metadata via many transactions. While most of you who use Citus will not need to enable this alternative metadata sync mode, this is how to do it:

Keep reading

If you're building a software application that serves multiple tenants, you may have already encountered the challenges of managing and isolating tenant-specific data. That's where the django-multitenant library comes in. This library, actively used since 2017 and now downloaded more than 10K times per month, offers a simple and flexible solution for building multi-tenant Django applications.

In this blog post, we'll dive deeper into the concept of multi-tenancy and explore how Django-multitenant can help you build scalable, secure, and maintainable multi-tenant applications on top of PostgreSQL and the Citus database extension. We'll also provide a practical example of how to use Django-multitenant in a real-world scenario. So, if you're looking to simplify your multi-tenant development process, keep reading.

Keep reading

A developer friend of mine prefers to read about what to expect at upcoming events in the narrative form of a blog, rather than having to click in and out of different abstracts on a schedule page.

So this ultimate guide post is my gift to those of you who want to know more about the 37 talks that will be presented at this year’s 2nd annual Citus Con: An Event for Postgres 2023—and who want to read about it in blog post form.

And yes, Citus Con is virtual again this year! This means you can watch all the livestream & on-demand talks from the comfort of your very own desk—and chit-chat in the virtual hallway track on the #cituscon channel on Discord.

[Update in May 2023]: It's a wrap! The categories in this ultimate guide will help you find the talks which are most useful to you and your work/interests. Or you can jump straight to the playlist of all 37 Citus Con 2023 talks on YouTube.

So what’s on the schedule at Citus Con: An Event for Postgres 2023, exactly? Be sure to check out both tabs on the Schedule page, both the Live Sessions & the On-Demand Sessions tabs, to learn about the:

Keep reading

Citus is a PostgreSQL extension that makes PostgreSQL scalable by transparently distributing and/or replicating tables across one or more PostgreSQL nodes. Citus could be used either on Azure cloud, or since the Citus database extension is fully open source, you can download and install Citus anywhere you like.

A typical Citus cluster consists of a special node called coordinator and a few worker nodes. Applications usually send their queries to the Citus coordinator node, which relays them to worker nodes and accumulates the results. (Unless of course you’re using the Citus query from any node feature, an optional feature introduced in Citus 11, in which case the queries can be routed to any of the nodes in the cluster.)

Anyway, one of the most frequently asked questions is: “How does Citus handle failures of the coordinator or worker nodes? What’s the HA story?”

And with the exception of when you’re running Citus in a managed service in the cloud, the answer so far was not great—just use PostgreSQL streaming to run coordinator and workers with HA and it is up to you how to handle a failover.

In this blog post, you’ll learn how Patroni 3.0+ can be used to deploy a highly available Citus database cluster—just by adding a few lines to the Patroni configuration file.

Keep reading

Our goal for the Citus extension is for you to be able to use all PostgreSQL features at any scale, with a seamless scaling experience. Distributed tables (or more generally “Citus tables”) are a powerful tool to get high performance at any scale. There are only a few remaining limitations when distributing a PostgreSQL table, but we are determined to solve them all. The Citus 11.2 release checks off another five SQL & DDL features that now work seamlessly on Citus tables. We also improved progress tracking for the shard rebalancer, so you know exactly what’s going on when rebalancing your cluster.

We also want PostgreSQL tools to work out-of-the-box even if you have a distributed PostgreSQL cluster. One of the most frequent questions we get on the Citus Slack from our open source users is how to set up high availability. Alexander Kukushkin, who is the primary maintainer of Patroni and recently joined the Citus database engine team, therefore developed a new version of Patroni which includes support for Citus!

Before we dive in, you can find detailed release notes for Citus 11.2 by the engineering team on our Updates page.

Keep reading

Page 3 of 15