TAO is a very important part of the infrastructure at Facebook. This is my attempt at summarizing the TAO paper, and the ruby-forum.org post, and the talk by Nathan Bronson. I am purely referring to public domain materials for this post.

Bạn đang xem: Tao: the power of the graph


Memcađậy was being used as a cache for serving the FB graph, which is persisted on MySquốc lộ. Using Memcabịt along with MySQL as a look-aside/write-through cabít makes it complicated for Product Engineers to lớn write code modifying the graph while taking care of consistency, retries, etc. There has lớn be glue code lớn unify this, which can be buggy.

A new abstraction of Objects & Associations was created, which allowed expressing a lot of actions on FB as objects và their associations. Initially there seems khổng lồ have been a PHP layer which deprecated direct access to MySquốc lộ for operations which fit this abstraction, while continuing to lớn use Memcabít và MySQL underneath the covers.

This PHP.. layer for the above Mã Sản Phẩm is not ideal, since:

Incremental Updates: For one-to-many associations, such as the association between a page và it’s fans on FB, any incremental update to the tín đồ list, would invalidate the entire các mục in the cache.

Distributed Control Logic: Control xúc tích resides in fat clients. Which is always problematic.

Expensive Read After Write Consistency: Unclear to me.


TAO is a write-through cabịt backed by MySQL.

TAO objects have sầu a type ($otype$), along with a 64-bit globally quality id. Associations have a type ($atype$), and a creation timestamp. Two objects can have sầu only one association of the same type. As an example, users can be Objects and their friendship can be represented as an association. TAO also provides the option to add inverse-assocs, when adding an assoc.


The TAO API is simple by thiết kế. Most are intuitive sầu khổng lồ underst&.

assoc_add(id1, atype, id2, time, (k→v)*): Add an association of type atype from id1 to lớn id2. assoc_delete(id1, atype, id2): Delete the association of type atype from id1 to lớn id2. assoc_get(id1, atype, id2set, high?, low?): Returns assocs of atype between id1 và members of id2mix, & creation time lies between $$. assoc_count(id1, atype): Number of assocs from id1 of type atype. And a few others, refer khổng lồ the paper.

As per the paper:

TAO enforces a per-atype upper bound (typically 6,000) on the actual limit used for an association query.

This is also probably why the maximum number of friends you can have sầu on FB is 5000.


There are two important factors in the TAO architecture design:

On FB the aggregate consumption of nội dung (reads), is far more than the aggregate nội dung creation (writes). The TAO API is such that, lớn generate a newsfeed story (for example), the web-server will need lớn do the dependency resolution on its own, and hence will require multiple round-trips to lớn the TAO backover. This further amplifies reads as compared to lớn writes, bringing the read-write ratio lớn 500:1, as mentioned in Nathan’s talk.

The choice of being okay with multiple round-trips to build a page, while wanting khổng lồ ensure a snappy hàng hóa experience, imposes the requirement that:

Each of these read requests should have sầu a low read latency (cannot cross data-center boundaries for every request). The read availability is required lớn be pretty high.

Choice of Backing Store

The underlying DB is MySQL, và the TAO API is mapped to simple SQL queries. MySQL had been operated at FB for a long time, and internally backups, bulk imports, async replication etc. using MySquốc lộ was well understood. Also MySquốc lộ provides atomic write transactions, and few latency outliers.

Sharding / Data Distribution

Objects & Associations are in different tables. Data is divided into logical shards. Each shard is served by a database.

Quoting from the paper:

In practice, the number of shards far exceeds the number of servers; we tune the shard to server mapping to lớn balance load across different hosts.

Xem thêm: 57 Hinh Anh Dep Ý Tưởng - 100+ Hình Ảnh Dễ Thương Nhất

And it seems like the sharding triông xã we credited lớn Pinterest might have been used by FB first :-)

Each object id contains an embedded shard id that identifies its hosting shard.

The above sầu setup means that your shard id is pre-decided. An assoc is stored in the shard belonging khổng lồ its id1.

Consistency Semantics

TAO also requires “read-what-you-wrote” consistency semantics for writers, và eventual consistency otherwise.

Leader-Follower Architecture

TAO is setup with multiple regions, and user requests hit the regions closest khổng lồ them. The diagram below illustrates the caching architecture.

There is one ‘leader’ region & several ‘slave’ regions. Each region has a complete copy of the databases. There is an ongoing async replication between leader khổng lồ slave(s). In each region, there are a group of machines which are ‘followers’, where each individual group of followers, caches and completely serves read requests for the entire domain name of the data. Clients are sticky lớn a specific group of followers.

In each region, there is a group of leaders, where there is one leader for each shard. Read requests are served by the followers, cađậy misses are forwarded to the leaders, which in turn return the result from either their cache, or query the DB.

Write requests are forwarded lớn the leader of that region. If the current region is a slave sầu region, the request is forwarded lớn the leader of that shard in the master region.

The leader sends cache-refill/invalidation messages lớn its followers, và to lớn the slave sầu leader, if the leader belongs to the master region. These messages are idempotent.

The way this is setup, the reads can never be stale in the master leader region. Followers in the master region, slave leader và by extension slave sầu followers might be stale as well. The authors mention an average replication lag of 1s between master and slave DBs, though they don’t mention whether this is same-coast / cross-country / trans-atlantic replication.

When the leader fails, the reads go directly lớn the DB. The writes khổng lồ the failed leader go through a random thành viên in the leader tier.

Read Availability

There are multiple places khổng lồ read, which increases read-availability. If the follower that the client is talking to, dies, the client can talk lớn some other follower in the same region. If all followers are down, you can talk directly to lớn the leader in the region. Following whose failure, the client contacts the DB in the current region or other followers / leaders in other regions.


These are some client-side observed latency & hit-rate numbers in the paper.

The authors report a failure rate of $4.9 × 10^−6$, which is 5 9s! Though one caveat as mentioned in the paper is, because of the ‘chained’ nature of TAO requests, an initial failed request would imply the dependent requests would not be tried to lớn begin with.


This again is a very readable paper relatively. I could underst& most of it in 3 readings. It helped that there is a talk and a ruby-forum.org post about this. Makes the material easier to lớn grasp.

I liked that the system is designed khổng lồ have a simple API, & foucses on making them as fast as they can. Complex operations have sầu not been built inlớn the API. Eventual consistency is fine for a lot of use cases,

There is no transactional support, so if we have sầu assocs và inverse assocs (for example likes_page và page_liked_by edges), và we would ideally want lớn remove sầu both atomically. However, it is possible that assoc in one direction was removed, but there was a failure to lớn remove the assoc in the other direction. These dangling pointers are removed by an async job as per the paper. So clients have sầu khổng lồ ensure that they are fine with this.

From the Q&A after the talk, Nathan Bronson mentions that there exists a flag in the calls, which could be phối to lớn force a cađậy miss / stronger consistency guarantees. This could be specifically useful in certain use-cases such ash blocking / privacy settings.