Highlights DoorDash: Building Faster Indexing with Apache Kafka and Elasticsearch The DoorDash team faced an issue of a very long time for updating the search index. They’ve built a search system relying on open source technologies. It uses Kafka as a message queue and for data storage, Flink for data transformation, and sending data to Elasticsearch. A reliable indexing system would ensure that changes in stores and items are reflected in the search index in real-time.
Posts with the tag mysql:
Highlights DoorDash: Optimizing OpenTelemetry’s Span Processor for High Throughput and Low CPU Costs With an effort to migrate from the monolith to microservices, there is a need to trace requests between these services. OpenTelemetry is a new project aiming to become a standard. As a request hits a system, OpenTelemetry assigns a unique ID, all the underlying services (called spans) receive the same ID. Spans data is being collected on a local collector, then it’s sent to a collector gateway.
Highlights FullContact: Improving the Graph: Transition to ScyllaDB FullContact set an ambitious goal of 10,000 QPS. Initially, they moved their database from HBase to Cassandra. Cluster consisted of 3 instances of c5.2xlarge EC2 + 2 TB of gp2 EBS storage. With the growing amount of records in the database, response time crept from 100 ms to 300 ms. It turned out that the default Size Tiered Compaction Strategy is optimized for inserts which lead to a single file for SSTable.