Announcing Azure Copilot agents and AI infrastructure innovations (8 minute read)
Azure Copilot is an agentic interface that automates cloud migration, optimization, troubleshooting, and governance while aligning all actions with organizational policies, RBAC, and compliance standards. Backed by Azure's expanding AI-ready infrastructure and six specialized Copilot agents in preview, the platform aims to modernize workloads, streamline operations, and enhance reliability.
|
Stack Overflow Introduces Stack Internal (3 minute read)
Stack Internal is an evolution of Stack Overflow's enterprise knowledge platform that blends AI automation with human validation to create a secure, continuously improving knowledge ecosystem for modern engineering teams. New capabilities, such as AI-powered knowledge ingestion and the MCP Server, connect organizational tools and copilots to trusted, verified content, reducing hallucinations, strengthening compliance, and boosting developer productivity.
|
Kubernetes 1.35: 10 new Alpha features (6 minute read)
Kubernetes 1.35 introduces 10 new alpha features, including Dynamic Resource Allocation improvements for binding conditions and partitionable devices, that aim to better orchestrate AI-native workloads. Other highlights include a Mixed Version Proxy to address version skew during upgrades, CSI driver options to receive generated service account tokens in secrets, and a framework for nodes to declare available Kubernetes features.
|
|
Why I (still) love Linux (12 minute read)
Linux provides unmatched freedom, accessibility, and longevity, even as many modern distributions drift from classic Unix principles and introduce instability in the name of progress. Despite frustrations like systemd's scope creep and corporate influence on development direction, Linux remains deeply valuable thanks to its hardware support, minimalistic distributions, and the decades of learning, growth, and reliability it has enabled.
|
"Good engineering management" is a fad (8 minute read)
Engineering management expectations swing with industry cycles: the 2000s rewarded organizational navigation, the 2010s rewarded hypergrowth-era people leadership, and the post-2022 era rewards hands-on execution as AI and economic conditions reshape organizations. These shifts are driven by business realitiesβnot moral narrativesβso the only durable path is developing broad foundational and long-term leadership skills while managing your career energy and priorities over a decades-long horizon.
|
|
Introducing the fully managed Amazon EKS MCP Server (preview) (9 minute read)
Amazon Elastic Kubernetes Service (EKS) has launched a fully managed EKS Model Context Protocol (MCP) Server in Preview. The EKS MCP Server allows users to manage EKS clusters through natural language instead of complex commands, and can be integrated with AI tools like Kiro, Cursor, and Cline. Amazon Q integration also enables AI-powered troubleshooting within the Amazon EKS console. The EKS MCP Server is currently available in all AWS Commercial regions, except US GovCloud and China regions.
|
Pgdog (GitHub Repo)
PgDog, a Rust-based transaction pooler and logical replication manager, can shard PostgreSQL and manage hundreds of databases and connections. As an application layer load balancer, PgDog supports multiple strategies, such as round robin, and can automatically route queries to shards, even splitting COPY commands. It is free and open-source under the AGPL v3 license. Healthchecks maximize database availability, and configuration can be tweaked at runtime without breaking connections.
|
Apache TsFile (Resource)
TsFile is a columnar storage file format designed for time series data that supports efficient compression, high throughput of read and write, and compatibility with various frameworks, such as Spark and Flink. It is easy to integrate TsFile into IoT big data processing frameworks.
|
|
Reliability lessons from the 2025 Cloudflare outage (5 minute read)
Cloudflare's outage on November 18 was initiated by a configuration change, which caused its Bot Management system to exceed its file size limit, triggering HTTP 5XX errors that cascaded across multiple dependent services and took major websites like X, ChatGPT, and Shopify offline. To reduce similar risks, organizations should test dependencies with fault injection, monitor health checks, identify single points of failure, and implement failover or error-handling mechanisms.
|
Demystifying Determinism in Durable Execution (8 minute read)
Durable execution frameworks recover by re-running a function from the top and reusing previously recorded side-effect results, which means the control flow must be fully deterministic so that every retry makes the same decisions and passes the same arguments. Side effects themselves can be non-deterministic, but they must be idempotent or duplication-tolerant because they may be re-invoked if their results weren't durably recorded.
|
|
|
Love TLDR? Tell your friends and get rewards!
|
|
Share your referral link below with friends to get free TLDR swag!
|
|
|
|
Track your referrals here.
|
|
Want to advertise in TLDR? π°
If your company is interested in reaching an audience of devops professionals and decision makers, you may want to advertise with us.
Want to work at TLDR? πΌ
Apply here or send a friend's resume to jobs@tldr.tech and get $1k if we hire them!
If you have any comments or feedback, just respond to this email!
Thanks for reading,
Kunal Desai & Martin Hauskrecht
|
|
|
|