Newslurp

<< Stories

Tiger Teams at Grafana ๐Ÿ…, Apache Kafka 4.1 ๐Ÿ†•, DBT Best Practices ๐Ÿ“œ

TLDR DevOps <dan@tldrnewsletter.com>

September 8, 11:10 am

TLDR DevOps
Grafana Labs successfully used "tiger teams" โ€“ small, focused groups of specialists โ€“ to address high-priority, cross-functional problems โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ  โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ โ€Œ 

TLDR

Together With The Linux Foundation

TLDR DevOps 2025-09-08

Vector search that's fast, reliable, and open source (Sponsor)

Vector workloads for RAG, semantic search, and GenAI need low latency at scale. OpenSearch 3.0 introduces a next-gen vector engine built to deliver โ€” no vendor lock-in required.

โšก๏ธPerformance-first โ€” Consistent low latency at enterprise scale

โšก๏ธAI-ready โ€” Hybrid + vector search on an open architecture

โšก๏ธOperates anywhere โ€” Portable across environments; integrates with Faiss, Lucene

โšก๏ธPredictable by design โ€” Apache 2.0, license-free model

โšก๏ธBuilt for production โ€” Handles heavy traffic and growth

Future-proof your AI stack with OpenSearch

๐Ÿ“ฑ

News & Trends

Kubernetes v1.34: PSI Metrics for Kubernetes Graduates to Beta (2 minute read)

Kubernetes v1.34 has graduated Pressure Stall Information (PSI) Metrics to Beta, offering a way to quantify resource pressure on nodes. Collected by the kubelet with the KubeletPSI feature gate enabled, PSI metrics expose data from the Linux kernel via the Summary API and the /metrics/cadvisor Prometheus endpoint to monitor CPU, memory, and I/O pressure at the node, pod, and container level.
Introducing Apache Kafkaยฎ 4.1.0: What's New and How to Upgrade (6 minute read)

Apache Kafka 4.1.0 introduces major updates, including the preview of Queues for Kafka (KIP-932), early access to the new Streams Rebalance Protocol (KIP-1071), consistent error handling for transactions, expanded OAuth support, and deadlock protection in producers. It also adds improvements across Streams and Connect, like explicit naming for internal topics and support for running multiple connector versions, making the platform more robust, scalable, and easier to manage.
๐Ÿš€

Opinions & Tutorials

From Symptoms to Solutions: Reducing MTTR through error analysis in New Relic (10 minute read)

Modern outages cost businesses up to $23,750 per minute, but structured error analysis with observability reduces downtime by turning noisy alerts into root cause insights. Teams can correlate symptoms, signals, logs, traces, and change events in one workflow, cutting MTTR and enabling faster, more reliable remediation.
Tiger teams: How we tackle urgent, cross-functional challenges at Grafana Labs (8 minute read)

Grafana Labs successfully used "tiger teams" โ€“ small, focused groups of specialists โ€“ to address high-priority, cross-functional problems, improving release cycles and tackling urgent issues. Originating from NASA's Apollo 13 mission, these teams at Grafana Labs consist of three to five specialized engineers who work temporarily, typically for one to two quarters, until a specific goal is achieved, such as simplifying the release process or securing a major partnership. The success of the Release Tiger Team led to the creation of a full-time dedicated team to iterate on release issues, though tiger teams can also be dissolved once their objectives have been met.
Understanding dbt: basics and best practices (5 minute read)

Data Build Tool (dbt) is an open-source analytics engineering framework that allows teams to transform raw data in warehouses like Snowflake and BigQuery using SQL-based workflows. Available as dbt Core (free CLI tool) and dbt Cloud (managed platform), dbt introduces software engineering best practices such as version control, automated testing, and CI/CD to analytics workflows. OpenLineage integration allows users to capture dbt job details, such as start and end times, to proactively alert and perform root cause analysis.
๐Ÿง‘โ€๐Ÿ’ป

Resources & Tools

Bridging the K8s Management Gap: Visibility, Availability, and Automation (Sponsor)

Is scaling Kubernetes creating management chaos, security headaches, and unexpected costs? This IDC Spotlight paper offers a roadmap for automating and unifying application, storage, and data management. Key solution capabilities include multicluster visibility, high availability, and extensive automation for policies like storage provisioning, backup, and security. Get a free copy from SUSE
Kubewall (GitHub Repo)

Kubewall is a single-binary Kubernetes dashboard that facilitates multi-cluster management and AI integration with options like OpenAI, Gemini, and Claude 4.
Gitlab CI Local (GitHub Repo)

GitLab CI Local allows users to run GitLab pipelines locally as shell or Docker executors, eliminating the need to push code for testing.
Stelvio (GitHub Repo)

Stelvio is a Python framework that simplifies AWS cloud infrastructure management and deployment. It lets users define their cloud infrastructure using pure Python, with smart defaults that handle complex configuration automatically.
๐ŸŽ

Miscellaneous

Broadcom's New Bitnami Restrictions? Migrate Easily with Docker (5 minute read)

Broadcom is moving most Bitnami images behind a paid subscription, leaving only \:latest tags free and shifting older images into an unsupported legacy registry, which raises cost, stability, and compliance risks for teams. Docker's Official Images and Hardened Images are free and enterprise-ready alternatives that offer security, stability, and predictable pricing for organizations needing reliable container solutions.
Dapr meets GitOps: A Guide to Dapr and Argo CD (13 minute read)

Dapr abstracts distributed system complexity while Argo CD enforces GitOps workflows, enabling consistent, automated, and self-service microservice deployments on Kubernetes. Together, they reduce operational overhead, prevent drift, and provide a scalable foundation for secure and reliable cloud-native applications.
โšก

Quick Links

See the secure DevOps platform used by defense, intelligence, and critical infrastructure orgs (Sponsor)

If your work is mission-critical, generalist tools won't cut it. Learn how Mattermost Enterprise Advanced delivers Zero Trust security, classified data controls, and BYOD readiness. Join live
Real-Time Security with Continuous Access Evaluation (CAE) comes to Azure DevOps (2 minute read)

Continuous Access Evaluation is now supported in Azure DevOps, enabling near real-time enforcement of Conditional Access policies to quickly revoke access after critical events such as password changes, account disablement, or IP changes.
What Experts See That the Rest of Us Miss During Incidents (5 minute read)

Skilled incident commanders manage cognitive saturation and redirect unhelpful executive involvement into productive contributions.

Love TLDR? Tell your friends and get rewards!

Share your referral link below with friends to get free TLDR swag!
Track your referrals here.

Want to advertise in TLDR? ๐Ÿ“ฐ

If your company is interested in reaching an audience of devops professionals and decision makers, you may want to advertise with us.

Want to work at TLDR? ๐Ÿ’ผ

Apply here or send a friend's resume to jobs@tldr.tech and get $1k if we hire them!

If you have any comments or feedback, just respond to this email!

Thanks for reading,
Kunal Desai & Martin Hauskrecht


Manage your subscriptions to our other newsletters on tech, startups, and programming. Or if TLDR DevOps isn't for you, please unsubscribe.