Newslurp

<< Stories

PagerDuty’s Vision for AI-Operations ✨, Monitoring Claude Usage πŸ–₯️, Left To Right Programming ↔️

TLDR DevOps <dan@tldrnewsletter.com>

August 20, 11:10 am

TLDR DevOps
AI-powered systems require new strategies focused on model accuracy, data freshness, governance, and human-AI collaboration β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ  β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ β€Œ 

TLDR

Together With IBM

TLDR DevOps 2025-08-20

Build. Automate. Scale. (Sponsor)

From pipelines to production, IBM TechXchange 2025 is where DevOps engineers level up

πŸ§ͺ Live labs. Open source tooling. HashiCorp & Red Hat experts.
πŸ’¬ Learn from 1,000+ speakers across 12 tracks β€” and walk away with skills you can apply immediately.
🎟️ Pass options include Full, Single-Day, Community Day, and Student Dev Day.

Explore DevOps sessions β†’
View pass types β†’
Register today β†’

πŸ“±

News & Trends

The PagerDuty Vision for AI-First Operations (9 minute read)

AI-powered systems require new strategies focused on model accuracy, data freshness, governance, and human-AI collaboration. This post outlines PagerDuty's vision for AI-first operations. It introduces a framework that details seven guiding principles and three AI types that together reduce manual toil, improve incident resolution, and free humans for higher-value work.
Tuning Linux Swap for Kubernetes: A Deep Dive (6 minute read)

Kubernetes v1.34 is expected to introduce a stable NodeSwap feature, enabling swap usage on Linux nodes to improve resource utilization and reduce out-of-memory (OOM) kills. Performance and stability rely on tuning Linux kernel parameters like swappiness and min_free_kbytes, which influence memory management, I/O latency, and Kubelet's eviction mechanism
Introducing langchain-gradient: Seamless LangChain Integration with DigitalOcean Gradient AI Platform (2 minute read)

langchain-gradient is an open-source integration that connects LangChain's framework for building applications powered by large language models (LLMs) to DigitalOcean's Gradient AI Platform. The package allows developers to streamline building and deploying intelligent applications by facilitating retrieval-augmented chatbots, automating customer support, and creating AI copilots for developers.
πŸš€

Opinions & Tutorials

Solving secret zero with Vault and OpenShift Virtualization (12 minute read)

Red Hat OpenShift Virtualization can use Kubernetes service accounts and Vault Agent to securely authenticate virtual machines to HashiCorp Vault without relying on secret zero. By templating these configurations, organizations can standardize and automate secure VM provisioning while enabling access to Vault-managed secrets across workloads.
How to monitor Claude usage and costs: introducing the Anthropic integration for Grafana Cloud (7 minute read)

Grafana Cloud now integrates with Anthropic, allowing users to monitor the costs and performance of Claude LLMs. The integration pulls usage data directly from the Anthropic Usage and Cost API, converts it into Prometheus-format metrics, and stores it in Grafana Cloud, providing a pre-built dashboard and customizable alerts for tracking token consumption, costs, and model usage. To get started, users must enter their Anthropic Admin API key in the integration settings to automatically add pre-built dashboards and alerts.
Monitor Claude usage and cost data with Datadog Cloud Cost Management (5 minute read)

Datadog's Cloud Cost Management (CCM) now integrates with the Anthropic Usage and Cost Admin API, enabling users to ingest Claude usage and cost data directly into existing dashboards, reports, and monitors. With this integration, organizations can monitor costs by model, workspace, API key, and service tier, and then set alerts for cost and usage data with prebuilt monitor templates. The integration normalizes Claude usage and cost data using the FinOps Foundation's FOCUS format, providing a unified schema for analysis and cost governance.
πŸ§‘β€πŸ’»

Resources & Tools

⚠️ Your "free" Kubernetes platform actually costs $100k+ per year (Sponsor)

Kubernetes is open source, but the people needed to run it aren't free. New research from Portainer breaks down the hidden labor costs most teams miss. At enterprise scale, annual operational costs routinely exceed $1 million - before infrastructure spend. See the complete cost breakdown by operational scale
Rotel (GitHub Repo)

Rotel, a high-performance and resource-efficient solution for OpenTelemetry data collection, processing, and exporting, is now available under the Apache 2.0 license. Configured via command-line flags or environment variables, Rotel supports multiple exporters (including OTLP, Datadog, and Kafka) and includes a Python processor SDK for custom data modification.
GitHub Agentic Workflows (GitHub Repo)

GitHub Agentic Workflows allows you to write agentic workflows in natural language markdown and run them in GitHub Actions.
🎁

Miscellaneous

Patterns for safe and efficient cache purging in CI/CD pipelines (17 minute read)

Caching in CI/CD pipelines improves build speed, reduces compute costs, and enhances runtime performance by storing dependencies, build artifacts, and static assets across various stages. Effective cache purging strategies like content hashing, TTL, and automated invalidation help prevent stale data and ensure fresh user content.
AI Agents Transform Platform Engineering at Microsoft (5 minute read)

Microsoft's platform engineering team uses AI coding agents to autonomously implement security updates, modernize pipelines, manage dependencies, and enforce code quality across thousands of repositories, reducing months of work to weeks with greater consistency and less developer disruption. Amanda Silver envisions smaller, more strategic teams focused on designing systems and standards while AI handles repetitive, large-scale implementation tasks.
⚑

Quick Links

AWS Cloud Map adds support for cross-account service discovery (2 minute read)

AWS Cloud Map now supports cross-account service discovery through AWS Resource Access Manager, enabling centralized namespace sharing across accounts, Organizational Units, or an entire AWS Organization.
Left To Right Programming (4 minute read)

Programs should support β€œleft-to-right” coding, where code remains valid as it's typed, so editors can provide meaningful autocompletion and guidance.

Love TLDR? Tell your friends and get rewards!

Share your referral link below with friends to get free TLDR swag!
Track your referrals here.

Want to advertise in TLDR? πŸ“°

If your company is interested in reaching an audience of devops professionals and decision makers, you may want to advertise with us.

Want to work at TLDR? πŸ’Ό

Apply here or send a friend's resume to jobs@tldr.tech and get $1k if we hire them!

If you have any comments or feedback, just respond to this email!

Thanks for reading,
Kunal Desai & Martin Hauskrecht


Manage your subscriptions to our other newsletters on tech, startups, and programming. Or if TLDR DevOps isn't for you, please unsubscribe.